.. _debugging-guide: Debugging Guide =============== This section describes how to debug your kernel code. See :ref:`working-with-code-samples` for how to compile and simulate (run the program). To debug, you can use the following tools: - ``csdb`` debugger for interactive debugging on hardware. - ``sdk_debug_shell visualize``, which launches the SDK GUI to look at all simulation results such as timeline and traces. See :ref:`sdk-gui` for more information. - ``sim.log`` simulator log file, which records a cycle-by-cycle log of wavelets or instructions executed on each PE. CSDB debugger ------------- CSDB is the Cerebras Software Language Debugger for the Wafer-Scale Engine. CSDB can be run on the host machine for interactive debugging with the Wafer-Scale Engine on issues such as hangs and functional failures. CSDB can also be used to inspect and debug coredumps produced from a simulator run. Below is a tutorial on how to use ``csdb`` to inspect a coredump from a simulator run. Tutorial ~~~~~~~~ We will use the :ref:`sdkruntime-gemv-checkerboard` example program for this tutorial. First, to produce corefiles, we will need to add the following line to ``run.py`` right before ``runner.stop()`` is called: .. code-block:: csl runner.dump_core("corefile.cs1") Note that the specified filename for the coredump MUST be ``corefile.cs1`` to produce the correct file types for ``csdb``. Run ``commands.sh`` to compile and execute the program and produce the corefiles. The run will produce four files: ``corefile.cs1_0``, ``corefile.cs1_1``, ``corefile.cs1_2``, and ``corefile.cs1_3``. We are now ready to use ``csdb``. Start ``csdb`` from the current working directory: .. code-block:: bash $ csdb . INFO:csdb: . contains more than one CSL compile directory. Starting debug shell... ``csdb`` reports that we have multiple compile directories: this is because the top level compile directory, ``out``, contains subdirectories containing compile output for the ``memcpy`` infrastructure. Select ``out`` as our compile context, and target the produced corefiles: .. code-block:: bash (csdb) context select out (csdb) target create --core-file=corefile.cs1 Run ``settings`` to see the current working directory, compile context, and target, along with the fabric rectangle dimensions: .. code-block:: bash (csdb) settings INFO:csdb: Workdir: . INFO:csdb: Compile context: gemv-checkerboard-pattern/out/ INFO:csdb: Target (core file): corefile.cs1 INFO:csdb: Rectangle(s): INFO:csdb: Rect (x = 0, y = 0, width = 11, height = 6) selected INFO:csdb: Trace: no selected. Run ``help`` to take a look at the available options: .. code-block:: bash (csdb) help Documented commands (type help ): ======================================== context memory register target wavelet image rectangle settings trace workdir Undocumented commands: ====================== exit help quit Select a new subrectangle of PEs, containing only a single PE, and deselect the default rectangle containing the whole fabric. Show the current rectangle(s) with ``rectangle show``: .. code-block:: bash (csdb) rectangle show INFO:csdb: Rectangle(s): INFO:csdb: Rect (x = 0, y = 0, width = 11, height = 6) selected (csdb) rectangle select 4,1,1,1 (csdb) rectangle show INFO:csdb: Rectangle(s): INFO:csdb: Rect (x = 0, y = 0, width = 11, height = 6) selected INFO:csdb: Rect (x = 4, y = 1, width = 1, height = 1) selected (csdb) rectangle deselect 0,0,11,6 INFO:csdb: Removing ('', Rect (x = 0, y = 0, width = 11, height = 6)) (csdb) rectangle show INFO:csdb: Rectangle(s): INFO:csdb: Rect (x = 4, y = 1, width = 1, height = 1) selected We can read memory values of the PE in the rectangle by using the memory command: .. code-block:: bash (csdb) memory read --address 0x9e0 --length 4 MSGS155 21:48:27 GMT Output will be directed to file 'memory-x4y1w1h1_09e0_09e4.log' MSGS155 21:48:27 GMT Log file: 'memory-x4y1w1h1_09e0_09e4.log' The memory values will be written to the log file specified above. Terminology ~~~~~~~~~~~ - Compilation context: The directory generated by ``cslc``. By default, the name is ``out``. - Trace: The directory generated after simulation is ran. By default, the name is ``simfab_traces``. - Working directory: Also known as workdir, this is the directory to which the debugger writes its output. Commands ~~~~~~~~ Context command """"""""""""""" The context command is used to select or change the compile context created by CSL compiler. Once a context is selected, a debug session can be started by creating a target. .. code-block:: bash Usage: context [OPTIONS] COMMAND [ARGS]... Set compile context. Options: --help Show this message and exit. Commands: list List all the compile context in workdir. select Select the directory that contains the ELF binaries as compile... show Show the selected compile context. **Example: list all the contexts and select one** .. note:: The "." after "[2]" in example below means current directory. .. code-block:: bash (csdb) context list INFO:csdb: [0] orig_hw2/out INFO:csdb: [1] orig_hw/out INFO:csdb: [2] . (csdb) context select orig_hw/out # is same as (csdb) context select [1] Memory command """""""""""""" To read from the memory, the user must first specify a rectangle and a target. When memory read is called, CSDB will read from a core file or a device. The output of the read is piped into a log file with name beginning with "memory". All address and lenght are in units of 16-bits (2-bytes). .. code-block:: bash Usage: memory [OPTIONS] COMMAND [ARGS]... Read and write to memory locations in PEs. Options: --help Show this message and exit. Commands: read Read memory from a core file or a device. write Write memory to a device in units of 2-bytes. **Example: output from reading tile (4,1) on address 0x09e0, length 4** .. code-block:: bash (4,1) 09e0: 06af 8af0 06af 8060 Rectangle command """"""""""""""""" The purpose of rectangle command is allow the user to select a rectangle within the fabric. By default, the selected rectangle is the entire fabric. The context must be selected before you can use the rectangle commands. .. code-block:: bash Usage: rectangle [OPTIONS] COMMAND [ARGS]... Select rectangle Options: --help Show this message and exit. Commands: reset Resets the current rectangle to fabric dimension. select Selects a rectangle. show Shows the current rectangle. **Example: select a rectangle on (1,2) w3 h4** .. code-block:: bash (csdb) rectangle select 1,2,3,4 Settings command """""""""""""""" The settings command is used to see the work directory, compile context, target, rectangle and trace. Target command """""""""""""" The purpose of the target command is to create a debug session. It is similar to attaching ``gdb`` to a process. You can create an interactive debug session by connecting to a CM IP address, or perform a post-mortem debugging by examining a core file. During an interactive debug session, you can use ``save-core`` to save a core file for examination later. .. code-block:: bash (csdb) target Usage: target [OPTIONS] COMMAND [ARGS]... Connect to CM for interactive debugging or examine a core file. Options: --help Show this message and exit. Commands: create Connects to CM or read core file as target. list List all the core files. save-core Save a core file after connecting to a CM target show Show selected core file. **Example: create an interactive debug session** .. code-block:: bash (csdb) target create --cmaddr 12.34.56.78:9000 **Example: list and load the core file** .. code-block:: bash (csdb) target list INFO:csdb: [0] core-ckpt INFO:csdb: [1] my_try1-ckpt (csdb) target create --core-file core-ckpt Trace command """"""""""""" The purpose of the trace command is to specify a directory in which a ``simfab_traces`` has been generated, so that the ``simfab_traces`` can be read for wavelet trace information. .. code-block:: bash (csdb) trace --help Usage: trace [OPTIONS] COMMAND [ARGS]... Select trace Options: --help Show this message and exit. Commands: list List all the valid post-run generated directory. select Select the directory that contains post-run traces. show Show the post-run directory that is set. **Example: select current run directory with simfab_traces** .. code-block:: bash (csdb) trace select . At this point, you can use the ``wavelet`` command to inspect the wavelet traces of this run. Workdir command """"""""""""""" The purpose of the workdir command is to specify a directory for output files to be written. .. code-block:: bash (csdb) workdir --help Usage: workdir [OPTIONS] COMMAND [ARGS]... workdir is the directory for debug session. Options: --help Show this message and exit. Commands: select Select a workdir. **Example: select a workdir** .. code-block:: bash (csdb) workdir select path/to/workdir sdk_debug_shell --------------- The ``sdk_debug_shell`` tool is used to run a smoke test or launch the SDK GUI visualizer. .. code-block:: bash $ sdk_debug_shell --help Usage: sdk_debug_shell [OPTIONS] COMMAND [ARGS] Debugger tool for the Cerebras WSE kernel code. Options: --help Show this message and exit. Commands: smoke Run the smoke tests. visualize Invokes the visualization tool csviz. Smoke test ~~~~~~~~~~ The ``smoke`` option runs the smoke tests in the specified directory. .. code-block:: bash Usage: sdk_debug_shell smoke [OPTIONS] [CSL_EXAMPLES_DIR]... Run the smoke tests. Options: --help Show this message and exit. Visualizer ~~~~~~~~~~ When you use the ``visualize`` option, the debugger will invoke the :ref:`sdk-gui` and you can visually inspect the debugging information in a web browser. The default ``artifact_dir`` is the current directory. .. code-block:: bash $ sdk_debug_shell visualize --help Usage: sdk_debug_shell visualize [OPTIONS] Visualize routing between PEs, post-simulation run results. For example, wavelet trace and instruction trace in a web browser. Options: --artifact_dir PATH --help Show this message and exit. Simulator Logs -------------- When running in the simulator, you can produce a simulator log file ``sim.log`` with cycle-by-cycle information about wavelets or instructions. The ``SINGULARITYENV_SIMFABRIC_DEBUG`` environment variable is used to control the output of ``sim.log``. Landing Logs ~~~~~~~~~~~~ ``SINGULARITYENV_SIMFABRIC_DEBUG=landing`` produces a log of wavelet landings on each PE's router, giving the cycle and color on which the wavelet lands, the direction from which it landed, its data, and its identity. Instruction Trace Logs ~~~~~~~~~~~~~~~~~~~~~~ ``SINGULARITYENV_SIMFABRIC_DEBUG=inst_trace`` produces an instruction trace that shows which instruction a PE is executing at each cycle.