Chapter 13: Modeling Best Practices

Modeling Best Practices: Debugging Playbook

How experts debug SystemC failures by separating elaboration, scheduling, binding, timing, payload, and configuration problems.

Listen to this lessonAudiobook mode

How to Read This Lesson

Modeling Best Practices: Debugging Playbook

SystemC failures often look mysterious because C++ execution, kernel scheduling, and modeled hardware behavior are heavily interleaved. A segmentation fault could be a C++ pointer error, or it could be a TLM payload mismatch. A simulation hang could be an infinite loop, or it could be a delta-cycle deadlock.

Expert debugging requires classifying the failure first, then applying the correct tools and examining the correct kernel data structures in your debugger.

Standard and source context

Class 1: Elaboration and Binding Failures

Symptoms: Exceptions thrown before sc_start() begins; messages about unbound ports. Cause: The LRM requires all ports and exports to be bound before simulation starts. Playbook:

  • Use the LRM API sc_core::sc_get_top_level_objects() to print the hierarchy.
  • In GDB, set a breakpoint on sc_core::sc_port_base::complete_binding().
  • Ensure no modules were instantiated as local stack variables inside constructors. This breaks sc_object_manager::m_hierarchy_curr, corrupting the kernel's object tree.
  • Check that multi-ports (sc_port<IF, N>) have the correct number of bindings.

Class 2: Scheduling Failures (Hangs and Deadlocks)

Symptoms: Simulation time (sc_time_stamp()) stops advancing, but CPU usage is 100%. Or, simulation freezes entirely. Cause:

  • CPU 100%: A delta-cycle loop. Process A triggers B, B triggers A, all in SC_ZERO_TIME.
  • Freeze: An SC_THREAD forgot to call wait(), locking the cooperative scheduler forever. Playbook:
  • If frozen, attach GDB and inspect sc_core::sc_get_curr_simcontext()->m_runnable. If it's empty, and the current frame is inside user code rather than sc_simcontext::crunch(), a thread forgot to yield (via wait() and the qt_block assembly context switch).
  • If spinning at 100% CPU, inspect sc_simcontext::m_update_list and sc_simcontext::m_delta_events. A delta loop will continuously populate these. Use the LRM API sc_core::sc_delta_count() to print the delta cycle count. If it increases while time stands still, you found the loop.

Class 3: TLM-2.0 Protocol Failures

Symptoms: Firmware reads garbage data, routers forward to the wrong target, or an initiator throws an TLM_INCOMPLETE_RESPONSE error. Cause: Violation of the TLM-2.0 base protocol. Playbook:

  • In GDB, break on your b_transport function and inspect the tlm_generic_payload object. Check the m_address, m_command, and m_response_status fields.
  • Verify the Target modifies set_response_status().
  • If using DMI (Direct Memory Interface), disable it temporarily. If the bug disappears, your DMI invalidation logic is flawed. DMI bypasses TLM sockets entirely, making memory overwrites completely invisible to standard socket debugging.

Complete Example: Debugging a Delta Cycle Loop

Here is a complete sc_main demonstrates a classic delta-cycle loop (Class 2 failure) and how to use the LRM APIs (sc_delta_count) to detect and debug it programmatically.

#include <systemc>
#include <iostream>
 
SC_MODULE(DeltaLoopDemo) {
    sc_core::sc_signal<bool> sig_a{"sig_a"};
    sc_core::sc_signal<bool> sig_b{"sig_b"};
 
    SC_CTOR(DeltaLoopDemo) {
        SC_METHOD(process_a);
        sensitive << sig_b; // A reacts to B
        
        SC_METHOD(process_b);
        sensitive << sig_a; // B reacts to A
 
        SC_METHOD(monitor_deltas);
        sensitive << sig_a << sig_b;
    }
 
    void process_a() {
        // Invert B and write to A
        // Kernel adds sig_a to sc_simcontext::m_update_list
        sig_a.write(!sig_b.read());
    }
 
    void process_b() {
        // Invert A and write to B
        // Kernel adds sig_b to sc_simcontext::m_update_list
        sig_b.write(!sig_a.read());
    }
 
    void monitor_deltas() {
        // The LRM provides sc_delta_count() to track evaluation phases
        uint64_t current_delta = sc_core::sc_delta_count();
        std::cout << "[Time: " << sc_core::sc_time_stamp() 
                  << "] Delta Count: " << current_delta << "\n";
 
        // Safeguard to prevent an actual infinite loop in this demonstration
        if (current_delta > 10) {
            SC_REPORT_ERROR("DEBUG", "Delta cycle loop detected! Aborting.");
        }
    }
};
 
int sc_main(int argc, char* argv[]) {
    // Configure the report handler to stop instead of abort for the demo
    sc_core::sc_report_handler::set_actions(sc_core::SC_ERROR, sc_core::SC_DISPLAY | sc_core::SC_STOP);
 
    DeltaLoopDemo demo("demo");
 
    std::cout << "Starting Simulation... Watch the delta count explode.\n";
    
    // Kickoff the loop
    demo.sig_b.write(true); 
 
    sc_core::sc_start(1, sc_core::SC_MS);
    
    std::cout << "Simulation stopped cleanly after detecting the loop.\n";
    return 0;
}

Explanation of the Execution

When you run this code, process_a writes to sig_a. In the update phase, sig_a changes, triggering process_b. process_b writes to sig_b. In the next update phase, sig_b changes, triggering process_a.

Because signal updates take zero simulation time, sc_time_stamp() remains at 0 s, but the kernel's crunch() function evaluates continuously.

The output will look like this:

Starting Simulation... Watch the delta count explode.
[Time: 0 s] Delta Count: 1
[Time: 0 s] Delta Count: 2
[Time: 0 s] Delta Count: 3
...
[Time: 0 s] Delta Count: 11
Error: (E0000) DEBUG: Delta cycle loop detected! Aborting.
Simulation stopped cleanly after detecting the loop.

By printing sc_time_stamp() alongside sc_delta_count(), the architect instantly diagnoses a combinatorial feedback loop rather than wondering why the simulation froze.

Lesson self-check

Can you answer these clearly?

Keep moving when you can answer each question without looking back at the lesson.

Comments and Corrections