VP Requirements & Abstraction Level

When architecting a Virtual Platform (VP), the first question an Electronic System Level (ESL) architect must answer is: What is the abstraction level?

If the goal is to boot an operating system (Linux, Android) as fast as possible to develop software early, you need Loosely Timed (LT) models. If the goal is to analyze bus contention, memory bandwidth, and cache hit ratios, you need Approximately Timed (AT) models.

Loosely Timed (LT) vs Approximately Timed (AT)

The IEEE 1666 TLM 2.0 standard explicitly defines these two coding styles.

Loosely Timed (LT)

Goal: Maximum simulation speed.
Mechanism: Uses the b_transport blocking interface. A transaction executes in a single C++ function call. Time is passed as a reference (sc_core::sc_time& delay) and accumulated, utilizing Temporal Decoupling to avoid expensive scheduler context switches.
Use Case: Software development, firmware validation, functional verification.

Approximately Timed (AT)

Goal: Cycle-approximate performance analysis.
Mechanism: Uses the nb_transport_fw and nb_transport_bw non-blocking interfaces. A single transaction is broken into multiple phases (BEGIN_REQ, END_REQ, BEGIN_RESP, END_RESP) traversing back and forth through the router, accurately modeling pipeline stages and bus arbitration.
Use Case: Architectural exploration, performance bottleneck analysis.

End-to-End LT Initiator Example

In a Doulos Simple Bus compliant VP, we predominantly use LT to boot software. Here is a perfect LT initiator utilizing Temporal Decoupling.

#include <systemc>
#include <tlm>
#include <tlm_utils/simple_initiator_socket.h>
 
SC_MODULE(LT_CPU_Model) {
    tlm_utils::simple_initiator_socket<LT_CPU_Model> socket;
 
    SC_CTOR(LT_CPU_Model) : socket("socket") {
        SC_THREAD(execute_instructions);
    }
 
    void execute_instructions() {
        tlm::tlm_generic_payload trans;
        sc_core::sc_time local_delay = sc_core::SC_ZERO_TIME;
        uint32_t data = 0;
 
        // Temporal Decoupling: Accumulate time locally without yielding to the SystemC scheduler
        for (int i = 0; i < 1000; i++) {
            trans.set_command(tlm::TLM_READ_COMMAND);
            trans.set_address(0x1000);
            trans.set_data_ptr(reinterpret_cast<unsigned char*>(&data));
            trans.set_data_length(4);
            trans.set_response_status(tlm::TLM_INCOMPLETE_RESPONSE);
 
            socket->b_transport(trans, local_delay);
            
            // Add internal CPU instruction execution latency
            local_delay += sc_core::sc_time(10, sc_core::SC_NS);
 
            // Sync with global time only when quantum is exceeded (e.g., 1000 ns)
            if (local_delay >= sc_core::sc_time(1000, sc_core::SC_NS)) {
                wait(local_delay); // Expensive context switch
                local_delay = sc_core::SC_ZERO_TIME;
            }
        }
    }
};
 
int sc_main(int argc, char* argv[]) {
    // Boilerplate for standalone compilation
    return 0;
}

By explicitly gathering requirements upfront, you avoid the disastrous mistake of writing slow, cycle-accurate models when the software team just needs a fast functional platform.