Chapter 4: TLM and Platforms

TLM Performance: DMI, Quantum Tuning, and Payload Discipline

A senior-level guide to making TLM virtual platforms fast without breaking timing, ordering, or debug behavior.

TLM Performance: DMI, Quantum Tuning, and Payload Discipline

TLM performance is not one trick. It is a set of disciplined choices: transport style, payload reuse, direct memory interface, temporal decoupling, report cost, and how much timing fidelity the use case really needs.

DMI and Payload Reuse Example

DMI (Direct Memory Interface) lets an initiator bypass repeated socket calls for memory-like regions. It is most valuable for RAM and ROM. Generic payloads should also be reused to avoid constant heap allocation.

Here is a full compilable example demonstrating DMI and payload reuse:

#include <systemc>
#include <tlm>
#include <tlm_utils/simple_initiator_socket.h>
#include <tlm_utils/simple_target_socket.h>
 
using namespace sc_core;
 
SC_MODULE(FastMemory) {
  tlm_utils::simple_target_socket<FastMemory> socket{"socket"};
  unsigned char memory[1024];
 
  SC_CTOR(FastMemory) {
    socket.register_b_transport(this, &FastMemory::b_transport);
    socket.register_get_direct_mem_ptr(this, &FastMemory::get_direct_mem_ptr);
  }
 
  void b_transport(tlm::tlm_generic_payload& trans, sc_time& delay) {
    // Normal slow-path transport
    trans.set_response_status(tlm::TLM_OK_RESPONSE);
    trans.set_dmi_allowed(true); // Hint to initiator that DMI is available
  }
 
  bool get_direct_mem_ptr(tlm::tlm_generic_payload& trans, tlm::tlm_dmi& dmi_data) {
    // Grant DMI access to the entire 1KB memory
    dmi_data.allow_read_write();
    dmi_data.set_dmi_ptr(memory);
    dmi_data.set_start_address(0);
    dmi_data.set_end_address(1023);
    dmi_data.set_read_latency(SC_ZERO_TIME);
    dmi_data.set_write_latency(SC_ZERO_TIME);
    return true;
  }
};
 
SC_MODULE(OptimizedInitiator) {
  tlm_utils::simple_initiator_socket<OptimizedInitiator> socket{"socket"};
  tlm::tlm_generic_payload reused_payload; // Payload reuse
  
  unsigned char* dmi_ptr = nullptr;
  uint64_t dmi_start = 0, dmi_end = 0;
  bool dmi_valid = false;
 
  SC_CTOR(OptimizedInitiator) { SC_THREAD(run); }
 
  void run() {
    uint64_t addr = 0x10;
    
    // First attempt: try DMI directly
    if (dmi_valid && addr >= dmi_start && addr <= dmi_end) {
      dmi_ptr[addr - dmi_start] = 0xAA;
      return;
    }
 
    // Slow path: configure reused payload
    unsigned char data = 0xAA;
    reused_payload.set_command(tlm::TLM_WRITE_COMMAND);
    reused_payload.set_address(addr);
    reused_payload.set_data_ptr(&data);
    reused_payload.set_data_length(1);
    reused_payload.set_response_status(tlm::TLM_INCOMPLETE_RESPONSE);
 
    sc_time delay = SC_ZERO_TIME;
    socket->b_transport(reused_payload, delay);
 
    // Check if target hinted at DMI
    if (reused_payload.is_dmi_allowed()) {
      tlm::tlm_dmi dmi_data;
      if (socket->get_direct_mem_ptr(reused_payload, dmi_data)) {
        dmi_valid = true;
        dmi_ptr = dmi_data.get_dmi_ptr();
        dmi_start = dmi_data.get_start_address();
        dmi_end = dmi_data.get_end_address();
        SC_REPORT_INFO(name(), "DMI successfully established!");
      }
    }
  }
};
 
int sc_main(int argc, char* argv[]) {
  OptimizedInitiator init("init");
  FastMemory mem("mem");
  init.socket.bind(mem.socket);
  sc_start();
  return 0;
}

DMI Safety

Targets should grant DMI only when direct access is safe (e.g. memory is contiguous, invalidation is implemented). Routers must translate DMI ranges. If a target grants a local address window, the router returns the corresponding system address window.

Extensions and Report Cost

Project policy should define who owns extensions and when they are cleared. Avoid format strings in hot paths unless tracing is enabled. Do not format complex messages that will never be displayed.

Expert Checklist

A performant TLM VP should:

  • use blocking transport for simple memory-mapped software access
  • use DMI for RAM/ROM
  • use temporal decoupling only with clear synchronization policy
  • reuse payloads in hot initiators
  • avoid dynamic allocation in per-transaction paths
  • set response status every time

Comments and Corrections