TLM Performance: DMI, Quantum Tuning, and Payload Discipline
A senior-level guide to making TLM virtual platforms fast without breaking timing, ordering, or debug behavior.
TLM Performance: DMI, Quantum Tuning, and Payload Discipline
TLM performance is not one trick. It is a set of disciplined choices: transport style, payload reuse, direct memory interface, temporal decoupling, report cost, and how much timing fidelity the use case really needs.
DMI and Payload Reuse Example
DMI (Direct Memory Interface) lets an initiator bypass repeated socket calls for memory-like regions. It is most valuable for RAM and ROM. Generic payloads should also be reused to avoid constant heap allocation.
Here is a full compilable example demonstrating DMI and payload reuse:
#include <systemc>
#include <tlm>
#include <tlm_utils/simple_initiator_socket.h>
#include <tlm_utils/simple_target_socket.h>
using namespace sc_core;
SC_MODULE(FastMemory) {
tlm_utils::simple_target_socket<FastMemory> socket{"socket"};
unsigned char memory[1024];
SC_CTOR(FastMemory) {
socket.register_b_transport(this, &FastMemory::b_transport);
socket.register_get_direct_mem_ptr(this, &FastMemory::get_direct_mem_ptr);
}
void b_transport(tlm::tlm_generic_payload& trans, sc_time& delay) {
// Normal slow-path transport
trans.set_response_status(tlm::TLM_OK_RESPONSE);
trans.set_dmi_allowed(true); // Hint to initiator that DMI is available
}
bool get_direct_mem_ptr(tlm::tlm_generic_payload& trans, tlm::tlm_dmi& dmi_data) {
// Grant DMI access to the entire 1KB memory
dmi_data.allow_read_write();
dmi_data.set_dmi_ptr(memory);
dmi_data.set_start_address(0);
dmi_data.set_end_address(1023);
dmi_data.set_read_latency(SC_ZERO_TIME);
dmi_data.set_write_latency(SC_ZERO_TIME);
return true;
}
};
SC_MODULE(OptimizedInitiator) {
tlm_utils::simple_initiator_socket<OptimizedInitiator> socket{"socket"};
tlm::tlm_generic_payload reused_payload; // Payload reuse
unsigned char* dmi_ptr = nullptr;
uint64_t dmi_start = 0, dmi_end = 0;
bool dmi_valid = false;
SC_CTOR(OptimizedInitiator) { SC_THREAD(run); }
void run() {
uint64_t addr = 0x10;
// First attempt: try DMI directly
if (dmi_valid && addr >= dmi_start && addr <= dmi_end) {
dmi_ptr[addr - dmi_start] = 0xAA;
return;
}
// Slow path: configure reused payload
unsigned char data = 0xAA;
reused_payload.set_command(tlm::TLM_WRITE_COMMAND);
reused_payload.set_address(addr);
reused_payload.set_data_ptr(&data);
reused_payload.set_data_length(1);
reused_payload.set_response_status(tlm::TLM_INCOMPLETE_RESPONSE);
sc_time delay = SC_ZERO_TIME;
socket->b_transport(reused_payload, delay);
// Check if target hinted at DMI
if (reused_payload.is_dmi_allowed()) {
tlm::tlm_dmi dmi_data;
if (socket->get_direct_mem_ptr(reused_payload, dmi_data)) {
dmi_valid = true;
dmi_ptr = dmi_data.get_dmi_ptr();
dmi_start = dmi_data.get_start_address();
dmi_end = dmi_data.get_end_address();
SC_REPORT_INFO(name(), "DMI successfully established!");
}
}
}
};
int sc_main(int argc, char* argv[]) {
OptimizedInitiator init("init");
FastMemory mem("mem");
init.socket.bind(mem.socket);
sc_start();
return 0;
}DMI Safety
Targets should grant DMI only when direct access is safe (e.g. memory is contiguous, invalidation is implemented). Routers must translate DMI ranges. If a target grants a local address window, the router returns the corresponding system address window.
Extensions and Report Cost
Project policy should define who owns extensions and when they are cleared. Avoid format strings in hot paths unless tracing is enabled. Do not format complex messages that will never be displayed.
Expert Checklist
A performant TLM VP should:
- use blocking transport for simple memory-mapped software access
- use DMI for RAM/ROM
- use temporal decoupling only with clear synchronization policy
- reuse payloads in hot initiators
- avoid dynamic allocation in per-transaction paths
- set response status every time
Comments and Corrections