Datatype Performance and Correctness
Choosing between C++ types, SystemC integer types, bit vectors, logic vectors, fixed-point types, and TLM byte arrays.
Datatype Performance and Correctness
SystemC provides an extensive library of custom datatypes because hardware modeling requires precise bit widths, four-state logic, and fixed-point arithmetic. However, a common beginner trap is using the most "hardware-looking" type everywhere. This drastically reduces simulation performance and makes the C++ code cumbersome to read.
The IEEE 1666 LRM strictly defines these datatypes. Knowing when to use native C++ types versus SystemC types is a hallmark of an expert SystemC architect.
The LRM Datatype Categories
The standard defines several datatype groups under the sc_dt namespace:
- Native C++ Types: (
int,uint32_t,bool) Performance: Maximum. Use Case: Virtual Platform (TLM) internal state, counters, flags, memory arrays. - Limited-Precision Fixed-Width Integers: (
sc_dt::sc_int<W>,sc_dt::sc_uint<W>) Performance: High (implemented using 64-bit native integers under the hood). Valid for $W \le 64$. Use Case: Register fields, exact small hardware width arithmetic. - Arbitrary-Precision Integers: (
sc_dt::sc_bigint<W>,sc_dt::sc_biguint<W>) Performance: Slow (dynamically allocates arrays of words). Valid for $W > 64$. Use Case: Cryptographic keys, very wide buses, wide memory payloads. - Bit and Logic Vectors: (
sc_dt::sc_bv<W>,sc_dt::sc_lv<W>) Performance: Very Slow (uses proxy objects, stores bit arrays, resolves 4-state logic forsc_lv). Use Case: Pin-level RTL interfaces, unknown ('X') or high impedance ('Z') states. - Fixed-Point Types: (
sc_dt::sc_fixed,sc_dt::sc_ufixed) Performance: Moderate to Slow (handles quantization and overflow). Use Case: DSP algorithms, AMS (Analog Mixed Signal) boundaries.
The Proxy Object Problem
A major performance pitfall in SystemC datatypes is the use of proxy classes for bit-selection ([]) and part-selection (range()).
When you write reg.range(15, 8), SystemC does not return an integer. It returns a temporary proxy object (sc_dt::sc_subref). If you nest these deeply, the C++ compiler generates massive amounts of temporary proxy objects, severely degrading simulation speed.
Best Practice: Convert to native C++ types for complex arithmetic, then assign back to SystemC types only at the module boundaries.
Complete Example: Datatype Trade-offs
This complete sc_main example demonstrates how to correctly mix native C++ types with SystemC limited-precision integers, and how to use part-select proxies safely.
#include <systemc>
#include <iostream>
#include <iomanip>
SC_MODULE(DatatypeDemo) {
// Port using exact-width hardware type
sc_core::sc_in<sc_dt::sc_uint<12>> address_in{"address_in"};
// Internal state using fast native C++ type (Best Practice for VPs)
uint32_t internal_memory[4096];
// Hardware-accurate register representing a 32-bit control register
sc_dt::sc_uint<32> control_reg;
SC_CTOR(DatatypeDemo) {
SC_METHOD(process_transaction);
sensitive << address_in;
dont_initialize();
// Initialize memory
for (int i = 0; i < 4096; i++) internal_memory[i] = 0;
control_reg = 0;
}
void process_transaction() {
// 1. Read from SystemC type to native C++ type (Fast)
uint32_t addr = address_in.read();
// 2. Perform operations using native C++ (Fast)
if (addr < 4096) {
internal_memory[addr] = 0xDEADBEEF;
}
// 3. Using SystemC Proxy Objects (range) correctly
// Extracting bits [11:8] as a 4-bit unsigned integer
sc_dt::sc_uint<4> page = address_in.read().range(11, 8);
// Packing bits into the control register
// Avoid deep nesting: reg.range() = (a, b);
control_reg.range(3, 0) = page;
control_reg.range(31, 28) = 0xF;
std::cout << "@ " << sc_core::sc_time_stamp()
<< " Addr: 0x" << std::hex << addr
<< " Page: 0x" << page
<< " Control Reg: 0x" << control_reg << "\n";
}
};
// Testbench to drive the module
SC_MODULE(Testbench) {
sc_core::sc_signal<sc_dt::sc_uint<12>> addr_sig{"addr_sig"};
DatatypeDemo* demo;
SC_CTOR(Testbench) {
demo = new DatatypeDemo("demo_inst");
demo->address_in(addr_sig);
SC_THREAD(drive);
}
void drive() {
wait(10, sc_core::SC_NS);
addr_sig.write(0x0A4); // Write 12-bit value
wait(10, sc_core::SC_NS);
addr_sig.write(0xF00);
}
~Testbench() {
delete demo;
}
};
int sc_main(int argc, char* argv[]) {
Testbench tb("tb");
std::cout << "Starting simulation...\n";
sc_core::sc_start(50, sc_core::SC_NS);
return 0;
}Explanation of the Execution
When run, the output shows:
Starting simulation...
@ 10 ns Addr: 0xa4 Page: 0x0 Control Reg: 0xf0000000
@ 20 ns Addr: 0xf00 Page: 0xf Control Reg: 0xf000000f
Notice how address_in.read().range(11, 8) correctly extracts the top 4 bits of the 12-bit address. When driving 0xF00, the top nibble is F, which is packed into the lowest 4 bits of the 32-bit control_reg.
Using uint32_t for the internal_memory ensures that the simulation runs at native C++ speeds for the bulk of the data storage, while sc_dt::sc_uint is reserved for explicit hardware boundaries.
Comments and Corrections