f02131a4ad
cosim: make request&callback functions thread safe |
||
---|---|---|
.vscode | ||
configs | ||
ext | ||
scripts | ||
src | ||
tests | ||
.clang-format | ||
.gitignore | ||
.gitmodules | ||
.travis.yml | ||
CMakeLists.txt | ||
LICENSE | ||
Makefile | ||
README.md |
README.md
About DRAMsim3
DRAMsim3 models the timing paramaters and memory controller behavior for several DRAM protocols such as DDR3, DDR4, LPDDR3, LPDDR4, GDDR5, GDDR6, HBM, HMC, STT-MRAM. It is implemented in C++ as an objected oriented model that includes a parameterized DRAM bank model, DRAM controllers, command queues and system-level interfaces to interact with a CPU simulator (GEM5, ZSim) or trace workloads. It is designed to be accurate, portable and parallel.
If you use this simulator in your work, please consider cite:
[1] S. Li, Z. Yang, D. Reddy, A. Srivastava and B. Jacob, "DRAMsim3: a Cycle-accurate, Thermal-Capable DRAM Simulator," in IEEE Computer Architecture Letters. Link
See Related Work for more work done with this simulator.
Building and running the simulator
This simulator by default uses a CMake based build system.
The advantage in using a CMake based build system is portability and dependency management.
We require CMake 3.0+ to build this simulator.
If cmake-3.0
is not available,
we also supply a Makefile to build the most basic version of the simulator.
Building
Doing out of source builds with CMake is recommended to avoid the build files cluttering the main directory.
# make directory for build
mkdir build
cd build
# cmake out of source build
# if co-simulation
cmake -D COSIM=1 ..
# else
cmake ..
# Build dramsim3 library and executables
make -j4
# Alternatively, build with thermal module enabled
cmake .. -DTHERMAL=1
The build process creates dramsim3main
and executables in the build
directory.
By default, it also creates libdramsim3.so
shared library in the project root directory.
Running
# help
./build/dramsim3main -h
# Running random stream with a config file
./build/dramsim3main configs/DDR4_8Gb_x8_3200.ini --stream random -c 100000
# Running a trace file
./build/dramsim3main configs/DDR4_8Gb_x8_3200.ini -c 100000 -t sample_trace.txt
# Running with gem5
--mem-type=dramsim3 --dramsim3-ini=configs/DDR4_4Gb_x4_2133.ini
The output can be directed to another directory by -o
option
or can be configured in the config file.
You can control the verbosity in the config file as well.
Output Visualization
scripts/plot_stats.py
can visualize some of the output (requires matplotlib
):
# generate histograms from overall output
python3 scripts/plot_stats dramsim3.json
# or
# generate time series for a variety stats from epoch outputs
python3 scripts/plot_stats dramsim3epoch.json
Currently stats from all channels are squashed together for cleaner plotting.
Integration with other simulators
Gem5 integration: works with a forked Gem5 version, see https://github.com/umd-memsys/gem5 at dramsim3
branch for reference.
SST integration: see http://git.ece.umd.edu/shangli/sst-elements/tree/dramsim3 for reference. We will try to merge to official SST repo.
ZSim integration: see http://git.ece.umd.edu/shangli/zsim/tree/master for reference.
Simulator Design
Code Structure
├── configs # Configs of various protocols that describe timing constraints and power consumption.
├── ext #
├── scripts # Tools and utilities
├── src # DRAMsim3 source files
├── tests # Tests of each model, includes a short example trace
├── CMakeLists.txt
├── Makefile
├── LICENSE
└── README.md
├── src
bankstate.cc: Records and manages DRAM bank timings and states which is modeled as a state machine.
channelstate.cc: Records and manages channel timings and states.
command_queue.cc: Maintains per-bank or per-rank FIFO queueing structures, determine which commands in the queues can be issued in this cycle.
configuration.cc: Initiates, manages system and DRAM parameters, including protocol, DRAM timings, address mapping policy and power parameters.
controller.cc: Maintains the per-channel controller, which manages a queue of pending memory transactions and issues corresponding DRAM commands,
follows FR-FCFS policy.
cpu.cc: Implements 3 types of simple CPU:
1. Random, can handle random CPU requests at full speed, the entire parallelism of DRAM protocol can be exploited without limits from address mapping and scheduling pocilies.
2. Stream, provides a streaming prototype that is able to provide enough buffer hits.
3. Trace-based, consumes traces of workloads, feed the fetched transactions into the memory system.
dram_system.cc: Initiates JEDEC or ideal DRAM system, registers the supplied callback function to let the front end driver know that the request is finished.
hmc.cc: Implements HMC system and interface, HMC requests are translates to DRAM requests here and a crossbar interconnect between the high-speed links and the memory controllers is modeled.
main.cc: Handles the main program loop that reads in simulation arguments, DRAM configurations and tick cycle forward.
memory_system.cc: A wrapper of dram_system and hmc.
refresh.cc: Raises refresh request based on per-rank refresh or per-bank refresh.
timing.cc: Initiate timing constraints.
Experiments
Verilog Validation
First we generate a DRAM command trace.
There is a CMD_TRACE
macro and by default it's disabled.
Use cmake .. -DCMD_TRACE=1
to enable the command trace output build and then
whenever a simulation is performed the command trace file will be generated.
Next, scripts/validation.py
helps generate a Verilog workbench for Micron's Verilog model
from the command trace file.
Currently DDR3, DDR4, and LPDDR configs are supported by this script.
Run
./script/validataion.py DDR4.ini cmd.trace
To generage Verilog workbench. Our workbench format is compatible with ModelSim Verilog simulator, other Verilog simulators may require a slightly different format.
Related Work
[1] Li, S., Yang, Z., Reddy D., Srivastava, A. and Jacob, B., (2020) DRAMsim3: a Cycle-accurate, Thermal-Capable DRAM Simulator, IEEE Computer Architecture Letters.
[2] Jagasivamani, M., Walden, C., Singh, D., Kang, L., Li, S., Asnaashari, M., ... & Yeung, D. (2019). Analyzing the Monolithic Integration of a ReRAM-Based Main Memory Into a CPU's Die. IEEE Micro, 39(6), 64-72.
[3] Li, S., Reddy, D., & Jacob, B. (2018, October). A performance & power comparison of modern high-speed DRAM architectures. In Proceedings of the International Symposium on Memory Systems (pp. 341-353).
[4] Li, S., Verdejo, R. S., Radojković, P., & Jacob, B. (2019, September). Rethinking cycle accurate DRAM simulation. In Proceedings of the International Symposium on Memory Systems (pp. 184-191).
[5] Li, S., & Jacob, B. (2019, September). Statistical DRAM modeling. In Proceedings of the International Symposium on Memory Systems (pp. 521-530).
[6] Li, S. (2019). Scalable and Accurate Memory System Simulation (Doctoral dissertation).