-
Notifications
You must be signed in to change notification settings - Fork 2
A Buffer on Board memory system simulator
dramninjasUMD/BOBSim
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
BOBSim: A cycle accurate Buffer-On-Board Memory System Simulator ================================================================================ Elliott Cooper-Balis Paul Rosenfeld Bruce Jacob University of Maryland dramninjas [at] gmail [dot] com 1 About BOBSim ------------------------------------------------------------------ The design and implementation of the commodity memory architecture has resulted in significant performance and capacity limitations. To circumvent these limitations, designers and vendors have begun to place intermediate logic between the CPU and DRAM. This additional logic has two functions: to control the commodity DRAM and to communicate with the CPU over a fast and narrow bus. The benefit provided by this logic is a reduction in pin-out to the memory system and increased signal integrity to the DRAM, allowing faster clock rates while maintaining capacity. BOBSim is a cycle-based simulator written in C++ that encapsulates all aspects of this "buffer-on-board" (BOB) memory system. Each of the major logical portions of the design have a corresponding software object and associated parameters that give total control over all aspects of the system's configuration and behavior. For more details about this architecture please see : https://wiki.umd.edu/BOBSim/ 2 Getting BOBSim ------------------------------------------------------------------ BOBSim is available on github. If you have git installed you can clone our repository by typing: $ git clone git://github.com/dramninjasUMD/BOBSim.git 3 Building BOBSim ------------------------------------------------------------------ To build an optimized standalone version of the simulator simply type: $ make To build the BOBSim library, type: $ make libbobsim.so 4 Running BOBSim ------------------------------------------------------------------ BOBSim is run in two separate modes: stand-alone mode and full-system mode. In stand-alone mode, a parameterizable random address stream is issued directly to the memory system. In full-system mode, BOBSim is attached to a CPU simulator and requests are generated from actual program execution. Regardless of the mode being used, all parameters and configurations are set within the Globals.h file (yes, regrettably, you must recompile when changing parameters). Field names for parameters should correspond to portions of the architecture described on the wiki (url). To save some time, the makefile has a directive for a particular DRAM device which are defined in Globals.h. The available devices are DDR3-1066, DDR3-1333, and DDR3-1600. STAND-ALONE MODE : In this mode, an address stream generated in RandomStreamSim.c is issued directly to the memory system. This address stream can be modified with the parameters READ_WRITE_RATIO and PORT_UTILIZATION. The former dictates the request mix as a ratio of reads to the total number of requests as a percentage and the latter dictates the frequency of requests from 0 (no requests) to 1.0 (as fast as possible) To run it : $ ./BOBSim -c X -n Y -q command line arguments -c X : Dictates number of CPU cycles to execute -n Y : Dictates the number of ports on the main BOB controller -q : Quiet mode, turns off all output (except for epoch output shown below) FULL-SYSTEM MODE : As with DRAMSim2, BOBSim strives to make integration with other full-system simulators as easy as possible. Several public functions are available to enable this and are described on the BOBSim wiki : https://wiki.umd.edu/BOBSim/index.php?title=Running_BOBsim 5 BOBSim Epoch Output ------------------------------------------------------------------ Example output from a one million cycle epoch execution. The first portion shows general stats about the request mix, average system bandwidth for that epoch, and latency statistics. ==================Epoch [1]======================= per epoch] reads : 149984 writes : 74451 (66.8274% Reads) : 224435 lifetime ] reads : 149984 writes : 74451 total : 224435 issued logic ops : 0 returned logic responses : 0 bandwidth : 42.8076 GB/sec current cycle : 1000000 -------------------------- full time mean : 248.938 ns std : 150.934 ns min : 43.4375 ns max : 1136.25 ns chan time mean : 202.682 ns std : 125.106 min : 41.5625 ns max : 868.75 ns dram time mean : 56.9559 ns std : 40.3156 ns min : 32.5 ns max : 277.5 ns The next portion shows the latency components (in nanoseconds) for read requests sent to each channel in the system. This example shows 8 DRAM channels. -- Per Channel Latency Components in nanoseconds (All from READs) : reqPort reqLink workQ access rrq rspLink rspPort total 0] 49.8418 1.5625 20.3763 57.0755 114.0523 8.0797 1.2500 252.2381 1] 44.2388 1.5625 20.1603 57.8072 122.3065 8.0807 1.2500 255.4060 2] 41.5492 1.5625 20.5893 58.0432 123.1006 8.0785 1.2500 254.1732 3] 50.1635 1.5625 22.4312 61.3874 129.6183 8.0801 1.2500 274.4930 4] 42.6759 1.5625 19.3439 54.6489 112.3736 8.0815 1.2500 239.9363 5] 49.4876 1.5625 21.6093 58.9417 119.4585 8.0809 1.2500 260.3906 6] 39.3225 1.5625 18.7533 55.4073 106.8524 8.0807 1.2500 231.2286 7] 35.3873 1.5625 17.6407 52.1759 106.6302 8.0809 1.2500 222.7276 This portion displays the utilization and statitics of the ports on the main BOB controller (which resides on the CPU). --- Port stats (per epoch) : 0] request: 88.8854% idle response: 85.0348% idle rds:37462 wrts:18421 rtn:37413 tot:55883 1] request: 88.766% idle response: 84.9828% idle rds:37576 wrts:18691 rtn:37543 tot:56267 2] request: 88.7678% idle response: 85.0892% idle rds:37319 wrts:18751 rtn:37277 tot:56070 3] request: 88.7866% idle response: 84.8996% idle rds:37782 wrts:18588 rtn:37751 tot:56370 === BOB Print === == Ports -- Port 0 - inputBufferAvg : 7.93731 (8) outputBufferAvg : 0.149652 (0) -- Port 1 - inputBufferAvg : 7.93691 (8) outputBufferAvg : 0.150172 (0) -- Port 2 - inputBufferAvg : 7.9372 (8) outputBufferAvg : 0.149108 (0) -- Port 3 - inputBufferAvg : 7.93662 (8) outputBufferAvg : 0.151004 (0) Below, the total generated bandwidth on each request and response link bus as a result of both packet overhead and data. The last line is the average over all the links in the system. == Link Bandwidth (!! Includes packet overhead and request packets !!) Req Link(6.4 GB/s peak) Rsp Link(9.6 GB/s peak) 5.26287 GB/s 8.68574 GB/s 5.22612 GB/s 8.69114 GB/s 5.3185 GB/s 8.64727 GB/s 5.18673 GB/s 8.53269 GB/s ----------- 5.24856 8.63921 (avgs) The section below consists of various statistics about each channel such as total requests, queue depths, generated bandwidth, and bank stats. == Channel Usage (32GB/Chan == 256 GB total) reqs workQAvg workQMax idleBanks actBanks preBanks refBanks (totalBanks) BusIdle BW(10.6667) RRQMax(16) RRQFull lifetimeRequests 0] 28141 1.8382 16 24.9417 5.1901 1.2149 0.6533 32.0000 46.00 5.364 15(15) 418853 28141 1] 28240 1.8082 14 24.9090 5.2221 1.2197 0.6492 32.0000 45.79 5.385 15(2) 437252 28240 2] 28093 1.8314 16 24.8532 5.2843 1.2133 0.6492 32.0000 46.07 5.357 15(13) 451932 28093 3] 28157 1.9971 15 24.5384 5.5927 1.2156 0.6533 32.0000 45.97 5.367 15(15) 536573 28157 4] 28270 1.7377 15 25.1464 4.9791 1.2212 0.6533 32.0000 45.73 5.392 15(1) 354155 28270 5] 28186 1.9130 15 24.7201 5.4096 1.2172 0.6531 32.0000 45.90 5.374 15(5) 467118 28186 6] 27995 1.6948 15 25.1484 4.9896 1.2087 0.6533 32.0000 46.28 5.337 15(15) 374438 27995 7] 27476 1.5340 16 25.5402 4.6199 1.1866 0.6533 32.0000 47.26 5.239 15(14) 300199 27476 AVG : 5.35199 == Requests seen at Channels -- Reads : 150116 -- Writes : 74442 = 224558 This section shows power consumption of each DRAM channel and the power necessary to operate the simple controllers in that particular system. == Channel Power -- Channel 0 -- DRAM Power : 9.17485 w -- Channel 1 -- DRAM Power : 9.19294 w -- Channel 2 -- DRAM Power : 9.16597 w -- Channel 3 -- DRAM Power : 9.18945 w -- Channel 4 -- DRAM Power : 9.19224 w -- Channel 5 -- DRAM Power : 9.1892 w -- Channel 6 -- DRAM Power : 9.14323 w -- Channel 7 -- DRAM Power : 9.0448 w Average Power : 9.16159 w Total Power : 73.2927 w SimpCont BG Power : 28 w SimpCont Core Power : 28 w System Power : 129.293 w Used to ensure everyone is at the same point in time. == Time Check CPU Time : 312500ns DRAM Time : 312500ns 6 BOBSim Output Files ------------------------------------------------------------------ BOBSim generates 3 output files : BOBStats.txt - contains parameters used for simulation and CSV output of various data BOBPower.txt - contains CSV data of all power consumption (and components) within the system BOBSim.txt - when compiling with pre-processor directive LOG_OUTPUT, all epoch output is printed in this file
About
A Buffer on Board memory system simulator
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published