ΕΛΠ 370: Αρχιτεκτονική Υπολογιστών Φροντιστήριο Αρ. 7 ·...
TRANSCRIPT
Σελ. 1Πέτρος Παναγή
ΕΛΠ 370: Αρχιτεκτονική Υπολογιστών
Φροντιστήριο Αρ. 7
MARSSx86Micro-ARchitectural and System Simulator
for x86-based Systemshttp://marss86.org/~marss86/index.php/Home
Σελ. 2Πέτρος Παναγή
Full System Simulator
MARSSx86 is a tool for cycle accurate full system simulation of the x86-64 architecture, specifically multicore implementations.
Σελ. 3
Emulator vs SimulatorEmulation refers to the ability of a computer program in an electronic
device to emulate (imitate) another program or device.A computer simulation (or "sim") is an attempt to model a real-life or
hypothetical situation on a computer so that it can be studied to see how the system works. By changing variables in the simulation, predictions may be made about the behaviour of the system.
Example:QEMU (Emulation) does not maintain ‘clock’ MARSSx86 Simulation engine keeps track of number of executed
cycles
Πέτρος Παναγή
Σελ. 4
QEMUQEMU is a generic and open source machine emulator and virtualizer that allows
you to run a complete operating system as just another task on your desktop. It can be very useful for trying out different operating systems, testing software, and running applications that won't run on your desktop's native platform.
http://wiki.qemu.org/Main_Page
Πέτρος Παναγή
Σελ. 5
Install of Marss_x86 on CentOSDownloading, compiling, running and simulating SPEC CPU 2006 on CentOS
release 6.5 with out been Root. (Part 1)
All the files are installed in the following foldercd /home/students/cs/SPEC2006/simulators/
Download Scons and an untar it in a clean folder:cd /home/students/cs/SPEC2006/simulators/imageswget http://prdownloads.sourceforge.net/scons/scons‐2.3.4.tar.gztar ‐xvf scons‐2.3.4.tar.gz
Download Marss x86 using git clone from:cd /home/students/cs/SPEC2006/simulators/marss_x86git clone git://github.com/avadhpatel/marss.git
Πέτρος Παναγή
Σελ. 6
Install of Marss_x86 on CentOSDownloading, compiling, running and simulating SPEC CPU 2006 on CentOS
release 6.5 with out been Root. (Part 2)
Compile using (use c=n where n the number of cores):cd /home/students/cs/SPEC2006/simulators/marss_x86../scons/bin/scons ‐Q config=config/default.conf
To Run the Qemu with simulation, telnet and network capabilities:qemu/qemu‐system‐x86_64 ‐curses ‐monitor
telnet:127.0.0.1:1234,server,nowait ‐m 2048 ‐hda ../images/ubuntu‐natty‐SPEC2006‐STD.qcow2 ‐net nic,model=ne2k_pci ‐net user ‐simconfig simconfig
On a new console telnet the QEMU's monitor console withtelnet 127.0.0.1 1234
Πέτρος Παναγή
Σελ. 7
Install of Marss_x86 on CentOSDownloading, compiling, running and simulating SPEC CPU 2006 on CentOS
release 6.5 with out been Root. (Part 3)
One in the Guest(Emulated) OS you will have to fix the apt-get in order to install new software like gcc and wget
Username: rootPassword: root
vi /ect/apt/sources/list and replace the natty with trustyto install the gcc compiler which is needed to compile the SPECs.apt‐get install gcc
Πέτρος Παναγή
Σελ. 8
Install of Marss_x86 on CentOSDownloading, compiling, running and simulating SPEC CPU 2006 on CentOS
release 6.5 with out been Root. (Part 4)To copy SPEC2006 into the image:mkdir SPEC2006cd SPEC2006scp ‐r [your username][email protected]:/home/students
/cs/SPEC2006/SPEC2006DVD/ .
you can follow the standard SPEC procedure for the installation with./install ‐d ~/SPEC2006_MARSSx86set SPEC = ~/SPEC2006_MARSSx86
. ./shrccd configcp linux64‐amd64‐gcc42.cfg EPL370‐configuration.cfgvi EPL370‐configuration.cfg (Edit the compilers)CC = /usr/bin/gccCXX = /usr/bin/g++FC = /usr/bin/gfortran
Πέτρος Παναγή
Σελ. 9
Install of Marss_x86 on CentOSDownloading, compiling, running and simulating SPEC CPU 2006 on CentOS
release 6.5 with out been Root. (Part 5)To compile 401.bzip2runspec ‐‐config=EPL370‐configuration.cfg ‐‐action=build ‐‐tune=base
401To run 401.bzip2./benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64‐m64‐gcc41‐nn
./benchspec/CPU2006/401.bzip2/data/ref/input/input.source
To simulate401.bzip2
~/start_sim; ./benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64‐m64‐gcc42‐nn ./benchspec/ CPU2006/401.bzip2/data/ref/ input/input.source ; ~/stop_sim; ~/kill_sim;
Πέτρος Παναγή
Σελ. 10
Create CheckpointsTo create checkpoints check:https://github.com/avadhpatel/marss/blob/master/util
/create_checkpoints.pyAnd to run benchmarks try:https://github.com/avadhpatel/marss/blob/master/util
/run_bench.py
Πέτρος Παναγή
Σελ. 11
PART B
Πέτρος Παναγή
Σελ. 12Πέτρος Παναγή
Σελ. 13Πέτρος Παναγή
Σελ. 14Πέτρος Παναγή
Σελ. 15Πέτρος Παναγή
Σελ. 16Πέτρος Παναγή
Σελ. 17Πέτρος Παναγή
Σελ. 18Πέτρος Παναγή
Σελ. 19Πέτρος Παναγή
Σελ. 20
Machine Description Filehttp://marss86.org/~marss86/index.php/Machine_Configuration(ISCA-2012 Tutorial-6 page: 94)cat config/default.confmachine:
# Use run-time option '-machine [MACHINE_NAME]' to select
single_core:description: Single Core configurationmin_contexts: 1max_contexts: 1cores: # The order in which core is defined is used to assign
# the cores in a machine- type: oooname_prefix: ooo_option:
threads: 1
Πέτρος Παναγή
Σελ. 21
Marss simconfig file# Sample Marss simconfig file‐machine single_core
# Logging options‐logfile results/test.log‐loglevel 4# Start logging after 10million cycles# ‐startlog 10m
# Stats file‐stats results/test.stats
http://marss86.org/~marss86/index.php/Run-time_ConfigurationUse like:$ qemu/qemu‐system‐x86_64 ‐simconfig config_file <OTHER_QEMU_OPTIONS>
Πέτρος Παναγή
Σελ. 22
Machine Description Filecaches:
- type: l1_128Kname_prefix: L1_I_insts: $NUMCORES # Per core L1-I cache
- type: l1_128Kname_prefix: L1_D_insts: $NUMCORES # Per core L1-D cache
- type: l2_2Mname_prefix: L2_insts: 1 # Shared L2 config
memory:- type: dram_contname_prefix: MEM_insts: 1 # Single DRAM controlleroption:
latency: 50 # In nano seconds
Πέτρος Παναγή
Σελ. 23
./matrix_serial_std.c.out 256 4 4MATRIX A is ready. MATRIX B is ready. MATRIX C is ready. Running Block MatMul AlgorithmStart Timer. Stop Timer. Elapsed Time: 0.17 Sec.
Performance counter stats for './matrix_serial_std.c.out 256 4 4':
173.151622 task‐clock # 0.997 CPUs utilized19 context‐switches # 0.110 K/sec1 cpu‐migrations # 0.006 K/sec
533 page‐faults # 0.003 M/sec496,692,252 cycles # 2.869 GHz [50.15%]
<not supported> stalled‐cycles‐frontend<not supported> stalled‐cycles‐backend
852,429,517 instructions # 1.72 insns per cycle [75.18%]19,995,826 branches # 115.482 M/sec [75.19%]
276,388 branch‐misses # 1.38% of all branches [74.91%]
0.173662186 seconds time elapsed
Πέτρος Παναγή
Σελ. 24
./matrix_serial_std.c.out 256 4 4petrosp@103ws31:~/EPL370/FALL2014/Homeworks/HW3>perf stat ‐e instructions:u
./matrix_serial_std.c.out 256 4 4
Performance counter stats for './matrix_serial_std.c.out 256 4 4':
852,154,159 instructions:u # 0.00 insns per cycle
0.172383581 seconds time elapsed
petrosp@103ws31:~/EPL370/FALL2014/Homeworks/HW3>perf stat ‐e instructions:k ./matrix_serial_std.c.out 256 4 4
Performance counter stats for './matrix_serial_std.c.out 256 4 4':
2,203,700 instructions:k # 0.00 insns per cycle
0.168992136 seconds time elapsed
Πέτρος Παναγή
Σελ. 25Πέτρος Παναγή
Σελ. 26
./start_sim;./matrix_serial_std.c.out 256 4 4;./stop_sim;./kill_sim
Using Notepad++http://notepad-plus-plus.org/
l1_128K
Πέτρος Παναγή
Σελ. 27
./start_sim;./matrix_serial_std.c.out 256 4 4;./stop_sim;./kill_sim
Πέτρος Παναγή
Σελ. 28
Double the L1 Cache Size
../marss_x86/scons/bin/scons ‐Q config=config/default.conf
Πέτρος Παναγή
Σελ. 29
./start_sim;./matrix_serial_std.c.out 256 4 4;./stop_sim;./kill_sim
l1_256K
Πέτρος Παναγή
Σελ. 30
./start_sim;./matrix_serial_std.c.out 256 4 4;./stop_sim;./kill_sim
Πέτρος Παναγή