uc berkeley 1 a disk and thermal emulation model for ramp zhangxi tan and david patterson

16
1 UC Berkeley A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

Post on 20-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

1

UC Berkeley

A Disk and Thermal Emulation Model for RAMP

Zhangxi Tan and David Patterson

Page 2: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

2

Outline

• Introduction and retrospective overview

• Improvement since June 06

• Disk and temperature emulation

• Future work

Page 3: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

3

June 06 status

• Internet in a box Version 0– 3 Xilinx XUP board ($299*3)

with 12 processors– uClinux and research

application (i3)

• Limitations– Software base is poor

• No MMU, no fork, no full version of linux

• Every software need porting

– Processor is too slow (100 MHz vs 3 GHz)

– No local storage per nodes

ROUTER

MB

MB

MB

MB

MB

MB

MB

MB

MB

MB

MB

MB100 Mbps Ethernet

100 Mbps Ethernet 100 Mbps Ethernet

OTHER DEVICES

OTHER DEVICES

TO EXTERNAL NETWORK

Board #1 Board #2

Board #3

192.168.1.35

192.168.1.37

192.168.1.36

192.168.2.2

192.168.3.2 192.168.4.2 192.168.13.2

192.168.12.2

192.168.14.2

192.168.23.2 192.168.24.2

192.168.22.2

Page 4: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

4

ImprovementJun 06 Jan 07

ProcessorProcessor

MicroBlaze LEON 3

32-bit RISC/Microcontroller 32-bit SPARC V8

No MMU MMU/Configurable TLB

Single precision floating point IEEE 754 Floating Point

Direct map cache Direct map/Set associative cache

OS and SoftwareOS and Software

uClinux 2.4 (no protection, no fork) Full Linux 2.6.18.1

Every software needs porting Run latest Debian/GNU Linux binaries directly (support apt-get)

OthersOthers

No disk emulation Emulate local disk with Ethernet attached storage

Slow processor only Emulate fast systems with “Time Dilation”

- Emulate system temperature

Page 5: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

5

Agenda

• Introduction and retrospective overview

• Improvement since June 06

• Disk and temperature emulation

• Future work

Page 6: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

6

Disk and Thermal Emulation

• Local disk is an essential part for datacenter– Local physical storage– Variable disk specifications (VM only have a

function module)– In the context of real workload

• Temperature is a critical issue in DC– Cooling, reliability– How the workload will affect the temperature

in datacenter is an interesting topic

Page 7: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

7

Methodology• HW Emulator (FPGA): 32-bit Leon3 with, 50MHz, 90 MHz DDR memory, 8K L1 Cache

(4K Inst and 4K Data)– Target system: Linux 2.6 kernel, 50 MHz – 2 GHz

• PC – storage, trace logger and model solver (offline or online)

Target Platform(32-bit Leon)

DiskSim(Timing

Info)Storage

AoE ParserAoE Kernel

Driver

Ethernet

FPGA

PC

Activity Monitors

MercuryThermal Emulator

Emulating IDE disk with Ethernet based network storage (ATA over Ethernet) + DiskSim

AoE: Encapsulate IDE command in Ethernet packetDiskSim: widely used disk simulator (provide access timing based on disk specification)

Thermal emulation is done by Mercury suite (ASPLOS’ 06)

Sample CPU/disk activities periodically and send to a central emulatorEmulator takes system configuration and predict temperature based on Newton’s laws of coolingDisk state will help power estimation

Time dilation makes “target” looks fasterReprogram HW timer to make ‘jiffies’ longer in terms of wall clock Slow down memory accordingly, when speeding up processor

Page 8: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

8

Experiments

• Thermal emulation model (validated in Mercury)– Physical layout from Dell PowerEdge 2850

• 3 GHz Xeon, 10K RPM SCSI• Emulated disk model (validated disk model in Disksim)

– Seagate Cheetah 9LP• 10K RPM, 5 ms avg seek time

• Several programs run in target system with various time dilation factors– Dhrystone: CPU intensive benchmark– Postmark: A file system benchmark (disk intensive)– Unix command with pipe (both disk and CPU intensive)

• cat alargefile | grep ‘a search pattern’ > searchresultfile• 100 MB file size

• Emulation output– Performance statistics– System temperature

Page 9: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

9

Dhrystone result (w/o memory TD)

How close to a 3 GHz x86 ~8000 Dhrystone MIPS? Memory, Cache, CPI

Page 10: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

10

Dhrystone w. Memory TD

Keep the memory access latency constant - 90 MHz DDR DRAM w. 200 ns latency in all target (50MHz to 2GHz)- Latency is pessimistic, but reflect the trend

Page 11: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

11

Postmark file system benchmark

• Speed-up factor is larger than TDF (overhead)• How close to modern SATA disk? Twice throughput if run the same benchmark.

Page 12: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

12

Disk emulation performance

• Overhead analysis– <1.4ms sending packet (no zero-copy, VM)– Burst of requests (service time < 10ms, including Disksim), AoE protocol segmentation – Larger TDF offset overhead

• Overall emulated disk time still a little longer than simulated timing in disksim (~2.8 ms)

Page 13: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

13

Emulated disk R/W time in target

• Pretty deterministic result with different TDF

Page 14: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

14

CPU Temperature Emulation

50 MHz 250 MHz 500 MHz

1 GHz 2 GHz• Need calibration to get correct absolute value• Trend is accurate

Page 15: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

15

Disk Temperature Emulation

50 MHz 250 MHz 500 MHz

1 GHz 2 GHz

Page 16: UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson

16

Limitations and Conclusion

• Limitations– AoE limits the maximum number of RW sectors to 2!

(Ethernet packet limitation)– Naïve memory dilation (constant delay)

• Conclusion– Doing disk emulation in SW is pretty “lightweight”, if

• Time dilation makes SW disk fast enough• Having separate network channel for disk emulation

• Future work– Better statistic time dilation model (CPI, distribution),

still simple HW– Emulate real-life disk controller (e.g. Intel ICH) less

overhead