using simulated hardware – virtualized software...
Post on 24-Feb-2018
247 Views
Preview:
TRANSCRIPT
3
Traditional Development Process
• Use hardware to test and debug software– Expensive
– Unwieldy environment for debugging
– Not available early enough
– Often, not every developer can have a test bed
4
Virtutech Simics• Virtualized Software Development
– Simulates the system under development – Simulates the (immediate) environment– Runs entire software image unchanged– Same build chain as physical system
• Benefits over physical hardware
– Customizable– Cheaper– Programmer-friendly– Scriptable & controllable– Available earlier– More flexible
6
Virtualizing the Hardware
Operating system
User program
MiddlewareDBServer
Drivers Firmware
Complete productionsoftware
Virtual hardware
Application
Service
Control
Connectivity
MSC
SGSN
HSS GMSC/Transit
User dataControl
Application Service CapabilityServers
Media Gateway PSTN/ISDN
GSM/EDGE
WCDMA
Backbone Switches/Routers
InternetIntranets
GGSN
RNCRBS
BSC
BTS
CSCF MGCF
Media Gateway/S
G
TSP
AXDCPP
AXE WPP
7Hardware
Virtualizing the Hardware
CPU
Operating system
User program
RAM
FLASH
MiddlewareDBServer
Complete productionsoftware
LCD
ASICROM
PCI
I2C
BusCPU
Drivers Firmware
The software can’t tell the difference
Network net
Identical build tools chain
DiskVirtual
hardware
Runs binaries from real target
Disk Ctrl
8
Abstraction Levels
Functional instruction-set & transaction-level device behavior
Timing-correct cycle-level (SystemC)
Implementation-level (VHDL/Verilog)
Operating system API (VxSim)
Service API (Java library)
Operating System API Standard (POSIX)
Abs
tract
ion
HW/SW interfaceHW/SW interface
Stable & narrow interface, enables
fast execution
Stable & narrow interface, enables
fast execution
Excessive detail gives very slow
simulation
Excessive detail gives very slow
simulation
Not same binaries as target, additional
build chain
Not same binaries as target, additional
build chain
Too abstract to provide information on actual
target behavior
Too abstract to provide information on actual
target behavior
Cycle-accurate instruction-set
9
Simics Virtual Hardware
• Very high execution speed
– 100s to 1000s of MIPS– JIT Technology ISS
• Timing approximated– Does not model caches,
pipelines, buses, and device implementation details
– Covers 80% to 95% of real-time systems development
• Complete system– Networks, multiple
machines, multicore etc.
• Processors– Complete instruction set
– Function identical to real
– User and supervisor modes
– Memory-management unit
• Devices– Function modeled
– Transaction-level abstraction
• Networks– Packets/message-level
10
Simics Compared to...Development cards Host-compiled simulation•Both run real binaries•HW has real timing•Simics offers convenient debug•Simics often faster•Simics available before hardware•Fault-injection possible in Simics•Simics better at networks
•Simics runs real binaries, not host•Simics requires no special build and OS emulation layer•Simics provides correct relative execution speed between nodes•Both handle networks•Host-compiled might be faster
Instruction-Set Simulators Cycle-accurate simulation•Simics provides the whole system•Simics faster than traditional ISS
•More detailed timing than Simics•Low-level timed interaction visible•Simics much faster (10x to 100x)•Models take more time to create
12
Handy Features of Simics
• Checkpointing – Store current state; pick up and continue later– Position workload once, use many times– Distribute a known system state for a software load– Package a bug for parallel investigation by many engs.
• Determinism– Same initial state gives same execution– Repeat the same execution any number of times– Investigate a problem time after time– Simplifies problem reproduction
13
Handy Features of Simics
• Visibility (insight without intrusion)– All state can be observed
– All events can be traced and logged
• Controllability– Any part of machine or state can be changed
– Fault injection an interesting special case
• Virtual time– Time is completely virtual – Global synchronization across all machines– Single-step code on one machine or processor with no “flooding”
14
Handy Features of Simics
• Configurability– Any parameter of system can be changed
• Sandboxing– Simulated machine complete isolated
– Allows investigating ”nasty code”
• Reverse debugging– Roll back execution to previous state
– Reverse breakpoints
– Investigate details of program errors
– Reduces time to find hard bugs and development risk
15
Reverse Debugging
• Going forwards
• Back up and find out what happened
• For an entire system, including distributed and multiprocessor systems
16
Networked System Testing
SimicsSimicsSimulated HW
OS
Application
Simics Network Link
Simulation
Simulated HW
OS
Application
Network connect
Simulated HW
OS
Application
Network connect
Real-network
Traffic gen
Network listener and tester
Behavior model System
under test
System under test
Other nodeson the
simulatednetwork
Other nodeson the
simulatednetwork
Real HW
OS
Application
Real HW
OS
Application
Real-worldnodes
connected to the simulatednetwork
Real-worldnodes
connected to the simulatednetwork
17
Incomplete Systems/Scaffolding
• Virtual environments are very useful for incomplete systems– If a bootrom does not exist, load OS directly
into memory and configure system state– To quickly get a prototype board, stub
hardware with fixed values for registers– Use breakpoints and “cheat” to fake OS
functionality not yet implemented – Model other network nodes by behavior, not as
concrete hardware with real software
19
The Multicore Revolution is Here!
• The massive move to parallel computers and multiple processor cores instead of single processors has been trumpeted before.
• This time it is for real. Why?
• More instruction-level parallelism hard to find– Very complex designs needed for small gain
• Clock frequency scaling is slowing drastically– Too much power and heat when pushing envelope
• Cannot communicate across chip fast enough– Better to design small local units with short paths
• Effective use of billions of transistors– Easier to reuse a basic unit many times
• Potential for very easy scaling– Just keep adding processors/cores for higher performance
20
Embedded Multicore
• Multiprocessor and multicore systems are the future for embedded systems– Dominant in server market since 1980s
– Prevalent in SoC design since 2000
– Standard on the desktop in 2006
• Now the only option for maximum performance
Vendor Chip Max #Cores
Arch AMP SMP
ARM ARM11 MPCore 4 ARMv6 X X
Cavium Octeon CN38 16 MIPS64 X
PA Semi PA6T custom 8 PPC X
Freescale MPC8641D 2 PPC X X
X
X
IBM 970MP 2 PPC64 X
IBM Cell 9 PPC64,DSP
Raza XLR 7-series 8 MIPS64 X
TI OMAP2 3 ARM,C55,IVA
21
Software on Multicore is Hard
• Parallelism required to gain performance– Parallel hardware is “easy” to design– Parallel software is known to be hard to write
• Existing software assumes single-processor– Multitasking != multiprocessor-ready– Software breaks in new interesting ways on multipro
• True concurrency is fundamentally hard– Human minds have a hard time with concurrency– Especially in complex software systems– Some phenomena cannot occur on a single processor
running multiple threads, only on true multipro
22
Multiprocessors & Debug
• Limited visibility into hardware– Single debug port, multiple processors– High speed, concurrent execution
• Timing-sensitive– Small changes in timing alters system behavior radically– Hardware variations impact software behavior
• Indeterminism– Rerunning a program gives different results– Hard to reproduce bugs
• Heisenbugs– Inserting probes to trace behavior alters behavior– Bugs hide when they are being debugged
• Other cores keeps running even if one core stopped
23
Three Steps of Debugging
1. Provoking errors– Forcing the system to a state where things break
2. Reproducing errors– Recreating a provoked error reliably
3. Locating the source of errors – Investigating the program flow and data– Depends on success in reproduction
Simics helps with all three steps
24
Debugging Multicore... in Simics
1. Provoking errors– Vary configuration, processor speeds, latencies– Force corner cases to occur
2. Reproducing errors– Checkpoints & determinism make reproduction easy– No Heisenbugs– Error situations easy to package and distribute
3. Locating the source of errors – Reverse debugging a key tool– Global time synchronization & global stop– No probe effect from instrumentation and tracing
26
Building a Model
• Prerequisite to obtaining benefits of virtualized software development
• How do we achieve this?
Backplane
CPU
RAM
Device
FLASH
Device
DSP
Device
CPU
RAM
Device
FLASH
Device
Enet
Device
Enet
DevelopmentHardware
Virtual DevelopmentPlatform
Simics Model
27
ProcessorProcessorProcessor
Processor Device
Network
Device
DeviceMemory
ASIC
Flash Interconnect
Arc
hite
ctur
e
Processor Device
Network
Device
Memory
Flash Interconnect
Con
figur
atio
n
• Use Simics framework• Reuse VT components
– Large library available• Adapt VT components
• Model custom parts– DML– C, C++– Python
• Device modeling by– Virtutech– Customer– Partner– Consultant
Modeling Your System
Device
ASIC
28
bank b {register DMA_control size 4 @ 0x20 { field EN [31] "Enable DMA";field SWT [30] "Software Trigger";field TS [15:0] "Transfer size";method after_write(memop) {inline $do_dma_transfer();
}}register DMA_source size 4 @ 0x24; register DMA_dest size 4 @ 0x28;
method do_dma_transfer() {if ($DMA_control.EN==1) {local uint16 count = $DMA_control.TS;local uint8 local_buf[4];local exception_type_t result;
while(count>0) {// copy memory details elided...
$DMA_source += 4;$DMA_dest += 4; count -= 1;
}// clear SWT bit, update TS$DMA_control.SWT = 0;$DMA_control.TS = count;
...
The DML Modeling Language
• Domain-specific for creating fast hardware device models
• Declarative style• Fast compiled models• Models binary redistributable• Efficient coding
– 5 times smaller than C– Quick start modeling– Iterative lazy development– Much faster than SystemC
• Modeling time:– Depends on model complexity– Hours to days to weeks
30
Case Study: Switchcore
• Problem: SwitchCore needed to develop and test drivers and protocol stacks for their next generation Xpeedium3 chips
• Challenges:– Silicon not yet available– SwitchCore customers need to evaluate performance and to develop their
own software layers– Previously had used an internal simulator but slow and expensive to
maintain• Solution: Model Xpeedium3 using Simics• Benefits:
– Internal software development (including offshore)– Customers can develop their own software using the same model– Reduced delay between prototype availability and production orders
31
Board with CPU
RAM
PPC 8548
CPUPLB
eTSEC
PCIe
UART
OS
Apps
DriverI2C
PHY
PHY
PHY
I2C Hub
MDIO
PCIe
I2C
custom link
MDIO
X3Chip
xMII
Serial
FLASH
PCIe Switch
EEPROM
Ethernet
Ethernet
Front panel
Back-plane
Switchcore Board Components
32
Case Study: Wind River/8641D
• Problem: Wind River needed to develop software for the FreescaleMPC8641D dual-core PowerPC SoC
• Challenges:– No prototype silicon was available– Silicon schedule was slipping but customers still required Wind River
support on schedule– 8641D is a dual-core chip - this is not a straightforward port
• Solution:– Wind River’s engineering organization ported VxWorks using Virtutech
Simics with the 8641D processor model• Benefits:
– Development could start ahead of silicon– Improved productivity, improved software quality, earlier availability of
8641D software to Wind River’s customers
33
Case Study: Ericsson
• Problem: Ericsson needed to test software on a large range of base-station configurations
• Challenges:– Hardware is expensive and takes 2-14 days to reconfigure before
testing– Systems can have up to 66 boards and 700 processors– Test teams are geographically distributed
• Solution: Create Simics models for each board and handle all re-configuration through scripts
• Benefits:– Enormous reduction in cost of capital equipment used for testing– Can reconfigure a system almost instantly– Can handle even a fully populated system
35
Virtualized Software DevelopmentHardware-based methodology Virtualized software development
Application development has to get started hand-built scaffolding
Software development can start once basic architectural decisions have been made
Separate development methodologies for application and lower-level software
Uniform development methodology; far more iterations lower risk & higher quality
System integration unpredictable and always on the critical path
System integration is quick and uncovers fewer problems: quality built-in much earlier
Long delay between hardware availability and product shipment (revenue)
Minimal delay between hardware availability and FCS (and $)
36
Driving Quality Sooner
Time
Number of
DefectsRemoved
With hardware only
With virtualized software development
Customer ship date
Software development Integration and test Deployed
Development starts earlier
More defects found during development
phase
Fewer defects found during integration
Product ships earlier
Higher quality
38
Remember! Enter the evaluation form and be a part of making Øredev even better.
You will automatically be part of the evening lottery
40
Simics Modeling Level: Processor
• Instruction-set simulation (ISS)• Goal is very high performance• Complete and correct processor functionality
– All instructions semantics bit-correct vs real machine– Supervisor-mode & user-mode– Runs the complete target instruction set
• Including Altivec, SSE, 3dNow, VIS, etc. extensions– All accessible values represented
• User-level registers• Supervisor-level registers• Model-specific registers, ASIs, debug register, etc.
• Memory-management unit• Timing abstracted
– Add details if required
41
Simics Modeling Level: Devices
• Hardware modeled as a set of devices– Memory map of machine (as seen by processor)– At the programming register level
• Model the program-visible behavior– Configuration registers– Control register – Data transmitted & received
• Transaction-level modeling– Reads, writes, DMA transfers, network packets– For high-performance models
• ASICs & FPGAs– Model programming interface behavior– Not detailed implementation
• Detailed timing can be added if required
42
Simics Modeling Level: Networks
• Interfaced using “real” network devices• Networks modeled at message level
– Entire messages (packets, frames, ...) delivered as a unit• Hardware addressing used
– Ethernet MAC, 1553 Node IDs– Does not care about higher-level protocols– Ethernet allows IPv4, IPv6, TCP, UDP, SCP, ICMP, ...
• Any topology or addressing scheme– Broadcast, Unicast, switched, point-to-point, etc.
• Perfect network by default– Introduce latencies– Introduce bandwidth limits– Introduce faults– Introduce arbitration
43
Simics Modeling: No Limits• Boards/machines:
– Single processor– Multiprocessor – Shared memory, local memories
• Backplane/interconnect:– Network (Ethernet, ATM, I2C, ATCA, custom links...)– Shared memory (PCI, PCIe, custom system...)
• System level:– Multiple boards and machines– Heterogeneous processors, boards, machines
• Networks:– Any number of networks– Mixing different network types
• Scalability:– Always allows 64-bit memory space– Simulation can be distributed
44
Modeling Your System
Processor Device
Network
Processor Device
DeviceMemory
ASIC
Flash Interconnect
Arc
hite
ctur
e
1. Determine the components• System architecture docs• User guides• Component manuals
45
Processor Device
Network
Processor Device
DeviceMemory
ASIC
Flash Interconnect
Arc
hite
ctur
e
1. Determine the components• System architecture docs• User guides• Component manuals
2. Reuse library components• Virtutech libraries• Processors• Devices• Interconnects & networks• System structure
Modeling Your System
Processor Device
Network
Processor Device
Memory
Flash Interconnect
Con
figur
atio
n
46
Processor Device
Network
Processor Device
DeviceMemory
ASIC
Flash Interconnect
Arc
hite
ctur
e
Processor Device
Network
Processor Device
Memory
Flash Interconnect
Con
figur
atio
n
1. Determine the components• System architecture docs• User guides• Component manuals
2. Reuse library components• Virtutech libraries• Processors• Devices• Interconnects & networks• System structure
3. Model unique components• And adapt existing• Virtutech• Customer• Consultant
Modeling Your System
Device
ASIC
Processor
47
The Modeling Process
• Determine level of abstraction• Model devices at interfaces
– PCI interface: read and write transactions– Memory map interface: read and write transactions– DMA transactions into/out of memory– Network: packets in and out– Interrupts lines: high or low
• Minimal device state to model behavior at interfaces
• Whatever documentation a software programmer needs is what we need
48
PPC440core
UIC
DCR Map
PLBMap
DCR Registers
DDR Controller
DDRSDRAM
PCIBridge
Eth1
Eth0
Ext BusController
GPIO
I2C
UART0
UART1
SRAM
FLASH
DMA
PLB Arbiter
Clock, Power, Control
MAL
Modeled as Simics
memories
Devices modeled (mostly) as dummies
Devices where function needs to be
modeled
Mapping a System For Modeling
PPC 440 GP block diagram
49
Network Simulation with Simics: Node View
SimicsSimics
Simulated machine
OS
Application
Simulated machine
OS
Application
Simulated machine
OS
Application
Simulatedmachine sends
packets onto the simulated network
Simulatedmachine sends
packets onto the simulated network
Simics Network Link
Simulation
Simulated machine
OS
Application
Regular OS networking API for
the applications
Regular OS networking API for
the applications
OS talks to the network device,
like on a real machine
OS talks to the network device,
like on a real machine
50
Network Instrumentation
SimicsSimics
Simulated machine
OS
Application
Simulated machine
OS
Application
Packets travel on the simulated
network(s)
Packets travel on the simulated
network(s)
Network instrumentation
moduleSimics
Network LinkSimulation
Simulated machine
OS
ApplicationInstrumentation at network coreInstrumentation at network core
Simulated machine
OS
Application
Packets can be inspected, killed,
corrupted, delayed, bandwidth limited
Packets can be inspected, killed,
corrupted, delayed, bandwidth limited
Instrumentation at network
devices
Instrumentation at network
devices
51
Uses for Virtual Hardware
• Hardware replacement for test & development– CapEx savings– Capacity & capability increase: more systems available
• Early hardware availability– Operating system bring-up & development
• Performance tuning & debugging– Unintrusive diagnosis of bottlenecks– Cache, TLB, Disk, Network profiling
• Fault injection– Stop & crash nodes, inject network faults– Repeatable & reliable
• Regression testing– Automation & perfect control over system
• Scalability testing– More hardware than in the real world
52
Reverse Debugging
• Stop & go back in time– Instead of rerunning
program from start
– No need to rerun and hope for bug to reoccur
– Investigate exactly what happened this time
– Breakpoints & watchpointsbackwards in time
– Very powerful for parallel programs
BackupGo forward
Only some runs reproduce the
right error
Only some runs reproduce the
right error
53
Reverse Debugging Techniques
• Trace-based– Record system execution
– Special hardware support or simulator
– Use as “tape recorder,” fixed execution observed
– Hard to extend to multipro
• Simulation-based– Record in simulator
– Replay in same simulator
– Can change state and continue execution
– More powerful solution
BackupGo forward
Backup
And go somewhere else
top related