multiprocessor system on chips
TRANSCRIPT
-
8/14/2019 Multiprocessor System on Chips
1/42
Multiprocessor System-On-Chips
-
8/14/2019 Multiprocessor System on Chips
2/42
Introduction
SoCAn integrated circuit that implements most or all of the
functions of a complete electronic system
Memory chip is not a system but a component
Contains memory, instruction-set processor (CPU),specialized logic, bus, other digital functions
Generally tailored to the application rather thanbeing a general-purpose chip Cost-effective
Provide the necessary performance
-
8/14/2019 Multiprocessor System on Chips
3/42
Introduction
A new crisis to SoC design increases in functionality, reliability, bandwidth
decreases in cost, power consumption
high-intension silicon with Register-Transfer-Level hardware design techniques
Pressure on chip designers
productive gap, growing cost, time-to-market
Challenges silicon density, design and verification tools & complexity,
bug cost, software, complex standard
-
8/14/2019 Multiprocessor System on Chips
4/42
Introduction
New design methodology using pre-designed, pre-verified processor cores
but, general-purpose processors is impossible
designing custom RTL logic
but, takes too long, too rigid to change easily
Solution configurable, extensible processor
firmwarerather than RTL-defined hardware
-
8/14/2019 Multiprocessor System on Chips
5/42
The Limitations of Traditional ASIC Design
Conventional SoC-design combining a standard microprocessor, memory, RTL-built
logic into an ASIC
philosophical descendants of earlier board-level designs one or two 32-bit busses (for saving pins)
rigid partitioning between a microprocessor and logic blocks
because assume that the communications are bottlenecks
The impact of SoC Integration wide bus (128-, 256-bit)
1 GB per second on an SoC using wider busses
-
8/14/2019 Multiprocessor System on Chips
6/42
The Limitations of Traditional ASIC Design
The limitation of general-purpose processors most generic data types
for complete generality
silicon-intensive, deeply pipelined, super-scalar IPC limits
Embedded system
critical functions need specific data types cannot take full advantage of all capabilities
hard-wired circuits for data-intensive functions
-
8/14/2019 Multiprocessor System on Chips
7/42
Extensible Processors as an Alternative to RTL
Two important criteria must accelerate and simplify the creation of configurations
hardware descriptions, software development tools,
verification aids
Configurable & Extensible processor Non-architectural processor configuration
Fixed-menu of processor architecture configurations User-modifiable processor RTL
Processor extension using an instruction-set descriptionlanguage
Fully automated processor synthesis
-
8/14/2019 Multiprocessor System on Chips
8/42
Extensible Processors as an Alternative to RTL
Design migration from hardwired state machine tofirmware program control flexibility
software-based development faster, more complete system modeling
unification of control and data
time-to-market
Not right choice small, fixed-state machines
simple data buffering
-
8/14/2019 Multiprocessor System on Chips
9/42
-
8/14/2019 Multiprocessor System on Chips
10/42
Toward Multiple-Processor SoCs
Complexity of SoC designs faster initial design
greater post-fabrication flexibility
Two trends combining of functions traditionally implemented
migration of functions with RTL into application-specificprocessors
Regards interconnection of multiple processors simulation of a system composed of application-specific
processors
-
8/14/2019 Multiprocessor System on Chips
11/42
-
8/14/2019 Multiprocessor System on Chips
12/42
What Are MPSoCs?
In an MPSoC, SW design is an inherent part of theoverall chip design
For chip designers Either HW or SW can be used to solve a problem
Depend on performance, power, and design time
For SW designers SW will be shipped as a part of a chip must be extremely
reliable
Meet many design constraints reserved for hardware(hard timing constrains, energy consumption...)
-
8/14/2019 Multiprocessor System on Chips
13/42
What Are MPSoCs?
Heterogeneous vs. symmetric multiprocessors Harder to program
Cheaper
More energy-effective
Challenges in MPSoC software design The combination of high reliability, real-time performance,
small memory footprint, and low-energy software
-
8/14/2019 Multiprocessor System on Chips
14/42
Why MPSoCs?
Typically, MPSoC is a heterogeneous multiprocessor Several different types of PEs (processing elements)
Heterogeneously distributed memory system
Heterogeneous interconnection network between the PEsand the memory systems
A shared-memory multiprocessor model is preferred
because it makes life simpler for the programmer
-
8/14/2019 Multiprocessor System on Chips
15/42
Why MPSoCs?
Multiprocessor vs. Uniprocessor Enough performance for some applications
The computational concurrency required to handleconcurrent real-world events in real time(task-level parallelism)
Heterogeneous vs. Symmetric Perform real-time computations
Be area-efficient
Be energy-efficient
Provide the proper I/O connections
-
8/14/2019 Multiprocessor System on Chips
16/42
Why MPSoCs?
Perform real-time computations Real-time computing is much more than high-performance
computing
Predictable behavior of the hardware
For predictable and high performanceA mechanism that is specialized to the needs of the application
Specialized memory systems, application-specific instructions
-
8/14/2019 Multiprocessor System on Chips
17/42
Why MPSoCs?
Be area-efficientA special-purpose PE may be much faster and smaller than
a programmable processor
If the system architect can predict some aspects of thememory behavior of the application, it is often possible toreflect those characteristics in the architecture
Memory specialization / Cache configuration
-
8/14/2019 Multiprocessor System on Chips
18/42
Why MPSoCs?
Be energy-efficient Power-sensitive, whether due
To environmental considerations (heat dissipation)
Or to system requirements (battery power)
Specialization saves power
Stripping away features that are unnecessary for the
application
-
8/14/2019 Multiprocessor System on Chips
19/42
Why MPSoCs?
Provide the proper I/O connections The point of an SoC is to provide a complete system
Specialized I/O
Because of the variety of physical interfaces, it is difficult tocreate customizable I/O devices effectively
-
8/14/2019 Multiprocessor System on Chips
20/42
Challenges
Software development High performance, real time, and low power Each MPSoC requires its own software development
environment
Task-level behavior Task-level parallelism is both easy to identify in SoC
applications and important to exploit RTOSs provide scheduling mechanisms, but abstract the
process
Networks-on-chips Use packet networks to interconnect the processes in the
SoC
-
8/14/2019 Multiprocessor System on Chips
21/42
Challenges
FPGAs The FPGA logic can be used for custom logic that could not
be designed before manufacturing
A good complement to software-based customization
Security Connect to Internet
Security becomes increasingly important
Networks of chips Sensor networks
Do not have total control over the system
-
8/14/2019 Multiprocessor System on Chips
22/42
Design Methodologies
Fast design time is very important Tight time-to-market and time window constraints
Higher level abstractions are needed on the HW andSW side
A key issue is the definition of a good system-level
model that is capable of representing all thoseheterogeneous components along with local andglobal design constraints and metrics
-
8/14/2019 Multiprocessor System on Chips
23/42
Design Methodologies
Design steps Design space exploration
Hardware/software partitioning, selection of architecturalplatform and components
Architecture design Design of components, hardware/software interface design
Consider strict requirements, regarding time-to-market, system performance, power consumption,and production cost
-
8/14/2019 Multiprocessor System on Chips
24/42
Hardware Architecture
Which CPU do you use? What instruction set and cache shouldbe used based on the application characteristics?
What set of processors do you use? How many processors do
you need?
What interconnect and topology should be used? How muchbandwidth is required? What quality-of-service characteristicsare required of the network?
How should the memory system be organized? Where shouldmemory be placed and how much memory should be providedfor different tasks?
-
8/14/2019 Multiprocessor System on Chips
25/42
-
8/14/2019 Multiprocessor System on Chips
26/42
-
8/14/2019 Multiprocessor System on Chips
27/42
Software
Software Architecture and Design Reuse Viewpoint Middleware, Operating system, Hardware abstraction layer
APIs provide an abstraction of the underlying hardware
architecture to upper layers of software
The software architecture may enable several levels ofsoftware design reuse
Key challenges Determining which abstraction of MPSoC architecture is most
suitable at each of the design steps Determining how to obtain application-specific optimization of
software architecture
-
8/14/2019 Multiprocessor System on Chips
28/42
Software
Optimization Viewpoint Cost and performance requirements
Two factors Processor architecture
ParallelismApplication-specific
Memory hierarchy
Shared memory Distributed memory
Consider problems in a different context with more designfreedom in hardware architecture and with a new focus onenergy consumption
-
8/14/2019 Multiprocessor System on Chips
29/42
Techniques for DesigningEnergy-Aware MPSoC
-
8/14/2019 Multiprocessor System on Chips
30/42
30
Introduction
Power and energy consumption have becomesignificant constraints
Reducing active power voltage scaling
Reducing standby power
-
8/14/2019 Multiprocessor System on Chips
31/42
31
Reducing Active Energy
Multiple Supply Voltage Decreasing supply voltage decrease performance since
increase gate delay Effective in MPSoC since different type of MP require
difference performance DVS combined with DFS
(most popular of the techniques)
Most embedded and mobile processors containing thisfeature
-
8/14/2019 Multiprocessor System on Chips
32/42
32
DVS+DFS
As long as supply voltage is increasedbeforeincreasing the clock rate, the system only stall whenthe PLL is relocking on the new clock rate.
Future MPSoC would require its own converter andPLL Requirement : Cores are tolerant of periodic dropouts Complication : PLL is analog device noise is induced by
digital switching
-
8/14/2019 Multiprocessor System on Chips
33/42
33
Reducing Standby Energy
Increasing VT decreases Subthreshold leakagecurrent(pros) and increases gate delay (cons).
DVS, DFS, variable VT is an effective way
Sleep transistor Gating the supply rail
Switching off the supply to idle component System SW can determine the optimal scheduling
Can direct idle cores to switch off
-
8/14/2019 Multiprocessor System on Chips
34/42
-
8/14/2019 Multiprocessor System on Chips
35/42
-
8/14/2019 Multiprocessor System on Chips
36/42
-
8/14/2019 Multiprocessor System on Chips
37/42
37
RequirementAbility to identify unused resources
Cache size is reduced dynamically to optimize
Cache block is supply-gated
Keeping the tag line active when deactivating a cache line
Dynamic voltage scaling
Drowsy cache
Leakage-biased bitline
-
8/14/2019 Multiprocessor System on Chips
38/42
-
8/14/2019 Multiprocessor System on Chips
39/42
39
Each processor have its own private cache Pros
Low power per access, low latency, and good scalability
Cons Duplication of data and instructions
Complex cache coherence protocol
-
8/14/2019 Multiprocessor System on Chips
40/42
40
Combine the advantage of two option!
CCC (crossbar-connected cache) Shared cache is divided into multiple banks using an N x M
crossbar Pros
Duplication problem is eliminated (logically single)
Consistency mechanism isnt needed
Scalable
Be useful in reducing energy consumption
-
8/14/2019 Multiprocessor System on Chips
41/42
41
CCC (cont) Cons
Concurrent access to the same bank cause processor stallAlleviate
More cache banks than # of processor Deals with the reference to the same block
The energy benefits of CCC
-
8/14/2019 Multiprocessor System on Chips
42/42
42
Reducing Snoop Energy
In bus-based symmetric multi-processors, all bussize cache controllers snoop the bus Snoop occur when writes are issued to already cached
block, and cache miss
Unlike normal cache, tag and data array access areseparated
Energy optimizations include Use of dedicated tag arrays for snoops
Serialization of tag and data array accesses