fusion apu & trends/ challenges in future soc design

33
FUSION APU AND TRENDS/ CHALLENGES IN FUTURE SOC (PROCESSOR) DESIGN Pankaj Singh , Acknowledgement: Denis Foley. Sr. Fellow, AMD 9 th International SoC Conference 2 nd & 3 rd November 2011

Upload: pankaj-singh

Post on 25-Jun-2015

1.042 views

Category:

Technology


0 download

DESCRIPTION

9th International SoC Conference 2011 FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

TRANSCRIPT

Page 1: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

FUSION APU AND TRENDS/

CHALLENGES IN FUTURE

SOC (PROCESSOR) DESIGN

Pankaj Singh,

Acknowledgement:

Denis Foley. Sr. Fellow, AMD

9th International SoC Conference

2nd & 3rd November 2011

Page 2: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

2 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

TODAY’S TOPICS

Trends:

– Three Eras of Processor Performance

– Evolution of Heterogeneous Computing

FSA and Open Standard:

– Why Fusion ?

– Open Standard, Open CL

Power, Performance

High Speed, Scalable Interconnect: NoC’s

3-D Stacking

SoC Trends & Challenges

– Verification Effort

– IP Integration

– TLM, RTL Co-simulation challenges.

Page 3: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

3 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

TRENDS: THREE ERAS OF PROCESSOR PERFORMANCE

Single-Core Era

Sin

gle

-thre

ad

Perf

orm

ance

?

Time

we arehere

o

Enabled by: Moore’s Law Voltage Scaling MicroArchitecture

Constrained by:PowerComplexity

Multi-Core Era

Thro

ughput

Perf

orm

ance

Time(# of Processors)

we arehere

o

Enabled by: Moore’s Law Desire for Throughput 20 years of SMP arch

Constrained by:PowerParallel SW availabilityScalability

HeterogeneousSystems Era

Targ

ete

d A

pplication

Perf

orm

ance

Time(Data-parallel exploitation)

we arehere

o

Enabled by:

Moore’s Law

Abundant data parallelism

Power efficient GPUs

Currently constrained by:

Programming models

Communication overheads

Page 4: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

4 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

TRENDS: EVOLUTION OF HETEROGENEOUS COMPUTINGA

rch

ite

ctu

re M

atu

rity

& P

rog

ram

me

r A

cce

ssib

ility

Po

or

Ex

ce

lle

nt

2012 - 20202009 - 20112002 - 2008

Graphics & Proprietary

Driver-based APIs

Proprietary Drivers Era

“Adventurous” programmers

Exploit early programmable

“shader cores” in the GPU

Make your program look like

“graphics” to the GPU

CUDA™, Brook+, etc

OpenCL™, DirectCompute

Driver-based APIs

Standards Drivers Era

Expert programmers

C and C++ subsets

Compute centric APIs , data

types

Multiple address spaces with

explicit data movement

Specialized work queue based

structures

Kernel mode dispatch

Fusion™ System Architecture

GPU Peer Processor

Architected Era

Mainstream programmers

Full C++

GPU as a co-processor

Unified coherent address space

Task parallel runtimes

Nested Data Parallel programs

User mode dispatch

Pre-emption and context

switching

More uptodate information on FSA:

http://developer.amd.com/afds/pages/keynote.aspx#/Dev_AFDS_Reb_2

Page 5: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

5 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

FSA & OPEN STANDARD: ENTER FUSION

Dual Core CPU Northbridge DirectX®11 GPU

FUSION APU

(Accelerated Processing Unit)

Heterogeneous compute engine combining

x86 compute and parallel processing

capabilities of the GPU on a single die

Page 6: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

6 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

FSA & OPEN STANDARD: WHY FUSION?

6

Integrating CPUs, Northbridge and GPU enables:

– Unified Memory

– High-bandwidth, low latency access by GPU

– Saves on interface power and PHY area

– Shared Power Control and TDP envelope

Potential bandwidth bottleneck

Relatively long memory latency

Page 7: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

7 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

COMMITTED TO OPEN STANDARDS

AMD drives open and de-facto

standards

– Compete on the best

implementation

Open standards are the basis for

large ecosystems

Open standards always win over

time

– SW developers want their

applications to run on multiple

platforms from multiple

hardware vendors

DirectX®

Page 8: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

8 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

OPENCL™ AND FSA

FSA is an optimized platform architecture for OpenCL™

– Not an alternative to OpenCL™

OpenCL™ on FSA will benefit from

– Avoidance of wasteful copies

– Low latency dispatch

– Improved memory model

– Shared pointers

FSA also exposes a lower level programming interface, for those that want the ultimate in control and performance

Optimized libraries may choose the lower level interface

Page 9: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

POWER & PERFORMANCE

Page 10: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

10 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

POWER-THERMAL EFFECTS IN SYSTEMS ON CHIPS

¡ Local failures !

Part not working

Complex SoCs: High power density

Non-uniform power dissipation: Hotspots

Spatial gradients: Cause malfunctions

High on-chip temperatures cause

malfunctions affecting reliability.

Power consumption depends on

frequency

Setting frequencies to control power and

temperature

Page 11: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

11 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

OPTIONS FOR POWER SAVINGS

Convergence of Performance and Low Power

– Notebook->Netbook-> Tablet

Tablet<-Smartphone

Page 12: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

12 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

PERFORMANCE AND POWER

S3 idle Static Screen

MM07 Media Playback

Full Compute

APU Power vs. Use Case

Performance

Po

we

r

Performance versus Power Efficiency

Power Management versus Power reduction

Performance & Thermal Design Power

Page 13: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

HIGH SPEED, SCALABLE

INTERCONNECT

Page 14: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

14 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

NOC’S: FROM BUSES TO NETWORKS:

[Friedman Harel:10]Note: This slide presents industry specific information does not relate to AMD NoC status

Page 15: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

15 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

NOC CHALLENGES: CAD TOOLS

Capturing application traffic.

Which Topology ?

Mapping? Routes to use?

Fixing communication

architecture : parameters.

Verification for correctness, performance.

Build models.

QoS under un-reliable conditions.

Key to success: Automate & integrate the steps.

Mesh Topology

homogeneous systems, with

regular tiles

Customized Topology

heterogeneous systems, with

different cores & irregular FP

Software ServicesMapping, QoS, middleware...

ArchitecturePacketing, buffering, flow control...

Physical ImplementationSynchronization, wires, power...

CAD Tools

Page 16: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

16 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

Synchronous Delay Insensitive

Global None

Timing Assumptions

Less DetectionLocal Clocks, Interaction

with data (becoming aperiodic)

A complete spectrum of approaches to system-timing exist

[Mullins06-07]

NOC CHALLENGES: BEYOND GLOBAL SYNCHRONY

Delay Insensitive

Page 17: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

3-D STACKING

Page 18: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

18 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

3-D STACKING

Supporting Heterogeneous computing: high density, high performance,

high memory B.W requirement.

3-D NoC’s option

Futuristic view:

Integrating Bio-sensor

Note:

This slide presents industry specific information does not relate to AMD 3-D stacking status

Page 19: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

SOC TRENDS &

CHALLENGES:

1. VERIFICATION EFFORT

2. IP INTEGRATION

3. TLM-RTL CO-SIMULATION

Page 20: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

20 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

WHAT’S NEW IN SOC DESIGN?

Larger and more complex chips with heavy use of pre-existing cores.

Heavy use of multi core processors and DSPs.

Complex Interconnect.

Shorter time to market and Smaller design teams.

… and software.

Leads to:

– Increased verification effort: Debugging is harder.

– Integration is more difficult.

– Need for scalable and high speed interconnect.

– SW / HW co-simulation is a major issue.

– Power –Performance challenge.

– How do we treat the system software?

Page 21: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

21 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

VERIFICATION EFFORT

Debugging

– Seamless debug across

h/w and software[especially SW]

Testbench Development:

– Several methodologies

VMM,OVMUVM.

New developments

[Unified strategy]

– UCIS,UVM TLM2.0

– Coverage trend

Address Gaps in VHDL,

System C coverage

Page 22: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

22 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

VERIFICATION EFFORT

Creating/Running Testcase:

– Direct & Random

– Run time improvement

Save-restore.

Verification Cycle per second instead of Cycles per second:

Configuring environment to dynamically select relevant

design/core.

Alternate options

Page 23: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

23 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

Emulation Focus Areas:

1. Tests/regression run with Long run time

2. Corner case bugs that may escape traditional verification

3. Replicating System level scenarios

Ongoing Initiatives/Need:

1.Seemless support for assertions.

2.Improve portability between Simulation & Emulation

3. Common model from TLM-HDL-Emulation

VERIFICATION EFFORT

Alternate Options

Page 24: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

24 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

IP INTEGRATION CHALLENGE

Integration of IP :

– Multiple IP’s, various configurations, design languages

– IP’s to be in Sync: macro’s , libraries.

– Complexity increases with mixed language designs

SYSTEM

C

SVLO

G

VERILOG

VHDL

Unique Strengths

of Languages

Diversity of Design

Teams

Importing Existing

IP

Legacy Testbench

Environment

Page 25: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

25 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

IP INTEGRATION CHALLENGE COMPARISON OF CHOICES

Direct

Instantiation

SV Bind

Construct

SystemC

Control/Observe

SCV-

Connect()SC-DPI

Source Code

AvailableYes Yes Yes Yes Yes

One IP

CompiledYes Yes Yes Yes Yes

Both IP

CompiledNo Yes No No No

Performance ++++ (3) +++ (2) + (1) + (1) +++++(4)

Delta Delay Yes Yes No No No

Languages

Supported

SV, SC,

VHDL

SV, SC,

VHDLSC + SV/VHDL

SC +

SV/VHDLSC + SV

Gap: No standardized automated methodology for integration.

Recommended Approach:

• Understand IP blocks: language, source code availability.

• Understand connection: 1-1, distributed, method port

• Option for optimized solution to quickly build a system

Page 26: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

26 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

10.10.2011

IP INTEGRATION CHALLENGE: GAPS WITH ANALOG IP

INTEGRATION IN SOC

Table1. Gaps with Analog IP Integration in SoC

Gaps Root Cause

Testchip setup

-Testchip scenario is different

-Tester used for testchip differs

Inbuilt debug

-Incomplete inbuilt SoC test/debug capability or derisk option for basic

functionality such as PLL clock

IP I/F verification -Incomplete test setup

Review process

-No common detailed review process between IP and SoC team. Incorrect

assumption based on past analog IP working silicon

IP Modelling

-Mismtach in version between IP simulation model and spice netlist

-Limitations of behavioral model to replicate actual analog IP functionality

-Timing issue

-DFT issue

EDA tools -Gaps in analog and digital simulation environment

Page 27: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

27 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

Verification Environment Bring-up

– Automated Assertions for early checks.

– Review forces, tie-off and relevant checkers from IP to SoC

– Bottleneck for SoC team to get started with verification: Option to use

fake model for initial bring up. Usage of system model.

– Super Block Concept: pre-verified IP blocks at similar frequency &

interface

Requirement:

Current solution: In-house methodology and process. No clear solution

from EDA vendors.

IP INTEGRATION CHALLENGE

IP Block1

IP Block2

Minimum Manual Effort

Hookup

Using ICU

No BUGS!

Page 28: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

28 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

TLM, RTL Co-simulation

Traditional use of System level models : Architecture profiling &

Performance Analysis

Increasing Demand for Co-simulation: Tradeoff between Accuracy and

Performance.

Open Challenges

Different level of Abstraction.

Need for improvement in Integration methodology and Test bench

development

Seamless Debug and Coverage methodology.

Using System Level model for HDL generation

Legacy system model not written with conversion in mind.

Current limitation: Incomplete translation.

Lack of reliable Equivalence Check tool.

Need: Merge top down (SystemC) and bottom-up (System Verilog)

methodology/flow.

Gaps/Work to do: How to do Power analysis

Page 29: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

29 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

THANK YOU!

Page 30: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

30 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

REFERENCES

[1] Wilson Research Group-MGC study blog 2011.

[2] AMD Coolchip2011 presentation. Denis Foley, AMD Sr. Fellow.

[3] Fusion Processors and HPC-2011, Chuck Moore, AMD Corporate

Fellow & Technology Group CTO

[3] AMD Fusion Developer Summit 2011. Phil Rogers, AMD Corporate

Fellow

[4] Fully Asynchronous framework for GALS network on chip. Friedman H

[5]Future of EE, NoC’s presentation. Dr. Srinivasan Murali

[6] Analog IP integration in SoC, IP reuse’09. Mixed language IP integration

DVCoN 2010. Extending Fucntional coverage to SystemC, VHDL-IP’10.

Pankaj S

Page 31: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

31 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

GLOSSARY

GPU – Graphics processing unit

APU: Accelerated Processing Unit

Open CL: Open Computing Language

TDP – Thermal Design power – a measure of a design

infrastructure’s ability to cool a device

NoC: Network On Chip

TLM: Transaction Level Modeling

Turbo Core – AMD boost mechanism

QoS: Quality of Service

UVM: Universal Verification Methodology

UCIS: Unified Coverage Interoperability Standard

Page 32: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

32 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

BACKUP

Page 33: FUSION APU & TRENDS/ CHALLENGES IN FUTURE SoC DESIGN

33 | 9th Intl. SoC Conference| Nov 2nd,3rd, 2011

Disclaimer

The information presented in this document is for informational purposes only and may contain technical inaccuracies,

omissions and typographical errors.

The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not

limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases,

product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD

assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this

information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of

such revisions or changes.

AMD makes no representations or warranties with respect to the contents hereof and assumes no responsibility for any

inaccuracies, errors or omissions that appear in this information.

AMD specifically disclaims any implied warranties of merchantability or fitness for any particular purpose. In no event will AMD

be liable to any person for any direct, indirect, special or other consequential damages arising from the use of any information

contained herein, even if AMD is expressly advised of the possibility of such damages.

Trademark Attribution

AMD, the AMD Arrow logo, AMD Athlon, AMD Phenom, AMD Turion, AMD Radeon, and combinations thereof are trademarks

of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. Microsoft, Windows and DirectX are registered

trademarks of Microsoft Corporation in the United States and/or other jurisdictions. PCIe is a registered trademark of PCI-SIG.

Other names used in this presentation are for identification purposes only and may be trademarks of their respective owners.

©2011 Advanced Micro Devices, Inc. All rights reserved.