multi-core processing the past and the future amir moghimi, asic course, ut ece

15
Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

Upload: kory-mason

Post on 24-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

Multi-core ProcessingThe Past and The FutureAmir Moghimi, ASIC Course, UT ECE

Page 2: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

The Past

• Instruction Level Parallelism (ILP) Enhanced Processors

• Wide Dynamic Execution [1] with techniques such as:• speculative execution (using branch prediction)• out of order execution (using register renaming

and reservation stations)• super scalar (using multiple-issue instruction cache

and reorder buffer)

• e.g. Intel P6 Micro-architecture• Used in: Pentium® Pro processor, Pentium® II

processor and Pentium® III processors

Page 3: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

ILP Limitations

• Window size limitation [2] due to:• 2450 comparisons for register dependency

detection among 50 instructions in one clock cycle!• A branch instruction every 5 instructions on average

• Imperfect branch prediction

• Serial nature of the application with true data dependencies

• So, how to use this huge amount of silicon coming every year and a half?• Use multiple cores on a single die

Page 4: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

Multi-core Basics

• A multi-core chip is one which combines two or more independent processing cores into a single die (also known as Chip Multi-Processor)

• Four main questions arise [3]:• How the application is developed?• How do they share data?• How do they physically communicate?• How scalable is the architecture?

Page 5: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

Given Answers

• For parallel application development, use the thread concept formerly proposed for discrete multi-processor systems

# of Proc

Communication model

Message passing 8 to 2048

Shared address

NUMA 8 to 256

UMA 2 to 64

Physical connection

Network 8 to 256

Bus 2 to 36

[3]

Page 6: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

Chip Level Multi-threading

• Implemented in superscalar processors before introducing multi-core chips

• Multi-threading Methods:• Fine-grained • Coarse-grained• Simultaneous MT

• e.g. Intel HyperThreading Technology

Page 7: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

4-way Threading Processor [3]

Thread A Thread B

Thread C Thread D

Tim

e →

Issue slots →SMTFine MTCoarse MT

Page 8: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

Now Multi-core Processing

• A simple look at a multi-core processor (IBM Xenon used in MS-Xbox 360)

• Simple but effective

Core 0

L1D L1I

Core 1

L1D L1I

Core 2

L1D L1I

1MB UL2

[4]

Page 9: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

A More Powerful Design

• STI Cell (used in PS3)

[8]

Page 10: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

A Comparison

• Sun UltraSPARC T1

[5]

4-w

ay M

T S

PA

RC

pip

e

4-w

ay M

T S

PA

RC

pip

e

4-w

ay M

T S

PA

RC

pip

e

4-w

ay M

T S

PA

RC

pip

e

4-w

ay M

T S

PA

RC

pip

e

4-w

ay M

T S

PA

RC

pip

e

4-w

ay M

T S

PA

RC

pip

e

4-w

ay M

T S

PA

RC

pip

e

Crossbar

4-way banked L2

Memory controllers

I/Osharedfuncs

[3]

Page 11: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

UltraSPARC T1 vs. Pentium EE

[5]

Page 12: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

UltraSPARC T1 vs. Pentium EE

Performance Comparison running SPEC JBB 2000, TPC-C, TPC-W, and XML Test as server benchmarks and SPEC CPU2000 as the serial benchmark [5]

Pentium Extreme Edition Die Photo [5]

Page 13: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

Now the Trend

• Intel will deliver a quad-core (4 full execution cores) processor in the first quarter of 2007 [1]

• “We forecast that more than 85 percent of our server processors and more than 70 percent of our mobile and desktop Pentium® family processor shipments will be multi-core–based by the end of 2006” [7]

• Intel plans to have 32 cores on a die till 2015 [7]

• But do not forget the high power density and memory bandwidth issues!

Page 14: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

Thanks

• Any questions?

Page 15: Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE

References1. http://www.intel.com/technology/architecture/coremicro/index.htm

2. John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach 2nd Edition. Morgan Kaufmann, 1999.

3. PSU CSE 431, Mary Jane Irwin, Computer Architecture, Fall 2005, Lecture 28.

4. http://www-128.ibm.com/developerworks/power/library/pa-fpfxbox/?ca=dgr-lnxw09XBoxDesign

5. http://www.dns-gmbh.de/dnsgmbh/unternehmen/event-kalender/23b3b3ee6c4e487e6f4205fa03e783bc.0.0/Niagara_CMT.pdf

6. James Laudon: Performance/Watt: the new server focus. SIGARCH Computer Architecture News 33(4): 5-13 (2005)

7. http://www.intel.com/technology/computing/multi-core/index.htm

8. http://www.pcstats.com/articleimages/200502