throughput oriented aarchitectures

17
Throughput Oriented Architectures 1

Upload: nomy059

Post on 05-Dec-2014

72 views

Category:

Engineering


1 download

DESCRIPTION

computer architecture article related to throughput oriented architectures

TRANSCRIPT

Page 1: Throughput oriented aarchitectures

1

Throughput Oriented Architectures

Page 2: Throughput oriented aarchitectures

2

Contents

• Throughput oriented Processors• Hardware Multithreading• Many Simple Processing Units • SIMD Execution • GPUs• NVIDIA GPU architecture• Throughput oriented programming• Conclusion

Page 3: Throughput oriented aarchitectures

3

Key Points:

• Throughput oriented processors tackle problems where parallelism is abundant.

• Due to their design ,programming throughput oriented processors requires much more emphasis on parallelism and scalability than programming sequential processors.

• GPUs are the leading exemplars of modern throughput-oriented architectures .

Page 4: Throughput oriented aarchitectures

4

Throughput-Oriented Architectures:

• Throughput and latency are two fundamental measures for processor performance.

• Traditional Scalar microprocessors are latency oriented architectures.

• Throughput oriented processors arise from the assumption that they will work where parallelism is abundant.

• Throughput oriented architectures rely on three key architectures:

1. Emphasis on many simple processing cores2. Extensive Hardware multi-threading3. SIMD Execution

Page 5: Throughput oriented aarchitectures

5

Hardware Multithreading:

• A computation in which parallelism is abundant can be decomposed into a collection of concurrent sequential tasks that execute in parallel or across many threads.

• A thread is able to execute the instruction stream corresponding to a single sequential task.

• Multithreading weather in hardware or software provides a way of tolerating latency.

• Hardware multi-threading as a design strategy for improving aggregate performance on parallel workloads has a long history.

Page 6: Throughput oriented aarchitectures

6

Hardware Multithreading:

• Tera, Sun Niagara and NVIDIA GPU22 uses multithreading for high throughput performance.

• Simultaneous multithreading is used to improve the efficiency of superscalar sequential processors.

• HEP, Tera and NVADIA G20 shows characteristics of throughput-oriented processors.

Page 7: Throughput oriented aarchitectures

7

Many simple processing units:

• High density transistors consists of many simple processing units.

• Throughput oriented architectures achieve higher level of performance by using simple and many processing units.

• The instructions execute in the order they are in the program.• Saving in chip area allow many parallel processing units and

gives higher throughput on parallel workloads.

Page 8: Throughput oriented aarchitectures

8

SIMD execution:

• Parallel processors uses form of SIMD execution to improve aggregate throughput.

• Two basic catagories of SIMD machines are SIMD processor array and vector processor.

• SIMD processor arrays consists of many processing units and single control unit.

• Vector processor consist of traditional scalar instructions and vector instructions operating on data vectors of fixed width.

Page 9: Throughput oriented aarchitectures

9

• GPUs are similar to a computer's CPU. A GPU, however, is designed specifically for performing the complex mathematical and geometric calculations that are necessary for graphics rendering.

GPU:

Page 10: Throughput oriented aarchitectures

10

• Difference between a CPU and GPU .• A CPU comprise of a few cores enhanced for serial

sequence.• And a GPU comprise of thousand of smaller more

efficient cores make for handling multiple tasks concurrently.

CPU And GPU:

Page 11: Throughput oriented aarchitectures

11

CPU ANG GPU:

Page 12: Throughput oriented aarchitectures

12

• Floating Point performance is 1000GFLOPS• On-chip scratchpads is 48KB/SM. • Off-chip memory bandwidth is 100GB/s

NVIDIA Fermi Graphical Processing Unit.

Page 13: Throughput oriented aarchitectures

13

NVIDIA v Intel:

Page 14: Throughput oriented aarchitectures

14

Performance per watt:

Page 15: Throughput oriented aarchitectures

15

Microarchitecture of GPU

Page 16: Throughput oriented aarchitectures

16

Reduction tree:

Page 17: Throughput oriented aarchitectures

17

• Throughput oriented processors assume parallelism is more focused, rather than scarce, and it target is maximizing total throughput of all tasks rather than minimizing the latency of one task.

• A fully general purpose chip can not affords to aggressively trade for increased total performance at the cost of single thread performance.

Conclusion