lecture1 - parallel computing architecturesece.uprm.edu/~wrivera/icom6025/lecture1.pdf · sisd simd...

47
Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Lecture 1 Parallel Computing Architectures

Upload: others

Post on 13-Mar-2020

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Dr. Wilson Rivera

ICOM 6025: High Performance ComputingElectrical and Computer Engineering Department

University of Puerto Rico

Lecture 1Parallel Computing Architectures

Page 2: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

• Goal: Understand parallel computing fundamental concepts – HPC challenges– Flynn’s Taxonomy– Memory Access Models– Multi-core Processors– Graphics Processor Units– Cluster Infrastructures– Cloud Infrastructures

Outline

ICOM 6025: High Performance Computing 2

Page 3: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Optimization of plasma heating systems for fusion experiments

Physics of high-temperature superconducting cuprates

Global simulation of CO2 dynamics

HPC Challenges

Fundamental instabilityof supernova shocks

Protein structure and function for cellulose-to-ethanol conversion

Next-generation combustion devices burning alternative fuels

Slide source: Thomas Zaharia

Page 4: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

1980 1990 2000 2010 2020 2030

Capacity: # of Overnight

Loads cases run

Available Computational

Capacity [Flop/s]

CFD-basedLOADS

& HQ

Aero Optimisation& CFD-CSM Full MDO

Real timeCFD based

in flightsimulation

x106

1 Zeta (1021)

1 Peta (1015)

1 Tera (1012)

1 Giga (109)

1 Exa (1018)

102

103

104

105

106

LES

CFD-basednoise

simulation

RANS Low Speed

RANS High Speed

HS Design

Data Set

UnsteadyRANS

“Smart” use of HPC power:• Algorithms• Data mining• knowledge

Capability achieved during one night batchCourtesy AIRBUS France

HPC Challenges

Page 5: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

ICOM 6025: High Performance Computing 5

High Resolution Climate Modeling on NERSC-3 – P. Duffy, et al., LLNL

HPC Challenges

Page 6: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

HPC Challenges

ICOM 6025: High Performance Computing 6

https://computation.llnl.gov/casc/projects/.../climate_2007F.pdf

Page 7: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Flynn's Taxonomy

SISD SIMD

MISD MIMD

Data

Instructions

ICOM 6025: High Performance Computing 7

Page 8: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Flynn's Taxonomy

•Single Instruction, Multiple Data (SIMD)

– All processing units execute the same instruction at any given clock cycle

– Best suited for high degree of regularity • Image processing

– Good examples• SSE = Streaming SIMD Extensions• SSE, SSE2, Intel MIC (Xeon Phi)• Graphics Processing Units (GPU)

ICOM 6025: High Performance Computing 8

Page 9: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Flynn's Taxonomy

• Multiple Instruction, Multiple Data (MIMD)

– Every processing unit may be executing a different instruction stream, and working with a different data stream. • Clusters, and multicore computers. • In practice MIMD architectures may also include

SIMD execution sub-components.

ICOM 6025: High Performance Computing 9

Page 10: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Memory Access Models

• Shared memory• Distributed memory• Hybrid Distributed-Shared Memory

ICOM 6025: High Performance Computing 10

Page 11: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Shared Memory

Bus Interconnect

Memory

CPU CPU CPU

L2 L2 L2

I/O

ICOM 6025: High Performance Computing 11

Page 12: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Shared Memory

• multiple processors can operate independently but share the same memory resources – so that changes in a memory location effected by one processor

are visible to all other processors.

• Two main classes based upon memory access times– Uniform Memory Access (UMA)

• Symmetric Multi Processors (SMPs)– Non Uniform Memory Access (NUMA)

• Main disadvantage is the lack of scalability between memory and CPUs. – Adding more CPUs geometrically increases traffic on the shared

memory CPU path

ICOM 6025: High Performance Computing 12

Page 13: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Shared Memory

• Memory hierarchy tries to exploit locality – Cache hit: in cache memory access (cheap)– Cache miss: non-cache memory access

(expensive)

ICOM 6025: High Performance Computing 13

Page 14: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Distributed Memory

Network I/O

CPU

L2

M

L2

CPU M

CPU

L2

M

L2

CPU M

ICOM 6025: High Performance Computing 14

Page 15: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Distributed Memory

• Processors have their own local memory.• When a processor needs access to data in

another processor– it is usually the task of the programmer to

explicitly define how and when data is communicated

• Examples: Cray XT4, Clusters, Cloud

ICOM 6025: High Performance Computing 15

Page 16: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Hybrid (Distributed-Shared) Memory

ICOM 6025: High Performance Computing 16

Shared memory

Shared memory

Shared memory

Shared memory

NETWORK

In practice we have hybrid memory access

Page 17: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Parallel computing trends• Multi-core processors

– Instead of building processors with faster clock speeds, modern computer systems are being built using chips with an increasing number of processor cores

• Graphics Processor Unit (GPU) – General purpose computing and in particular data parallel high

performance computing

• Dynamic approach to cluster computing provisioning. – Instead of offering a fixed software environment, the application

provides information to the scheduler about what type of resources it needs, and the nodes are automatically provisioned for the user at run-time.

• Platform ISF Adaptive Cluster • Moab Adaptive Operating Environment

• Large scale commodity computer data centers (cloud)– Amazon EC2, Eucalyptus, Google App Engine

ICOM 6025: High Performance Computing 17

Page 18: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Multi-cores and Moore’s Law

Circuits complexity doubles every 18 months

ICOM 6025: High Performance Computing 18

Power wall (2004)

Source: The National Academies Press, Washington, DC, 2011

Source: Intel

Page 19: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Power Wall

• The transition to multi-core processors is not a breakthrough in architecture, but it is actually a result from the need of building power efficient chips

ICOM 6025: High Performance Computing 19

Page 20: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Power Density Limits Serial Performance

ICOM 6025: High Performance Computing 20

Page 21: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Many-cores (Graphics Processor Units)

• Graphics Processor Units (GPUs)

– throughput oriented devices designed to provide high aggregate performance for independent computations.

• prioritizing high-throughput processing of many parallel operations over the low-latency execution of a single task.

– GPUs do not use independent instruction decoders

• instead groups of processing units share an instruction decoder; this maximizes the number of arithmetic units per die area

ICOM 6025: High Performance Computing 21

Page 22: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Multi-Core vs. Many-Core

• Multi-core processors (minimize latency)– MIMD– Each core optimized for executing a single thread– Lots of big on-chip caches– Extremely sophisticated control

• Many-core processors (maximize throughput)– SIMD– Cores optimized for aggregating throughput– Lots of ALUs– Simpler control

ICOM 6025: High Performance Computing 22

Page 23: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

CPUs: Latency Oriented Design

• Large caches– Convert long latency memory

accesses to short latency cache accesses

• Sophisticated control– Branch prediction for

reduced branch latency– Data forwarding for reduced

data latency• Powerful ALU

– Reduced operation latency © David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012, SSL 2014, ECE408/CS483, University of Illinois, Urbana-Champaign

23

Cache

ALUControl

ALU

ALU

ALU

DRAM

Page 24: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

GPUs: Throughput Oriented Design

• Small caches– To boost memory throughput

• Simple control– No branch prediction– No data forwarding

• Energy efficient ALUs– Many, long latency but heavily

pipelined for high throughput• Require massive number of

threads to tolerate latencies© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012, SSL 2014, ECE408/CS483, University of Illinois, Urbana-Champaign

24

DRAM

GPU

Page 25: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

0

200

400

600

800

1000

1200

1400

9/22/2002 2/4/2004 6/18/2005 10/31/2006 3/14/2008 7/27/2009

GFLO

Ps

NVIDIAGPUIntelCPU

Multi-Core vs. Many-Core

T12

WestmereNV30 NV40

G70

G80

GT200

3GHz Dual Core P4

3GHz Core2 Duo

3GHz Xeon Quad

ICOM 6025: High Performance Computing 25

Page 26: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Intel® Xeon® Processor E7-8894 v4

• 24 cores• 48 threads• 2.40 GHz• 14 nm• 60MB cache• $8k (July 2017)

26

Page 27: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

NVIDIA TITAN Xp

• 3840 cores• 1.6 GHz• Pascal Architecture• Peak = 12TF/s• $1.5K

ICOM 6025: High Performance Computing 27

Page 28: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Cluster Hardware Configuration

Head Node

Node 1

Node 2

Node n

Switch Local StorageExternal Storage

ICOM 6025: High Performance Computing 28

© Wilson Rivera

Page 29: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Cluster Head Node

• Head Node – Network interface cards (NIC): one connecting to

the public network and the other one connecting to the internal cluster network.

– A local storage is attached to the head node for administrative purposes such as accounting management and maintenance services

ICOM 6025: High Performance Computing 29

Page 30: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Cluster Interconnection Network

• The interconnection of the cluster depends upon both application and budget constraints. – Small clusters typically have PC based nodes connected

through a Gigabit Ethernet network– Large scale production clusters may be made of 1U or 2U

servers or blade servers connected through either • A Gigabit Ethernet network (Server Farm), or • A high performance computing network (High Performance

Computing Cluster)– Infiniband– Quadrics– Myrinet – Omni-Path (Intel)

ICOM 6025: High Performance Computing 30

Page 31: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Cluster Storage

•Storage Area Network (SAN)– Storage devices appear as locally attached to the

operating system.

•Network Attached Storage (NAS)– Distributed File-based protocols

• Parallel Virtual File System (PVFS)• General Parallel File System (GPFS)• Hadoop Parallel File System (HPFS)• Lustre• CERN-VM-FS

ICOM 6025: High Performance Computing 31

Page 32: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Cluster Software

Operating System Cluster Infrastructure Services

Cluster Tools and Libraries

Cluster Resource Manager Scheduler Monitor Analyzer

Communication Compiler Optimization

ICOM 6025: High Performance Computing 32

© Wilson Rivera

Page 33: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Top500.org

ICOM 6025: High Performance Computing 33

Page 34: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

History of Performance

ICOM 6025: High Performance Computing 34Exascale Computing and Big Data

Page 35: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

1 Gflop/s

1 Tflop/s

100 Mflop/s

100 Gflop/s

100 Tflop/s

10 Gflop/s

10 Tflop/s

1 Pflop/s

100 Pflop/s

10 Pflop/s SUM

N=1

N=500

Projected Performance

ICOM 6025: High Performance Computing 35

Page 36: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

#1 TAIHULIGHT @ CHINA

• June 2017• National

Supercomputing Center in Wuxi

• SW26010 processors developed by NRCPC

• 40,960 nodes• 10,649,600 cores• Peak =125 PF/s• R max =93 PF/s• 15,371 kW

ICOM 6025: High Performance Computing 36

Page 37: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Cloud Computing

• Cloud computing allows scaling on demand without building or provisioning a data center– Computing resources available on demand (self service) – Charging only for resources utilized (Pay-as-you-go)

• Worldwide revenue from public IT cloud services exceeded $21.5 billion in 2010– It will reach $72.9 billion in 2015– compound annual growth rate (CAGR) of 27.6%.

http://www.idc.com/prodserv/idc_cloud.jsp

Page 38: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Cloud versus Grid

• Grids– Sharing and coordination of distributed

resources– Grid Middleware

• Globus, UNICORE, Glite• Clouds

– Leverages on virtualization to maximize resource utilization

– Cloud Middleware• IaaS, PaaS, SaaS

ICOM 6025: High Performance Computing 38

Page 39: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Layered cloud model

From: K ChenWright University

Page 40: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Cloud Layers

– Infrastructure as a Service (IaaS)• Flexible in terms of the applications to be hosted • Amazon EC2, RackSpace, Nimbus, Eucalyptus

– Platform as a Service (PaaS)• Application domain-specific platforms• Google App Engine, MS Azure, Heroku

– Software as a Service (SaaS)• Service domain-specific• Salesforce, NetSuite

ICOM 6025: High Performance Computing 40

Page 41: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Unused resources

Cloud Economics• Pay by use instead of provisioning for peak

Static data center Data center in the cloud

Demand

Capacity

Time

Res

ourc

es

Demand

Capacity

TimeR

esou

rces

41

From: K ChenWright University

Page 42: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Cloud Economics• Setup:

– A peak period needs 10 servers to process requests– Assume your service is going to run for 1 year

• Private cluster: one-time investment– Servers $1,500 x 10 = $15,000 – Power/AC costs about $200/year/server => $2,000– Administrator: $50,000

• Public cloud:– Rush hours: 10 hours/day, which needs 10 nodes/hour– Other hours: 14hours need 2 nodes/hour– Total: 128 hour.nodes x $0.10/hour.node =$12.80/day– One year cost = $4,672

Page 43: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Cloud Economics

• Amazon EC2 Pricing • Google engine pricing• Hadoop Sizing•• How much to rent a supercomputer

– 8-core VM– 30 GB of RAM (each core 3.75GB)– $1.16/hour – 600,000 cores– 75,000 VMs– $87,000/hour – $2 million per day

ICOM 6025: High Performance Computing 43

Page 44: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Data analytics Ecosystem

ICOM 6025: High Performance Computing 44Exascale Computing and Big Data

Page 45: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Summary

– Parallel computing infrastructure trends• Multi-core Processors

– As a result from the need of building power efficient chips.• Graphics Processor Units

– Throughput oriented devices designed to provide high aggregate performance for independent computations

• Cluster Infrastructures– Head; interconnection; storage; software

• Cloud Infrastructures– Physical resources; virtual resources; infrastructure

services; application services

ICOM 6025: High Performance Computing 45

Page 46: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Scientific Computing Terminology

Terms

• HPC System• Interconnect• Node (blade,

sled, etc.)• Chassis

Definition• “High Performance Computing”

(HPC) Computer– Computers Connected through high speed interconnect and configured for scientific computing.

• The wiring, chips, and software that connects computing components.

• An independent computing unit of an HPC System. Unit has its own operating system (OS) and memory. The physical cases of a node are often called blades and sleds.

• Nodes are often aggregated into a chassis (with a backplane) for sharing electrical power, cooling and sharing a local interconnect.

Page 47: Lecture1 - Parallel Computing Architecturesece.uprm.edu/~wrivera/ICOM6025/Lecture1.pdf · SISD SIMD MISD MIMD Data ions ICOM 6025: High Performance Computing 7. Flynn's Taxonomy •Single

Terminology (continued)

• Chip or Die• Socket• CPU (or

processor?)• Core• Hyper-

Threading

• Self-contained circuits on a single media of size ~20mm x 20mm, containing up to ~1 billion transistors.

• Provides a connection between and chip and a motherboard.

• A Central Processor Unit, consisting of a chip or die (often called a processor).

• Modern CPUs contain multiple cores. A core is an execution unit within that can execute a code’s instructions independently while other cores execute a different code’s instructions.

• A single core can have additional circuitry that allows two or more instruction streams (threads) to proceed through a single core “simultaneously”. Hyper-Thread is an Intel trademark for 2 threads. Xeon Phi Coprocessor supports 4 threads.

Terms Definition