reconfigurable versus fixed versus hybrid architectures
DESCRIPTION
Reconfigurable Versus Fixed Versus Hybrid Architectures. John K. Antonio Oklahoma Supercomputing Symposium 2008 Norman, Oklahoma October 6, 2008. Overview. The (past) world of reconfigurable computing The (past) world of multi-core - PowerPoint PPT PresentationTRANSCRIPT
Computer Science, University of Oklahoma
Reconfigurable Versus Fixed Versus Hybrid Architectures
John K. Antonio
Oklahoma Supercomputing Symposium 2008
Norman, Oklahoma October 6, 2008
Computer Science, University of Oklahoma 2
Overview
• The (past) world of reconfigurable computing
• The (past) world of multi-core
• The (emerging) world of reconfigurable multi-core architectures
• Illustrative analysis
• Conclusions
Computer Science, University of Oklahoma 3
Drivers for reconfigurable computing
• Near performance of custom ASIC
• Near cost of commodity processor
• More flexible than custom ASIC
• “Programming” tools improving steadily
• Often used in embedded applications having high computational throughput requirements and strict SWAP constraints
SAR processing on a UAV
“Predator”
Jeffrey T. Muehring, “Optimal Configuration of a Parallel Embedded System for Synthetic Aperture Radar Processing,” MS Thesis, Texas Tech University, Dec. 1997.
5
A prototype hybrid system
PCI MOTHER BOARD
PENTIUM DRAM
WIL
D
ON
E
SR
AM
SR
AM
AN
NA
PO
LIS
WIL
DF
OR
CE
AN
NA
PO
LIS
WIL
DF
OR
CE
AN
NA
PO
LIS
WIL
DF
OR
CE
AN
NA
PO
LIS
WIL
DF
OR
CE
HARD DISK DRIVE
PCI MOTHER BOARD
PENTIUM DRAM
WIL
D
ON
E
SR
AM
SR
AM
AN
NA
PO
LIS
WIL
DF
OR
CE
AN
NA
PO
LIS
WIL
DF
OR
CE
AN
NA
PO
LIS
WIL
DF
OR
CE
AN
NA
PO
LIS
WIL
DF
OR
CE
HARD DISK DRIVE
CN
CN C
N
CN
CN C
N CN
CN
CN
CN
CN C
N CN
CN
DR
AM
AS
IC
RIN
-T
VME BACKPLANE
ME
RC
UR
Y R
AC
E M
CH
9U
BO
AR
D
HA
RD
DIS
K D
RIV
E
DR
AM
SP
AR
C
RO
UT
-T FO
RC
E S
PA
RC
5V
CN
Data SourcePC
AnnapolisFPGA
Subsystem(F)
Custom Interface
Cables
SPARC
MercuryDSP/GPP
Subsystem
Data SinkPC
AnnapolisFPGA
Subsystem(B)
Data SinkPC
Data SourcePC
Custom Interface Cables
MercuryDSP/GPP
Subsystem
AnnapolisFPGA
Subsystem(F)
AnnapolisFPGA
Subsystem(B)
SPARC
Data SinkPC
Data SourcePC
Custom Interface Cables
MercuryDSP/GPP
Subsystem
AnnapolisFPGA
Subsystem(F)
AnnapolisFPGA
Subsystem(B)
SPARC
A prototype hybrid system
Minimum Power Configurations
0.5 1 1.5 250
100
150
200
250
300
350
400
Resolution
Vel
oci
ty
1 1 2 2 1 1
1 1 2 1 2 1
XT Xr Xa YTYrYa
1 1 2 2 0 1
1 2 1 2 0 21 3 0 2 0 21 3 0 2 1 12 0 2 2 1 1
1 1 2 2 1 1
2 1 1 2 2 0
1 1 2 2 0 2
Jeffrey T. Muehring, “Optimal Configuration of a Parallel Embedded System for Synthetic Aperture Radar Processing,” MS Thesis, Texas Tech University, Dec. 1997.
Minimum Power
Jeffrey T. Muehring, “Optimal Configuration of a Parallel Embedded System for Synthetic Aperture Radar Processing,” MS Thesis, Texas Tech University, Dec. 1997.
Computer Science, University of Oklahoma 9
Overview
• The (past) world of reconfigurable computing
• The (past) world of multi-core
• The (emerging) world of reconfigurable multi-core architectures
• Illustrative analysis
• Conclusions
Computer Science, University of Oklahoma 10
Drivers for multi-core technology path
• Single-core path leading to increased cost, heat, and power consumption
• Single-core path widens the pocessor/memory speed gap
• Multi-core path transparent to many application domain developers
• Multi-core path can improve performance of threaded software
11
Typical multi-core architecture*
Dual Core Chip
L2 Cache
Core Core
Memory
Dual Core Chip
L2 Cache
Core Core
*L. Chai, Q. Gao, D.K. Panda, “Understanding the Impact of Multi-Core Architecture in Cluster
Computing: A Case Study with Intel Dual-Core System,” Seventh Int'l Symposium on Cluster Computing and the Grid (CCGrid), Rio de Janeiro - Brazil, May 2007.
Computer Science, University of Oklahoma 12
Overview
• The (past) world of reconfigurable computing
• The (past) world of multi-core
• The (emerging) world of reconfigurable multi-core architectures
• Illustrative analysis
• Conclusions
Computer Science, University of Oklahoma 13
Emerging drivers and requirements for multi-core architectures
• Scale to support massively data parallel (SPMD) applications
• Match coupling among cores with application granularity
• Power is a major challenge for large data centers and supercomputing facilities
Computer Science, University of Oklahoma 14
Hybrid architectural frameworkMulti-core Chip
Reconfigurable Logic
Core Core
L2Cache
L2Cache
MU
Core Core
L2Cache
L2Cache
Core
L2Cache
Core
L2Cache
MU MU MU MU MU
Reconfigurable logic
Computer Science, University of Oklahoma 15
Shared everything configurationMulti-core Chip
Core Core
L2Cache
L2Cache
MU
Core Core
L2Cache
L2Cache
Core
L2Cache
Core
L2Cache
MU MU MU MU MU
Interconnection Network
Interconnection Network
Reconfigurable logic
Computer Science, University of Oklahoma 16
Shared nothing configurationMulti-core Chip
Core Core
L2Cache
L2Cache
MU
Core Core
L2Cache
L2Cache
Core
L2Cache
Core
L2Cache
MU MU MU MU MU
Co-
Proc
Co-
Proc
Co-
Proc
Co-
Proc
Co-
Proc
Co-
Proc
Reconfigurable logic
Computer Science, University of Oklahoma 17
Hybrid configurationMulti-core Chip
Core Core
L2Cache
L2Cache
MU
Core Core
L2Cache
L2Cache
Core
L2Cache
Core
L2Cache
MU MU MU MU MU
Co-Proc Co-Proc Co-Proc Co-Proc Co-Proc Co-Proc
Interconnection Network
Reconfigurable logic
Computer Science, University of Oklahoma 18
Features of hybrid architecture
• Match core coupling and core processing capacity with application granularity– Fixed multiprocessor architecture not well
matched with all application granularities– Proposed reconfigurable multi-core architecture
can be configured to match core coupling with application granularity
Computer Science, University of Oklahoma 19
Mismatched SPMD execution
core 1
core 2
core 3
core 4
core c
time
Core coupling too loose relative to application granularity
Communication time Computation time
Computer Science, University of Oklahoma 20
core 1
core 2
core 3
core 4
core c
Matched SPMD execution
time
Core coupling tightened to match application granularity
Communication time Computation time
Computer Science, University of Oklahoma 21
Overview
• The (past) world of reconfigurable computing
• The (past) world of multi-core
• The (emerging) world of reconfigurable multi-core architectures
• Illustrative analysis
• Conclusions
Computer Science, University of Oklahoma 22
Illustrative Analysis
• Notation– Number of cores: c– Problem size: n– Sequential time complexity: – Parallel time complexity:
– Computational complexity:– Communication complexity: – Core coupling ratio:
),(),(),( ncgLncfKncTP
),( ncf),( ncg
LK /
)(nTS
Computer Science, University of Oklahoma 23
Example
Sequential Time:
Parallel Time:
Speedup:
The value of K: related to core processing capacity
The value of L: related to interconnection among cores
nnTS )(
cLcnKncTP log)/(),(
cLcnK
nS
log)/(
Computer Science, University of Oklahoma 24
K = 1.0, L = 1.0
Number of cores, c
Computer Science, University of Oklahoma 25
K = 1.5, L = 0.5
Number of cores, c
Computer Science, University of Oklahoma 26
K = 0.5, L = 1.5
Number of cores, c
Computer Science, University of Oklahoma 27
n = 1024
Number of cores, c
Computer Science, University of Oklahoma 28
Overview
• The (past) world of reconfigurable computing
• The (past) world of multi-core
• The (emerging) world of reconfigurable multi-core architectures
• Illustrative analysis
• Conclusions
Computer Science, University of Oklahoma 29
Conclusions
• Current multi-core approaches may not scale to support massive parallelism
• Hybrid reconfigurable multi-core approach enables trades between core coupling and core processing capacity
• More research needed in reconfigurable micro-architecture to support hybrid architectures