enabling realistic fine-grain voltage scaling with ...cbatten/pdfs/torng-rpdn-slides...enabling...
TRANSCRIPT
Enabling RealisticFine-Grain Voltage Scaling with
Reconfigurable Power Distribution Networks
Waclaw Godycki, Christopher Torng, Ivan BukreyevAlyssa Apsel, Christopher Batten
School of Electrical and Computer EngineeringCornell University
47th Int’l Symp. on Microarchitecture, Dec 2014
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Motivation: Integrated Voltage Regulation (IVR)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
On-Chip
Off-ChipDiscrete
Voltage Regulators
IntegratedVoltage Regulators
Key Benefit of IVR
I Reduced System Cost
Challenges of IVR
I Integrated energy-storage elementshave low energy densities
I Low switching speeds with highparasitic losses
A New Era of IVRI Energy storage elements have
slightly improved energy densitiesI Faster switches with low parasitic
losses
Cornell University Christopher Torng 2 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Motivation: Integrated Voltage Regulation (IVR)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
On-Chip
Off-ChipDiscrete
Voltage Regulators
IntegratedVoltage Regulators
Key Benefit of IVR
I Reduced System Cost
Challenges of IVR
I Integrated energy-storage elementshave low energy densities
I Low switching speeds with highparasitic losses
A New Era of IVRI Energy storage elements have
slightly improved energy densitiesI Faster switches with low parasitic
losses
Cornell University Christopher Torng 2 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Motivation: Integrated Voltage Regulation (IVR)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
On-Chip
Off-ChipDiscrete
Voltage Regulators
IntegratedVoltage Regulators
Key Benefit of IVR
I Reduced System Cost
Challenges of IVR
I Integrated energy-storage elementshave low energy densities
I Low switching speeds with highparasitic losses
A New Era of IVRI Energy storage elements have
slightly improved energy densitiesI Faster switches with low parasitic
losses
Cornell University Christopher Torng 2 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Motivation: Integrated Voltage Regulation (IVR)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
On-Chip
Off-ChipDiscrete
Voltage Regulators
IntegratedVoltage Regulators
VLLC
GraphicsEngine
Rin
g In
terc
onne
ct
SystemAgent
+PCIe
+DMI
VDDQDRAM Cntl
VCCIN
FIV
R
CPUs+
Cache
VCPU0
VRING
VGPU
VSA
VIOA
VCPU1……
Intel Haswell integrates the
voltage control loop circuitry on-die
with inductors in-package.
Intel HaswellFigure from D. Kanter, MPR'13
"Fully" Integrated Voltage Regulator (FIVR)
Cornell University Christopher Torng 3 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Motivation: Integrated Voltage Regulation (IVR)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
On-Chip
Off-ChipDiscrete
Voltage Regulators
IntegratedVoltage Regulators
VLLC
GraphicsEngine
Rin
g In
terc
onne
ct
SystemAgent
+PCIe
+DMI
VDDQDRAM Cntl
VCCIN
FIV
R
CPUs+
Cache
VCPU0
VRING
VGPU
VSA
VIOA
VCPU1……
Intel Haswell integrates the
voltage control loop circuitry on-die
with inductors in-package.
Intel HaswellFigure from D. Kanter, MPR'13
"Fully" Integrated Voltage Regulator (FIVR)
Potential for Fine-Grain Voltage Scaling
Cornell University Christopher Torng 3 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Motivation: Fine-Grain Voltage Scaling OpportunitiesC
ore
s
0
7
Knuth-Morris-Pratt String Search
Core Busy Core Waiting
Core
s
0
7
Core
s
0
7time
Breadth-First Search
Radix Sort
Cornell University Christopher Torng 4 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Motivation: Fine-Grain Voltage Scaling OpportunitiesC
ore
s
0
7
Knuth-Morris-Pratt String Search
Core Busy Core Waiting
Core
s
0
7
Core
s
0
7time
Breadth-First Search
Radix Sort
90μs
2.3μs
7μs
Cornell University Christopher Torng 4 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Enabling Realistic Fine-Grain Voltage Scaling
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Fine-GrainDVFS
Controller
PowerDistribution
Network
FGVS Architecture: FG-SYNC+
Use lightweight software hints andlookup tables derived offline to enablefast multi-level voltage configuration
FGVS Circuits: RPDN
Enable sprinting cores to dynamicallyborrow energy storage from restingcores
Methodology and Evaluation
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 5 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Enabling Realistic Fine-Grain Voltage Scaling
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Fine-GrainDVFS
Controller
PowerDistribution
Network
Stats
FGVS Architecture: FG-SYNC+
Use lightweight software hints andlookup tables derived offline to enablefast multi-level voltage configuration
FGVS Circuits: RPDN
Enable sprinting cores to dynamicallyborrow energy storage from restingcores
Methodology and Evaluation
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 5 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Enabling Realistic Fine-Grain Voltage Scaling
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Fine-GrainDVFS
Controller
PowerDistribution
Network
Stats
PowerModes
FGVS Architecture: FG-SYNC+
Use lightweight software hints andlookup tables derived offline to enablefast multi-level voltage configuration
FGVS Circuits: RPDN
Enable sprinting cores to dynamicallyborrow energy storage from restingcores
Methodology and Evaluation
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 5 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Enabling Realistic Fine-Grain Voltage Scaling
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Fine-GrainDVFS
Controller
PowerDistribution
Network
Stats
PowerModes
Voltages
FGVS Architecture: FG-SYNC+
Use lightweight software hints andlookup tables derived offline to enablefast multi-level voltage configuration
FGVS Circuits: RPDN
Enable sprinting cores to dynamicallyborrow energy storage from restingcores
Methodology and Evaluation
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 5 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Enabling Realistic Fine-Grain Voltage Scaling
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
PowerDistribution
Network
Stats
PowerModes
Voltages
FG-SYNC+
FGVS Architecture: FG-SYNC+
Use lightweight software hints andlookup tables derived offline to enablefast multi-level voltage configuration
FGVS Circuits: RPDN
Enable sprinting cores to dynamicallyborrow energy storage from restingcores
Methodology and Evaluation
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 5 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Enabling Realistic Fine-Grain Voltage Scaling
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Stats
PowerModes
Voltages
FG-SYNC+RPDN
FGVS Architecture: FG-SYNC+
Use lightweight software hints andlookup tables derived offline to enablefast multi-level voltage configuration
FGVS Circuits: RPDN
Enable sprinting cores to dynamicallyborrow energy storage from restingcores
Methodology and Evaluation
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 5 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Enabling Realistic Fine-Grain Voltage Scaling
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Stats
PowerModes
Voltages
FG-SYNC+RPDN
FGVS Architecture: FG-SYNC+
Use lightweight software hints andlookup tables derived offline to enablefast multi-level voltage configuration
FGVS Circuits: RPDN
Enable sprinting cores to dynamicallyborrow energy storage from restingcores
Methodology and Evaluation
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 5 / 22
• Motivation • FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Enabling Realistic Fine-Grain Voltage Scaling
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Stats
PowerModes
Voltages
FG-SYNC+RPDN
FGVS Architecture: FG-SYNC+
Use lightweight software hints andlookup tables derived offline to enablefast multi-level voltage configuration
FGVS Circuits: RPDN
Enable sprinting cores to dynamicallyborrow energy storage from restingcores
Methodology and Evaluation
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 5 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Fine-Grain Synchronization Controller (FG-SYNC+)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
PowerDistribution
NetworkFG-SYNC+
Stats
parallel_for(int i=0; i<N; n++){
< loop body >
}
Loop Start: Activity Hint -- busy
Loop Ends: Activity Hint -- waiting
Before Start: Work Left Hint
Cornell University Christopher Torng 6 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Fine-Grain Synchronization Controller (FG-SYNC+)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
PowerDistribution
NetworkFG-SYNC+
Stats
parallel_for(int i=0; i<N; n++){
< loop body >
}
Loop Start: Activity Hint -- busy
Loop Ends: Activity Hint -- waiting
Before Start: Work Left Hint
A A A AA A A wA A w wA w w w
N N N NS N N rS S r rX r r r
Activity Pattern DVFS Mode Pattern
PowerModes
( Active Waiting )( S, X Sprint modes )( N, r Nominal, rest )
Cornell University Christopher Torng 6 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Fine-Grain Synchronization Controller (FG-SYNC+)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
PowerDistribution
NetworkFG-SYNC+
Stats
parallel_for(int i=0; i<N; n++){
< loop body >
}
Loop Start: Activity Hint -- busy
Loop Ends: Activity Hint -- waiting
Before Start: Work Left Hint
A A A AA A A wA A w wA w w w
N N N NS N N rS S r rX r r r
Activity Pattern DVFS Mode Pattern
PowerModes
( Active Waiting )( S, X Sprint modes )( N, r Nominal, rest )
Voltages
Cornell University Christopher Torng 6 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Fine-Grain Synchronization Controller (FG-SYNC+)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
PowerDistribution
NetworkFG-SYNC+
Stats
parallel_for(int i=0; i<N; n++){
< loop body >
}
Loop Start: Activity Hint -- busy
Loop Ends: Activity Hint -- waiting
Before Start: Work Left Hint
A A A AA A A wA A w wA w w w
N N N NS N N rS S r rX r r r
Activity Pattern DVFS Mode Pattern
PowerModes
( Active Waiting )( S, X Sprint modes )( N, r Nominal, rest )
Voltages
Performance
En
erg
y E
ffic
ien
cy
isop
ower
Cornell University Christopher Torng 6 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Fine-Grain Synchronization Controller (FG-SYNC+)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
PowerDistribution
NetworkFG-SYNC+
Stats
parallel_for(int i=0; i<N; n++){
< loop body >
}
Loop Start: Activity Hint -- busy
Loop Ends: Activity Hint -- waiting
Before Start: Work Left Hint
A A A AA A A wA A w wA w w w
N N N NS N N rS S r rX r r r
Activity Pattern DVFS Mode Pattern
PowerModes
( Active Waiting )( S, X Sprint modes )( N, r Nominal, rest )
Voltages
Performance
En
erg
y E
ffic
ien
cy
isop
ower
Cornell University Christopher Torng 6 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Fine-Grain Synchronization Controller (FG-SYNC+)
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
PowerDistribution
NetworkFG-SYNC+
Stats
parallel_for(int i=0; i<N; n++){
< loop body >
}
Loop Start: Activity Hint -- busy
Loop Ends: Activity Hint -- waiting
Before Start: Work Left Hint
A A A AA A A wA A w wA w w w
N N N NS N N rS S r rX r r r
Activity Pattern DVFS Mode Pattern
PowerModes
( Active Waiting )( S, X Sprint modes )( N, r Nominal, rest )
Voltages
Performance
En
erg
y E
ffic
ien
cy
isop
ower
Core 1 Core 3Core 2Core 1Core 0
Cornell University Christopher Torng 6 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Ideal Fine-Grain Voltage Scaling: Breadth-First Search
time
Core 7
Rest = 0.7V Nominal = 1.0V Sprint = 1.15V Super-Sprint = 1.33V
Core 6
Core 5
Core 4
Core 3
Core 2
Core 1
Core 0
BusyWaiting
Cornell University Christopher Torng 7 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Ideal Fine-Grain Voltage Scaling: Breadth-First Search
time
Core 7
Rest = 0.7V Nominal = 1.0V Sprint = 1.15V Super-Sprint = 1.33V
Core 6
Core 5
Core 4
Core 3
Core 2
Core 1
Core 0
BusyWaiting
Cornell University Christopher Torng 7 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Ideal Fine-Grain Voltage Scaling: Breadth-First Search
time
Core 7
Rest = 0.7V Nominal = 1.0V Sprint = 1.15V Super-Sprint = 1.33V
Core 6
Core 5
Core 4
Core 3
Core 2
Core 1
Core 0
BusyWaiting
Cornell University Christopher Torng 7 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Ideal Fine-Grain Voltage Scaling: Breadth-First Search
time
Core 7
Rest = 0.7V Nominal = 1.0V Sprint = 1.15V Super-Sprint = 1.33V
Core 6
Core 5
Core 4
Core 3
Core 2
Core 1
Core 0
BusyWaiting
Execution
Time
Reduced
by 24%
Ideal FGVS
Level
Space
Time
Cornell University Christopher Torng 7 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Non-Ideal FGVS: Different Levels
2-Level Controllers
0.9 1.0 1.1 1.2 1.3 1.4 1.50.8
1.0
1.2
1.4
1.6
1.8
2.0
Speedup
Norm
aliz
ed
En
erg
y E
ffic
ien
cy
isopower
No FGVS
Cornell University Christopher Torng 8 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Non-Ideal FGVS: Different Levels
2-Level Controllers
0.9 1.0 1.1 1.2 1.3 1.4 1.50.8
1.0
1.2
1.4
1.6
1.8
2.0
Speedup
Norm
aliz
ed
En
erg
y E
ffic
ien
cy
isopower
No FGVS r / N
Cornell University Christopher Torng 8 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Non-Ideal FGVS: Different Levels
2-Level Controllers
0.9 1.0 1.1 1.2 1.3 1.4 1.50.8
1.0
1.2
1.4
1.6
1.8
2.0
Speedup
Norm
aliz
ed
En
erg
y E
ffic
ien
cy
isopower
No FGVS r / N
N / S
Cornell University Christopher Torng 8 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Non-Ideal FGVS: Different Levels
2-Level Controllers
0.9 1.0 1.1 1.2 1.3 1.4 1.50.8
1.0
1.2
1.4
1.6
1.8
2.0
Speedup
Norm
aliz
ed
En
erg
y E
ffic
ien
cy
isopower
No FGVS r / N
N / S N / X
Cornell University Christopher Torng 8 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Non-Ideal FGVS: Different Levels
2-Level Controllers
0.9 1.0 1.1 1.2 1.3 1.4 1.50.8
1.0
1.2
1.4
1.6
1.8
2.0
Speedup
Norm
aliz
ed
En
erg
y E
ffic
ien
cy
isopower
No FGVS r / N
N / S N / X
3-Level vs 4-Level Controllers
Speedup
isopower
Norm
aliz
ed
En
erg
y E
ffic
ien
cy
1.0 1.1 1.2 1.3 1.4 1.50.8
1.0
1.2
1.4
1.6
1.8
2.0
No FGVS r / N / S
r / N / X r / N / S / X
Increasing
sprinting
levels
Cornell University Christopher Torng 8 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
Non-Ideal FGVS: Space and Time
Different Numbers of Domains
No FGVS 2 Domains
4 Domains 8 Domains
1.0 1.1 1.2 1.3 1.4 1.50.8
1.0
1.2
1.4
1.6
1.8
2.0
Speedup
Norm
aliz
ed
En
erg
y E
ffic
ien
cy
isopower
Increasing
number of
domains
Different Response Times
Speedup
Norm
aliz
ed
En
erg
y E
ffic
ien
cy
0.4
0.8
1.0
1.2
1.4
1.6
1.8
2.0
No FGVS
100 ns
1000 ns
0 ns
isopower
Decreasing
response
time
0.6 0.8 1.0 1.2 1.4 1.6
Cornell University Christopher Torng 9 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
FG-SYNC+: Summary of FGVS Potential
Exploiting fine-grain voltage scaling requires:
I FGVS in Level: at least three levels and four levelsresults in additional benefits
I FGVS in Space: per-core voltage control
I FGVS in Time: voltage settling response times of100 ns or faster
How do we design a power distribution network thatcan enable fine-grain voltage scaling?
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 10 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
FG-SYNC+: Summary of FGVS Potential
Exploiting fine-grain voltage scaling requires:
I FGVS in Level: at least three levels and four levelsresults in additional benefits
I FGVS in Space: per-core voltage control
I FGVS in Time: voltage settling response times of100 ns or faster
How do we design a power distribution network thatcan enable fine-grain voltage scaling?
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 10 / 22
Motivation • FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN Methodology and Evaluation
FG-SYNC+: Summary of FGVS Potential
Exploiting fine-grain voltage scaling requires:
I FGVS in Level: at least three levels and four levelsresults in additional benefits
I FGVS in Space: per-core voltage control
I FGVS in Time: voltage settling response times of100 ns or faster
How do we design a power distribution network thatcan enable fine-grain voltage scaling?
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 10 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
Talk Outline
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Stats
PowerModes
Voltages
FG-SYNC+RPDN
FGVS Architecture: FG-SYNC+
Use lightweight software hints andlookup tables derived offline to enablefast multi-level voltage configuration
FGVS Circuits: RPDN
Enable sprinting cores to dynamicallyborrow energy storage from restingcores
Methodology and Evaluation
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 11 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
PDN: Basic Regulators
The three primary types of step-down voltage regulators are linearregulators, inductor-based switching regulators (buck), and
capacitor-based switching regulators.
Cornell University Christopher Torng 12 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
A Simple On-Chip Power Distribution Network
SingleFixedVR
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL
Cornell University Christopher Torng 13 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
A Simple On-Chip Power Distribution Network
SingleFixedVR
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL
FlybackCap
Vout = Vin/2
Vin
time
S
Vin
S Load
P
P
Cornell University Christopher Torng 13 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
A Simple On-Chip Power Distribution Network
SingleFixedVR
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL
S
SFlybackCap
Vout = Vin/2
Vin
time
S
P
Vin
P
Load
Cornell University Christopher Torng 13 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
A Simple On-Chip Power Distribution Network
SingleFixedVR
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL
FlybackCap
Vout = Vin/2
S
Vin
S Load
P
P
Vin
time
S P
Cornell University Christopher Torng 13 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
A Simple On-Chip Power Distribution Network
SingleFixedVR
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL
FlybackCap
Vout = Vin/2
S
Vin
S Load
P
P
Vin
time
S P
Vin/2
S P
Cornell University Christopher Torng 13 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
A Simple On-Chip Power Distribution Network
SingleFixedVR
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL
FlybackCap
Vout = Vin/2
Vin
time
S
Vin
S Load
P
P
How to design sophisticated control circuitry?
How to use multiple phases to reduce ripple?
How to size the energy storage?
How to choose the switch-to-cap area ratio?
Cornell University Christopher Torng 13 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Multiple Adjustable Voltage Regulators
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
CTRL CTRL CTRL CTRL
Cornell University Christopher Torng 14 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Multiple Adjustable Voltage Regulators
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
CTRL CTRL CTRL CTRL
Vin
Load FlybackCap
Cornell University Christopher Torng 14 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Multiple Adjustable Voltage Regulators
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
CTRL CTRL CTRL CTRL
Vin
Load FlybackCap
How to design sophisticated control circuitry?
How to use multiple phases to reduce ripple?
How to size the energy storage?
How to choose the switch-to-cap area ratio?
Cornell University Christopher Torng 14 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Multiple Adjustable Voltage Regulators
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
CTRL CTRL CTRL CTRL
Vin
Load FlybackCap
How to design sophisticated control circuitry?
How to use multiple phases to reduce ripple?
How to size the energy storage?
How to choose the switch-to-cap area ratio?
Cornell University Christopher Torng 14 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Per-Core Regulator Sizing StudyP
ow
er
Effic
ien
cy
10 20 30 40 50 60 700
60%
65%
70%
75%
80%
85%
Output Power (mW)
Nominal
(1.0V)
Sprint
(1.15V)
Cornell University Christopher Torng 15 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Per-Core Regulator Sizing StudyP
ow
er
Effic
ien
cy
10 20 30 40 50 60 700
60%
65%
70%
75%
80%
85%
Output Power (mW)
Nominal
(1.0V)
Sprint
(1.15V)
Cell Area = 0.011 mm2
Cornell University Christopher Torng 15 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Per-Core Regulator Sizing StudyP
ow
er
Effic
ien
cy
10 20 30 40 50 60 700
60%
65%
70%
75%
80%
85%
Output Power (mW)
Nominal
(1.0V)
Sprint
(1.15V)
Cell Area = 0.011 mm2
4
Cells
Cornell University Christopher Torng 15 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Per-Core Regulator Sizing StudyP
ow
er
Effic
ien
cy
10 20 30 40 50 60 700
60%
65%
70%
75%
80%
85%
Output Power (mW)
Nominal
(1.0V)
Sprint
(1.15V)
Cell Area = 0.011 mm2
4
Cells
Super-Sprint
(1.33V)
Cornell University Christopher Torng 15 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Per-Core Regulator Sizing StudyP
ow
er
Effic
ien
cy
10 20 30 40 50 60 700
60%
65%
70%
75%
80%
85%
Output Power (mW)
Nominal
(1.0V)
Sprint
(1.15V)
Cell Area = 0.011 mm2
4
Cells
Super-Sprint
(1.33V)
5
Cells
6
Cells
7
Cells
Cornell University Christopher Torng 15 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Per-Core Regulator Sizing StudyP
ow
er
Effic
ien
cy
10 20 30 40 50 60 700
60%
65%
70%
75%
80%
85%
Output Power (mW)
Nominal
(1.0V)
Sprint
(1.15V)
Cell Area = 0.011 mm2
4
Cells
Super-Sprint
(1.33V)
5
Cells
6
Cells
7
Cells
Rest
(0.7V)
Cornell University Christopher Torng 15 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Per-Core Regulator Sizing StudyP
ow
er
Effic
ien
cy
10 20 30 40 50 60 700
60%
65%
70%
75%
80%
85%
Output Power (mW)
Nominal
(1.0V)
Sprint
(1.15V)
Cell Area = 0.011 mm2
4
Cells
Super-Sprint
(1.33V)
5
Cells
6
Cells
7
Cells
Rest
(0.7V)
3
Cells
2
Cells
1
Cell
Cornell University Christopher Torng 15 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Multiple Adjustable Voltage Regulators
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL CTRL CTRL CTRL
MAVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Cornell University Christopher Torng 16 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Multiple Adjustable Voltage Regulators
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL CTRL CTRL CTRL
MAVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Cornell University Christopher Torng 16 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
MAVR: Multiple Adjustable Voltage Regulators
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL CTRL CTRL CTRL
MAVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR Key Observation
MAVR requires enough area for eachregulator to independentlysupport all power modes
Power limits mean FG-SYNC+ isdesigned such that only 1 or 2 cores
are ever super-sprinting at once
Cornell University Christopher Torng 16 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
RPDN: Reconfigurable Power Distribution Networks
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL CTRL CTRL CTRL
MAVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR Key Observation
MAVR requires enough area for eachregulator to independentlysupport all power modes
Power limits mean FG-SYNC+ isdesigned such that only 1 or 2 cores
are ever super-sprinting at once
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL CTRL CTRL CTRL
16:4 XBar4b 4b 4b 4b
RPDN
Cornell University Christopher Torng 17 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
RPDN: Reconfigurable Power Distribution Networks
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL CTRL CTRL CTRL
MAVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR Key Observation
MAVR requires enough area for eachregulator to independentlysupport all power modes
Power limits mean FG-SYNC+ isdesigned such that only 1 or 2 cores
are ever super-sprinting at once
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL CTRL CTRL CTRL
16:4 XBar4b 4b 4b 4b
RPDN
Core 2Core 0 Core 1 Core 3 Core 0 Core 2Core 1 Core 3
Cornell University Christopher Torng 17 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
RPDN: Reconfigurable Power Distribution Networks
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL CTRL CTRL CTRL
MAVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR
Adjust-ableVR Key Observation
MAVR requires enough area for eachregulator to independentlysupport all power modes
Power limits mean FG-SYNC+ isdesigned such that only 1 or 2 cores
are ever super-sprinting at once
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
CTRL CTRL CTRL CTRL
16:4 XBar4b 4b 4b 4b
RPDN
Core 2Core 0 Core 1 Core 3 Core 0 Core 2Core 1 Core 3
40% Area Savings
Cornell University Christopher Torng 17 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
SPICE-Level Transient Response + Leakage Benefits
100 150 200 250 300 350 400
1.41.31.21.11.00.90.80.7
Time (ns)
Volta
ge (
V)
120 ns
150 ns
1.31.21.11.00.90.80.7
0.25 1.0 1.75 2.5 3.25Time (us)
Volta
ge (
V)
1390 ns 960 ns
2900 ns
MAVR Transient Response
RPDN Transient Response
Cornell University Christopher Torng 18 / 22
Motivation FGVS Architecture: FG-SYNC+ • FGVS Circuits: RPDN • Methodology and Evaluation
SPICE-Level Transient Response + Leakage Benefits
100 150 200 250 300 350 400
1.41.31.21.11.00.90.80.7
Time (ns)
Volta
ge (
V)
120 ns
150 ns
1.31.21.11.00.90.80.7
0.25 1.0 1.75 2.5 3.25Time (us)
Volta
ge (
V)
1390 ns 960 ns
2900 ns
MAVR Transient Response
RPDN Transient Response
0 5 10 15 20 25 30 35 40
60%
65%
70%
75%
80%Rest Nominal Sprint Super-Sprint
Output Power (mW)
4
cells 7
cells
Pow
er
Effic
ien
cy
1
cell55%
50%
Power Efficiency vs Output Power
in a Leakier Process
Cornell University Christopher Torng 18 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
Talk Outline
Core 0 Core 1 Core 2
CacheBank
CacheBank
CacheBank
On-Chip Interconnect
Core 3
CacheBank
Stats
PowerModes
Voltages
FG-SYNC+RPDN
FGVS Architecture: FG-SYNC+
Use lightweight software hints andlookup tables derived offline to enablefast multi-level voltage configuration
FGVS Circuits: RPDN
Enable sprinting cores to dynamicallyborrow energy storage from restingcores
Methodology and Evaluation
Architecture and Circuits Co-Design Approach
Cornell University Christopher Torng 19 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
Methodology
CrossCompiler
RISCISA Sim
RISC Binary
C++ MultithreadedApp Using Hints
Cycle-LevelSimulator
Cycle-Level Multicorewith DVFS Controller
Cornell University Christopher Torng 20 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
Methodology
CrossCompiler
RISCISA Sim
RISC Binary
C++ MultithreadedApp Using Hints
Cycle-LevelSimulator
Cycle-Level Multicorewith DVFS Controller
Verilog RTL of MulticoreRISC Processor
Gate-Level Model
VerilogSimulator
VerilogSimulator
SwitchingActivity
Layout
SynthesisP&R
Instruction-BasedEnergy Dictionary
Cornell University Christopher Torng 20 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
Methodology
CrossCompiler
RISCISA Sim
RISC Binary
C++ MultithreadedApp Using Hints
Cycle-LevelSimulator
Cycle-Level Multicorewith DVFS Controller
Verilog RTL of MulticoreRISC Processor
Gate-Level Model
VerilogSimulator
VerilogSimulator
SwitchingActivity
Layout
SynthesisP&R
Instruction-BasedEnergy Dictionary
SPICE CircuitModels of PDNs
SPICESimulator
DVFS ModeTransition
Times
Cornell University Christopher Torng 20 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
Methodology
CrossCompiler
RISCISA Sim
RISC Binary
C++ MultithreadedApp Using Hints
Cycle-LevelSimulator
Cycle-Level Multicorewith DVFS Controller
Verilog RTL of MulticoreRISC Processor
Gate-Level Model
VerilogSimulator
VerilogSimulator
SwitchingActivity
Layout
SynthesisP&R
Instruction-BasedEnergy Dictionary
SPICE CircuitModels of PDNs
SPICESimulator
DVFS ModeTransition
Times
Power ModelPer-Core Stats
Power Efficienciesfor each PDN
Cornell University Christopher Torng 20 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
Methodology
CrossCompiler
RISCISA Sim
RISC Binary
C++ MultithreadedApp Using Hints
Cycle-LevelSimulator
Cycle-Level Multicorewith DVFS Controller
Verilog RTL of MulticoreRISC Processor
Gate-Level Model
VerilogSimulator
VerilogSimulator
SwitchingActivity
Layout
SynthesisP&R
Instruction-BasedEnergy Dictionary
SPICE CircuitModels of PDNs
SPICESimulator
DVFS ModeTransition
Times
Power ModelPer-Core Stats
Power Efficienciesfor each PDN
SystemPerformanceand Energy
Results
Cornell University Christopher Torng 20 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
EvaluationN
orm
aliz
ed
En
erg
yE
ffic
ien
cy
Speedup
isopower
MAVR RPDNNo FGVS
0.6 0.8 1.0 1.2 1.40.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
BenchmarksI bfsI bilateralI ditherI kmeansI mriqI pbbs-drI pbbs-knnI pbbs-mmI rsortI splash2-fftI splash2-luI strsearchI viterbi
Cornell University Christopher Torng 21 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
EvaluationN
orm
aliz
ed
En
erg
yE
ffic
ien
cy
Speedup
isopower
MAVR RPDNNo FGVS
0.6 0.8 1.0 1.2 1.40.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
BenchmarksI bfsI bilateralI ditherI kmeansI mriqI pbbs-drI pbbs-knnI pbbs-mmI rsortI splash2-fftI splash2-luI strsearchI viterbi
Cornell University Christopher Torng 21 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
EvaluationN
orm
aliz
ed
En
erg
yE
ffic
ien
cy
Speedup
isopower
MAVR RPDNNo FGVS
0.6 0.8 1.0 1.2 1.40.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
BenchmarksI bfsI bilateralI ditherI kmeansI mriqI pbbs-drI pbbs-knnI pbbs-mmI rsortI splash2-fftI splash2-luI strsearchI viterbi
Cornell University Christopher Torng 21 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
EvaluationN
orm
aliz
ed
En
erg
yE
ffic
ien
cy
Speedup
isopower
MAVR RPDNNo FGVS
0.6 0.8 1.0 1.2 1.40.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
BenchmarksI bfsI bilateralI ditherI kmeansI mriqI pbbs-drI pbbs-knnI pbbs-mmI rsortI splash2-fftI splash2-luI strsearchI viterbi
Cornell University Christopher Torng 21 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
EvaluationN
orm
aliz
ed
En
erg
yE
ffic
ien
cy
Speedup
isopower
MAVR RPDNNo FGVS
0.6 0.8 1.0 1.2 1.40.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
2X Performance Boost with 40% Area Savings over MAVR
BenchmarksI bfsI bilateralI ditherI kmeansI mriqI pbbs-drI pbbs-knnI pbbs-mmI rsortI splash2-fftI splash2-luI strsearchI viterbi
Cornell University Christopher Torng 21 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN • Methodology and Evaluation •
EvaluationN
orm
aliz
ed
En
erg
yE
ffic
ien
cy
Speedup
isopower
MAVR RPDNNo FGVS
0.6 0.8 1.0 1.2 1.40.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
2X Performance Boost with 40% Area Savings over MAVR
10-50% Performance and 10-70% Energy Eff. over no FGVS
BenchmarksI bfsI bilateralI ditherI kmeansI mriqI pbbs-drI pbbs-knnI pbbs-mmI rsortI splash2-fftI splash2-luI strsearchI viterbi
Cornell University Christopher Torng 21 / 22
Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation
Take-Away Points
I Architecture and Mixed-Signal Circuit Co-Design can maximizethe system-level benefit of the emerging trend towards integratedvoltage regulation.
I Lightweight hints can provide an elegant solution to informinghardware of fine-grain activity imbalance.
I Reconfigurable Power Distribution Networks can enablerealistic FGVS by significantly reducing regulator area overheadand improving voltage-settling response times by an order ofmagnitude.
This work was supported in part by the National Science Foundation (NSF),a Spork Fellowship, and donations from Intel Corporation and Synopsys, Inc.
Cornell University Christopher Torng 22 / 22