cryogenic dram based memory system for scalable quantum ...€¦ · swamit tannu doug carmean...
TRANSCRIPT
S W A M I T TA N N U
D O U G C A R M E A N
M O I N U D D I N Q U R E S H I
MEMSYS-2017
CRYOGENIC DRAM BASED MEMORY SYSTEM FOR SCALABLE QUANTUM COMPUTERS:
A FEASIBILITY STUDY
Why Quantum Computers?
❖ Quantum computers provide large speedup for problems in material science, machine learning, and medicine
Quantum Computers enable solutions to important problems
Exe
cutio
n Ti
me Classical
Computer
Problem SizeProblem Size
Quantum Computer
Molecule and Material Simulations
2
Billion Years
Days
Qubits: Background
Quantum computer use quantum bits (qubits) to encode the information
Classical Bit
3
❖ State of a Classical Bit1 or 0 two points on sphere
Qubits: Background
Quantum computer use quantum bits (qubits) to encode the information
Classical BitQuantum Bit
3
❖ State of a Classical Bit1 or 0 two points on sphere
Qubits: Background
Quantum computer use quantum bits (qubits) to encode the information
Classical BitQuantum Bit
3
❖ State of a Classical Bit1 or 0 two points on sphere
❖ State of a Quantum Bit Any point on the sphere
Organization of Quantum Computer
Control Processor -- Interface between Qubits & Programmer
4
Quantum Computer
Control
Processor
Qubits
Qubits are fickle 5
Qubits are kept at extremely low temperature (~20mK)
❖ No quantization
small change in state
lead to errors
❖ Room temperature
too noisy to operate
1
0
Classical Bit Quantum Bit
Qubits are fickle 5
Qubits are kept at extremely low temperature (~20mK)
❖ No quantization
small change in state
lead to errors
❖ Room temperature
too noisy to operate
1
00
Classical Bit Quantum Bit
Qubits are fickle 5
Qubits are kept at extremely low temperature (~20mK)
❖ No quantization
small change in state
lead to errors
❖ Room temperature
too noisy to operate
1
00
Classical Bit Quantum Bit
Cryogenic Control Processor
Control
Processor
Qubits 20mK
300K
Large Thermal Gradient
Metal WiresThermal Leakage
7
Cryogenic Control Processor
Control
Processor
Qubits 20mK
300K
Large Thermal Gradient
Metal WiresThermal Leakage
7
Cryogenic Control Processor
Control
Processor
Qubits 20mK
300K
Large Thermal Gradient
Metal WiresThermal Leakage
7
Qubits 20mK
Control
Processor
4K
Cryogenic Control Processor
Control
Processor
Qubits 20mK
300K
Large Thermal Gradient
Metal WiresThermal Leakage
7
Qubits 20mK
Control
Processor
Superconducting wires
Low Leakage
4K
Cryogenic Control Processor
Cryogenic Control Processor is essential for scalable Quantum Computer (Ref: D. Carmean, ISCA’16 Keynote)
Control
Processor
Qubits 20mK
300K
Large Thermal Gradient
Metal WiresThermal Leakage
7
Qubits 20mK
Control
Processor
Superconducting wires
Low Leakage
4K
Memory for Quantum Computers
Quantum Computer
Control Processor
Qubits
Memory
❖ Program Memory + Data Memory Stores
Quantum Executable , Data , ECC-frames (~10s GB)
❖ Memory must be kept at cryo temperature to
avoid large thermal gradient
❖ Josephson Junction technology works at 4K
Limited memory density (only few Mb)
8
Memory for Quantum Computers
Quantum Computer
Control Processor
Qubits
Memory
❖ Program Memory + Data Memory Stores
Quantum Executable , Data , ECC-frames (~10s GB)
❖ Memory must be kept at cryo temperature to
avoid large thermal gradient
❖ Josephson Junction technology works at 4K
Limited memory density (only few Mb)
Memory
Control Processor
Qubits
8
Memory for Quantum Computers
Quantum Computer
Control Processor
Qubits
MemoryData
Memory
Program
Memory
❖ Program Memory + Data Memory Stores
Quantum Executable , Data , ECC-frames (~10s GB)
❖ Memory must be kept at cryo temperature to
avoid large thermal gradient
❖ Josephson Junction technology works at 4K
Limited memory density (only few Mb)
Data
Memory
Program
Memory
Control Processor
Qubits
8
Memory for Quantum Computers
Quantum Computer
Control Processor
Qubits
MemoryData
Memory
Program
Memory
Quantum computers require substantial memory capacity at cryo temperature
❖ Program Memory + Data Memory Stores
Quantum Executable , Data , ECC-frames (~10s GB)
❖ Memory must be kept at cryo temperature to
avoid large thermal gradient
❖ Josephson Junction technology works at 4K
Limited memory density (only few Mb)
Data
Memory
Program
Memory
Control Processor
Qubits
8
Does commodity DRAM work at cryogenic temperatures ?
9
Goal: To characterize DRAM at cryogenic temperature to understand the functionality and error patterns
Why Memory Fails at Cryogenic Temperature?10
Temperature Threshold VoltageCarriers (e-)Faults
At low temperatures, carrier freezeout can cause increase in threshold voltageworsens error rate
Why Memory Fails at Cryogenic Temperature?10
Temperature Threshold VoltageCarriers (e-)Faults
At low temperatures, carrier freezeout can cause increase in threshold voltageworsens error rate
Minimum Operational Temperature (MOT)Minimum temperature for fault free operation
Minimum Operational Temperature (MOT)Minimum temperature for fault free operation
❖ Conventional memory testing Memtest86 running on host, dedicated
memory testers
How to Test DRAM at Cryogenic Temperature? 12
❖ Conventional memory testing Memtest86 running on host, dedicated
memory testers
❖ Host machines or memory testers do not work at cryogenic temperatures
How to Test DRAM at Cryogenic Temperature? 12
Need mechanism to reduce DIMM temperature without affecting tester
❖ Conventional memory testing Memtest86 running on host, dedicated
memory testers
❖ Host machines or memory testers do not work at cryogenic temperatures
How to Test DRAM at Cryogenic Temperature? 12
Isolated Cooling of DIMM
❖Need cryogenic coolant Liquid Nitrogen
(boils at 77K)
❖Need isolated cooling of DIMMs
Compact cryogenic heatsink
13
Isolated Cooling of DIMM
❖Need cryogenic coolant Liquid Nitrogen
(boils at 77K)
❖Need isolated cooling of DIMMs
Compact cryogenic heatsink
13
Isolated Cooling of DIMM
❖Need cryogenic coolant Liquid Nitrogen
(boils at 77K)
❖Need isolated cooling of DIMMs
Compact cryogenic heatsink
❖ DIMM is sandwiched between two
heatsinks and can be cooled down to 80K
13
Isolated Cooling of DIMM
❖Need cryogenic coolant Liquid Nitrogen
(boils at 77K)
❖Need isolated cooling of DIMMs
Compact cryogenic heatsink
❖ DIMM is sandwiched between two
heatsinks and can be cooled down to 80K
Compact heatsink with Liquid Nitrogen provides isolated cooling of a DIMM
13
Challenges: Thermal Shock & Ice Condensation
Time
Tem
pera
ture
(K)
300K
80K
Limit rate of cooling & use isolation chamber to reduce condensation
15
Challenges: Thermal Shock & Ice Condensation
THERMAL SHOCK
Time
Tem
pera
ture
(K)
300K
80K
Limit rate of cooling & use isolation chamber to reduce condensation
15
Challenges: Thermal Shock & Ice Condensation
THERMAL SHOCK
Time
Tem
pera
ture
(K)
300K
80K
Limit rate of cooling & use isolation chamber to reduce condensation
15
Challenges: Thermal Shock & Ice Condensation
THERMAL SHOCK
Time
Tem
pera
ture
(K)
300K
80K
Co
nd
en
satio
n
Limit rate of cooling & use isolation chamber to reduce condensation
15
Challenges: Thermal Shock & Ice Condensation
THERMAL SHOCK
Time
Tem
pera
ture
(K)
300K
80K
Limit rate of cooling & use isolation chamber to reduce condensation
15
Co
nd
en
satio
n
Challenges: Thermal Shock & Ice Condensation
THERMAL SHOCK
Time
Tem
pera
ture
(K)
300K
80K
Limit rate of cooling & use isolation chamber to reduce condensation
15
Co
nd
en
satio
n
Experimental Methodology
❖Verify memory functionality by
using march-tests
❖Fault single bit fault in a burst
❖MOT Minimum temperature at
which no faults are observed
16
55
Number of Chips 750
Number of Vendors 5
Number of DIMMS
Minimum Operational Temperature for DIMMs
18% DIMMs are functional below 90K
17
70
80
90
100
110
120
130
140
150
160
170
0 10 20 30 40 50 60
18%55%
100%
90%
Min
imum
Ope
ratin
g Te
mpe
ratu
re (K
)
Minimum Operational Temperature for DIMMs
18% DIMMs are functional below 90K
17
70
80
90
100
110
120
130
140
150
160
170
0 10 20 30 40 50 60
18%55%
100%
90%
Min
imum
Ope
ratin
g Te
mpe
ratu
re (K
)
DIMM Failure!
Chip Failures
92% of chips worked at cryogenic conditions— Pick cryogenic tolerant chips
Functional Chips
Faulty Chip
92%
8%
18
Min Operational Temperature Vs Chip Capacity
70
80
90
100
110
120
130
140
150
160
170
0 10 20 30 40 50 60 70
Min
imum
Ope
ratin
g Te
mpe
ratu
re (K
)
MOT increases with capacity; capacity of chip is correlated to technology node
150
160
170256 Mb
512 Mb
1 Gb
4 Gb
2 Gb
19
Min Operational Temperature Vs Chip Capacity
70
80
90
100
110
120
130
140
150
160
170
0 10 20 30 40 50 60 70
Min
imum
Ope
ratin
g Te
mpe
ratu
re (K
)
MOT increases with capacity; capacity of chip is correlated to technology node
150
160
170256 Mb
512 Mb
1 Gb
4 Gb
2 Gb
19
Fault Granularity
Uncorrelated faults Conventional ECC can be effective for Cryo DRAM
❖ Single bit errors
Uncorrelated faults
❖Linear codes (SECDED, BCH)
are still effective
20
Single bit fault 99.985%
Double bit fault 0.015%
Single bit
DoubleDoubleDouble
Transient Vs Permanent Faults
Permanent faults conventional sparing techniques can be used
21
❖Repeated faulty addresses = permanent error
❖Unique faulty address = transient error
Transient41%
Permanent59%
DDR3
Transient36%
Permanent64%
DDR2
Transient53%
Permanent
47%
DDR4
Conclusion
❖ Quantum computers need dense memory at low temperature
❖ Does DRAM Work at cryogenic temperature?
❖ Experiments show most commodity DRAM chips work at 90K
❖ Error patterns are amenable to existing fault tolerance techniques
22
Questions?
Want to know more about quantum computers?
Please visit my paper presentation at MICRO 2017 In 2 weeks in Boston!
23