a first-principles approach to opto-electronic link optimization · 2017. 9. 5. · a...
TRANSCRIPT
-
A First-Principles Approach to Opto-Electronic Link Optimization
Krishna Settaluri, Prof. Vladimir Stojanovic
E3S Seminar
Nov 17, 2016
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or Confidential Information.
-
Outline
• Motivation and importance of optical link communication
• Introduce end-to-end optical link model for optimization• Fundamental theory-based analysis
• Optimization results
• Schematic-level results
• Next generation performance limits
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or Confidential Information.
-
The Wire Problem
Channel Loss =
Power for fixed data rateFor Internal E3S Use Only. These Slides May Contain Prepublication Data and/or Confidential Information.
-
The Need for High Bandwidth Links In Data Centers
Microsoft
Facebook
Oracle
Increasing demand for high throughput links Contributors
High Performance Computing 5G Mobile Market
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Silicon Photonics
Luxtera OracleST
Tight integration of photonics + CMOS shows promise for next generation optical interconnects Utilize existing CMOS process nodes for production Small interconnect capacitance
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Photonic componentsMicroring ModulatorPhotodiode
Waveguides Grating Couplers
Low loss on-chip waveguides ~2dB/cm loss
Hochberg
Couple light from off-chip to on-chip Edge couplers promise sub 1dB/coupler loss
Wavelength (nm)1295.6 1295.8 1296 1296.2 1296.4 1296.6 1296.8
Tra
nsm
iss
ion
(d
B)
-16
-14
-12
-10
-8
-6
-4
-2
0
DataModel
High optical bandwidth (~40GHz) allows fast ON/OFF modulation
High Responsivity ~ 0.8A/W Ge PD show high BW (120GHz) [Vivien]
Intel
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Opportunity for photonic link optimization
What’s preventing better link performance?What are the fundamental limits to the best-case performance for a given data rate?
Can we arrive in the vicinity of this global optimum quickly?
Photonic microdevice design
Macro-System Performance
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Photonic Link
Electrical Digital
Driver Photodiode
Electrical Data
E/O Conversion
O/E Conversion
Electrical Data
Channel
TX Macro RX Macro
Optical Modulator
TX Macro
I/V Conversion
Gain Stage
RX MacroOptical Fiber
Idealized
Realistic
Digital In Digital Out
Digital In Digital OutFor Internal E3S Use Only. These Slides May Contain Prepublication Data and/or Confidential Information.
-
Optical Link Model
Current to Voltage Conversion
Transimpedance Amp (TIA)
Voltage Amplification
Rail-to-RailOutput Swing
Digital Input
Digital Output
1. Truly End-to-End — From Digital to Digital2. Differs from previous link optimization literature
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Importance of Link Optimization
• Energy split between TX and RX dictate best link E/B• Extreme 1: All energy in TX, none in RX• Extreme 2: All energy in RX, none in TX
• Global optima somewhere in the middle and heavily data rate dependent
• Noise and swing affect link performanceFor Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Optimization Approach
RX Macro
N
Transimpedance Amplifier (TIA) Voltage Amplifiers StrongArm Sense Amplifiers
All stages defined by input FET width, which defines gm, Cox, and Id
RFB, N, and M also optimization parameters
M
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Optimization FlowData Rate + Technology
Per Stage Bandwidth
Parameterized N, M, etc. Topology
End-To-End Gain + Noise
Input Sensitivity
Link Energy/bit
Per Stage Gain
Best topology parameters for given data rate
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or Confidential Information.
-
Block-Level BW Requirement
fs fs fs
BW Requirement(0.7*fData)
Minimum required per-stage bandwidth
*Stage Count, N, Dependent
fs – per stage BWfdata – is the data rate
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Prerequisite Technology-Dependent Factors
Load amplifier with replica of itself
fa is technology and topology dependent:Common-Source — fa = 0.36*fT for 65nm CMOS (Miller Effect)
Cascode — fa = 0.4*fT for 65nm CMOS (no Miller Cap)
β is technology and topology dependent:Common-Source — β=0.29 for 65nm CMOS (Miller Effect)
Cascode — β=0.4 for 65nm CMOS (no Miller Cap)
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Analog Front-End Chain
fs fs fs
GBW of a stage can now be written in terms of technology metrics and width of input FET:
DC Gain (BW = fs):
Optimization parameters
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
TIA Analysis
IPD
Vo
Gain
Bandwidth
A is inverter gain, gmro
Back calculate RFB requirement from per stage Bandwidth
IPD
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
StrongArm Sense AmplifierReset FETs Reset FETsLatching FETs
Integration Phase
Regeneration Phase
Latching Phase
Sample the input and integrate difference on output cap
Cross-coupled inverters activate and regenerate output to rail-to-rail
Fixed time after regeneration to latch data
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
StrongArm Sense Amplifier #2
CP CQ
CXCY
Integration PhaseDischarge CP,Q first then CX,Y
Integration Gain (Vint,final / Vin, diff)
Regeneration Time Constant
Total Sampler GainRegeneration Time
(1)
(2)
(1) (2)
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Putting it all together…Swing Sensitivity
RX Macro
N
TransimpedanceAmplifier (TIA) Voltage Amplifiers StrongArm Sense Amplifiers
Rtot GSA
Input Sensitivity to Overcome Swing
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Noise Sensitivity
Major Noise Sources1. TIA Resistor Noise2. TIA Transistor Noise3. AFE Transistor Noise4. Sampler Noise
Input referred resistor noise
Input referred TIA FET noise
Input referred AFE noise
Sampler Voltage Noise(approximation from Nuzzo’s work)
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Total Rx Sensitivity
Total Link Energy Per Bit:
Path inefficiencies(couplers, laser efficiency, modulator loss)
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Link Parameters and Optimization Variables
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Simulation Results –Restrict # of samplers to 1 (M=1)
65nm Heterogeneous Integration Platform
• 10% wall plug efficiency
• 3.5dB/coupler loss
• 0.8A/W responsivity
• 5dB Modulator IL
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Simulation Results –Unrestricted # of samplers
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Simulation-Inspired Points and Circuits
FET sizings and bias currents outputted from simulationSchematic design based on these sizings/biasing
Minimal size adjusting for proper DC operating points
5Gbps RX
5Gbps RX w/ CTLE
25Gbps DDR/QDR
25Gbps QDR w/ Switching
65nm Heterogeneously Integrated CMOS
Low Data Rate Regime
High Data Rate Regime
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
5Gbps RX
Double Data Rate Receiver Pseudo—Differential Architecture Single Gain Stage proceeding TIA
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
5Gbps Results
As expected, front-end noise sensitivity dominates
Noise Sensitivity Breakdown
RFBTIA FET
Other
50% of total noise from RFBFor Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
How to reduce RFB noise contribution?
Noise from RFB inversely proportional to RFB But increasing RFB reduces BW
Solution: Keep RFB large, while retaining BW with CTLE
TIA BandwidthNoise
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
5 Gbps Rx With CTLE
TIA
CTLEIncrease RFB and compensate with
follow-on zero
zero
Result: Large RFB and High Bandwidth
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
5Gbps Rx With CTLE Results
Swing and Noise Sensitivities are Comparable
RFB contribution lowered
RFB
TIA FET
Other
Noise Sensitivity Breakdown
15% of total noise from RFBFor Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
25 Gbps DDR/QDR
High Data Rate RegimeM = 2 for DDRM = 4 for QDR
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
25Gbps DDR/QDR Results
Going from M=2 to M=4 increases sampler gain exponentially.
Sampler swing requirement more than order of magnitude smaller.
Rx Power Constant; Tx Power Drastically ReducedFor Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
25Gbps QDR w/ Switching
Alleviate swing requirement by loading 1 sampler
Imperfect junction caps not taken into account — accurate assumption for reasonable # of samplers
Reduce Load Capacitance Increase AFE Gain
Reduce Sampler Noise
AFE “sees” loading from 1 sampler only
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
25Gbps QDR w/ Switching Results
Sampler noise contribution decreases
Overall sensitivity improved
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Future Technology Performance?
Current Platform
Order of magnitude improvement with realistically better photonics
Performance dominated by “weakest link” technology Improved photonics – CMOS dominant performance
Best Photonics
Wall plug efficiency: 10%Coupler Loss: 3.5dBPD Responsivity: 0.8 A/WModulator IL: 5dB
Wall plug efficiency: 30%Coupler Loss: 1dBPD Responsivity: 0.8 A/WModulator IL: 3dB
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Future Technology Performance?
Current Platform Best Photonics Best Photonics + CMOS
Order of magnitude improvement with better CMOS
Improved photonics + CMOS – power struggle Equal split in performance
fT = 1THzfT = 150GHz
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Concluding remarks
Introduced fundamentals-motivated analysis of optical links
Validated analysis with schematic-level results
Presented next-generation performance metrics and importance towards link performance
Next step… Layout-automation and generation to provide realistic
performance numbers
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.
-
Acknowledgments
• Work done in collaboration with Chris Keraly, Eli Yablonovitch
• Yue Lu and Prof Elad Alon for helpful discussions
• teamVlada for helpful discussions
• DARPA, E3S for funding and resources
For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.