framework for novel compute (franc) - darpa required for novel compute (franc) ... interpolation 1:...

38
Foundation Required for Novel Compute (FRANC) Dr. Daniel S. Green, Program Manager DARPA/MTO September 15, 2017 Distribution Statement “A” (Approved for Public Release, Distribution Unlimited)

Upload: lamthien

Post on 14-Apr-2018

244 views

Category:

Documents


5 download

TRANSCRIPT

  • Foundation Required for Novel Compute (FRANC)Dr. Daniel S. Green, Program Manager

    DARPA/MTO

    September 15, 2017

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 2

    ERIcreating an electronics capability that will provide a foundational contribution to national security

    Three thrust areas: Materials and Integration, Architectures, Designs

    Electronics Resurgence Initiative : Materials

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 3

    FRANC foundation required for assessing and establishing the proof of principle for beyond von Neumann computing architectures.

    Foundation Required for Novel Compute (FRANC)

    Todays Topic

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 1946 2012What is a transistor: The World of Modern Electrons; Sam Sattel

    Applied Physics: Feb 2012; Experimental realization of superconducting quantum interference devices with topological insulator junctions. M. Veldhorst et. al.

    The Opportunity: Materials

    and continue to present opportunities

    Materials have underpinned Moores Law from the start

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 300mm diameter Si CMOS wafer Si (45nm), InP (TF5 HBT), GaN (GaN20 HEMT)DAHI Program DAHI Program

    and Integration

    and allowed a faster, flexible mix of materials

    At the same time, heterogeneous integration has advanced

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 6

    Data for 7nm instantiation of a state-of-the-art Machine Learning accelerator

    The Challenge: Beyond von Neumann Computing

    Data from S. Mitra of Stanford

    8%

    92%

    Neural Programmer (LSTM)

    20%

    80%

    ResNet-152 (CNN)

    Compute Memory

    15%

    85%

    Alex Net (CNN)

    Current von Neumann architecture spends more time moving data than processing it

    Accelerators dont help (enough) if using the same architecture

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • Integration Materials

    Beyond von Neumann

    Shutterstock.com

    Rethinking Our Approach

    Processing Memory

    ALU

    L1 Cache

    L2 Cache

    DRAM

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 8

    The story so far

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 8/22/2016JUMP

    Approved

    4/2017HIVE

    Kickoff

    4/2016CRAFT Kickoff

    2016 2017 2018 2019

    4/2017SSITH

    BAA Released

    6/2016CHIPS

    Approved

    1/2017L2M

    Approved

    11/2015

    N-ZERO Kickoff

    9

    N-ZERO Ultra low power design

    CRAFT Reduced design time

    CHIPS Pseudolithic design

    JUMP Broad university support

    L2M In field machine learning

    HIVE Graph processing

    SSITH Built in security

    plus many earlier efforts!

    Traditional programs currently funded

    Today

    FRANC builds from a rich base

    FRANC aims to leverage, not repeat, its predecessorsDistribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 10

    STARNet research demonstrated benefits of beyond von Neumann topologies

    Source: S.F. Yitbarek, et al., DATE 2016.

    Xeon+HMCQuad Xeon and Hybrid Memory Cube

    NM Atom16 Intel Atom Cores

    External Acc16 external accelerators, SerDESlinked

    NM AccNear memory accelerator

    Performance Comparison (higher better)

    Near-memory processing provides dramatic performance and energy improvements

    U. Mich, UCSD

    Key result: near-memory processing provides dramatic performance improvements Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 11

    https://www.youtube.com/watch?v=RVq56e9MzF4

    UPSIDE Insight #1: exploit new materials and physics for fast, low power computation.

    Front End Filtering(Edge Detection)

    Input Image Patch

    Pixels Mapped into coupled oscillators

    3x3 pixels

    Oscillators relax to unique energy state

    Final energy compared against library of learned features

    E1=

    E2=E3=

    E4=Best Match: Ex=E3

    Step and repeat to Identify all Edges (in red)

    Final Result: Filtered Image

    =E3Magnetic Couplingbetween STNOs

    Ipixel based on pixel intensity

    IbiasIbias

    Ibias

    UPSIDE saw potential for unconventional computing

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 12

    Breakthrough device features:- 200 nm2 cross-section (EUV lithography)- > 4 orders of magnitude on/off ratio @ 0.1 V- 10 years projected retention- Low switching voltage (~+/-1.5V)- CMOS compatible.

    12x12 Memristor Array

    Single Xbar

    Memristor chip: using physics of an emerging device for classification.

    While compelling demonstrations exist, a plan for general use is not clear

    First demonstration - three object pattern recognition implemented directly in memristor

    crossbar array

    and demonstrated benefits

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 13

    CHIPS seeking a pathway for modularity

    CHIPS modularity aims to overcome the limitation of ASIC/monolithic solutions

    Access to Commercial IP Memory SerDes Processors

    Reusable Function blocks QR decomposition Waveforms FFT

    Big Data Movement Image processing Machine Learning High-speed chiplet networks

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 14

    CHIPS developing interface standardCHIPS Program

    Interface Standard MetricsData rate 10 GpbsEnergy efficiency < 1 pJ/bitLatency < 5 nsBandwidth density > 1000 Gbps/mm

    CHIPS Target

    Ground ref.EMIB

    Co-ax

    HBM

    SerDes

    Differential

    Single-ended

    Sources:1. 2016 JSSC, Dehlaghi2. 2013 JSSC, Poulton3. 2012 JSSC, Dickson4. 2013 JSSC, Mansuri5. 2016 ECTC, Mahajan

    CHIPS interface is one of many possible routes for efficient interdie communications

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 15

    What is FRANC? Technical Areas

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 16

    Realize circuit prototypes that demonstrate beyond von Neumann topologies

    Leverage emerged materials and/or integration technologies Integrate processing/memory to create revolutionary capabilities Increase over SOA by >10

    FRANC Technical Area 1 (TA-1): New topology circuit prototypes

    ?Distribution Statement A (Approved for Public Release, Distribution Unlimited)

    Image source: http://dclifecounseling.com/storage/apple%20oranges.jpg?__SQUARESPACE_CACHEVERSION=1450919665411

  • 17

    TA-1: Workload Matters

    ArchitecturesConv-DDR3 = conventional processor, DDR3 memoryConv-3D = conventional processor, 3D stacked DRAMBase-NDP = Near Data Processing NDP = base-NDP plus communication and coherence support

    Simulation study of execution time for a varying workloads

    FRANC metrics are Proposer defined BUT:- should show the breadth of applicability- should be compared to relevant state of the art

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • Distribution Statement A (Approved for Public Release, Distribution Unlimited) 18

    Example #1 of Benchmarks: PNNLs PERFECT Suite

    Application Domain Kernel

    Synthetic Aperture Radar

    Interpolation 1

    Interpolation 2

    Back Projection

    Wide Area Motion Imaging

    Deblur /Debayer

    Image Registration

    Change Detection

    Space-Time Adaptive Processing

    System Solver

    Inner Product

    Outer Product

    PERFECT Application 1

    Discrete Wavelet Transform

    2D Convolution

    Histogram Equalization

    KernelsSort

    FFT 1D

    FFT 2D

    Note: citation of this benchmark suite serves only as an example of a standard set of benchmarks. This suite is not endorsed by DARPA.

    The Pacific Northwest National Laboratory (PNNL) defined a suite of benchmarks that are intended to represent computing-constrained embedded applications.

    Benchmarks are specified as C code.

    See http://hpc.pnnl.gov/PERFECT/

    http://hpc.pnnl.gov/PERFECT/

  • Distribution Statement A (Approved for Public Release, Distribution Unlimited) 19

    Example #2 of Benchmarks: The SEAK Suite

    SEAK: Suite of Embedded Applications and Kernels

    The SEAK suite is intended to define computing problems that are of interest to DARPA and the DoD.

    Benchmarks are specified as black-box transformations. Specifically, the benchmark is defined as an input dataset and a correctness test for the output. Hardware and software are unconstrained.

    See http://hpc.pnl.gov/SEAK/files/SEAK-Specification.pdf

    Application Domain Kernel

    Acoustics Automatic Speech Recognition

    Radio

    Synthetic Aperture Radar Image Formation

    Synthetic Aperture Radar Target Detection

    Space-Time Adaptive Processing Signal Formation

    Space-Time Adaptive Processing Target Detection

    Image

    Image RegistrationMultisensor Image Fusion

    Face DetectionText Image Classification

    Natural Image Classification

    Hyperspectral Image

    Signature ExtractionTarget Detection

    Note: citation of this benchmark suite serves only as an example of a standard set of benchmarks. This suite is not endorsed by DARPA.

    http://hpc.pnl.gov/SEAK/files/SEAK-Specification.pdf

  • Distribution Statement A (Approved for Public Release, Distribution Unlimited) 20

    FRANC allows for circuit prototypes that encompass accelerators that leverage new materials or integration technology

    More modest effort than full TA-1 to demonstrate performance benefits of accelerator approach

    Performance benefits must be quantified, e.g., What is improved? e.g., energy-delay or some other metric? What is enabled? e.g., low power computation In what computational domain does it excel? e.g., image processing What is the overhead of using the approach? e.g., time to set up the processes

    FRANC Technical Area 1b (TA-1b): Accelerators

  • Distribution Statement A (Approved for Public Release, Distribution Unlimited) 21

    New materials and/or integration technology that enables 2.5D and 3D integrated solutions beyond von Neumann

    Component technologies supporting new computing topologies

    Technologies with a fast path to commercialization cost share required

    Identified interest areas: Accelerating material discovery Non-volatile memory (NVM) Power Management for ICs Chip-scale photonic components

    FRANC Technical Area 2 (TA-2): Building Blocks

    TA2 seeks to enable integrated solutions supporting beyond von Neumann computing topologies

  • 22

    Accelerating Materials Discovery and Development

    Intermolecular, Inc.

    Design of Experiments

    Combinatorial Processing

    Machine Learning

    L. Ward, et al, A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials , Northwestern U.

    State of the art process development

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

    Sources: Intel, Intermolecular, Northwestern

    Example: materials discovery with machine learning

  • Examples: emerging opportunities for new devices

    23

    New Physics for Non-Volatile Memory

    5TP Ma, ERD Memory Workshop 2014; 6Wen et al, Nature2013; 7Kreupl, ERD Memory Workshop, 2014; 8X. Hong, ERD Memory Workshop, 2014; 9V. Zhirnov, ERD Memory Workshop 2014

    On Off

    Ferroelectric Tunnel Junction6

    Carbon Memory7

    Mott Memory8

    Molecular Memory9

  • 24

    Power Management for ICs

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

    Example: voltage regulators (VR) are central to efficient computing

    Sources: http://www.electronicdesign.com/, http://www.cs.columbia.edu

    VR efficiency is challenged by high current and high dynamic range

    and transient switching!

    Better components (e.g. inductors) have been demonstrated

  • Chip scale photonic components

    Memory Module

    DRAM

    DRAMDRAMDRAMPDRAM

    DRAMDRAMPDRAM

    Intrachip Photonic Communication Layer

    Multi-core CPU Chip

    PDRAM: Photonic DRAM

    DRAMDRAMDRAMPDRAM

    Optical Link (Fiber/Waveguide)

    CMOS PhotonicsCMOS-foundry-compatible photonic technology for seamless, optimized communications within and between advanced multi-core CPU chip and

    memory chip

    DRAM Photonics DRAM-foundry-compatible photonic

    technology for photonic-RAM (PDRAM) to enable high bandwidth, fast access, low

    power and high capacity DRAM

    Optimization New embedded processor architectures

    to optimize performance by fully leveraging technology development

    efforts

    Multiple stacked DRAM chips

    Photonic Communication

    deep into DRAM

    DIMM

    25

    Photonics building blocks for complete chip-scale photonic interconnects are not complete

    POEM = Photonically Optimized Embedded Microprocessor Distribution Statement A (Approved for Public Release, Distribution Unlimited)

    Example: photonics have been shown to provide performance

  • 26

    FRANC timeline

    6 Months

    24 Months

    Phase 1 Phase 2 Phase 3

    New Topology Circuit Prototypes

    T&I Plans

    24 Months

    Building Blocks

    T&I Plans

    Preliminary Design Detailed Design Functioning Prototype

    6 Months

    Component Spec Component Design Functioning Prototype

    Down Selects

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

    FRANC timeline is designed to accelerate getting projects underway

    Initial phase is intended for small team to complete preliminary analysis Phase 2 and 3 are expected to contain the bulk of the proposed effort

  • 27

    Proposal Evaluation Criteria and Proposal Guidance

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 28

    The following pages contain selected tips on how to satisfy the evaluation criteria

    Read the BAA for complete guidance on technical and cost proposals

    Specific Proposal Evaluation Criteria Includes:

    1. Overall scientific and technical merit2. Potential contribution and relevance to the DARPA mission3. Impact on the Overall Electronics Landscape4. Cost realism

    Proposal Evaluation Criteria

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 29

    The proposed technical approach is innovative, feasible, achievable, and complete.

    A specific technical solution path is proposed along with arguments why the proposed approach is expected to be successful. Analysis and trades are presented explaining why the proposed approach was selected and why alternatives were not proposed.

    Task descriptions and technical elements are complete and in a logical sequence leading to an endpoint supporting FRANC goals. Proposed task elements have measureable milestones that will aid DARPA in tracking progress. Deliverables and cross-performer interfaces are clearly defined and support the FRANC Program structure.

    The proposal identifies major technical risks and includes planned risk mitigation efforts.

    1. Overall Scientific and Technical Merit

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 30

    The potential contributions of the proposed effort are relevant to the national technology base. Specifically, DARPAs mission is to make pivotal early technology investments that create or prevent strategic surprise for U.S. National Security.

    Proposal tip: consider the perspective of the proposal reviewer that must describe how your proposal contributes to the DARPA mission

    2. Potential Contribution and Relevance to the DARPA Mission

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 31

    The purpose of ERI is to provide sustainable performance scaling Having a one-shot advantage is likely not compelling. Performance increases

    should outlive the program. Providing new underlying technological capability is useful, e.g., ways to rapidly

    discover and develop new materials Combining traditional and non-traditional approaches

    3. Impact on the Overall Electronics Landscape

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 32

    The proposed staffing and schedule is consistent with the proposed tasking and technical milestones.

    The proposed costs are realistic for the technical and management approach and accurately reflect the technical goals and objectives of the solicitation. The proposed costs are consistent with the proposer's Statement of Workand reflect a sufficient understanding of the costs and level of effort needed to successfully accomplish the proposed technical approach. The costs for the prime proposer and proposed subcontractors are substantiated by the details provided in the proposal.

    The proposal identifies major cost and schedule risks and includes planned risk mitigation efforts.

    Ensure editable spreadsheets are delivered as part of the cost volume

    4. Cost Realism

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 33

    Technologies are particularly sought close to commercialization where DARPA can partner to bring new technologies to reality

    Significant cost share is encouraged to ensure commitment to develop the technology

    Cost share will be considered in evaluating technology that has potential for significant commercial impact.

    Cost share, if any, is included in the cover sheet of the proposal Cost share is required on TA-2

    A note about cost share

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 34

    Proposal Tips 1

    a) Describe what is novel in your approach and why it will lead to significant, sustainable performance advantages in computing. Provide a specific approach rather than a menu of potential approaches.

    How does it benchmark against traditional approaches? Under what conditions does it excel? Where is it lacking?

    b) Describe the innovations required to realize the above approach and their feasibility.

    Materials, methods, reproducibility, packaging, handling, storage, etc. Define measurable milestones by program phase.

    c) Describe your scaling strategy to meet the program final goals and beyond. Demonstrate quantitatively through modeling how the approach will meet and exceed the

    program goals through scaling. Consider your approach through all phases and how the approach scales in performance

    throughout the program and beyond

    d) Describe any partnering strategies needed to achieve success

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 35

    Proposal Tips 2

    e) Provide sufficient technical detail, simulation data, etc. for us to make a technical conclusion

    f) Place your technology in context No technology can solve all problems. Where will it make a difference?

    g) Provide separate costing for each phase Each phase is a new funding option that may or may not be exercised. If you

    provide one overall cost, it is all or nothing, and likely nothing will be the result. Provide costing by major subtasks in each phase. Sometimes specific tasks are

    valuable to fund separately. If they are not costed separately, they cannot be funded separately.

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • 36

    FAQ questions sent to [email protected] DO NOT send proprietary information for FAQ. All answers given will be published

    before BAA due date. Dont ask if you dont want it published.

    Miscellaneous

    Important Dates

    BAA Release 13-Sept-2017

    Proposer Day 15-Sept-2017

    FAQ Deadline 23-Oct-2017

    Proposals Due 6-Nov-2017

    Program Kick-Off Early 2018

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

    mailto:[email protected]

  • 37

    Contracting Time

    Distribution Statement A (Approved for Public Release, Distribution Unlimited)

  • www.darpa.mil

    38Distribution Statement A (Approved for Public Release, Distribution Unlimited)

    Foundation Required for Novel Compute (FRANC)Electronics Resurgence Initiative : MaterialsFoundation Required for Novel Compute (FRANC)The Opportunity: Materialsand IntegrationThe Challenge: Beyond von Neumann ComputingRethinking Our ApproachThe story so farFRANC builds from a rich base STARNet research demonstrated benefits of beyond von Neumann topologiesUPSIDE saw potential for unconventional computing and demonstrated benefitsCHIPS seeking a pathway for modularityCHIPS developing interface standardWhat is FRANC? Technical AreasFRANC Technical Area 1 (TA-1): New topology circuit prototypesTA-1: Workload MattersExample #1 of Benchmarks: PNNLs PERFECT SuiteExample #2 of Benchmarks: The SEAK SuiteFRANC Technical Area 1b (TA-1b): AcceleratorsFRANC Technical Area 2 (TA-2): Building BlocksAccelerating Materials Discovery and DevelopmentNew Physics for Non-Volatile MemoryPower Management for ICsChip scale photonic componentsFRANC timelineProposal Evaluation Criteria and Proposal GuidanceProposal Evaluation Criteria1. Overall Scientific and Technical Merit2. Potential Contribution and Relevance to the DARPA Mission3. Impact on the Overall Electronics Landscape4. Cost RealismA note about cost shareProposal Tips 1Proposal Tips 2MiscellaneousContracting TimeSlide Number 38