slaac s ystems l evel a pplications of a daptive c omputing

29
SLAAC S ystems L evel A pplications of A daptive C omputing DARPA/ITO Adaptive Computing Systems PI Meeting Napa, California April 13-14 Presented by: Bob Parker Deputy Director, Information Sciences Institute

Upload: courtney-foreman

Post on 31-Dec-2015

39 views

Category:

Documents


0 download

DESCRIPTION

SLAAC S ystems L evel A pplications of A daptive C omputing. DARPA/ITO Adaptive Computing Systems PI Meeting Napa, California April 13-14 Presented by: Bob Parker Deputy Director, Information Sciences Institute. System Level Applications of. Adaptive Computing. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

SLAACSystems Level Applications of

Adaptive Computing

DARPA/ITO Adaptive Computing Systems PI Meeting

Napa, California

April 13-14

Presented by:

Bob ParkerDeputy Director,

Information Sciences Institute

Page 2: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 2Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

System Level Applications ofAdaptive Computing

Team Members: USC/ISI (Lead), BYU, UCLA, Sandia National Labs

Significant reduction in power, weight, volume,and cost for several challenging DoD embeddedapplications

•SAR ATR•Sonar Beamforming•IR ATR•Others

Utilizing Three Phases of Adaptive Computing Components Large Current Generation FPGAs

Rapid Reconfigurable and/or Fine Grain FPGAsHybrid FPGAs

Integrating Multiple Constituent TechnologiesScalable Embedded Baseboard Gigabit/Sec NetworkingModular Adaptive Compute ModulesSmart Network Based Control SoftwareAlgorithm Mapping Tools

Developing Reference PlatformsFlight Worthy Deployable SystemLow Cost Researchers Kit

‘97 ‘98 ‘99

Lab Demo of an ACS implemented SAR ATR algorithm

Embedded SAR ATR Demo of ACS HW (Clear, 1Mpixel/s, 6TT)First Generation of Reference Platforms

‘01

Embedded SAR ATR Demo (CC&D, 1Mpixel/s, 6TT)

‘00

Embedded SAR ATR Demo(CC&D, 10Mpixel/s, 6TT)

Page 3: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 3Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

SLAAC Objectives

Define a system-level open, distributed heterogeneous adaptive systems architecture

Design, develop and evolve scalable reference platforms implementing the adaptive systems architecture

Validate the approach by deploying reference platforms in multiple defense application domains SAR ATR Sonar Beamforming IR ATR Others

Page 4: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 4Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

SLAAC Affiliates

ACSResearch

Community

BYU

Sandia

UCLA

ISI

SandiaSAR/ATR

NVL

IRATR

NUWC

SonarBeamforming

LANL

Ultra Wide- Band Coherent RF

LANL Multi-

dimensional

Image

Processing

Lockheed

Martin

App

licat

ions

Cha

lleng

e Pr

oble

m O

wne

rsSL

AA

C D

evel

oper

s

Electronic

Counter-

measuresC

ompo

nent

Dev

elop

ers

Page 5: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 5Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

SLAAC Architecture

ACSDevice

ControlProcessor

ControlProcessor

DSPDevice

ControlProcessor

Network

Host

NetworkInterface

Processor

Host

ACSDevice

ControlProcessor

Sensor

NetworkInterface

Processor

Myricom L4 Orca Board

X1

X2

XP_RIGHT

XP_RIGHT

X0

XP_LEFT

XP_LEFT

XP

_XB

AR

XP

_XB

AR

X0_LEFT

X0_RIGHT

X0_

XB

AR

PROMCLK

PMC BUS

PCI BUS

SLAAC1 Board UCLA Board

Myricom L5 Baseboard

ACSDevice

ControlProcessor

Page 6: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 6Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

SLAAC Programming Model

Single host program controls distributed system of nodes and channels system dynamically

allocated at runtime multiple hosts compete

for nodes channels stream data

between host/nodes

1

2

3Network

NodesHost

Page 7: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 7Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Runtime System

Runtime SystemMessaging Layer

Network Layer

Application

System Layer - High-level programming interface (e.g., ACS_Create_Sytem (Node_list, Channel_list))

Node Layer - Hide-device specific information (e.g., FPGA configuration)

Control Layer - Node-independent communication commands (i.e., blocking and non-blocking message passing primitives)

Page 8: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 8Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Remote Node Processing Alternatives

Application

Runtime

Messaging

Messaging

Runtime

FPGA

Application

Runtime

Messaging

Messaging

Runtime

Application

Runtime

FPGA

Host Node

Remote Node

Network Network

•Less power required from compute node

•Less latency between application and low-level control

Page 9: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 9Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Runtime and Debugging

Interactive debugger all system layer C functions

provided in command-line interface

symbolic VHDL debugging support using readback

single-step clock scriptable

SLAAC Runtime monitor system state hardware diagnostics

Other tools network traffic monitors

(MPI based?) load balancing visualization tools

Page 10: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 10Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Runtime Status

Complete System Layer API specification Control Layer API specification, partially simulated

Scheduled May: VHDL simulation of SLAAC board June: Implementation of basic runtime system functions

Page 11: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 11Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Development Platform Path

SBC

SLAAC Double-

Wide PMC CardM

yrin

et

SBC

SLAAC Double-

Wide PMC Card

P0/

Myr

inet

L5 Baseboard

SLAAC

P0

/ Myr

inet

L5 Baseboard

SLAAC

ImprovedComputeDensity

ImprovedDevelopment Environment

Low Cost COTS Development Platform

SBC w/ External Network

SBC w/ Embedded Network

Fully Embedded Platform

SBC

SLAAC1

PMC

Board

Myrinet

PMC

Card

SBC

SLAAC1

PMC

Board

Myrinet

PMC

Card

SLAAC Runtime System

System Layer

Node Layer

Control Layer

System Layer

Node Layer

Control Layer

System Layer

Node Layer

Control Layer

System Layer

Node Layer

Control Layer

Page 12: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 12Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Hardware Platforms and Software Development

•Low risk development path

•Standards compliant (MPI, VxWorks)

•Recompile to change platforms

•GP programming environment at node level

•Bandwidth limited by MPI

•Custom network interface program (exploits GM)

•Direct network/compute connection

•Immature development environment

•SLAAC provides programming environment

•Maximum bandwidth

Hardware Platform Node O.S No Node O.S.

Cluster of workstations MPI, Linix or NT, PCI GM, PCI

SBC w/ external network MPI, VxWorks, PMC MPI, PMC

SBC w/ embedded network MPI, VxWorks, VME P0 GM, VME P0

Fully Embedded ? GM, VME P0

NodeApplication

Runtime

COTS OS

HostApplication

Runtime

COTS OS

Node O.S.

Risk

Performance

NodeApplication

Runtime

Custom NI

HostApplication

Runtime

COTS OS

No Node O.S.

Page 13: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 13Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

BYU/UCLA Domain-Specific Compilers for ATR

BYU: Focus of Attention (FOA) UCLA: Template Matching

Image Morphology“CYTO” code

NeighborhoodProcessor Generator

SynopsysLogic Synthesis

XilinxLogic & Route

FPGA

Hand optimizedneighborhood

operators(Viewlogic Library)

Map “CYTO’ neighborhood operations to pre-defined FPGA blocks High packing density to enable single configuration

Templates

CorrelatorGenerator

SynopsysLogic Synthesis

XilinxPlace &Route

FPGA FPGAFPGA FPGA

Optimize VHDL using template overlap Creates optimized template subset withminimum number of reconfigurations

VHDL (Structural)

OptimizationHere

OptimizationHere

VHDL (Structural)

Page 14: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 14Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

•Mojave board can interface to the i960development system for in-housetesting (as shown), or with theMyricom LANai board.

Host ProcessorPCI Slot

Bus Connector

PCI BusExternal

System Processor

MojaveBoard

Static FPGA

The UCLA Testbench

Page 15: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 15Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

X1

X2

XP_RIGHT

XP_RIGHT

X0

XP_LEFT

XP_LEFT

XP

_XB

AR

XP

_XB

AR

X0_LEFT

X0_RIGHT

X0_

XB

AR

PROMCLK

FIFO Data (64 pins)FIFO Control (~16 pins)Clock, Configuration, InhibitExternal Memory Bus

PMC BUS

PCI BUS

Jumper block256Kx18 SRAM

SLAAC 1 Board

Page 16: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 16Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

40,000 sqnm / day@ 1 ft. Resolution

Current ChallengeSystem Parameter Scale Factor

SAR Area Coverage Rate(sqnm / day @1 ft Res.)

40,0001000 40X (FOA,*

*Corresponds to a data rate of 40 Megapixels / sec

Level / Difficulty of CC&D HighLow 100X (Indexer)10X (Ident.)

Number of Target Classes 30 6 5X (Indexer,Ident.)

Indexer,Ident.)

Surveillance Challenge Problem - SAR / ATR

Page 17: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 17Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Project Benefit IncludesImproved Compute Density

(Scaled To Challenge Size; Assuming FOA, Indexer, and 1 Identifier)

Clear

Clear

CC&D

CC&D

Past STARLOSSystems

JSTARS ‘96 Demo

JSTARS ‘97 Demo

(Early Two-Level Multicomputers + Algorithms)DARPA EHPC Program

DARPA ACS Program

5X Over Moore’s LawACS Large FPGAs with on-chip SRAM blocks + Algorithms

10X Over Moore’s LawACS Hybrid Chips + Algorithms

‘99 Demo Range

‘01 Demo Range

Clear

CC&D

‘98 Demo

‘91 ‘94 ‘96‘92 ‘93 ‘95 ‘97 ‘98 ‘99 ‘00 ‘01

Year

0.01

0.1

1.0

10

100

1000

10000

100000

1000000

Num

ber

of 6

U V

ME

Cha

ssis

(VM

E C

hass

is =

3.5

cft

, 80

lbs,

700

W, $

400,

000)

Page 18: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 18Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

ATR Flight Demo System

For: 1 Mpixel/sec with 6 target configurations (targets in-the-clear scenario) Baseline 1996 System:Hardware Architecture— Systolic – 3 algorithm modules— SIMD – 1 algorithm module— Early 2-level multiprocessors/DSP –

3 algorithm modules

1997 Flight Demo System:Hardware Architecture— 2-level multiprocessor/DSP – 8 algorithm

modules (1 additional algorithm module implemented over baseline system)

Power, Volume, Weight Product (W- ft3 -lbs.)Power (W)

1680

453W

Weight (lbs.)354

124lbs.

Volume (ft3)

17.5

7ft3

(5 VME chassis)

(2 VME chassis)

10,407,600

393,204

W-

ft3

-lb

s 26.47 ratio

26.47 ratio

2-level multiprocessor/DSP configuration implements algorithms (with additionalalgorithm module) with better performance and significantly lower power, size,and weight versus baseline implementation

Page 19: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 19Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Common SAR/ATR - DARPA ACS & EHPC FY97 Testbed

MIMD

SIMD

Systolic

Laboratory Development ElementReal-Time Deployable Element(Common ATR Model Year 1)

Joint STARS

SBC

Myrinet

MIMDIntel

Paragon

SIMDCPP DAP

SystolicDatacube/Custom

WorkstationsSUN / SGI / PC

MIMDSGI

SystolicCYTO

RAID(Data) Next Generation

Embeddable HPC Technologies

Myrinet

HIPPI

SIMDCNAPS

MIMDMulticomputersPowerPC HPSC

Page 20: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 20Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

JSTARS ATR Processor

PowerPC Multicomputer 13 Commercial Motorola VMEbus

CPU boards– 200Mhz 603e PowerPC per board– 5.2 GFLOPS Peak

Commercial Myrinet High Speed Communications

– 1.28Gbits/sec full duplex– Cross point topology

SHARC Multicomputer 4 Sanders HPSC processor boards

– 8 33Mhz Analog Devices SHARC DSP processors per board

– 3.2 GFLOPS Peak Myrinet High Speed Communications

Page 21: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 21Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

TMD/RTM Real-Time ATR

delivered 6/97 FOA, SLD, MPM, MSE, CRM, LPM,& PGA Supported 5 real-time ESAR/ATR airborne

flight exercises– 2 Engineering check-out flights– 3 Phase III evaluation flights

Features 1 Mpixel/sec, 6 Configurations targets in-the-

clear scenario Large scale dynamic range capability Modular, Scalable, Plug & Play Architecture 2 VMEbus Chassis ATR System Heterogeneous Two-Level Multicomputer,

COTS PowerPC and Sanders SHARC

DARPA SAR ATR EHPC Testbed Experiments in Action

RTM ATR Advanced Technology Demonstration

This work performed under the sponsorship of the Air Force Aeronautical Systems Center and the Air Force Research Laboratory (formerly Rome Laboratory)

Page 22: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 22Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Joint STARS SAR/ATR Transition

• Developed Airborne Real-time SAR / ATR System• Demonstrated initial system at Pentagon(Sep 96)• All COTS system implementation (Apr 97)• Full system integrated on T3 aircraft (Aug 97)• Engineering / integration flights completed with

fully operational system (Sep 97)• Three real-time demonstration flights (Oct 97)• Operationally significant Pd/FAR performance

• Jointly managed, USAF ASC/FBXT and AFRL/IF.• Provided JSTARS with a real-time ATR capability.• Leveraged prior Service & DARPA investments.• Sandia developed the ATR System, Northrop

Grumman developed the ESAR system and led the integration of both systems onto the aircraft.

• ATR system enables an image analysts to identify threats in real-time by prescreening large amounts of data for potential targets.

Description Accomplishments

Page 23: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 23Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

BYU- FOA and Indexing  

SAR Image

Coarse Data

Sensor PreprocessorFine Data

Identification

Location AngleEstimate

Target ID

Confidence

Indexer

Detection

Focus ofAttention

Superquant FOA 1 pass adaptive threshold technique Produces ROI blocks for indexing >7.8 Gbops/second/image

1 Mpixel/second, FY98 40 Mpixel/second, FY01

CC&D indexing Algorithm definition in process

Page 24: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 24Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

BYU - SAR ATR Status Non-Adaptive FOA

Wildforce PCI platform 3 months to retarget to SLAAC board

Compilation Strategies Current approach

– “VHDL synthesis from scratch”– Traditional tool flow

Planned approach - July 1999– “Gate Array” approach– Fixed chip floorplan regular arrays

30x speedup, compile < 1 hour~ 10% efficiency loss

Page 25: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 25Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Sonar Beamforming with 96 Element TB-23

Goals: First RT matched field algorithm deployment 1000x 51 Beams 51000 Beams Ranging + “look forward” capability Demonstrate adaptation among algorithms at sea Validate FPGAs for signal processing

Computation: 2 stage (course and fine) 16 Gop/sec, 2.5 GB memory, 80 MB/sec I/O

Approach: Use k- + matched field algorithms Leverage ACS to provide course grain RTR

– Environmental adaptability– Multiple resolution processing

Page 26: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 26Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

STEALTH ADVANCEMENTSN

OIS

E L

EV

EL

SN

OIS

E L

EV

EL

S

19601960 19701970 19801980 19901990 20002000 20102010

VICTOR 1VICTOR 1

ALFAALFA

VICTOR IIIVICTOR III

AKULAAKULA

IMPROVEDIMPROVEDVICTOR IIIVICTOR III

IMPROVEDIMPROVEDAKULAAKULA

SEVERODVINSKSEVERODVINSK

594594

637637

688688

688I688I

SSN-21SSN-21 NSSNNSSN

LEAD SHIPLEAD SHIPKEEL LAIDKEEL LAID

DEC 93DEC 93

BROADBAND QUIETING COMPARISONBROADBAND QUIETING COMPARISON

Credit: NUWC

Page 27: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 27Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Sonar Status

Algorithm identified by NUWC 4/3/98 validation in process

BYU mapping of NUWC algorithm underway for Wildforce board map to SLAAC boards when available

Sonar module generation Operational generators include:

– pipelined multipliers & CORDIC units

– C/C++ programs generating VHDL code

– generators used in Wildforce mapping

Page 28: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 28Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

Timeline - Baseline + Option

1 BYU prelim. mappingto ACS2 Submarine I/F specified and I/F construction begun

FY99 FY00 FY01FY98

Feasibility Study 1st Mapping 1 Advanced algor. dev.on RRP2 Sonar module generators

1 Advanced algor. dev.2 Sonar subcompilers

Top level compiler

NUWC specifiesalgorithm

1st SLAAC ACSboard avail.

BYU end-to-end lab demo completeand delivered to ISI

Advanced SLAAC ACS boards avail.

ISI delivers demo system to NUWCfor testing

SEA TEST (summer 2000)

Page 29: SLAAC S ystems  L evel  A pplications of  A daptive  C omputing

04/19/23 29Bob ParkerUSC INFORMATION SCIENCES INSTITUTE

SLAAC Conclusions

Great early success in deployed capability Interesting runtime tradeoffs Significant risk reduction through COTS standards Promising simulation results - headed for

hardware Adaptive systems are hard - but getting easier

http://www.east.isi.edu/SLAAC