1 progress report (4/14/04) virtual prototyping of advanced space system architectures based on...
Post on 17-Jan-2016
214 Views
Preview:
TRANSCRIPT
1
Progress Report (4/14/04)Progress Report (4/14/04)
Virtual Prototyping ofVirtual Prototyping ofAdvanced Space System ArchitecturesAdvanced Space System Architectures
based on RapidIObased on RapidIOSponsor:Sponsor:
Honeywell Space Systems, Clearwater, FLHoneywell Space Systems, Clearwater, FL
Principal Investigator:Principal Investigator:
Dr. Alan D. GeorgeDr. Alan D. George
OPS Graduate Assistants:OPS Graduate Assistants:
David Bueno, Ian TroxelDavid Bueno, Ian Troxel
RA Graduate Assistants:RA Graduate Assistants:
Chris Conger, Adam LekoChris Conger, Adam Leko
Modeling and Simulation (MS) GroupModeling and Simulation (MS) Group
HCS Research Laboratory, ECE DepartmentHCS Research Laboratory, ECE Department
University of FloridaUniversity of Florida
2
Presentation OutlinePresentation Outline
Project Motivation and Goals Project Task Outline
– Literature Review– RIO Component Modeling– GMTI Modeling– Systems Modeling– Test Plan
Conclusions Future Collaboration Possibilities
3
Project Motivation and GoalsProject Motivation and Goals
Determine the optimal means by which to develop RapidIO for space systems– Perform RIO switch, board and system tradeoff studies– Identify limitations of space-based RIO design– Determine design feasibility using SBR case study– Provide assistance for Honeywell proposal efforts
Lay the groundwork for future Honeywell system prototyping
4
Project TasksProject Tasks
Literature Review– RIO spec, RIO components, SBR, misc.
RIO Component Modeling– Layers, endpoints, switches, processors, etc.– GMTI traffic models, memory boards, backplanes
RIO System Modeling– Proposed systems models, test plan
Simulation Experiments Data Analysis and Report
5
Literature Review:Literature Review:OverviewOverview
Scope– Encompass major issues surrounding RapidIO-based systems– Goal: find previous work related to issues in this project
Overview of results– RapidIO information and specifications– RapidIO implementation spec sheets– General information on space-based radar (GMTI and SAR)– Switch issues
Multicasting Load balancing Shared-memory switch issues (buffer management)
Deliverable available on Honeywell HCS page:– http://www.hcs.ufl.edu/~leko/Honeywell/
6
RapidIO Specifications, ExtensionsRapidIO Specifications, Extensions
Protocol Issues, Extensions Not specified or not supported by
RapidIO protocol Main topics considered thus far include:
– Dynamic load balancing Papers regarding dynamic load balancing
schemes in other protocols, e.g. InfiniBand
– RapidFabric Protocol Extensions End-to-end flow control, support for
thousands of nodes Data streaming and traffic management “Next Generation Physical Layer”
– Multicast algorithms ACM paper discusses multicasting in
switch-based parallel systems Dr. Sarp Oral’s dissertation Other papers discuss general multicast
issues, deadlock avoidance RapidFabric specs also address multicast
– System-level failure recovery
RapidIO Specs All information by RapidIO Trade
Association – MM, MP, GSM logical layers– Common transport layer– Parallel, serial physical layer– Error management extensions
Specifications extremely detailed, describes all requirements for each layer and feature
Used as main reference for model design Motorola technical whitepaper
– Brief but thorough discussion of all specifications above
– Also good reference, much shorter than official specs with many diagrams
– [RIO Spec] techwhitepaper_rev3
7
Literature Review:Literature Review:Switch IssuesSwitch Issues
Multicasting– Switches being considered are store + forward; no deadlock possible– Modify switch internal routing table/Command and Status Registers to
support multi-valued destinations– May also use RapidIO’s built-in “Multicast event” if no payload is
needed Buffer management
– Static thresholds for different traffic classes good, but will require tuning for load
– Dynamic thresholds better and require only slightly more logic– Pushout best, but requires lots of additional logic (and silicon)
Load balancing– Simple method: static load balancing based on routing tables– Methods proposed for use with InfiniBand may be adapted for RapidIO
with extensions to protocol
8
Literature Review:Literature Review:Space-Based Radar (GMTI)Space-Based Radar (GMTI)
Found solid background information on GMTI (Ground-Moving Target Indicator) and STAP (Space-Time Adaptive Processing)
Several different GMTI algorithm variants– Pre-Doppler– Post-Doppler– Post-Doppler PRI-staggered
Different partitioning methods proposed– Parallel pipelined– Staggered iterations
Interesting note: Honeywell baseline spec for input data size much larger than those used in existing literature
9
RapidIO Component Modeling: RapidIO Component Modeling: OverviewOverview
RapidIO packet formats– Message-passing requests and responses– Flow control symbols– Packet control symbols
RapidIO endpoint model– Message-passing logical layer
Expandable to include I//O logical layer with minimal effort– Common transport layer– Parallel physical layer
RapidIO central memory switch model– Common transport layer– Parallel physical layer
GMTI traffic models– Memory board model sources/sinks traffic
Statistics gathering– RIO request stats (average latency and BW)– RIO response stats (average latency and BW)– Additional statistics components to be developed as
necessary
10
RapidIO Component Modeling: RapidIO Component Modeling: Endpoint ModelEndpoint Model
Key Features– Message-Passing Logical Layer– Common Transport Layer– Parallel Physical Layer
Receiver-controlled flow control Transmitter-controlled flow control Error detection and recovery Priority scheme for buffer management
Key Adjustable Parameters– Packet assembly delay– Packet disassembly delay– Clock frequency– Link width– Input queue length– Output queue length– Four priority thresholds
Determine the maximum number of packets that may be in a buffer to still accept a packet of a given priority Example: If threshold for priority 0 packets is 4, incoming priority 0 packets will be rejected if there are 5 or
more packets currently in the input buffer– Number of device ID bytes in packet (affects packet size and max number of devices in system)– Buffer memory copy delay per byte
Note: As there is no “link model,” parameters such as clock frequency and link width are incorporated into the endpoint model.
High-level Endpoint Model
11
RapidIO Component Modeling: RapidIO Component Modeling: Endpoint Verification testsEndpoint Verification tests
2-node BW/latency results shown to right → 2 test cases: single packet, continuous packet stream
Single-packet test– Average latency, BW for all possible packet
sizes– Theoretical BW limit: 4 Gbps– Latency determined by:
transmit time + packet disassembly time
– Error detection and correction Insert packet CRC error, control symbol parity bit
error, ack error Observe/verify that link partners correct error Also insert error in control symbol related to error
recovery, verify behavior
Packet-stream test (256-byte packets)– Around 3.5 Gbps data generation rate, link
becomes saturated– Once saturated, average latency increases with
time
Single Packet BW (Rx Controlled)
1.500
2.000
2.500
3.000
3.500
4.000
4.500
32 64 128 256
Packet Size
Gb
ps Actual BW (Rx)
Eff. BW (Rx)
Stream Test
0
5000
10000
15000
20000
25000
30000
2.980 3.278 3.576 3.874 4.172
Data Generation Rate (Gbps)
Ave
rag
e R
equ
est
Lat
ency
(n
s)
1 CPI
2 CPI
3 CPI
12
RapidIO Component Modeling: RapidIO Component Modeling: Central-Memory Switch ModelCentral-Memory Switch Model
Key Features– Selectable cut-through or store-
and-forward routing– Non-blocking architecture– Routes packets based solely on
destination ID (read from a routing table file) as per RIO spec
– RIO Common Transport Layer– RIO Parallel Physical Layer
13
RapidIO Component Modeling: RapidIO Component Modeling: Central-Memory Switch ModelCentral-Memory Switch Model
Key Adjustable Parameters
– Cut-through/store-and-forward behavior
– Average central-memory read latency
– Average central-memory write latency
– Central-memory size
– Link width, clock frequency, and other RapidIO physical layer parameters
– Static priority threshold scheme based on free memory in switch
14
RapidIO Component Modeling: RapidIO Component Modeling: Switch/Small-System VerificationSwitch/Small-System Verification
Verified N-hop latency for N switches (endpoint-to-endpoint latency)
– cut-throughLatency = AT + XT + N x WL + DT
– store-and-forwardLatency = AT + XT + N(WL + XT) + DT
– Passed packets through multiple switches, observe timestamps at various points compared w/ expected values
Error correction, flow control verification (Rx/Tx)– Error tests similar to 2-node, except using switch port as partner– Flow control verified by using various window sizes, observing link partner chatter– Adjusted endpoint and switch priority thresholds, verify correct acceptance/denial
All-to-one, One-to-all delivery verification– Using system shown to right, packets from
generator sent to all nodes round-robin– All nodes configured to receive packets and
redirect to single memory sink– Even with saturated links, each node eventually
receives expected packets
XT – transmission timeWL – switch memory write latencyDT – packet disassembly delayAT – packet assembly delayN – number of hops between endpoints
15
GMTI Modeling:GMTI Modeling: Algorithm Description Algorithm Description
Global memory board sends groups of RapidIO packets to processing boards
Each processing board has 4 processors GMTI algorithm used is a post-Doppler variant
with 4 stages:– Pulse compression– Doppler processing– Weight computation/Beamforming (STAP)– CFAR
After processing, packets containing detection information get sent back to main memory
16
GMTI Modeling:GMTI Modeling:Algorithm Flow OverviewAlgorithm Flow Overview
Input data cube Processing board 1
Processing board 2
Processing board 3
Processing board 9
Global memory
9 9
17
GMTI Modeling:GMTI Modeling:Processing Board OverviewProcessing Board Overview
Pulse compression STAPDoppler CFAR
Processing element
CPU CPU
CPU CPU
RIO interface
RIO interface
RIO interface
RIO interface
RIO switch
$
$
$
$
18
GMTI Modeling:GMTI Modeling: Data Cube Generator Data Cube Generator
Key features– Can control amount of data
generated, as well as rate– Evenly spreads out packets over
entire CPI Diagram to right shows effect of
increasing generation rate while keeping data size constant
Blocks represent packets, dashed lines represent CPI
Must be careful to not saturate → balance data size/CPI
– No endpoint, only generates data
LegendBlue oval – signal new CPI, calculate
number of packets to create
Red square – loop over number of packets
Purple square – time delay between each packet
Green square – create a packet, fill out necessary fields
Pink oval – create last packet, may be smaller size
19
GMTI Modeling:GMTI Modeling: Global Memory Board Global Memory Board
Key features– May serve as sensor data source, global memory
endpoint (port), or both– One endpoint and one generator per model– Using multiple instances in a system represents:
Multiple sensors each generating part of the complete data cube
Multiple ports to the same global memory
– 3 simple blocks in yellow circle give significant flexibility in controlling data traffic
Key adjustable parameters– Ranges, channels, pulses– CPI– Size per element, packet size– Packet group size (message length)– Destination ID(s) to send to– Memory delay per byte– All RapidIO endpoint parameters
BLUE – data cube generator YELLOW – traffic shaper RED – RapidIO Endpoint
20
GMTI Modeling:GMTI Modeling:Baseline Board/Backplane ModelsBaseline Board/Backplane Models
Baseline Processor Board Model– Four processors
Compute node ASIC + RIO endpoint– One 8-port switch– One link to each endpoint– Four links available for backplane
connection
Baseline Backplane Model– Four 8-port switches
Minimal number for a symmetric configuration– Two links to each processor board
“Wastes” 2 links per board– One link to memory board– Many additional configurations to explore
21
GMTI Modeling:GMTI Modeling: System Model System Model
Baseline configuration– 125MHz DDR RIO links
– Receiver-controlled flow control
– Two-link trunk between each switch
– Static, store-and-forward routing
– 10kB switch central-memory size
– 72ns average read/write latency per packet for switch memory
– Baseline GMTI algorithm Pulses = 126 Ranges = 120000 Beams = 6 CPI = 256ms
22
Additional Proposed System Model #1Additional Proposed System Model #1Generic Backplane Model Double links for up to 12 boards Routing tables and switch setup independent of
number of boards actually used, or application Two levels of switches
– Each 1st-level switch has two links to level 2– 2 free links per switch, currently forms ring– Each 2nd-level switch is connected to all 1st level, as well
as double global memory link (red circles, bottom)
Uniform, dual-link access to all other boards– Logical neighbors one less hop with ring
N-Board Configuration (N = 1 … 12) Diagram to left shows system with 6 boards inserted All 4 global-memory links are used (bottom), assuming 4
ports/endpoints to memory; input data cube from global memory also
System purposefully built to be application independent, reusable– Attempt to maintain performance with increased versatility– GMTI may be pipelined or staggered
Easy to simulate applications with different numbers of processors Each board also has two free links, could be used
23
Additional Proposed System Model #2Additional Proposed System Model #2Blue circles GMTI
algorithm tasks
Red circles Global
memory ports
Custom System Configuration – Pipelined Algorithm– Currently assumes # boards as specified in GMTI spreadsheet
Changing # procs/boards may require remaking routing tables, switch layout
– For tasks with > 1 board allocated, form star topology among boards for ↑ intra-task communication If not necessary/worthwhile, use extra board links for double connections to switches All switches have free links for small architectural adjustments
– Algorithm will be pipelined – Not all global memory ports must be used; currently have switch support for up to 4 input, 3 output
24
Proposed ExperimentsProposed Experiments
Motivation– Maximize performance
Minimize the time needed for one CPI of the GMTI algorithm (must be < 256ms for baseline configuration)
– Minimize cost/power Number of boards/processors/switches
Independent variables– System configuration– Routing tables– Store-and-forward vs. cut-through routing– Flow control method (tx-controlled vs. rx-controlled)– Link rate (125MHz vs. 250 MHz)– Priority thresholds– Endpoint queue lengths– Switch central memory size
Initial experiments will examine the effects of varying these one at a time Experiments to follow will seek correlations between the parameters
25
Additional Features to ExploreAdditional Features to Explore Multicast and flow control extensions for RIO
– Spec to be released this year– Part of RapidFabric extensions
Dynamic load balancing– RIO spec technically allows only static load
balancing Must accomplish using clever routing tables
– Possible to extend the spec to allow some dynamic load balancing
Literature search on this topic revealed several promising directions
Primary challenge is RIO packet delivery ordering rules
– Experiments can be conducted to determine optimal method for dynamic load balancing, if Honeywell’s applications warrant
SBR algorithm alternatives– Pipelining of GMTI algorithm– Staggering of GMTI algorithm– SAR (our initial focus directed at GMTI)
RapidIO I/O Logical Layer– Remote reads/writes instead of message
passing– Can be easily added to the models if deemed
important by Honeywell for their applications
Diagram courtesy of: RapidFabric RapidIO Extensions Whitepaper, March 2004.
RapidIO Architecture Layers Highlighting RapidFabric Extensions
26
Project TasksProject Tasks
Literature Review– RIO spec, RIO components, SBR, misc.
RIO Component Modeling– Layers, endpoints, switches, processors, etc.– GMTI traffic models, memory boards, backplanes
RIO System Modeling– Proposed systems models, test plan
Simulation Experiments Data Analysis and Report
27
Project TimetableProject Timetable
Literature Review [complete, yet ongoing]– RIO spec, RIO components, SBR, misc.
RIO Component Modeling [complete]– Layers, endpoints, switches, processors, etc.– GMTI traffic models, memory boards, backplanes
RIO System Modeling [in progress]– Proposed systems models, test plan
Simulation Experiments [to begin May 10th] Data Analysis and Report [to begin June 20th]
28
ConclusionsConclusionsCompleted literature review
– Will include future topics as requiredRIO components and systems are well
underwayGMTI case study is well underwayExperiments identifiedFoundation for future integrated payload
prototyping is well underway
29
Future Collaboration PossibilitiesFuture Collaboration Possibilities Expanding the current project
– Jeremy Ramos’s suggestions for ST-8 Examine latency sensitivity in RIO systems with data and control packets Examine RIO’s ability to throttle flow control for buffered pipeline processing with
systems that have limited software support (i.e. RC devices)
– Include other aspects of UF’s Fast and Accurate Simulation Environment (FASE) Algorithm profiling, system tradeoff analysis
New directions– Wireless RC interconnects proposal– ST-8 / Integrated Payload middleware study
Green Hills and other RTOS providers (evaluation / collaboration) Monitoring and management of RC components (light version of UF’s CARMA) Algorithm / system prototyping
– Other possibilities?
top related