chrec f3: target tracking rafael garcia 11/26/08
TRANSCRIPT
CHREC F3: Target CHREC F3: Target TrackingTracking
Rafael Garcia
11/26/08
2
F3 Goals, Motivations, & F3 Goals, Motivations, & ChallengesChallenges Goals
Develop applications & design strategies for scalable architectures from case-study
Analyze & examine available multi-FPGA platforms and tools for scalable system design
Motivations Meet performance requirements in HPC/HPEC
scenarios by mapping across multiple FPGAs Exploit multi-FPGA platforms to develop larger,
complex designs and algorithms Increase understanding of performance
prediction, power, and usability for scalable apps
Challenges Perform multilevel algorithm partitioning, analysis,
and optimization for multi-FPGA systems Determine influence of application characteristics
on selection of platforms, tools and languages
F3Insights
Formulation
Translation
Desig
n
Exe
cuti
on
Kalman Filter Overview Traditional Kalman filters estimate the state of a dynamic
system in a noisy environment Commonly used in target prediction and can be extended to
multiple dimensions, targets, and models Excellent target tracker when an accurate model is known Useful even if an accurate model is not known
Current Architecture 4 tightly coupled FPGAs mapped to 4 quadrants
System is driven by two global clocks 100MHZ inter-FPGA communication links 50MHz data-processing clock
2-step processing cycle returns results at 25MSa/s Inter-FPGA communication occurs when target crosses a quadrant
boundary Current state of target is passed along
Non-pipelined design 2-step cycle where one cycle depends on the previous one and the
other cycle depends on pseudo-sensor data from host CPU Low frequency and lack of
pipeline registers is expected to lower power consumption
2-cycle design simplifies communication network
Current Architecture
Continuously receiving pseudo-sensor data and returning condensed information
Limited to a single target per quadrant Set sensor sampling rate of 25MSa/s
Resource M4K rams DSPs ALUTs
Stratix II: EP2S180F1020C3
1% 15% 2%
Simplified Algorithm Assumes steady-state
operation Target must closely follow
given movement model for accurate results Allows for precomputed
covariance and Kalman-gain terms
Model tracks four parameters Horizontal position Vertical position Horizontal velocity Vertical velocity
Algorithm Changes Remove the hardcoded
terms, increasing prediction accuracy during non-steady-state situations
Modify model to include Z-axis parameters for airborne targets
Sensor Target Precision Resource Kernel
Low Power Slow Fixed Low Kalman FilterFast Sampling Fast Fixed Low Kalman Filter
Multi-Scale Airborne Floating High MKSHigh-Noise Noisy Floating Medium Kalman FilterSelective Multiple Floating High Feature Selection
New Module Types
RCML Representation
BCast
Start/Initialize
for each C value in PredictionVector
Time-Update (“Predict”)
i=4
Update error
covariance
Next-state prediction
GatherReport Current Results
for each D value in MeasurementVector
Measurement-Update (“Correct”)
i=4
Compute Kalman
gain
Correct prediction
Update error
covariance
Time-Step Advance
Generate Sensor
Readings
Data Set: PredictionVectorElement Type: fixed Position
fixed AccelerationNum Elements:= 4
Data Set: MeasurementVectorElement Type: fixed Position
fixed AccelerationNum Elements:= 4
Kalman Filter Estimates state of a dynamic system in a noisy
environment In this case, the ‘dynamic system’ is a moving target
Commonly used in target prediction and can be extended to multiple dimensions, targets, and models
Assumes sensor noise is white Gaussian noise Requires a pre-programmed
model describing the target’s motion Works in a continuous
2-cycle loop Developed in 1960 by
Rudolf E. Kalman (A UF professor from1971-1992!)
Kalman Filter can be viewed as a simple black box An input stream of samples measuring a target’s position
is contaminated with noisy samples The output is a stream of samples with most of the noisy
samples filtered
Kalman System Models
Accurate Samples
Noisy Samples
MostlyAccurate Samples
Kalman Filter
-9.8 m/sNE wind at 23mph
FollowsRoad
Reasons for sensor noise Battery Power
variable battery voltage voltage regulators cost money, draw power, and are not perfect
Sensors low quality sensors
cost-cutting for mass production sometimes requires cheap sensors incorrectly deployed sensors
bad orientation, obstructed sensor Environment
environmental conditions rain, dust, night-time tracking, snow
Multiple targets misinterpreted samples from neighboring targets during multiple-target
tracking Sensor processing stage must ensure proper target isolation
Wireless signal bad data from neighboring sensors due to a weak wireless signal
Kalman Filter example
PR Virtual Architecture with Kalman Filters Sensor records samples
Image processing step extracts specific features Target size, vertical position, horizontal position, target bearing, elevation, etc.
Kalman filters extract sensor noise Results are sent to a central location to be displayed
Module interface
Module interface
Module interface
Module interface
Module interface
Kalman
filter
Kalman
filter
Kalman
filter
Kalman
filter
Kalman
filter
Switch 1 Switch 2 Switch 3 Switch 4 Switch 5Sensor Interface
Display Interface
Communication architecture
VLX25
FPGA and PR benefits for the Kalman Filter FPGA amenable features
Low memory requirements Simple filter with streaming inputs and outputs
Can be implemented using only logic and MAC units Requires only multiplication and addition
No complex time-consuming operations such as division, square-root, differentiation, etc.
Low bandwidth requirements Filter receives/produces a stream of
coordinates, not a stream of images PR amenable features
Optimum resource usage The right filter type for the right job
Swapping modules does not halt execution Active filters are never disturbed
Experimental FPGA Experimental FPGA Power MeasurementsPower Measurements
GiDEL Host Specifications Dual Xeon 3.00 GHz processors (Pentium 4 era) 2GB RAM Single 500GB hard drive CD Drive 600W max power supply (Kappa clone)
ProcStar II Power Characteristics Main board supply rated at 7.6A at 3.3V
7.6A × 3.3V = 25.08W maximum power available to: Stratix II EP2S180 FPGA (4x) 2GB SODIMM DDR memory(2x)(only 1 used for tests) 64MB SRAM memory (8x) Miscellaneous oscillators, peripherals, controllers, etc.
This means roughly 5W max available to each FPGA Test Design Characteristics
Kalman tracking filters Heavy multiplier usage, no block rams, minimal logic usage (w/ dedicated multipliers)
In all cases, design runs at 33MHz
Experimental Setup
Methodology GiDEL host system measured without FPGA board
P3 Kill-A-Watt AC power meter used for measurements 0.2% documented accuracy
Accurate to within 1 Watt 7 different test cases with varying power utilization
GiDEL host system measured with FPGA board Same 7 test cases were used (without loading an FPGA design)
This provides minimum power-use baseline for ProcStar II GiDEL board is loaded with FPGA-computationally intensive design
CPU is kept idle Power consumption under regular design is measured (@ 33 MHz)
2% logic use (per FPGA) 15% multiplier use (per FPGA) 1 filter instance per FPGA
Power consumption under maximum-multiplier-use design is measured (@ 33 MHz) 4% logic use 88% multiplier use 7 filter instances per FPGA
Power consumption under maximum-logic-use design is measured (@ 33 MHz) 77% logic use 0% multiplier use 34 filter instances per FPGA
Test Cases Without ProcStar II
With ProcStar II
1. Server off (not standby)
8 W 8 W
2. Idle 127 W 137 W
3. Idle with CDROM spinning
131 W 141 W
4. Full HDD load (defrag)
132 W 143 W
5. Full CPU load (1 thread)
188 W 198 W
6. Full CPU load (4 threads)
255 W 257 W
7. Full CPU/HDD load (3 threads, defrag)
258 W 264 W
Results: Baseline ProcStar II
Threads are simple while(1) loops Although only 2 cores are present, 4 threads were used to bypass Hyper-threading and
OS scheduling HDD load is an exception since defrag requires its own thread to be effective
Results: Kalman Filters on ProcStar II
Power estimates 12.5% toggle rate assumed @ 33 MHz Experimental numbers below assume FPGAs consume
all power (ie. ProcStar II memories, glue logic, etc. consume 0W)
Design 1 140 W total power
~3.25 W per FPGA 15% mult., 2% logic 1 filter instance, high Fmax
Design 2 140 W total power
~3.25 W per FPGA 88% mult., 4% logic 7 filter instances, high Fmax
Design 3 152 W total power
~6.25 W per FPGA 0% mult., 77% logic 34 filter instances, low Fmax
Results: Kalman Filter in ProcStar II
*Measured power is derived by subtracting baseline power consumption on ProcStar II board from measured power consumption and dividing by 4 Power consumed from board components not accounted for, actual FPGA power consumption is lower
Questions?