Download - Evolutionary Feature Extraction for SAR Air to Ground Moving Target Recognition – a Statistical Approach Evolving Hardware Dr. Janusz Starzyk Ohio University

--11

Evolutionary Feature Extraction for SAR Air to Ground Moving Target Recognition – a Statistical Approach

Evolving Hardware

Dr. Janusz StarzykOhio University

--22

Neural Network Data Classification

Concept of “ Logic Brain”

Random learning data generation

Multiple space classification of data

Feature function extraction

Dynamic selectivity strategy

Training procedure for data identification

FPGA implementation for fast training process

--33


Concept of “ Logic Brain” Threshold setup converts analog to digital worldThreshold setup converts analog to digital world “ “Logic Brain” is possible based on artificial Logic Brain” is possible based on artificial neural neural

network network Random learning data generation

Gaussian distribution random multiple dimensionGaussian distribution random multiple dimension data generationdata generation

Half data sets prepared for learning procedureHalf data sets prepared for learning procedure Another half used later for training procedureAnother half used later for training procedure

Abdulqadir Alaqeeli, and Jing Pang

--44


Multiple space classification of data Each space can be represented by a set of Each space can be represented by a set of minimum base vectorsminimum base vectors

Feature function extraction and dynamic selecting strategy Conditional entropy extracts information in eachConditional entropy extracts information in each

subspacesubspace Different combinations of base vectors compose Different combinations of base vectors compose the redundant sets of new subspacethe redundant sets of new subspace

expansion strategyexpansion strategy Minimum function selection Minimum function selection

shrinking strategyshrinking strategy

--55


FPGA implementation for fast training process

Learning results are saved on boardLearning results are saved on board

Testing data sets are generated on board and sentTesting data sets are generated on board and sent

through the artificial neural network generated onthrough the artificial neural network generated on

board to test the successful data classification rateboard to test the successful data classification rate

The results are displayed on board The results are displayed on board

Promising application

Especially useful for feature extraction of large data Especially useful for feature extraction of large data setssets

Catastrophic circuit fault detectionCatastrophic circuit fault detection

--66

Information Index: Background

• A priori class probabilities are known

• Entropy measure based on conditional probabilities

H(X;Y) = P *log(P ) + P *log(P ) +

+ P *log(P ) + P *log(P )

- P *log(P ) - P *log(P )

1w 1w 2w 2w

12w 12w 21w 21w

1 1 2 2

XXX

X

X

X

X

X XX

XXX

OO OO

OOO

OO

OO

Class A

Class B

X

--77


• P1 and P2 and a priori class probabilities

• P1w and P2w are conditional probabilities of correct

classification for each class

• P12w and P21w are conditional probabilities of misclassification given a test signal

• P1w , P2w, P12w and P21w are calculated using Bayesian estimates of their probability density functions

--88


• probability density functions of P1w , P2w, P12w, P21w

pdf pdfpdf P

pdf P pdf Pw1 11 1

1 1 2 2

**

* *

pdf pdfpdf P

pdf P pdf Pw2 22 2

1 1 2 2

**

* *

pdf pdfpdf P

pdf P pdf Pw12 12 2

1 1 2 2

**

* *

pdf pdfpdf P

pdf P pdf Pw21 21 1

1 1 2 2

**

* *

--99

Direct Integration

m

iiSpdf

for N dimensions , mn grid points are needed to estimate pdf

S S i i < < S S kk

0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 00

0. 2

0. 4

0. 6

0. 8

1

1. 2

0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 00

0. 2

0. 4

0. 6

0. 8

1

1. 2

uniform uniform

gridgrid

nonuniform nonuniform

gridgrid

S S i i = = S S kk

S S kk S S iiS S kk

S S ii

--1010

Monte Carlo Integration

-100 0 100 200 300 400 500 600 7000

0.5

1

1.5

pdfpdf11

pdfpdf22

W(XW(Xii))

xxi i

m

XPP

m

ii

1

11

m

XWPP

m

ii

w

11

Xi generated with pdf1

pdf

--1111

Information Index: probability density functions

P2w

-0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1pdfs of the dominating feature for BMP2 and BTR60

--1212

Information Index: weighted pdfs

P2w

-0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1misclassification pdfs of the dominating feature for BMP2 and BTR60

feature value

rela

tive

dens

ity

--1313

Information Index: Monte Carlo Integration

• To integrate the probability density function– generate random points xi with pdf1

– weight generated points according to

– estimate the conditional probability P1w using

w xpdf P

pdf P pdf Pi11 1

1 1 2 2

( )*

* *

P P w x mw ii

m

1 1 11

( ) /

--1414

Information Index and Probability of Misclassification

0 0.2 0.4 0.6 0.8 10

0.02

0.04

0.06

0.08

0.1

0.12

0.14

information

cond

ition

al p

roba

bilit

y P

12w

--1515

Standard Deviation of Information in MC Simulation

0

5

10

15 0.2

.5.75

1

-20

-15

-10

-5

0

average information levellog2 of the number of the MC points

log2

of

the

info

rmat

ion

erro

r

--1616

Normalized Standard Deviation of Information

0

5

10

15 0.2

.5.75

1

-15

-10

-5

0

5

average information levellog2 of the number of the MC points

log2

of t

he n

orm

aliz

ed in

form

atio

n er

ror

--1717

Information Index: Status

• MIIFS was generalized to continuous distributions

• N-dimensional information index was developed

• Efficient N-dimensional integration was used

• Information error analysis was performed

• Information index can be used with non Gaussiandistributions

• For small training sets and low information index information error is larger than information

--1818

Optimum Transformation: Background

• Principal Component Analysis (PCA) based on Mahalanobis distance suffers from scaling

• PCA assumes Gaussian distributions and estimates covariance matrices and mean values

• PCA is sensitive to outliers

• Wavelets provide compact data representation and improve recognition

• Improvement shows no statistically significant difference in recognition for different wavelets

• Need for a specialized transformation

--1919

Optimum Transformation: Haar Wavelet

• Example

H =(a + a )

2 i = 0...(N / 2) -1 average

H = (a - a ) i = 0...(N / 2) -1 difference

i2i 2i+1

N/2+i 2i 2i+1

a0 a1 a2 a3 a4 a5 a6 a7

Input Signal 0.0 0.5 1.0 0.5 0 -0.5 -1 -0.5

Haar coefficients 0.25 0.75 -0.25 -0.75 -0.5 0.5 0.5 -0.5

--2020


• Repeat average and difference log2(n) times

Input Signal

Level 0

0 0.5 1 0.5 0 -0.5 -1 -0.5

Level 1 0.25 0.75 -0.25 -0.75 -0.5 0.5 0.5 -0.5

Level 2 0.5 -0.5 -0.5 0.5 0 0 -1 1

Level 3 0 1 0 -1 0 0 0 -2

--2121


• Waveform interpretation

--2222


• Matrix interpretation

• b=W*a where W

L

N

MMMMMMMMMMMMMMMMM

O

Q

PPPPPPPPPPPPPPPPP

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

5 5 0 0

0 0 5 5

1 1 0 0

0 0 1 1

25 25 25 25

5 5 5 5

5 5 5 5

1 1 1 1

. .

. .

. . . .

. . . .

. . . .

--2323


• Matrix interpretation for the class of signals B=W*A

• where A is (n x m) input signal matrix

• Selection of n best coefficients performed using the information index

Bs1=S1*W*A

• where S1 is (n x n*log2(n)) selection matrix

--2424

Optimum Transformation: Evolutionary Iterations

• Iterating on the selected result Bs2=S2*W* Bs1

• where S2 is a selection matrix orBs2=S2*W* S1*W* A

• after k iterations Bsk= Sk*W* ... S2*W* S1*W* A

• So, the optimized transformation matrix T= Sk*W* ... S2*W* S1*W

• can be obtained from the Haar wavelet

--2525


• Learning with the evolved features

0 20 40 60 80 100 120 1400

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Most selective coefficients in decreasing order

Info

rmat

ion

valu

eSelected coefficients in two class ATR problem

--2626


• Waveform interpretation of T rows

0 50 100 150-2

-1

0

1

2

0 50 100 150-2

-1

0

1

2

0 50 100 150-2

-1

0

1

2

0 50 100 150-2

-1

0

1

2

--2727


• Mean values and the evolved transformation

0 20 40 60 80 100 120 140-1.5

-1

-0.5

0

0.5

1

1.5Original Signals and the evolved transformation

Bin Index

Sig

na

l Va

lue

--2828

Two Class Training

• Training on HRR signals 17o depression angle profiles of BMP2 and BTR60

0 5 10 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 Information for BTR60 and BMP2

number of features

info

rma

tion

and

err

or

--2929

window

t

8bit

8bit

Sample # 1

Sample # m8bit

8bit

Haar-WaveletTransform

N.N. input signal is recognized

1

k

Note: k m

Wavelet-Based Reconfigurable FPGA for Classification

--3030

Block Diagram of The Parallel Architecture

R R R R RRRR

A A A A DDDD

A A D D DDAA

A D A D DADA

D= DIFFERENCE With registerd Out

A= AVERAGE With registerd Out

R= REGISTER

8 InputSamples

8 OutputSamples

0 1 2 3 4 5 6 7

(0+1)/2

(0+1)/2

0 1 2 3 4 5 6 7

(0+1)/2

0 1 2 3 4 5 6 7

--3131

Simplified Block Diagram of The Serial Architecture

R R

A R D R

D RA R D RA R

A DA D A DA D

R: register using CLBs A: registered Average D: registered difference

R: register using IOBs2

01

(0+1)/2 (0-1)(2+3)/2 (2-3)(0+1)/2 (0-1)

4

23First the BlueSecond the Green

--3232

RAM-Based Wavelet

RAM16x8

RAM16x8

RAM16x8

RAM16x8

RAM16x8PEPE PE PE

WA RA WA RA WA RA WA RA WA RA

Control Control Control Control

DoneStart

DataIn

--3333

The Processing Element

2

BA

A - B

REGISTER

REGISTER

Dt+1

Dt

A

B

A

B

8

M

Dout

8

5 CLBs

5 CLBs

Note:A=Dt and B=Dt+1

20 8 10 2 11

10 2 11 X

8 10 2 11-8 9 XX

9 6 5 X

9 9 5 X

00101

--3434

Results: For One Iteration of Haar Wavelet

• For 8 samples:– Parallel arch.: 120 CLBs, 128 IOBs, 58ns.

– Serial arch. : 98 CLBs*, 72 IOBs, 148ns*.

Parallel Arch. wins for larger number of samples.

• For 16 samples:– Parallel arch.: 320 CLBs, 256 IOBs, 233ns.

– RAM-Based arch.: 136 CLBs, 16 IOBs, ~ 1s.

RAM-Based Arch. Wins since 1s is not so slow.

------------------------------------------------------------* These values increase very fast when the # of samples increases, and the

delay becomes extremely higher.

--3535

Reconfigurable Haar-Wavelet-Based Architecture

ROM#1

ROM#2

ROM#3

RAM#1

RAM#2

RAM#3

RAM#4

RAM#5

TemporaryRAM

Controllers for Selecting Coefficients

Data In

Feedback

Selected Coefficients

PE PE PE PE

‘ Data

--3636

--3737

Test Results

• Testing on HRR signals 15o depression angle profiles of BMP2 and BTR60

• With 15 features selected correct classification for BMP2 data is 69.3% and for BTR60 is 82.6%

• Comparable results in SHARP Confusion Matrix for BMP2 data is 56.7% and for BTR60 is 67%

--3838

Problem Issues

• BTR60 signals with 17o and 15o depression angles do not have compatible statistical distributions

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

--3939

Problem Issues

• BMP2 and BTR60 signal distributions are not Gaussian

-0.5 0 0.5 1 1.5 2 2.5-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5Two dimensional projection of transformed two class HRRdata

--4040

Work Completed

• Information index and its properties

• Multidimensional MC integration

• Information as a measure of learning quality

• Information error

• Wavelets and their effect on pattern recognition

• Haar wavelet as a linear matrix operator

• Evolution of the Haar wavelet

• Statistical support for classification

--4141

Recommendations and Future Work

• Training Data must represent a statistical sample of all signals not a hand picked subset

• Probability density functions will be approximated using parametric or NN approach

• Information measure will be extended to k-class problems

• Training and test will be performed on 12 class data

• Dynamic clustering will prepare decision tree structure

• Hybrid, evolutionary classifier will be developed

Download - Evolutionary Feature Extraction for SAR Air to Ground Moving Target Recognition – a Statistical Approach Evolving Hardware Dr. Janusz Starzyk Ohio University

Top Related