a plea for adaptive data analysis an introduction to hht norden e. huang research center for...
Post on 20-Dec-2015
215 views
TRANSCRIPT
A Plea for Adaptive Data Analysis
An Introduction to HHT
Norden E. HuangResearch Center for Adaptive Data Analysis
National Central University
Ever since the advance of computer, there is an explosion of data.
The situation has changed from a thirsty for data to that of drinking from a fire hydrant.
We are drowning in data, but thirsty for
knowledge!
Henri Poincaré
Science is built up of facts*,
as a house is built of stones;
but an accumulation of facts is no more a science
than a heap of stones is a house.
* Here facts are indeed data.
Scientific Activities
Collecting, analyzing, synthesizing, and theorizing are the core of scientific activities.
Theory without data to prove is just hypothesis.
Therefore, data analysis is a key link in this continuous loop.
Different Paradigms IMathematics vs. Science/Engineering
• Mathematicians
• Absolute proofs
• Logic consistency
• Mathematical rigor
• Scientists/Engineers
• Agreement with observations
• Physical meaning
• Working Approximations
Different Paradigms IIMathematics vs. Science/Engineering
• Mathematicians
• Idealized Spaces
• Perfect world in which everything is known
• Inconsistency in the different spaces and the real world
• Scientists/Engineers
• Real Space
• Real world in which knowledge is incomplete and limited
• Constancy in the real world within allowable approximation
Rigor vs. Reality
As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.
Albert Einstein
Data Processing and Data Analysis
• Processing [proces < L. Processus < pp of Procedere = Proceed: pro- forward + cedere, to go] : A particular method of doing something.
• Data Processing >>>> Mathematically meaningful parameters
• Analysis [Gr. ana, up, throughout + lysis, a loosing] : A separating of any whole into its parts, especially with an examination of the parts to find out their nature, proportion, function, interrelationship etc.
• Data Analysis >>>> Physical understandings
Traditional Data Analysis
All traditional ‘data analysis’ methods are either developed by or established according to mathematician’s rigorous rules. They are really ‘data processing’ methods.
In pursue of mathematic rigor and certainty, however, we are forced toidealize, but also deviate from, the reality.
Traditional Data Analysis
As a result, we are forced to live in a pseudo-real world, in which all processes are
Linear and Stationary
Available ‘Data Analysis’ Methodsfor Nonstationary (but Linear) time series
• Spectrogram• Wavelet Analysis• Wigner-Ville Distributions• Empirical Orthogonal Functions aka Singular Spectral
Analysis• Moving means• Successive differentiations
Available ‘Data Analysis’ Methodsfor Nonlinear (but Stationary and Deterministic)
time series
• Phase space method• Delay reconstruction and embedding• Poincaré surface of section• Self-similarity, attractor geometry & fractals
• Nonlinear Prediction
• Lyapunov Exponents for stability
Typical Apologia
• Assuming the process is stationary ….
• Assuming the process is locally stationary ….
• As the nonlinearity is weak, we can use perturbation approach ….
Though we can assume all we want, but
the reality cannot be bent by the assumptions.
Motivations for alternatives: Problems for Traditional Methods
• Physical processes are mostly nonstationary
• Physical Processes are mostly nonlinear
• Data from observations are invariably too short
• Physical processes are mostly non-repeatable.
Ensemble mean impossible, and temporal mean might not be meaningful for lack of stationarity and ergodicity. Traditional methods are inadequate.
The job of a scientist is to listen carefully to nature, not to tell nature how to behave.
Richard Feynman
To listen is to use adaptive method and let the data sing, and not to force the data to fit preconceived modes.
The Job of a Scientist
Characteristics of Data from Nonlinear Processes
32
2
2
22
d xx cos t
dt
d xx cos t
dt
Spring with positiondependent cons tan t ,
int ra wave frequency mod ulation;
therefore ,we need ins tan
x
1
taneous frequenc
x
y.
p
2 2 1 / 2 1
i ( t )
For any x( t ) L ,
1 x( )y( t ) d ,
t
then, x( t )and y( t ) form the analytic pairs:
z( t ) x( t ) i y( t ) ,
where
y( t )a( t ) x y and ( t ) tan .
x( t )
a( t ) e
Hilbert Transform : Definition
Empirical Mode DecompositionSifting : to get one IMF component
1 1
1 2 2
k 1 k k
k 1
x( t ) m h ,
h m h ,
.....
.....
h m h
.h c
.
The Stoppage Criteria
The Cauchy type criterion:
when SD is small than a pre-set value, where
T2
k 1 kt 0
T2
k 1t 0
h ( t ) h ( t )SD
h ( t )
Definition of the Intrinsic Mode Function (IMF)
Any function having the same numbers of
zero cros sin gs and extrema,and also having
symmetric envelopes defined by local max ima
and min ima respectively is defined as an
Intrinsic Mode Function( IMF ).
All IMF enjoys good Hilbert Transfo
i ( t )
rm :
c( t ) a( t )e
Empirical Mode DecompositionSifting : to get all the IMF components
1 1
1 2 2
n 1 n n
n
j nj 1
x( t ) c r ,
r c r ,
x( t ) c r
. . .
r c r .
.
Definition of Instantaneous Frequency
i ( t )
t
The Fourier Transform of the Instrinsic Mode
Funnction, c( t ), gives
W ( ) a( t ) e dt
By Stationary phase approximation we have
d ( t ),
dt
This is defined as the Ins tan taneous Frequency .
Instantaneous Frequency
distanceVelocity ; mean velocity
time
dxNewton v
dt
1Frequency ; mean frequency
period
dHH
So that both v and
T defines the p
can appear in differential equations.
hase functiondt
The combination of Hilbert Spectral Analysis and
Empirical Mode Decomposition is designated as
HHT
(HHT vs. FFT)
Jean-Baptiste-Joseph Fourier
1807 “On the Propagation of Heat in Solid Bodies”
1812 Grand Prize of Paris Institute
“Théorie analytique de la chaleur”
‘... the manner in which the author arrives at these equations is not exempt of difficulties and that his analysis to integrate them still leaves something to be desired on the score of generality and even rigor.’
1817 Elected to Académie des Sciences
1822 Appointed as Secretary of Math Section
paper published
Fourier’s work is a great mathematical poem. Lord Kelvin
Comparison between FFT and HHT
j
j
t
i t
jj
i ( )d
jj
1. FFT :
x( t ) a e .
2. HHT :
x( t ) a ( t ) e .
Orthogonality Check
• Pair-wise % • 0.0003• 0.0001• 0.0215• 0.0117• 0.0022• 0.0031• 0.0026• 0.0083• 0.0042• 0.0369• 0.0400
• Overall %
• 0.0452
Properties of EMD Basis
The Adaptive Basis based on and derived from the data by the empirical method satisfy nearly all the traditional requirements for basis
a posteriori:
Complete
Convergent
Orthogonal
Unique
Duffing Type WavePerturbation Expansion
For 1 , we can have
x( t ) cos t sin 2 t
cos t cos sin 2 t sin t sin sin 2 t
cos t sin t sin 2 t ....
1 cos t cos 3 t ....2 2
This is very similar to the solutionof Duffing equation .
Duffing Equation
23
2.
Solved with for t 0 to 200 with
1
0.1
od
0.04 Hz
Initial condition :
[ x( o ) ,
d xx x c
x'( 0 ) ] [1
os t
, 1]
3
t
e2
d
tb
What This Means
• Instantaneous Frequency offers a total different view for nonlinear data: instantaneous frequency with no need for harmonics and unlimited by uncertainty.
• Adaptive basis is indispensable for nonstationary and nonlinear data analysis
• HHT establishes a new paradigm of data analysis
Comparisons
Fourier Wavelet Hilbert
Basis a priori a priori Adaptive
Frequency Integral Transform: Global
Integral Transform: Regional
Differentiation:
Local
Presentation Energy-frequency Energy-time-frequency
Energy-time-frequency
Nonlinear no no yes
Non-stationary no yes yes
Uncertainty yes yes no
Harmonics yes yes no
Conclusion
Adaptive method is the only scientifically meaningful way to analyze data.
It is the only way to find out the underlying physical processes; therefore, it is indispensable in scientific research.
It is physical, direct, and simple.
Current Applications
• Non-destructive Evaluation for Structural Health Monitoring – (DOT, NSWC, and DFRC/NASA, KSC/NASA Shuttle)
• Vibration, speech, and acoustic signal analyses– (FBI, MIT, and DARPA)
• Earthquake Engineering– (DOT)
• Bio-medical applications– (Harvard, UCSD, Johns Hopkins)
• Global Primary Productivity Evolution map from LandSat data – (NASA Goddard, NOAA)
• Cosmological Gravitational Wave– (NASA Goddard)
• Financial market data analysis– (NCU)
• Geophysical and Climate studies – (COLA, NASA, NCU)
Outstanding Mathematical Problems
1.Adaptive data analysis methodology in general
2.Nonlinear system identification methods
3.Prediction problem for nonstationary processes (end effects)
4.Optimization problem (the best IMF selection and uniqueness. Is there a unique solution?)
5.Spline problem (best spline implement of HHT, convergence and 2-D)
6.Approximation problem (Hilbert transform and quadrature)
History of HHT1998: The Empirical Mode Decomposition Method and the Hilbert Spectrum for No
n-stationary Time Series Analysis, Proc. Roy. Soc. London, A454, 903-995. The invention of the basic method of EMD, and Hilbert transform for determining the Instantaneous Frequency and energy.
1999: A New View of Nonlinear Water Waves – The Hilbert Spectrum, Ann. Rev. Fluid Mech. 31, 417-457.
Introduction of the intermittence in decomposition. 2003: A confidence Limit for the Empirical mode decomposition and the Hilbert spe
ctral analysis, Proc. of Roy. Soc. London, A459, 2317-2345.Establishment of a confidence limit without the ergodic assumption.
2004: A Study of the Characteristics of White Noise Using the Empirical Mode Decomposition Method, Proc. Roy. Soc. London, A460, 1597-1611Defined statistical significance and predictability.
2007: On the trend, detrending, and variability of nonlinear and nonstationary time series. Proc. Natl. Acad. Sci., 104, 14,889-14,894.The correct adaptive trend determination method
2008: On Ensemble Empirical Mode Decomposition. Advances in Adaptive Data Analysis (in press)
2008: On instantaneosu Frequency. Advances in Adaptive Data Analysis (Accepted)
Advances in Adaptive data Analysis: Theory and Applications
A new journal to be published by the World Scientific
Under the joint Co-Editor-in-Chief
Norden E. Huang, RCADA NCUThomas Yizhao Hou, CALTECH
To be launched in the March 2008
Milankovitch Time scales: Temperature Data from Vostok Ice Core
Data length 3311 points covering 422,766 Years BP
Milutin Milankovitch 1879-1958
How the Sun Affects Climate: Solar and Milankovitch Cycles
The
A Truly Nonlinear World