chapter 7 features extraction using discrete wavelet...
TRANSCRIPT
78
CHAPTER 7
FEATURES EXTRACTION USING DISCRETE
WAVELET TRANSFORM (DWT) AND FAST
FOURIER TRANSFORM (FFT)
7.1 FEATURE EXTRACTION
Once the ultrasonic test signals acquired in a form of digitized data
are preprocessed, we need to determine features from the raw signal by the
use of digital processing techniques. This process is named ‘feature
extraction’. Feature extraction is a special form of dimensionality reduction.
Feature extraction involves simplifying the amount of resources required to
describe a large set of data accurately.
Since not all features that can be extracted from ultrasonic signals
for a given classification problem need to be used, due to their redundancy, a
further process is needed for redundancy reduction by retaining only an
informative subset of them. This stage of processing is called ‘feature
selection’.
7.1.1 Need for Feature Extraction
When the input data to an algorithm is too large to be processed
and it is suspected to be notoriously redundant (much data, but not much
information) then the input data will be transformed into a reduced
79
representation set of features (also named features vector). Transforming the
input data into the set of features is called features extraction. Thus the
extraction of discriminatory features in the signal enhances the reduction of
the length of the data vector by eliminating redundancy in the signal and
compressing the relevant information into a feature vector of significantly
lower dimension.
The ultrasonic oscillograms are a graphical representation in which
it depicts some information regarding the variations in the pattern according
to the type of the flaws. So these ultrasonic oscillograms can be described by
pertinent features allowing the defect classification. After pertinent features
extraction, it is normally useful to elaborate a recognition procedure
(identification) of the detected defect type.
7.2 FEATURES EXTRACTION USING DWT
Discrete wavelet transform is used to extract characteristics from a
signal on various scales proceeding by successive high pass and low pass
filtering. The wavelet coefficients are the successive continuation of theapproximation and detail coefficients
The basic feature extraction procedure consists of
1. Decomposing the signal using DWT into N levels using
filtering and decimation to obtain the approximation and
detailed coefficients
2. Extracting the features from the DWT coefficients
The features extracted from the Discrete wavelet transform (DWT)
coefficients of ultrasonic test signals are considered useful features for input
into classifiers due to their effective time–frequency representation of non-
stationary signals.
80
7.2.1 Feature Extraction Algorithm
Initially, it is verified that the digitized flaw data are available in
the powers of 2 for making the effective decomposition.
The various steps involved in the feature extraction algorithm are as
follows:
Step 1: The ultrasonic flaw data are decomposed into four detail subbands
using Discrete Wavelet Transform (DWT). The subbands are high frequency
detail band coefficients and low frequency approximation band coefficients.
Step 2: The approximation co-efficients are further decomposed using DWT
to extract localized information from the subband of detail coefficients. In this
work, four levels of decomposition have been done using biorthogonal
wavelet (bior 4.4).
Four level approximation and detail coefficients of six classes of
defect are graphically represented in Appendix 1 as Figures A1.1 to A1.6.
Step 3: For further analyzing and processing, all the four level detail band
coefficients have been taken.
Step 4: The frequency vector (in radians/sample) is extracted for four detail
subbands using periodogram function in Matlab.
Step 5: The features are computed either by using syntax or by implementing
the formulae. They are mean, variance, mean of energy, maximum amplitude,
minimum amplitude, maximum energy, minimum energy, average frequency,
mid frequency, maximum frequency, minimum frequency, half point of the
function.
81
The M-file program for four level signal decomposition and
features extraction using DWT are provided in Appendix 2.
Step 6: Finally, the extracted features for the six classes of defects are
tabulated and analyzed for classification.
7.2.2 Extracted Features
In this work, twelve features are extracted from the Discrete
wavelet transform (DWT) coefficients of ultrasonic test signals obtained from
the six classes of defect. The extracted features from the signal are as below:
1. Mean: It is nothing but an average value.
n
ii 1
1m xn
2. Variance: The variance is defined as the sum of square
distances of each term in the distribution from the mean,
divided by the number of terms in the distribution.
n2
ii 1
1 x mn 1
3. Mean of the energy: It is the average value of the energy.
n2
e ii 1
1m xn
where x Sequence, m Mean, n Number of Samples
4. Maximum Amplitude: It is the peak value of amplitude of the
signal.
82
5. Minimum Amplitude: It is the lowest value of amplitude of
the signal.
6. Maximum Energy: It is the highest energy value obtained
from the signal.
7. Minimum Energy: It is the lowest energy value obtained from
the signal.
8. Average Frequency:
n
i ii 1
avg n
ii 1
f xpf
p
where p Power spectral density, f Frequency vector
9. Mid Frequency: It is the frequency value which is obtained
when the power spectral density is at the maximum value.
10. Maximum frequency: It is the maximum frequency value of
the energy in the spectrum.
11. Minimum frequency: It is the minimum frequency value of the
energy in the spectrum.
12. Half Point of the energy (HaPo): It is a very valuable variable
as it represents the frequency that divides up the spectrum into
two parts of same area.
7.2.3 Feature Extraction Results
The extracted features from the ultrasonic flaw signal for crack in
each of the four level sub bands are shown in the table 7.1. D1 represents 1st
level detail band. D2 represents 2nd level detail band. D3 represents 3rd level
detail band. D4 represents 4th level detail band. The extracted features for
83
each of the other 5 classes of defect in each of the four level sub bands are
shown in Appendix 3 as tables A 3.1 to A 3.5.
Table 7.1 Extracted Features for crack in four detail sub bands
Four level detail sub bands of waveletcoefficientsSl.
NoFeatures D1
/2 - D2
/4 - /2D3
/8 - /4D4
/16 - /81. Mean 0.2331 0.0191 0.5567 -0.46172. Variance 6.6855 68.5707 1045.37 389.10793. Mean of energy 6.7366 68.5041 1043.6377 387.80114. Max. amp 25.7044 63.4938 205.2167 82.11985. Min. amp -23.7403 -86.6635 -316.108 -193.9046. Max. energy 660.717 7510.559 99924.09 37598.67. Min.energy 0.0000 0.0000 0.0003 0.00008. Avg. frequency 2.6154 1.7991 1.8754 1.70949. Mid frequency 2.8777 2.0433 2.1967 1.448110. Max. frequency 0.0031 0.0123 0.0245 0.073611. Min.frequency 0.3283 2.3746 1.4849 2.454412. Half pt. 1.4205 2.0064 2.5525 1.988
7.3 FEATURE ANALYSIS OF DWT FEATURES
7.3.1 Feature values for Six Classes of Defect
Among the extracted twelve features, the average values of each
feature for each classes of defect are determined and tabulated. The average
values of the each of the twelve features obtained from DWT coefficients of
the crack signal in 1st level detail sub band are shown in the Table 7.2. This
average value is calculated for the 30 signals obtained from crack. The
average values of all features for other classes of defect are provided in
Appendix 3 as tables A 3.6 to A 3.10.
84
85
7.3.2 Selection of Features
As the relationship between ultrasonic signal characteristics and
flaw classes is not straightforward, the extraction of features plays a critical
role in classification accuracy and this becomes the important basis of
decision-making for classification.
The variation of the twelve features with respect to each classes of
defect is analysed and for each defect, the average values for all the features
are determined and are plotted in graphs. The average feature values for the
six classes of defect in the first level sub band are marked in the graph
representing defects in the x axis and average values in the y axis. The
variation in the feature values for the six classes of defect are shown in the
following graphs as Figures 7.1(a) to 7.1(k).
Figure 7.1(a) Variation of mean for six classes of defect
86
Figure 7.1(b) Variation of variance for six classes of defect
Figure 7.1(c) Variation of mean of energy for six classes of defect
Figure 7.1(d) Variation of maximum amplitude for six classes of defect
87
Figure 7.1(e) Variation of minimum amplitude for six classes of defect
Figure 7.1(f) Variation of maximum energy for six classes of defect
Figure 7.1(g) Variation of average frequency for six classes of defect
88
Figure 7.1(h) Variation of mid frequency for six classes of defect
Figure 7.1(i) Variation of maximum frequency for six classes of defect
Figure 7.1(j) Variation of minimum frequency for six classes of defect
89
Figure 7.1(k) Variation of half point of energy for six classes of defect
In the feature analysis, the variation of the twelve features with
respect to each classes of defect is analysed and for each defect, the average
values for all the features is determined. By analyzing and comparing the
graphical results, it is inferred that among the extracted twelve features; only
eight features have given faithful information and also good discrimination
between the flaws.
They are
1. Mean
2. Variance
3. Maximum amplitude
4. Minimum amplitude
5. Maximum energy
6. Average frequency
7. Minimum frequency
8. Half point of the function
90
7.3.3 Inputs to ANN and SVM
The selected eight features which are giving good discrimination
between material defects are considered as the main parameters and hence
these eight features are taken combinely as the input to the ANN and SVM for
the classification of defects. Based on feature analysis, other four features
such as mean of energy, minimum energy, mid frequency and maximum
frequency are neglected because of the following reasons:
Mean of energy : The feature values are same as the variance.
Minimum energy : The feature values are zero for all six classes of defect
Mid frequency : It gives closer values for all six classes of defect
Max. frequency : It gives similar values for all six classes of defect
The selected features extracted from each ultrasonic signal are used
as the input to the ANN and SVM. The input must be representative of its
respective ultrasonic oscillogram. Here, the input of the ANN and SVM is
eight component vector. The 4th level detail DWT coefficients representation
of a defect signal (left) and its respective input of the ANN (right) are shown
in Figures 7.2 (a) to 7.2 (f).
mean of the samplesvariancemaximum amplitudeminimum amplitudemaximum energyaverage frequencyminimum frequencyhalf point
0.0103262.7837124.6363 -127.516256.28 1.8955 3.1416 1.8408
Figure 7.2(a) DWT coefficients representation of a crack signal (left) and
its respective input of the ANN (right)
=
91
mean of the samplesvariancemaximum amplitudeminimum amplitudemaximum energyaverage frequencyminimum frequencyhalf point
-1.254798.6577147.7117-204.86341968.951.58253.09252.1844
Figure 7.2(b) DWT coefficients representation of a porosity signal (left)
and its respective input of the ANN (right)
0.2553481.575194.0681-169.08337662.451.74322.57711.4972
mean of the samplesvariancemaximum amplitudeminimum amplitudemaximum energyaverage frequencyminimum frequencyhalf point
Figure 7.2(c) DWT coefficients representation of a lack of fusion signal
(left) and its respective input of the ANN (right)
mean of the samplesvariancemaximum amplitudeminimum amplitudemaximum energyaverage frequencyminimum frequencyhalf point
-1.4214274.805494.3939-191.45436654.631.63563.14161.7426
Figure 7.2(d) DWT coefficients representation of a lack of penetration
signal (left) and its respective input of the ANN (right)
=
=
=
92
mean of the samplesvariancemaximum amplitudeminimum amplitudemaximum energyaverage frequencyminimum frequencyhalf point
-0.4579862.8073207.6305-162.66643110.441.28532.30111.5953
Figure 7.2(e) DWT coefficients representation of a tungsten inclusion
signal (left) and its respective input of the ANN (right)
-0.0574592.2208173.8241-188.83335657.821.83671.20261.939
mean of the samplesvariancemaximum amplitudeminimum amplitudemaximum energyaverage frequencyminimum frequencyhalf point
Figure 7.2(f) DWT coefficients representation of a non defect signal (left)and its respective input of the ANN (right)
7.4 FEATURES EXTRACTION USING FFT
7.4.1 Feature Extraction Algorithm
The various steps involved in the feature extraction algorithm are asfollows:
Step 1: The ultrasonic flaw data are down sampled in four stages. The downsampled data are in different sample length. Four stages down sampledsignals for six classes of flaws are graphically represented in Appendix 4 asFigures A 4.1 to A 4.6.
Step 2: Fast Fourier Transform is applied to the four stages of down sampledsignals.
=
=
93
Step 3: The frequency vector (in radians/sample) is extracted for four stagedown sampled signals using periodogram function in Matlab.
Step 4: The features are computed using syntax and implementing theformulae. The extracted features are mean, variance, mean of energy,maximum amplitude, minimum amplitude, maximum energy, minimumenergy, average frequency, mid frequency, maximum frequency, minimumfrequency and half point of the function.
The M-file program for four stage down sampling and featuresextraction after applying FFT are shown in Appendix 5.
Step 5: Finally, the extracted features for the six classes of defects aretabulated and analyzed for classification.
7.4.2 Extracted Features
Twelve features are extracted from the each signal of the six classesof defects. The extracted features from the signal are as below:
1 Mean: It is nothing but an average value.
n
ii 1
1m xn
2 Variance: The variance is defined as the sum of squaredistances of each term in the distribution from the mean,divided by the number of terms in the distribution.
n2
ii 1
1 x mn 1
3 Mean of the energy: It is the average value of the energy.
n2
e ii 1
1m xn
94
where x Sequence, M mean, n Number of Samples
4 Maximum Amplitude: It is the peak value of amplitude of the
signal.
5 Minimum Amplitude: It is the lowest value of amplitude of
the signal.
6 Maximum Energy: It is the highest energy value obtained
from the signal.
7 Minimum Energy: It is the lowest energy value obtained from
the signal.
8 Average Frequency:
n
i ii 1
avg n
ii 1
f xpf
p
where p Power spectral density, f Frequency vector
9 Mid Frequency: It is the frequency value which is obtained
when the power spectral density is at the maximum value.
10 Maximum frequency: It is the maximum frequency value of
the energy in the spectrum.
11 Minimum frequency: It is the minimum frequency value of the
energy in the spectrum.
12 Half Point of the energy (HaPo): It is a very valuable variable
as it represents the frequency that divides up the spectrum into
two parts of same area.
7.4.3 Feature Extraction Results
The extracted features from the ultrasonic flaw signal for crack ineach of the four stage down samplings are shown in the Table 7.3. S1represents 1st stage down sampling. S2 represents 2nd stage down sampling.
95
S3 represents 3rd stage down sampling. S4 represents 4th stage downsampling. The extracted features for each of the other five classes of defect ineach of the four stage down samplings are shown in Appendix 6 asTables A 6.1 to A 6.5.
Table 7.3 Extracted Features for crack in four stage down sampling
Four stage down samplingSl.No Features S1 S2 S3 S41. Mean 127 127 127 127
2. Variance 66579581 33376311 16710952 8381750
3. Mean of energy 65780248 32970438 16504812 8282812
4. Max. amplitude 519237 259935 130044 65142
5. Min. amplitude 2.1927 -5.9101 -22.7786 10
6. Max. energy 269607062 67566204 16911441 4243480
7. Min. energy 4.6023 -335.1817 502.0688 100
8. Avg. frequency 3.1725 3.1721 3.1732 3.1799
9. Mid frequency 6.1666 6.1666 6.1666 6.0746
10. Max. frequency 5.8429 5.8414 5.8414 5.7678
11. Min. frequency 5.889 5.8905 5.8905 5.7923
12. Half point 4.4409 4.4393 4.4363 4.4301
7.5 FEATURE ANALYSIS OF FFT FEATURES
7.5.1 Feature Values for Six Classes of Defect
Among the extracted twelve features, the average values of each
feature for each classes of defect are determined and tabulated. The average
values of the each of the twelve features obtained from FFT coefficients of
the crack signal in 1st stage down sampling are shown in the Table 7.4. This
average value is calculated for thirty number of signals obtained from crack.
The average values of all features for other classes of defect are provided in
Appendix 6 as Tables A 6.6 to A 6.10.
96
97
7.5.2 Selection of features
The average feature values for the six classes of defect in the first
stage down sampled signal are marked in the graph representing defects in the
x axis and average values in the y axis. The variation in the feature values for
the six classes of defects is shown graphically in Figures 7.3 (a) to 7.3 (l).
Figure 7.3(a) Variation of mean for six classes of defect
Figure 7.3(b) Variation of variance for six classes of defect
98
Figure 7.3(c) Variation of mean of energy for six classes of defect
Figure 7.3(d) Variation of maximum amplitude for six classes of defect
Figure 7.3(e) Variation of minimum amplitude for six classes of defect
99
Figure 7.3(f) Variation of maximum energy for six classes of defect
Figure 7.3(g) Variation of minimum energy for six classes of defect
Figure 7.3(h) Variation of average frequency for six classes of defect
100
Figure 7.3(i) Variation of mid frequency for six classes of defect
Figure 7.3(j) Variation of maximum frequency for six classes of defect
Figure 7.3(k) Variation of minimum frequency for six classes of defect
101
Figure 7.3(l) Variation of half point for six classes of defect
In the feature analysis, the variation of the twelve features with
respect to each classes of defect is analysed and for each defect, the average
values for all the features is determined. By analyzing and comparing the
graphical results, it is inferred that among the extracted twelve features; only
eight features have given faithful information and also good discrimination
between the flaws. They are
1. Variance
2. Mean of Energy
3. Maximum Amplitude
4. Minimum Amplitude
5. Minimum Energy
6. Mid Frequency
7. Maximum Frequency
8. Minimum Frequency
102
7.5.3 Inputs to ANN and SVM
The selected eight features are giving good discrimination between
material defects and are considered as the main parameters which influence
the classification of defects and hence these eight features are taken
combinely as the input to the ANN and SVM. Based on feature analysis, other
four features such as Mean, Maximum Energy, Average Frequency, Half
Point are neglected as it gives similar values for all six classes of defect.
The selected features extracted from each ultrasonic signal are used
as the input to the ANN and SVM. Here, the input of the ANN and SVM is
eight component vector.
7.6 SUMMARY
Feature extraction procedure and the various features extracted
from the ultrasonic flaw signals using Discrete Wavelet Transform (DWT)
and Fast Fourier Transform (FFT) are described in this section. The extracted
features for each of the six classes of defect in each of the four level sub
bands are tabulated. Selection of features based on feature analysis is also
clearly described. Lastly, the critical features which give the best
classification results are presented.