selectionof relevant features for - university of cretehy578/2017/markaki-ii.pdf · systolic heart...
TRANSCRIPT
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Selection of Relevant Features forAudio Classification tasks
Maria Markaki
21 October 2011
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
1 Feature Extraction from Sound SignalsModulation Frequency Analysis
2 Feature Selection for ClassificationFeature Selection based on MIRedundancy Reduction using HOSVD
3 Speech Discrimination on Broadcast news
4 Pathological Voice Quality Assessment
5 Systolic Heart Murmur Classification
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
ModulationFrequencyAnalysis
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Outline
1 Feature Extraction from Sound SignalsModulation Frequency Analysis
2 Feature Selection for ClassificationFeature Selection based on MIRedundancy Reduction using HOSVD
3 Speech Discrimination on Broadcast news
4 Pathological Voice Quality Assessment
5 Systolic Heart Murmur Classification
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
ModulationFrequencyAnalysis
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Non-stationary Signal Analysis
The analysis of human speech was the main reason for thedevelopment in the 1940s of time-frequency analysis
Time-frequency representations depict simultaneousmeasurements of the acoustic energy in both time andfrequency domainsThe main method was - and still is - the short-time Fouriertransform whose the squared magnitude is the spectrogram
Similar to a Fourier analyser, our auditory system mapsthe one-dimensional sound waveform to a time-frequencyrepresentation through the cochlea
During later auditory stages, spectrum analysis occurs:fast and slow modulation patterns are detected by arraysof filters centred at different frequencies
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
ModulationFrequencyAnalysis
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Principle of Modulation Spectra
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
ModulationFrequencyAnalysis
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
In Equations
Short-time Fourier transform:
Xk(m) =
∞∑
n=−∞
h(mM − n)x(n)W knK ,
where k = 0, . . . ,K − 1, WK = e−j(2π/K), h(n) :
acoustic frequency analysis window.
Subband envelope detection & frequency analysis:
Xl(k , i) =
∞∑
m=−∞
g(lL−m)|Xk(m)|W imI ,
where i = 0, . . . , I − 1, g(m) : modulation frequencyanalysis window [1]
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
ModulationFrequencyAnalysis
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Example
Joint acoustic / modulation frequency representations and their
combination with cepstrum represent a simple interpretation of the
computational auditory model [1].
Fre
quen
cy (
kHz)
Modulation frequency (Hz)0 100 200 300 400 500
0.5
1.0
1.5
2.0
2.5
0
20
40
Ene
rgy
0 20 40Pitch energy
Figure: Modulation spectrogram of sustained vowel /AH/ by a normalspeaker. The two side plots present the slices intersecting at the point ofmaximum energy; its coordinates coincide with the fundamental frequencyand the first formant of /AH/ (∼ 590 Hz).
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
ModulationFrequencyAnalysis
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Parameters
Tapered windows h(n) and g(m) :
reduced sidelobes of frequency estimates
Length of the analysis window h(n) :
trade-off between resolution in the acoustic andmodulation frequency axes
Overlap between successive windows :
upper limit of the subband sampling rate duringmodulation transform
Modulation spectral energy in the joint acoustic /modulation frequency plane:
a 2D-matrix |Xl (k , i)| ∈ RK×I
N training matrices: a 3D-tensor A ∈ RK×I×N
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Outline
1 Feature Extraction from Sound SignalsModulation Frequency Analysis
2 Feature Selection for ClassificationFeature Selection based on MIRedundancy Reduction using HOSVD
3 Speech Discrimination on Broadcast news
4 Pathological Voice Quality Assessment
5 Systolic Heart Murmur Classification
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Curse of Dimensionality
Classification algorithms detect and exploit complexpatterns in data during training, validation and testing
High dimensional features pose challenging problems tolearning algorithms:
high computational cost and storage volumes for therepresentation of signalsdifficult exclusion of accidental, unstable patterns whichlead to over-fitting of the training system:
- the generalization error, and- the number of training examples required for achieving agiven error level
both increase with data dimension
In order to obtain a low-dimensional representation of thesignals suitable for classification, we can employ:
feature selection techniquesdimensionality reduction
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Maximal Statistical Dependency
Minimal classification error ≃ maximal statisticaldependency of target class c on the data distribution
Max-Dependency criterion → a set S of m features {xi}which jointly have the largest dependency on the targetclass maxD(S , c)
statistical dependency of variables is measured by mutualinformation (MI):
D(S , c) = I ({xi , i = 1, . . . ,m}; c) =∫
. . .
∫
p(x1, . . . , xm, c) logp(x1, . . . , xm, c)
p(x1, . . . , xm)p(c)dx1 . . . dxmdc
requires multivariate densities for MI estimation - hard toimplement
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Feature Selection based on MI
Shannon’s MI between two variables measures the amountof relevant and redundant information :
in the supervised learning framework, feature xj is regardedas relevant if it provides information about a target credundancy between features xj and xi is defined as theamount of information variable xj holds about variable xi
Max-Relevance criterion:
maxD(S , c), D =1
|S |
∑
xi∈S
I (xi ; c)
- features selected might depend on each other⇒ add a minimal redundancy condition [2]:
minR(S), R =1
|S |2
∑
xi ,xj∈S
I (xi ; xj)
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Max-Relevance-Min-Redundancy
Criterion (mRMR)
Incremental algorithm: selects the mth feature from theset {X − Sm−1} of m features:
maxxj∈X−Sm−1
I (xj ; c)−1
m − 1
∑
xi∈Sm−1
I (xj ; xi )
low computational complexity of incremental searchmethodequivalent to Max-Dependency for first order incrementalfeature selection [2]
Still, heuristics are necessary during training for discoveringthe optimal relation between relevance and redundancy
Idea : reduce features redundancy first so that multivariateprobability densities almost equal the product of marginaldensities
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Higher Order SVD
Higher Order Singular Value Decomposition (HOSVD) is ageneralization of SVD to tensors [3]
SVD first proposed for the Wigner distribution
Real signals contain noise spread out over all the terms ofthe decomposition, whereas signals are well represented bythe first few terms
Truncation of the series after the first few terms,significantly reduces noise while retaining most of thesignalThe signal representations can be approximated in alower-dimensional space producing a compact feature setsuitable for classification
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
3rd-order Singular Value Decomposition
A generalization of SVD to tensors :
A = S ×1 U(1) ×2 U
(2) ×3 U(3)
where:
U(n) = [U(n)1 , . . . U
(n)In
], the matrix of left singular vectorsof the matrix unfolding A(n)
S ∈ R (I1×I2×I3) has all-orthogonal subtensors with orderedFrobenius-norms:
‖Sin=1‖ ≥ ‖Sin=2‖ ≥ . . . ≥ ‖Sin=In‖ ≥ 0
‖Sin=i‖ ≡ σ(n)i are n−mode singular values of A ≡
singular values of the matrix unfolding A(n)
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
“Rank” of the Matrix Unfolding
Ordering of n−mode singular values σ(n)in
implies that the“energy” of tensor A is concentrated in the singular
vectors U(n)i with the lowest values of i in every subspace
Based on the data accuracy, we define a threshold τ and
retain the singular vectors with σ(n)in
exceeding it
4 8 12 16 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
singular value index
sing
ular
val
ue c
ontr
ibut
ion
Frequency subspaceModulation−frequency subspace
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Maximum Contribution Criterion
Dimensionality of the embedding can be selected throughtraditional model selection methods such ascross-validation
Dimensionality reduction can preserve information from allthe original input variables, promoting generalization
still, purely unsupervised techniques might throw away lowvariance dimensions which are highly predictive for aclassification task
Goal: to combine both unsupervised and supervisedtechniques to gain the benefit of both approaches
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Optimal “Independent” Features
Project |Xl(k , i)| to the basis vectors contributing morethan τ to the “energy” of each subspace
U(1)i , i = 1, . . . , i1 in the acoustic frequency space
U(2)i , i = 1, . . . , i2 in the modulation frequency space
Select the most relevant “independent” features
0 0.02 0.04 0.06 0.08 0.1 0.12
10−1
100
101
Extrapolated MI
P.D
.F. o
f MI v
alue
s
Redundancy: packed featuresRedundancy: original features
Figure: Redundancy of original (red triangles) and “independent”features, after applying HOSVD (yellow triangles).
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
FeatureSelection basedon MI
RedundancyReduction usingHOSVD
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Maximum Relevance Criterion applied
after HOSVD
50 100 150 200 250 3000.05
0.06
0.07
0.08
0.09
0.1
0.11
0.12
Feature number
Equ
al e
rror
rat
e
MaxRelmRMR
Figure: We select the most relevant projections of features among thosecontributing more than a threshold, through cross-validation procedure.SVM classifier equal error rate using mRMR and MaxRel features forspeech/nonspeech discrimination on broadcast news.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Outline
1 Feature Extraction from Sound SignalsModulation Frequency Analysis
2 Feature Selection for ClassificationFeature Selection based on MIRedundancy Reduction using HOSVD
3 Speech Discrimination on Broadcast news
4 Pathological Voice Quality Assessment
5 Systolic Heart Murmur Classification
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Speech Discrimination based on
Modulation Spectra
The discrimination of speech and non-speech is the firstprocessing step before speaker segmentation andrecognition, or speech transcription
We design a content based speech discriminationalgorithm which exploits long-term information inherent inmodulation spectrum
the system is built upon a segment based SVM classifier
Detection experiments on Greek and U.S. Englishbroadcast news data, suggest that the system providescomplementary information to state-of-the-art mel-cepstralfeatures
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Relevance of Features
50100
150200
250
2000
4000
6000
80000
0.05
0.1
0.15
Modulation frequency (Hz)Acoustic frequency (Hz)5
1015
2025
510
1520
0
0.05
0.1
0.15
0.2
Modulation frequency SVsAcoustic frequency SVs
Figure: Relevance of the original and compressed modulation spectralfeatures: Mutual information (MI) between the speech / non-speech classvariable and (left) the acoustic and modulation frequencies (65× 125dimensions) and (right) the first 25 singular vectors in each subspace.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Maximum Relevance vs Maximum
Contribution Criterion
50 100 150 200 2500.05
0.055
0.06
0.065
0.07
0.075
0.08
Number of features
EE
R
Max ContributionMax Relevance
Figure: SVM classifier equal error rate (EER) as a function of number offeatures selected in terms of maximum relevance or maximum contribution.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Approximate Representations with
Optimal Performance
Modulation frequency (Hz)
Aco
ustic
freq
uenc
y (H
z)
0 50 100 150 200 2500
1000
2000
3000
4000
5000
6000
7000
8000
Modulation frequency (Hz)
Aco
ustic
freq
uenc
y (H
z)
0 50 100 150 200 2500
1000
2000
3000
4000
5000
6000
7000
8000
Figure: (Left) Rank−(13, 12) approximation of modulation spectrumfor 500 ms of a speech signal. (Right) 21 features approximation for thesame speech signal. Energy at modulations corresponding to pitch (∼ 120Hz) and syllabic and phonetic rates (< 40 Hz) remain prominent.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Outline
1 Feature Extraction from Sound SignalsModulation Frequency Analysis
2 Feature Selection for ClassificationFeature Selection based on MIRedundancy Reduction using HOSVD
3 Speech Discrimination on Broadcast news
4 Pathological Voice Quality Assessment
5 Systolic Heart Murmur Classification
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Pathological Voice Quality Assessment
Objectively evaluate the degree of voice alterations in anon-invasive manner, using acoustic analysis
assist the perceptual evaluation of dysphonic voice qualityused by the clinicians
Identify acoustic measures that highly correlate withpathological voice qualities
Modulation frequency analysis for voice pathologydetection and classification
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Relevant Features without
Normalization
Modulation frequency (Hz)
Aco
ustic
freq
uenc
y (H
z)
0 100 200 300 400 500
800
1620
3220
6400
12500
0.05
0.1
0.15
0.2
0.25
0.3
Aco
ustic
freq
uenc
y (H
z)
Modulation frequency (Hz)
0 100 200 300 400 500
800
1620
3220
6400
12500
0.05
0.1
0.15
0.2
Figure: Relevance (MI) between modulation spectral features andpathologic voice class without normalization in MEEI (left), and in PdA(Right).
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Normalization of modulation spectra
The distribution of envelope amplitudes of voiced speechhas a strong exponential component
we calculate modulation spectra using a log transformationof the amplitude values |Xk (m)| and subtracting theirmean log amplitude before windowing :
X̂k(m) = log |Xk(m)| − log |Xk(m)| (1)
where log |Xk(m)| denotes the average of log |Xk(m)| overm
analogous to the cepstral mean subtraction approach,which compensates for convolutional noise in MFCCfeatures
Next, we normalize every acoustic frequency subband withthe marginal of the modulation frequency representation(Sukittanon et al 2004):
Xl ,sub(k , i) =Xl(k , i)
∑
i Xl(k , i)(2)
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Relevant Features after Normalization
Modulation frequency (Hz)
Aco
ustic
freq
uenc
y (H
z)
0 100 200 300 400 500
800
1620
3220
6400
12500
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
Modulation frequency (Hz)
Aco
ustic
freq
uenc
y (H
z)
0 100 200 300 400 500
800
1620
3220
6400
12500
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Figure: Relevance (MI) between modulation spectral features andpathologic voice class after normalization in MEEI (left), and in PdA(right).
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Performance of MFCC and mRMS
features in MEEI
1 2 5 10 20 40 60
1
2
5
10
20
40
60
False Alarm probability (in %)
Mis
s pr
obab
ility
(in
%)
MFCCmRMSFusion
Figure: Detection Error Trade-off (DET) curve using mRMS features,MFCC and their fusion (concatenation of feature vectors) in MEEI.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Performance of MFCC and mRMS
features in PdA
1 2 5 10 20 40 60
1
2
5
10
20
40
60
False Alarm probability (in %)
Mis
s pr
obab
ility
(in
%)
MFCCmRMSFusion
Figure: DET curve using mRMS features, MFCC and the concatenatedfeature vector in PdA.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Cross-database performance of MFCC
and mRMS features
1 2 5 10 20 40 60
1
2
5
10
20
40
60
False Alarm probability (in %)
Mis
s pr
obab
ility
(in
%)
MFCCmRMSFusion
Figure: DET curve using mRMS features, MFCC and the concatenatedfeature vector when training is performed in PdA and testing in MEEI.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Results: Classification of Pathologies
in MEEI
Classify: vocal fold polyp, adductor spasmodic dysphonia,
keratosis leukoplakia, and vocal nodules
mRMS FD-GA
DCFopt (%) AUC (%) m DR (%)
Pol/Add 88.33 ± 2.64 95.74 60 82.5
Pol/Ker 86.11 ± 5.52 93.61 80 81.8
Pol/Mod 91.25 ± 3.13 95.03 20 87.5
where: FD-GA stands for Fisher distance and Genetic
Algorithms (Hosseini et al. 2008)
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Outline
1 Feature Extraction from Sound SignalsModulation Frequency Analysis
2 Feature Selection for ClassificationFeature Selection based on MIRedundancy Reduction using HOSVD
3 Speech Discrimination on Broadcast news
4 Pathological Voice Quality Assessment
5 Systolic Heart Murmur Classification
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Systolic Heart Murmur Classification
Classic heart auscultation using a stethoscope
the most common method to screen the health ofcardiovascular systemsimple, fast, with minimal cost
Detection of pathological heart sounds - murmurs oradditional sounds
indication of structural abnormalities of the cardiovascularsystem
A significant percentage of children presents someinnocent functional murmurs
accurate discrimination between pathological and innocentmurmurs is a skill that can take years to acquire and refine
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Automatic Preprocessing of PCG
recordings
2 4
−0.5
0
0.5
1
Time (sec)
PCGECG
SM
S2
S1
0.4s
Figure: Phonocardiographic signal (solid) and envelope ofelectrocardiographic signal (dash) of an 10-years old with innocent early tomidsystolic murmur. A 400ms segment at the beginning of a heart cycle ishighlighted, including S1, the systolic murmur (SM) and S2.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Reassigned spectrogram
Figure: Energy (relative sound intensity in dB) of the reassignedspectrogram [4] of the PCG (shown in previous slide) with innocent earlyto midsystolic murmur - first 400 ms of one heart cycle.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Children PCG Database
Figure: Mean values for the energy (relative sound intensity in dB) ofthe reassigned spectra of the PCG from 25 subjects with (left) innocentsystolic murmurs, (right) pathological systolic murmurs - 3 recordings with5 consequent heart cycles per recording.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Visualization of Useful Information
Figure: Relevance - estimated as mutual information - of the reassignedspectral features of the PCG (the first 400ms of the heart cycle) fordiscrimination of abnormal murmurs.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
System Performance
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1−Specificity
Sen
sitiv
ity
5 heart cycles1 heart cycle
Figure: Average ROC curves of 25 cross-validation runs using SVMbased on one heart cycle (red dashed) or five heart cycles segments (bluesolid line). The best classification score for one recording corresponds to asensitivity of 92.11% and a specificity of 89.82% (blue square).
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Comparison of System Performance to
General Doctors
Sensitivity Specificity0
10
20
30
40
50
60
70
80
90
100
%
Automatic diagnosis
General doctors
Figure: Sensitivity and specificity of the system (green bars) comparedto general doctors (yellow bars).
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Summary of Contributions
Adaptation of the maximum dependency criterion forfeature selection in two steps:
1 redundancy reduction through HOSVD2 selection of the most relevant independent features
through cross-validation
Application of Max-Dep criterion to speech discriminationand pathological voice quality assessment based onmodulation spectra
Application of Max-Dep criterion to heart murmurclassification based on reassigned spectra
Normalization of modulation frequency features forcross-database experiments on voice quality assessment
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Future Work
Apply the algorithm using more elaborate representationsfor various signal classification tasks
Experiment with recent feature selection techniques, e.g.,based on Markov Blanket theory
Comparison of the heart murmur classification system to astate-of-the-art method on the same data
Classification of a sequence of spectra, as in video, addingan extra dimension of time before HOSVD
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
L. Atlas and S.A. Shamma.
Joint acoustic and modulation frequency.
EURASIP Journal on Applied Signal Processing, 7:668–675, 2003.
H. Peng, F. Long, and C. Ding.
Feature selection based on mutual information: criteria ofmax-dependency, max-relevance, and min-redundancy.
IEEE Trans. Pattern Anal. Mach. Intell., 27:1226–1238, 2005.
L. De Lathauwer, B. De Moor, and J. Vandewalle.
A multilinear singular value decomposition.
SIAM J. Matrix Anal. Appl., 21:1253–1278, 2000.
F. Auger and P. Flandrin.
Improving the readability of time-frequency and time-scalerepresentations by the reassignment method.
IEEE Trans. Signal Process., 43(5):1068–1089, 1995.
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Conference Publications
1 “Speech - Nonspeech Discrimination using the InformationBottleneck Method and Spectro-Temporal ModulationIndex”, Markaki M., Wohlmayr M. and Stylianou Y.,InterSpeech ICSLP, 2007
2 “Discrimination of Speech from nonspeech in broadcastnews based on modulation frequency features”, MarkakiM. and Stylianou Y., ISCA, 2008
3 “Dimensionality Reduction of Modulation FrequencyFeatures for Speech Discrimination”, Markaki M. andStylianou Y., InterSpeech, 2008
4 “Singing Voice Detection using Modulation FrequencyFeatures”, Markaki M., Holzapfel A. and Stylianou Y.,ISCA, 2008
5 “Evaluation of Modulation Frequency Features for SpeakerVerification and Identification”, Markaki M. and StylianouY., EUSIPCO, 2009
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Conference Publications
6 “Using Modulation Spectra for Voice Pathology Detectionand Classification”, Markaki M. and Stylianou Y., IEEEEMBC, 2009
7 “Normalized Modulation Spectral Features forCross-Database Voice Pathology Detection”, Markaki M.and Stylianou Y., InterSpeech, 2009
8 “Modulation Spectral Features for Objective Voice QualityAssessment: the Breathiness case”, Markaki M. andStylianou Y., MAVEBA, 2009
9 “Modulation Spectral Features for Objective Voice QualityAssessment”, Markaki M. and Stylianou Y., IEEE ISCCSP,2010
10 “Dysphonia Detection based on Modulation SpectralFeatures and Cepstral Coefficients”, Markaki M., StylianouY., Arias-Londono J.D. and Godino-Llorente J.I., ICASSP,2010
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
Journal & Book Publications
1 “Extraction of Speech-Relevant Information fromModulation Spectrograms”, Markaki M., Wohlmayr M.and Stylianou Y., Progress in Nonlinear SpeechProcessing, Springer, pp. 78 - 88, 2007
2 “Discrimination of Speech from Nonspeeech in BroadcastNews Based on Modulation Frequency Features”, MarkakiM. and Stylianou Y., Speech Communication, 2010
3 “On combining information from Modulation Spectra andMel-Frequency Cepstral Coefficients for automaticdetection of pathological voices”, Arias-Londono J.D.,Godino-Llorente J.I., Markaki M. and Stylianou Y.,Logopedics Phoniatrics Vocology, 2010
4 “Voice Pathology Detection and Discrimination Based onModulation Spectral Features”, Markaki M. and StylianouY., IEEE Transactions on Speech and Audio Processing,2011
Selection of
Relevant
Features for
Audio
Classification
tasks
Maria
Markaki
Feature
Extraction
from Sound
Signals
Feature
Selection for
Classification
Speech Dis-
crimination
on Broadcast
news
Pathological
Voice Quality
Assessment
Systolic Heart
Murmur
Classification
THANK YOU
for your attention