Download - Topic 6 - GitHub Pages
![Page 1: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/1.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Topic 6
The Digital Fourier Transform
![Page 2: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/2.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Why bother? • The ear processes sound by decomposing it into sine
waves at different frequencies.• An algorithm that does the same would be a step
towards machines that hears things as humans do.• So…how do we do this by machine?
0 10 20 30 40 50 60 70 80 90 100-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
220 Hz
660 Hz
1100 HzPeripheral Auditory system
Cochlea,Auditory nerve
0 10 20 30 40 50 60 70 80 90 100-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40 50 60 70 80 90 100-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40 50 60 70 80 90 100-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
![Page 3: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/3.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Jean Baptiste Joseph FourierHe was a French mathematician and physicist who lived from 1768-1830.
He presented a paper in 1807 to the Institut de France claiming any continuous signal with finite period could be represented as the sum of an infinite series of properly chosen sinusoidal wave functions at different frequencies.
We now call this the Fourier Series.
Could this be the way to decompose sounds into different frequencies, like the ear does?
![Page 4: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/4.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
FactoidAmong the reviewers of Fourier’s paper were two famous mathematicians
– Joseph Louis Lagrange (1736-1813)– Pierre Simon de Laplace (1749-1827)
Lagrange said sine waves could not perfectly represent signals with discontinuous slopes, like square waves. (He was, technically, right)
Thus, the Institut de France did not publish Fourier’s work until 15 years later, after Lagrange died.
![Page 5: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/5.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
The Fourier Series
[ ]01
( ) cos( ) sin( )n n n nn
x t A A t B tw w¥
=
= + +å
Given a periodic function x(t)
An INFINITE series of sine and cosine functions reproduces x(t)
0 2 /n n n Tw w p= =
the period
IntegersmtxmTtx Î"=+ )()(
Where frequency ωn is an integer multiple n of fundamental frequency ω0
![Page 6: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/6.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Aside: Two defs for “frequency”
• The frequency of a periodic function can be defined in two ways.
f =ω / 2π =1/TFrequency
Angular Frequency Period
![Page 7: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/7.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Fourier Series
01
cos( )( )
sin( )n n
n n n
A tx t A
B tww
¥
=
é ù= + ê ú+ë û
åThe original signal
Quite a few!
The frequency of this component
Amplitude: the contribution of this component
![Page 8: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/8.jpg)
PROBLEM 1The Fourier Series has an infinite number of sinusoids in it.
It isn’t practical to calculate an infinite number of things, in the general case.
We need to frame the problem as a finite one, so we can actually solve the general case.
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
![Page 9: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/9.jpg)
PROBLEM 2
A function representable by a Fourier series is perfectly periodic. Therefore, it goes on infinitely.
Real world audio is not infinite in length…and things are only locally periodic.
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
![Page 10: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/10.jpg)
Solution: Sample, Window, Hope
STEP 1: Take a “snapshot” of the signal by sampling it a finite number of times within a brief time window
STEP 2: Pretend what you saw in that window goes on forever
Voila! Now you have a “periodic” signal represented by a finite number of points.
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
![Page 11: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/11.jpg)
Image from The Scientist and Engineer's Guide to Digital Signal Processing by Steven Smith)
Discrete Fourier Transform• Represents a finite sequence of complex values as a
finite number of discrete sinusoidal functions.• This finite sequence of samples can be perfectly
reproduced by the finite set of sinusoids.
[ ]X k
DFT0 N-1
0 N-1
0 N-1
0 N-1
Real portion
Imaginary portion
N/2
N/2
Real portion
Imaginary portion
[ ]x nTime domain Frequency domain
IDFT
![Page 12: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/12.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Digital Sampling
0123456
-4-3-2-1AM
PLIT
UD
E
TIME
sample intervalquantization increment
An analog signal is sampled into sequence of discrete sample points, x[n]
![Page 13: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/13.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Windowingx[n] is windowed by function w[n](multiply the ith value of x by the ith value of w)
x[n] w[n] z[n]
x =
Example: windowing x[n] with a rectangular window
![Page 14: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/14.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Only what’s in the window
z[n]
Do the DFT only on the values in the window
x[n] w[n]
x =
Ignore Ignore
![Page 15: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/15.jpg)
Windowing can lead to problemsThe original signal The sample window
What the Fourier Transform “imagines” the signal looks like, based on what was in the window
![Page 16: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/16.jpg)
Window size
• Your window should be longer than the period of the function you want to analyze
• At a sample rate of 8000 Hz, what is the minimum window size that can capture the lowest sound you can hear?
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
![Page 17: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/17.jpg)
Window shape
• Making the window “small” at the edges reduces weirdness.
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
GOODBAD
![Page 18: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/18.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Why window shape matters
• Don’t forget that a DFT assumes the signal in the window is periodic
• The boundary conditions mess things up…unless you manage to have a window whose length is an exact integer multiple of the period of your signal
• Making the edges of the window less prominent helps suppress undesirable artifacts
![Page 19: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/19.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Some famous windows• Rectangular
• Hann (Julius von Hann)
• Bartlett
[ ] 1w n =
[ ] 20.5 1 cos1nw n
Npæ öæ ö= - ç ÷ç ÷-è øè ø
Note: we assume w[n] = 0 outside some range [0,N]
sample
sample
sample
ampl
itude
ampl
itude
ampl
itude
÷÷ø
öççè
æ ---
--
=21
21
12][ NnNN
nw
![Page 20: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/20.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
The DFT and IDFT formulae
21
021
0
[ ] [ ]
1[ ] [ ]
iN knN
niN kn
N
k
X k x n e
x n X k eN
p
p
- -
=
-
=
=
=
å
å
DiscreteFourierTransform
InverseDiscreteFourierTransform
What’s differentbetween them
![Page 21: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/21.jpg)
Some points
• If you have n samples in your window, you will get n frequencies of analysis from your FFT.
• If the samples were grabbed at regular times, then the analysis frequencies will be spaced regularly in the frequency domain.
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
![Page 22: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/22.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Fundamental Frequencyof a Signal
• The lowest frequency a sine or cosine can have and still fit into one period of the function.
• Often called “F zero”• Don’t confuse this for the DC offset.• Don’t confuse the fundamental frequency of a
signal with the fundamental frequency of analysis of an FFT
0 0 / 2 1/f Tw p= =
![Page 23: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/23.jpg)
Fundamental Frequency of Analysis
• The lowest frequency component a Fourier transform can analyze meaningfully
• All the frequencies of analysis are integer multiples of the fundamental frequency of analysis
• This is also the spacing between frequencies of analysis.
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
fanalysis =SN Number of
samples in my window
Sample rate
![Page 24: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/24.jpg)
About Frequencies of Analysis
• A recording with N points produces a DFT with N points.
• The energy in the DFT is symmetric around two “pivot points”: the DC offset (the 0 frequency) and ½ the sample frequency (S).
• Let’s look at an example.
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
![Page 25: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/25.jpg)
The numbers in an 8 point FFT
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
-3 -i -1-i 0 1+i 2 1-i 0 -1+i
-0.13 -0.98 0.38 -0.27 -0.13 -0.27 -0.63 -0.98
0 1 2 3 4 5 6 7Array Index
8 point signal
FFT of signal
DC offset Nyquistfrequency
Complex conjugates
Better index 0 1 2 3 4 (-4) -3 -2 -1
Frequencyof FFT 0 S/N 2S/N 3S/N 4S/N -3S/N -2S/N -S/N
0 125Hz 250Hz 375Hz 500Hz -375Hz -250Hz -125HzIf S=1000 Hzand N = 8
![Page 26: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/26.jpg)
Another view of the same data
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Complex conjugates
Nyquistfrequency
![Page 27: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/27.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Complex Numbers
Ay
( )2 2
cossin
x Ay A
A x y
==
= +
( ) cos sin z x iyA iq q
º +
º +
x
imag
inar
y
real
θ
![Page 28: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/28.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Euler’s Formula
• Useful for relating polar coordinates to rectangular coordinates
cos sinThus...
i
i
e i
z Ae
q
q
q q= +
=AMPLITUDE
PHASE
![Page 29: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/29.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Multiplying Complex Numbers
• POLAR notation EASIER for this
( ) ( )
1 2
1 2
1 1 2 2
3 1 2
i i
i
z Ae z A e
z A A e
q q
q q+
= =
=
![Page 30: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/30.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Multiplying Complex Numbers
• Cartesian works as follows…
( )1 1 1 2 2 2
3 1 2 1 2 1 2 2 1
z x iy z x iyz x x y y i x y x y= + = +
= - + +
![Page 31: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/31.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Complex Conjugate
• the complex conjugate of a complex number is given by changing the sign of the imaginary part.
z a ibz a ib= += -
A complex number
Its complex conjugate
![Page 32: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/32.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
DFT in Cartesian Coordinates
21
0
1
0
[ ] [ ]
2 2[ ] cos sin
iN knN
n
N
n
X k x n e
kn knx n iN N
p
p p
- -
=
-
=
=
æ öæ ö æ ö= - + -ç ÷ ç ÷ç ÷è ø è øè ø
å
å
I got this using Euler’s formula.
In Cartesian coordinates, the fact that these are complex values is more obvious.
![Page 33: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/33.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
The Inverse DFT
21
0
1
0
1[ ] [ ]
1 2 2[ ] cos sin
iN knN
k
N
k
x n X k eN
kn knX k iN N N
p
p p
-
=
-
=
=
æ öæ ö æ ö= +ç ÷ ç ÷ç ÷è ø è øè ø
å
å
complex number
![Page 34: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/34.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Computational complexity
• How many operations does this take for each frequency?
• How many operations total?
1
0
2 2[ ] [ ] cos sinN
n
kn knX k x n iN Np p-
=
æ öæ ö æ ö= - + -ç ÷ ç ÷ç ÷è ø è øè ø
å
![Page 35: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/35.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
The FFT
• Fast Fourier Transform– A much, much faster way to do the DFT– Introduced by Carl F. Gauss in 1805 (ish)– Rediscovered by Cooley & Tukey in 1965– Big O notation for this is O(N log N)– Matlab functions fft and ifft are standard– REQUIRES the window size be a power of 2
![Page 36: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/36.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Short time Fourier Transform
•Break signal into windows•Apply windowing function•Calculate fft of each window
![Page 37: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/37.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Inverse Short time Fourier Transform
Reverse direction of arrows! Do your inverse FFT.Don’t have to apply window function again.Overlap & add!
![Page 38: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/38.jpg)
Real spectrogram
time
frequ
ency
• The Short-Time Fourier Transform (STFT) is a succession of local Fourier Transforms (FT)
STFT
STFT
Zafar Rafii, Winter 2014 38
+ j*Imaginary spectrogram
time
frequ
ency
frame i frame i
Time signal
timewindow i
![Page 39: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/39.jpg)
Imaginary spectrum
frequency
Real spectrum
frequency
• If we used a window of N samples, the FT has N values, from 0 to N-1; e.g., if N = 8…
Time signal
timewindow i
FT
STFT
Zafar Rafii, Winter 2014 39
+ j*window i frame i frame i
-3 -1 0 1 2 1 0 -1 0 -1 0 1 0 -1 0 1+ j*0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
N frequency values N frequency values
![Page 40: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/40.jpg)
Imaginary spectrum
frequency
Real spectrum
frequency
• Frequency index 0 is the DC component; it is always real. It is an offset from 0. That’s all.
Time signal
timewindow i
FT
STFT
Zafar Rafii, Winter 2014 40
+ j*window i frame i frame i
-3 -1 0 1 2 1 0 -1 0 -1 0 1 0 -1 0 1+ j*0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
![Page 41: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/41.jpg)
Imaginary spectrum
frequency
Real spectrum
frequency
• Frequency indices from 1 to floor(N/2) are the “unique” complex values (a + j*b)
Time signal
timewindow i
FT
STFT
Zafar Rafii, Winter 2014 41
+ j*window i frame i frame i
-3 -1 0 1 2 1 0 -1 0 -1 0 1 0 -1 0 1+ j*0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
![Page 42: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/42.jpg)
Imaginary spectrum
frequency
Real spectrum
frequency
• Frequency indices from floor(N/2) to N-1 are the “mirrored” complex conjugates (a - j*b)
Time signal
timewindow i
FT
STFT
Zafar Rafii, Winter 2014 42
+ j*window i frame i frame i
-3 -1 0 1 2 1 0 -1 0 -1 0 1 0 -1 0 1+ j*0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
![Page 43: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/43.jpg)
Imaginary spectrum
frequency
Real spectrum
frequency
• If N is even, there is a pivot component at frequency index N/2; it is always real!
Time signal
timewindow i
FT
STFT
Zafar Rafii, Winter 2014 43
+ j*window i frame i frame i
-3 -1 0 1 2 1 0 -1 0 -1 0 1 0 -1 0 1+ j*0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
![Page 44: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/44.jpg)
Real spectrogram
time
frequ
ency
• Summary of the frequency indices and values in the STFT (in colors!)
STFT
Zafar Rafii, Winter 2014 44
Imaginary spectrogram
time
frequ
ency
Frequency 0 = DC component (always real)
Frequency 1 to floor(N/2) = “unique” complex values
Frequency N/2 = “pivot” component (always real)
Frequency floor(N/2) to N-1 = “mirrored” complex conjugates
N frequency values = frequency 0 to N-1 + j*
![Page 45: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/45.jpg)
• The (magnitude) spectrogram is the magnitude (absolute value) of the STFT
Magnitude spectrogram
time
frequ
ency
Real spectrogram
time
frequ
ency
Spectrogram
Zafar Rafii, Winter 2014 45
Imaginary spectrogram
time
frequ
ency
abs
frame iframe i frame i
+ j*
![Page 46: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/46.jpg)
• For a complex number 𝑎 + 𝑗 ∗ 𝑏, the absolute value is 𝑎 + 𝑗 ∗ 𝑏 = 𝑎! + 𝑏!
abs
Spectrogram
Zafar Rafii, Winter 2014 46
Imaginary spectrum
frequency
Real spectrum
frequency
+ j*frame i frame i
-3 -1 0 1 2 1 0 -1 0 -1 0 1 0 -1 0 1+ j*0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Magnitude spectrum
frequency
=
frame i
1.4 1.4 1.4 1.43 0 2 0
![Page 47: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/47.jpg)
1.4 1.4 1.4
• All the N frequency values (frequency indices from 0 to N-1) are real and positive (abs!)
abs
Spectrogram
Zafar Rafii, Winter 2014 47
Imaginary spectrum
frequency
Real spectrum
frequency
+ j*frame i frame i
-3 -1 0 1 2 1 0 -1 0 -1 0 1 0 -1 0 1+ j*0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Magnitude spectrum
frequency
0 1 2 3 4 5 6 7= 1.43 0 2 0
frame i
N frequency values
![Page 48: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/48.jpg)
1.4 1.4 1.4
• Frequency indices from 0 to floor(N/2) are the unique frequency values (with DC and pivot)
abs
Spectrogram
Zafar Rafii, Winter 2014 48
Imaginary spectrum
frequency
Real spectrum
frequency
+ j*frame i frame i
-3 -1 0 1 2 1 0 -1 0 -1 0 1 0 -1 0 1+ j*0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Magnitude spectrum
frequency
0 1 2 3 4 5 6 7= 1.43 0 2 0
frame i
![Page 49: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/49.jpg)
1.4 1.4 1.4
• Frequency indices from floor(N/2)+1 to N-1 are the mirrored frequency values
abs
Spectrogram
Zafar Rafii, Winter 2014 49
Imaginary spectrum
frequency
Real spectrum
frequency
+ j*frame i frame i
-3 -1 0 1 2 1 0 -1 0 -1 0 1 0 -1 0 1+ j*0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Magnitude spectrum
frequency
0 1 2 3 4 5 6 7= 1.43 0 2 0
frame i
![Page 50: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/50.jpg)
1.4 1.4 1.4
• Since they are redundant, we can discard the frequency values from floor(N/2)+1 to N-1
abs
Spectrogram
Zafar Rafii, Winter 2014 50
Imaginary spectrum
frequency
Real spectrum
frequency
+ j*frame i frame i
-3 -1 0 1 2 1 0 -1 0 -1 0 1 0 -1 0 1+ j*0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Magnitude spectrum
frequency
0 1 2 3 4 5 6 7= 1.43 0 2 0
frame i
floor(N/2)+1 unique frequency values
![Page 51: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/51.jpg)
• The spectrogram has therefore floor(N/2)+1 unique frequency values(with DC and pivot)
Magnitude spectrogram
time
frequ
ency
Real spectrogram
time
frequ
ency
Spectrogram
Zafar Rafii, Winter 2014 51
Imaginary spectrogram
time
frequ
ency
frame iframe i frame i
abs+ j*
![Page 52: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/52.jpg)
Spectrogram
• Why the magnitude spectrogram?– Easy to visualize (compare with the STFT)– Magnitude information more important– Human ear less sensitive to phase
Zafar Rafii, Winter 2014 52
Time signal
time
Magnitude spectrogram
time
frequ
ency +
0
![Page 53: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/53.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
The Spectrogramspectrogram(y,256,128,256,fs,'yaxis');
•A series of short term DFTs•Typically just displays the magnitudes in Xof the frequencies up to ½ sampe rate
•There is a spectrogram function in matlab
![Page 54: Topic 6 - GitHub Pages](https://reader031.vdocuments.net/reader031/viewer/2022021623/620afc5220085c7f3268d585/html5/thumbnails/54.jpg)
Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
The Spectrogram
•A series of short term DFTs•Typically just displays the magnitudes in Xof the frequencies up to ½ sampe rate
•There is a spectrogram function in matlab
spectrogram(y,1024,512,1024,fs,'yaxis');