automatic pitch tracking

34
Automatic Pitch Tracking September 18, 2014

Upload: cynthia-nolan

Post on 31-Dec-2015

40 views

Category:

Documents


0 download

DESCRIPTION

Automatic Pitch Tracking. September 18, 2014. The Digitization of Pitch. Praat can give us a representation of speech that looks like:. The blue line represents the fundamental frequency (F0) of the speaker’s voice. Also known as a pitch track - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Automatic Pitch Tracking

Automatic Pitch Tracking

September 18, 2014

Page 2: Automatic Pitch Tracking

The Digitization of Pitch

• The blue line represents the fundamental frequency (F0) of the speaker’s voice.

• Also known as a pitch track

• How can we automatically “track” F0 in a sample of speech?

• Praat can give us a representation of speech that looks like:

Page 3: Automatic Pitch Tracking

Pitch Tracking• Voicing:

• Air flow through vocal folds

• Rapid opening and closing due to Bernoulli Effect

• Each cycle sends an acoustic shockwave through the vocal tract

• …which takes the form of a complex wave.

• The rate at which the vocal folds open and close becomes the fundamental frequency (F0) of a voiced sound.

Page 4: Automatic Pitch Tracking

Voicing Bars

Page 5: Automatic Pitch Tracking

Voicing Bars

Individual glottal pulses

Page 6: Automatic Pitch Tracking

Voicing = Complex Wave

• Note: voicing is not perfectly periodic.

• …always some random variation from one cycle to the next.

• How can we measure the fundamental frequency of a complex wave?

Page 7: Automatic Pitch Tracking

• The basic idea: figure out the period between successive cycles of the complex wave.

• Fundamental frequency = 1 / period

duration = ???

Page 8: Automatic Pitch Tracking

Measuring F0• To figure out where one cycle ends and the next

begins…

• The basic idea is to find how well successive “chunks” of a waveform match up with each other.

• One period = the length of the chunk that matches up best with the next chunk.

• Automatic Pitch Tracking parameters to think about:

1. Window size (i.e., chunk size)

2. Step size

3. Frequency range (= period range)

Page 9: Automatic Pitch Tracking

Window (Chunk) Size

Here’s an example of a small window

Page 10: Automatic Pitch Tracking

Window (Chunk) Size

Here’s an example of a large(r) window

Page 11: Automatic Pitch Tracking

Initial window of the waveform is compared to another window (of the same duration) at a later point in the waveform

Page 12: Automatic Pitch Tracking

Matching

The waveforms in the two windows are compared to see how well they match up.

Correlation = measure of how well the two windows match

???

Page 13: Automatic Pitch Tracking

Autocorrelation• The measure of correlation =

• Sum of the point-by-point products of the two chunks.

• The technical name for this is autocorrelation…

• because two parts of the same wave are being matched up against each other.

• (“auto” = self)

Page 14: Automatic Pitch Tracking

Autocorrelation Example• Ex: consider window x, with n samples…

• What’s its correlation with window y?

• (Note: window y must also have n samples)

• x1 = first sample of window x

• x2 = second sample of window x

• …

• xn = nth (final) sample of window x

• y1 = first sample of window y, etc.

• Correlation (R) = x1*y1 + x2* y2 + … + xn* yn

• The larger R is, the better the correlation.

Page 15: Automatic Pitch Tracking

By the NumbersSample 1 2 3 4 5 6

x .8 .3 -.2 -.5 .4 .8

y -.3 -.1 .1 .3 .1 -.1

product -.24 -.03 -.02 -.15 .04 -.08

Sum of products = -.48

• These two chunks are poorly correlated with each other.

Page 16: Automatic Pitch Tracking

By the Numbers, part 2Sample 1 2 3 4 5 6

x .8 .3 -.2 -.5 .4 .8

z .7 .4 -.1 -.4 .1 .4

product .56 .12 .02 .2 .04 .32

Sum of products = 1.26

• These two chunks are well correlated with each other.

(or at least better than the previous pair)

• Note: matching peaks count for more than matches close to 0.

Page 17: Automatic Pitch Tracking

Back to (Digital) Reality

The waveforms in the two windows are compared to see how well they match up.

Correlation = measure of how well the two windows match

???

These two windows are poorly correlated

Page 18: Automatic Pitch Tracking

Next: the pitch tracking algorithm moves further down the waveform and grabs a new window

Page 19: Automatic Pitch Tracking

The distance the algorithm moves forward in the waveform is called the step size

“step”

Page 20: Automatic Pitch Tracking

Matching, again

The next window gets compared to the original.

???

Page 21: Automatic Pitch Tracking

Matching, again

The next window gets compared to the original.

???

These two windows are also poorly correlated

Page 22: Automatic Pitch Tracking

The algorithm keeps chugging and, eventually…

another “step”

Page 23: Automatic Pitch Tracking

Matching, again

The best match is found.

???

These two windows are highly correlated

Page 24: Automatic Pitch Tracking

The fundamental period can be determined by calculating the length of time between the start of window 1 and the start of (well correlated) window 2.

period

Page 25: Automatic Pitch Tracking

period

• Frequency is 1 / period

• Q: How many possible periods does the algorithm need to check?

• Frequency range (default in Praat: 75 to 600 Hz)

Mopping up

Page 26: Automatic Pitch Tracking

Moving on

• Another comparison window is selected and the whole process starts over again.

Page 27: Automatic Pitch Tracking

*

**********************

*******************

*************

****** ********************

************* ************** ***********************

**********************

*********** ****************** *******

****************

F0 (Hz)

1 2 3 4 (s)

200300400

Time

would

Uhm

I

like

A flight to Seattle from Albuquerque

• The algorithm ultimately spits out a pitch track.

• This one shows you the F0 value at each step.

Thanks to Chilin Shih for making these materials available

Page 28: Automatic Pitch Tracking

Pitch Tracking in Praat• Play with F0 range.

• Create Pitch Object.

• Also go To Manipulation…Pitch.

• Also check out:

Page 29: Automatic Pitch Tracking

Summing Up• Pitch tracking uses three parameters

1. Window size

• Ensures reliability

• In Praat, the window size is always three times the longest possible period.

• E.g.: 3 X 1/75 = .04 sec.

2. Step size

• For temporal precision

3. Frequency range

• Reduces computational load

Page 30: Automatic Pitch Tracking

Deep Thought Questions• What might happen if:

• The shortest period checked is longer than the fundamental period?

• AND two fundamental periods fit inside a window?

• Potential Problem #1: Pitch Halving

• The pitch tracker thinks the fundamental period is twice as long as it is in reality.

• It estimates F0 to be half of its actual value

Page 31: Automatic Pitch Tracking

Pitch Halving

pitch is halvedCheck out normal file in Praat.

Page 32: Automatic Pitch Tracking

More Deep Thoughts• What might happen if:

• The shortest period checked is less than half of the fundamental period?

• AND the second half of the fundamental cycle is very similar to the first?

• Potential Problem #2: Pitch doubling

• The pitch tracker thinks the fundamental period is half as long as it actually is.

• It estimates the F0 to be twice as high as it is in reality.

Page 33: Automatic Pitch Tracking

Pitch Doubling

pitch is doubled

Page 34: Automatic Pitch Tracking

Microperturbations• Another problem:

• Speech waveforms are partly shaped by the type of segment being produced.

• Pitch tracking can become erratic at the juncture of two segments.

• In particular:

• voiced to voiceless segments

• sonorants to obstruents

• These discontinuities in F0 are known as microperturbations.

• Also: transitions between modal and creaky voicing tend to be problematic.