primitive probabilistic auditory scene analysis -...

77
Primitive probabilistic auditory scene analysis Richard E. Turner ([email protected]) Computational and Biological Learning Lab, University of Cambridge Maneesh Sahani ([email protected]) Gatsby Computational Neuroscience Unit, UCL, London

Upload: nguyenkhuong

Post on 14-May-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Primitive probabilistic auditory scene analysis

Richard E. Turner ([email protected])

Computational and Biological Learning Lab, University of Cambridge

Maneesh Sahani ([email protected])

Gatsby Computational Neuroscience Unit, UCL, London

Page 2: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Motivation

object

parts of objects

structural primitives

sensoryinput

Page 3: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Motivation

object

parts of objects

structural primitives

sensoryinput

trees

bark

edges

photoreceptoractivities

Page 4: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Motivation

object

parts of objects

structural primitives

sensoryinput

trees

bark

edges

photoreceptoractivities

bird song

motif

AM tones and noise

auditory filteractivities

Page 5: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Auditory Scene Analysis

object

parts of objects

structural primitives

sensoryinput

trees

bark

edges

photoreceptoractivities

bird song

motif

AM tones and noise

auditory filteractivities

superschema−based

grouping

schema−basedgrouping

primitivegrouping

Page 6: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generative model: Probabilistic Auditory Scene Synthesis

trees

bark

edges

photoreceptoractivities

bird song

motif

AM tones and noise

auditory filteractivities

superschema−based

grouping

schema−basedgrouping

primitivegrouping

p(source)

p(parts|source)

p(primitive|part)

p(sound|primitives)

Page 7: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Recognition model: Probabilistic Auditory Scene Analysis

bird song

motif

AM tones and noise

auditory filteractivities

superschema−based

grouping

schema−basedgrouping

primitivegrouping

p(source)

p(parts|source)

p(primitive|part)

p(sound|primitives)

p(source|sound)

p(part|sound)

p(primitive|sound)

sound waveform

Page 8: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Primitive Probabilistic Auditory Scene Analysis

bird song

motif

AM tones and noise

auditory filteractivities

superschema−based

grouping

schema−basedgrouping

primitivegrouping

p(source)

p(parts|source)

p(primitive|part)

p(sound|primitives)

p(source|sound)

p(part|sound)

p(primitive|sound)

sound waveform

Page 9: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Primitive Probabilistic Auditory Scene Analysis

Part 1: Probabilistic generative model: primitive auditory scenesynthesis

What are the important low-level statistics of natural sounds?

Part 2: Probabilistic recognition model: primitive auditoryscene analysis

Provocative computational theory: Grouping rules arise frominferences based on the statistics of natural sounds.

Page 10: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Primitive Probabilistic Auditory Scene Analysis

Part 1: Probabilistic generative model: primitive auditory scenesynthesis

What are the important low-level statistics of natural sounds?

Part 2: Probabilistic recognition model: primitive auditoryscene analysis

Provocative computational theory: Grouping rules arise frominferences based on the statistics of natural sounds.

Page 11: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Primitive Probabilistic Auditory Scene Analysis

Part 1: Probabilistic generative model: primitive auditory scenesynthesis

What are the important low-level statistics of natural sounds?

Part 2: Probabilistic recognition model: primitive auditoryscene analysis

Provocative computational theory: Grouping rules arise frominferences based on the statistics of natural sounds.

Page 12: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Heuristic Analysis: Fire sound

y

Page 13: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Heuristic Analysis: Fire sound

y

−202

4.1

KH

z

−202

2.4

KH

z

−202

1 K

Hz

0.5 0.52 0.54 0.56 0.58 0.6 0.62−2

02

time /s

0.4

KH

z

filter

Page 14: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Heuristic Analysis: Fire sound

y

−202

4.1

KH

z

−202

2.4

KH

z

−202

1 K

Hz

0.5 0.52 0.54 0.56 0.58 0.6 0.62−2

02

time /s

0.4

KH

z

filter and demodulate

Page 15: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Heuristic Analysis: Fire sound

Page 16: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Heuristic Analysis: Fire sound

Page 17: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Heuristic Analysis: Water

Page 18: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Heuristic Analysis: Rain

Page 19: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Heuristic Analysis: Speech

Page 20: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Summary

sound → filter bank → demodulate → transformed envelopes

• Important statistics include

– energy in sub-bands (power-spectrum)– patterns of co-modulation (local power-sharing)– time-scale of the modulation– depth of the modulation (sparsity)

• Josh’s texture synthesis work tomorrow

• Formulate a probabilistic model to capture these statistics:

sound ← modulate carriers ← modulators ← transformed envelopes

Structural Primitives = Co-modulated narrow-band processes

Page 21: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Summary

sound → filter bank → demodulate → transformed envelopes

• Important statistics include

– energy in sub-bands (power-spectrum)– patterns of co-modulation (local power-sharing)– time-scale of the modulation– depth of the modulation (sparsity)

• Josh’s texture synthesis work tomorrow

• Formulate a probabilistic model to capture these statistics:

sound ← modulate carriers ← modulators ← transformed envelopes

Structural Primitives = Co-modulated narrow-band processes

Page 22: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Summary

sound → filter bank → demodulate → transformed envelopes

• Important statistics include

– energy in sub-bands (power-spectrum)– patterns of co-modulation (local power-sharing)– time-scale of the modulation– depth of the modulation (sparsity)

• Josh’s texture synthesis work tomorrow

• Formulate a probabilistic model to capture these statistics:

sound ← modulate carriers ← modulators ← transformed envelopes

Structural Primitives = Co-modulated narrow-band processes

Page 23: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Model Schematic

0 0.05 0.1

0

time

y t

0 0.05 0.1

0

time0 0.05 0.1

0

time

0

a 1,t

0

a D,t

0c 1,t

0c D,t

=

...

...

+ ... +

× ×

= =

Page 24: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Model Schematic

0 0.05 0.1

0

time

y t

0 0.05 0.1

0

time0 0.05 0.1

0

time

0

a 1,t

0

a D,t

0c 1,t

0c D,t

0x 1,t

0x D,t

=

...

...

+ ... +

× ×

= =

...

Page 25: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Probabilistic Model

• Transformed envelopes: slow and +/–

p(xe,1:T |Γe,1:T,1:T ) = Norm (xe,1:T ; 0,Γe,1:T,1:T )

Γe,t−t′ = σ2e exp

(− 1

2τ2e

(t− t′)2)

• Envelopes: positive mixture of transformed envelopes with controllable sparsity

ad,t = a

(E∑e=1

gd,exe,t + µd

), e.g. a(z) = log(1 + exp(z)).

• Carriers: band-limited Gaussian noise

p(cd,t|cd,t−1:t−2, θ) = Norm

(cd,t;

2∑t′=1

λd,t′cd,t−t′, σ2d

)

yt =D∑d=1

cd,tad,t + σyεt

Page 26: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Probabilistic Model

• Transformed envelopes: slow and +/–

p(xe,1:T |Γe,1:T,1:T ) = Norm (xe,1:T ; 0,Γe,1:T,1:T )

Γe,t−t′ = σ2e exp

(− 1

2τ2e

(t− t′)2)

• Envelopes: positive mixture of transformed envelopes with controllable sparsity

ad,t = a

(E∑e=1

gd,exe,t + µd

), e.g. a(z) = log(1 + exp(z)).

• Carriers: band-limited Gaussian noise

p(cd,t|cd,t−1:t−2, θ) = Norm

(cd,t;

2∑t′=1

λd,t′cd,t−t′, σ2d

)

yt =D∑d=1

cd,tad,t + σyεt

Page 27: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Probabilistic Model

• Transformed envelopes: slow and +/–

p(xe,1:T |Γe,1:T,1:T ) = Norm (xe,1:T ; 0,Γe,1:T,1:T )

Γe,t−t′ = σ2e exp

(− 1

2τ2e

(t− t′)2)

• Envelopes: positive mixture of transformed envelopes with controllable sparsity

ad,t = a

(E∑e=1

gd,exe,t + µd

), e.g. a(z) = log(1 + exp(z)).

• Carriers: band-limited Gaussian noise

p(cd,t|cd,t−1:t−2, θ) = Norm

(cd,t;

2∑t′=1

λd,t′cd,t−t′, σ2d

)

yt =D∑d=1

cd,tad,t + σyεt

Page 28: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Probabilistic Model

• Transformed envelopes: slow and +/–

p(xe,1:T |Γe,1:T,1:T ) = Norm (xe,1:T ; 0,Γe,1:T,1:T )

Γe,t−t′ = σ2e exp

(− 1

2τ2e

(t− t′)2)

• Envelopes: positive mixture of transformed envelopes with controllable sparsity

ad,t = a

(E∑e=1

gd,exe,t + µd

), e.g. a(z) = log(1 + exp(z)).

• Carriers: band-limited Gaussian noise

p(cd,t|cd,t−1:t−2, θ) = Norm

(cd,t;

2∑t′=1

λd,t′cd,t−t′, σ2d

)

yt =D∑d=1

cd,tad,t + σyεt

Page 29: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

What do the parameters control?

λd control the centre frequency and bandwidth of each sub-band

σ2d control the power in each sub-band

τe controls time-scale of modulators

µd and σ2e control the modulation depth, skew and the sparsity

gd,e control the patterns of modulation across sub-bands

0 0.1 0.2 0.3 0.40

0.1

0.2

0.3

0.4

0.5

centre frequency

band

wid

th

Page 30: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Intuitions for role of model parameters: Generation

Page 31: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing sparsity

Page 32: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing sparsity

Page 33: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing sparsity

Page 34: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing sparsity

Page 35: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing sparsity

Page 36: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing sparsity

Page 37: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing co-modulation

Page 38: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing co-modulation

Page 39: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing co-modulation

Page 40: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing co-modulation

Page 41: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing co-modulation

Page 42: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing co-modulation

Page 43: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Increasing co-modulation

Page 44: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Decreasing time-scale

Page 45: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Decreasing time-scale

Page 46: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Decreasing time-scale

Page 47: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Decreasing time-scale

Page 48: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Decreasing time-scale

Page 49: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Decreasing time-scale

Page 50: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Generation: Decreasing time-scale

Page 51: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Inference and Learning

• Key Observation: Fix envelopes, posterior over carriers is Gaussian.

• Leads to MAP estimation of the envelopes (like HMMs, ICA, NMF)

XMAP = arg maxX

p(X|Y)

p(X|Y) =1Zp(X,Y) =

1Z

∫dCp(X,Y,C) =

1Zp(X)

∫dCp(Y|A,C)p(C)

• Compute integral efficiently using Kalman Smoothing

• Leads to gradient based optimisation for transformed amplitudes

• Learning: approximate Maximum Likelihood θ = arg maxθ p(Y|θ)

• See Turner and Sahani, ICASSP 2010.

Page 52: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Inference and Learning

• Key Observation: Fix envelopes, posterior over carriers is Gaussian.

• Leads to MAP estimation of the envelopes (like HMMs, ICA, NMF)

XMAP = arg maxX

p(X|Y)

p(X|Y) =1Zp(X,Y) =

1Z

∫dCp(X,Y,C) =

1Zp(X)

∫dCp(Y|A,C)p(C)

• Compute integral efficiently using Kalman Smoothing

• Leads to gradient based optimisation for transformed amplitudes

• Learning: approximate Maximum Likelihood θ = arg maxθ p(Y|θ)

• See Turner and Sahani, ICASSP 2010.

Page 53: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Synthetic Sounds

Stream Stream Stream Wind Wind Fire Fire Rain Twigs Speech

Orig Orig Orig Orig Orig Orig Orig Orig Orig Orig

Model Model Model Model Model Model Model Model Model Model

Page 54: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Primitive Probabilistic Auditory Scene Analysis

Part 1: Probabilistic generative model: primitive auditory scenesynthesis

What are the important low-level statistics of natural sounds?

Part 2: Probabilistic recognition model: primitive auditoryscene analysis

Provocative computational theory: Grouping rules arise frominferences based on the statistics of natural sounds.

Page 55: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Introduction to the psychophysics

• Probe model with stimuli used to study psychophysics of auditory sceneanalysis

• Fix carriers to be ‘auditory filter’ like

• Assume perceptual access to:

– instantaneous frequency of the carriers (not the phase)– modulators

How does the model ‘hear’ these stimuli?

Page 56: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Continuity Illusion

yt = a1,tc1,t + a2,tc2,t

−5

0

5

sign

al

0

100

freq

uenc

y /H

z

Page 57: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Continuity Illusion

yt = a1,tc1,t + a2,tc2,t

−5

0

5

sign

al

0

100

freq

uenc

y /H

z

Page 58: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Continuity Illusion

yt = a1,tc1,t + a2,tc2,t

−5

0

5

sign

al

0

100

freq

uenc

y /H

z

−1

0

1

ytone

0 0.5 1 1.5

−5

0

5

ynois

e

time /s

Page 59: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Common Amplitude Modulation

yt = (a1,t + a3,t)c1,t + (a2,t + a3,t)c2,t

0

sign

al

0 0.5 1 1.5 2 2.5 3 3.5 4

100

150

200

time /s

freq

uenc

y /H

z

Page 60: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Common Amplitude Modulation

yt = (a1,t + a3,t)c1,t + (a2,t + a3,t)c2,t

0

sign

al

0 0.5 1 1.5 2 2.5 3 3.5 4

100

150

200

time /s

freq

uenc

y /H

z

Page 61: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Common Amplitude Modulation

yt = (a1,t + a3,t)c1,t + (a2,t + a3,t)c2,t

0

sign

al

0 0.5 1 1.5 2 2.5 3 3.5 4

100

150

200

time /s

freq

uenc

y /H

z

Page 62: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Old Plus New Heuristic

yt = a1,t(c1,t + c2,t + c3,t) + a2,t(c2,t + c4,t) + a3,tc2,t + a4,t(c1,t + c3,t)

−2

0

2

0

200

400

freq

uenc

y /H

z

Page 63: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Old Plus New Heuristic

yt = a1,t(c1,t + c2,t + c3,t) + a2,t(c2,t + c4,t) + a3,tc2,t + a4,t(c1,t + c3,t)

−2

0

2

0

200

400

freq

uenc

y /H

z

Page 64: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Old Plus New Heuristic

yt = a1,t(c1,t + c2,t + c3,t) + a2,t(c2,t + c4,t) + a3,tc2,t + a4,t(c1,t + c3,t)

−2

0

2

0

400

freq

uenc

y /H

z

−202

a 1

−202

a 2

−101

a 3

0 0.5 1 1.5 2 2.5 3 3.5 4−2

02

a 4

time /s

Page 65: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Comodulation Masking Release

yt = a1,tc1,t + a2,tc2,t

−4−2

024

signal 1

80

100

120

freq

uenc

y /H

z

signal 2

Page 66: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Comodulation Masking Release

yt = a1,tc1,t + a2,tc2,t

−4−2

024

signal 1

80

100

120

freq

uenc

y /H

z

signal 2

Page 67: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Comodulation Masking Release

yt = a1,tc1,t + a2,tc2,t

−4−2

024

signal 1

80

100

120

freq

uenc

y /H

z

signal 2

−4−2

024

ynois

e

0 0.5 1 1.5 2−2

0

2

ytone

time /s0 0.5 1 1.5 2

−2

0

2

time /s

Page 68: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Good Continuation

yt = a1,tc1,t + a2,tc2,t

0

sign

al 1

0

50

100

freq

uenc

y /H

z

0

sign

al 2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 80

120

160

time /s

freq

uenc

y /H

z

Page 69: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Good Continuation

yt = a1,tc1,t + a2,tc2,t

0

sign

al 1

0

50

100

freq

uenc

y /H

z

0

sign

al 2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 80

120

160

time /s

freq

uenc

y /H

z

Page 70: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Good Continuation

yt = a1,tc1,t + a2,tc2,t

0

sign

al 1

0

50

100

freq

uenc

y /H

z

0

sign

al 2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5160

240

320

time /s

freq

uenc

y /H

z

Page 71: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Proximity

yt = a1,tc1,t + a2,tc2,t

−1

0

1

sign

al 1

0

100

freq

uenc

y /H

z

−1

0

1

sign

al 2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0

100

time /s

freq

uenc

y /H

z

Page 72: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Proximity

yt = a1,tc1,t + a2,tc2,t

−1

0

1

sign

al 1

0

100

freq

uenc

y /H

z

−1

0

1

sign

al 2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0

100

time /s

freq

uenc

y /H

z

Page 73: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Proximity

yt = a1,tc1,t + a2,tc2,t

−1

0

1

sign

al 1

0

100

freq

uenc

y /H

z

−1

0

1

sign

al 2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0

100

time /s

freq

uenc

y /H

z

Page 74: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Conclusions

• Developed a model for natural sounds comprising quickly varying carriers andslowly varying modulators

• Inspired by natural sound statistics

• Inference replicates many of the characteristics of auditory scene analysis

Future work

• Train on a large corpus of natural sounds: quantitatively predictpsychophysical results

• Trading between grouping principles learned from natural sounds

• Psychophysics: prediction of grouping on new stimuli

• Experimental stimulus generation: samples from the model are controlled,but natural sounding

Page 75: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

• Model extensions: Hierarchical (cascades of modulators), asymmetries, moregeneral filter-shapes

Page 76: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Additional Slides

Page 77: Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

Inference with fixed amplitudes, ad,t = 1

700 800 900 1000 1100 1200 1300 1400 1500 1600frequency /Hz

carr

ier

spec

tra

filte

r 3

filte

r 2

filte

r 1

5.35 5.36 5.37 5.38 5.39 5.4time /s

y

5.35 5.36 5.37 5.38 5.39 5.4time /s