2017 muellermeinard lecturemusicprocessing musicstructure

135
Music Processing Meinard Müller Lecture Music Structure Analysis International Audio Laboratories Erlangen [email protected]

Upload: others

Post on 01-Jan-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Music Processing

Meinard Müller

Lecture

Music Structure Analysis

International Audio Laboratories [email protected]

Page 2: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Book: Fundamentals of Music Processing

Meinard MüllerFundamentals of Music ProcessingAudio, Analysis, Algorithms, Applications483 p., 249 illus., hardcoverISBN: 978-3-319-21944-8Springer, 2015

Accompanying website: www.music-processing.de

Page 3: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Book: Fundamentals of Music Processing

Meinard MüllerFundamentals of Music ProcessingAudio, Analysis, Algorithms, Applications483 p., 249 illus., hardcoverISBN: 978-3-319-21944-8Springer, 2015

Accompanying website: www.music-processing.de

Page 4: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Book: Fundamentals of Music Processing

Meinard MüllerFundamentals of Music ProcessingAudio, Analysis, Algorithms, Applications483 p., 249 illus., hardcoverISBN: 978-3-319-21944-8Springer, 2015

Accompanying website: www.music-processing.de

Page 5: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Chapter 4: Music Structure Analysis

In Chapter 4, we address a central and well-researched area within MIR knownas music structure analysis. Given a music recording, the objective is toidentify important structural elements and to temporally segment the recordingaccording to these elements. Within this scenario, we discuss fundamentalsegmentation principles based on repetitions, homogeneity, and novelty—principles that also apply to other types of multimedia beyond music. As animportant technical tool, we study in detail the concept of self-similaritymatrices and discuss their structural properties. Finally, we briefly touch thetopic of evaluation, introducing the notions of precision, recall, and F-measure.

4.1 General Principles4.2 Self-Similarity Matrices4.3 Audio Thumbnailing4.4 Novelty-Based Segmentation4.5 Evaluation4.6 Further Notes

Page 6: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Music Structure AnalysisExample: Zager & Evans “In The Year 2525”

Time (seconds)

Page 7: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Music Structure Analysis

Time (seconds)

Example: Zager & Evans “In The Year 2525”

Page 8: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Music Structure Analysis

V1 V2 V3 V4 V5 V6 V7 V8 OBI

Example: Zager & Evans “In The Year 2525”

Page 9: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Music Structure AnalysisExample: Brahms Hungarian Dance No. 5 (Ormandy)

Time (seconds)

A1 A2 A3B1 B2 B3 B4C

Page 10: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Music Structure Analysis

Time (seconds)

Example: Folk Song Field Recording (Nederlandse Liederenbank)

Page 11: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Example: Weber, Song (No. 4) from “Der Freischütz”

0 50 100 150 200

…...

Kleiber

Time (seconds)

.. ....

Music Structure Analysis

0 50 100 150 200

Introduction Stanzas Dialogues

20 40 60 80 100 12020 40 60 80 100 120

Ackermann

Time (seconds)

Page 12: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Music Structure Analysis

Stanzas of a folk song

Intro, verse, chorus, bridge, outro sections of a pop song

Exposition, development, recapitulation, coda of a sonata

Musical form ABACADA … of a rondo

General goal: Divide an audio recording into temporal segments corresponding to musical parts and group these segments into musically meaningful categories.

Examples:

Page 13: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Music Structure Analysis

Homogeneity:

Novelty:

Repetition:

General goal: Divide an audio recording into temporal segments corresponding to musical parts and group these segments into musically meaningful categories.

Challenge: There are many different principles for creating relationships that form the basis for the musical structure.

Consistency in tempo, instrumentation, key, …

Sudden changes, surprising elements …

Repeating themes, motives, rhythmic patterns,…

Page 14: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Music Structure Analysis

Novelty Homogeneity Repetition

Page 15: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Overview

Introduction

Feature Representations

Self-Similarity Matrices

Audio Thumbnailing

Novelty-based Segmentation

Thanks:

Clausen, Ewert, Kurth, Grohganz, …

Dannenberg, Goto Grosche, Jiang Paulus, Klapuri Peeters, Kaiser, … Serra, Gómez, … Smith, Fujinaga, … Wiering, … Wand, Sunkel,

Jansen …

Page 16: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Overview

Introduction

Feature Representations

Self-Similarity Matrices

Audio Thumbnailing

Novelty-based Segmentation

Thanks:

Clausen, Ewert, Kurth, Grohganz, …

Dannenberg, Goto Grosche, Jiang Paulus, Klapuri Peeters, Kaiser, … Serra, Gómez, … Smith, Fujinaga, … Wiering, … Wand, Sunkel,

Jansen …

Page 17: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

General goal: Convert an audio recording into a mid-level representation that captures certain musical properties while supressing other properties.

Timbre / Instrumentation

Tempo / Rhythm

Pitch / Harmony

Page 18: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

General goal: Convert an audio recording into a mid-level representation that captures certain musical properties while supressing other properties.

Timbre / Instrumentation

Tempo / Rhythm

Pitch / Harmony

Page 19: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

C124

C236

C348

C460

C572

C684

C796

C8108

Example: Chromatic scale

Waveform

Time (seconds)

Am

plitu

de

Page 20: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature RepresentationFr

eque

ncy

(Hz)

Inte

nsity

(dB

)

Inte

nsity

(dB

)

Freq

uenc

y (H

z)

Time (seconds)

C124

C236

C348

C460

C572

C684

C796

C8108

Example: Chromatic scale

Spectrogram

Page 21: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature RepresentationFr

eque

ncy

(Hz)

Inte

nsity

(dB

)

Inte

nsity

(dB

)

Freq

uenc

y (H

z)

Time (seconds)

C124

C236

C348

C460

C572

C684

C796

C8108

Example: Chromatic scale

Spectrogram

Page 22: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

C4: 261 HzC5: 523 Hz

C6: 1046 Hz

C7: 2093 Hz

C8: 4186 Hz

C3: 131 Hz

Inte

nsity

(dB

)

Time (seconds)

C124

C236

C348

C460

C572

C684

C796

C8108

Example: Chromatic scale

Spectrogram

Page 23: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

C4: 261 Hz

C5: 523 Hz

C6: 1046 Hz

C7: 2093 Hz

C8: 4186 Hz

C3: 131 Hz Inte

nsity

(dB

)

Time (seconds)

C124

C236

C348

C460

C572

C684

C796

C8108

Example: Chromatic scale

Log-frequency spectrogram

Page 24: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

Pitc

h (M

IDI n

ote

num

ber)

Inte

nsity

(dB

)

Time (seconds)

C124

C236

C348

C460

C572

C684

C796

C8108

Example: Chromatic scale

Log-frequency spectrogram

Page 25: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

Chroma C

Inte

nsity

(dB

)

Pitc

h (M

IDI n

ote

num

ber)

Time (seconds)

C124

C236

C348

C460

C572

C684

C796

C8108

Example: Chromatic scale

Log-frequency spectrogram

Page 26: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

Chroma C#

Inte

nsity

(dB

)

Pitc

h (M

IDI n

ote

num

ber)

Time (seconds)

C124

C236

C348

C460

C572

C684

C796

C8108

Example: Chromatic scale

Log-frequency spectrogram

Page 27: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

C124

C236

C348

C460

C572

C684

C796

C8108

Example: Chromatic scale

Chroma representation

Inte

nsity

(dB

)

Time (seconds)

Chr

oma

Page 28: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature RepresentationExample: Brahms Hungarian Dance No. 5 (Ormandy)

Time (seconds)

A1 A2 A3B1 B2 B3 B4C

Page 29: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

A1 A2 A3B1 B2 B3 B4C

Feature extractionChroma (Harmony)

Example: Brahms Hungarian Dance No. 5 (Ormandy)

Time (seconds)

Page 30: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

A1 A2 A3B1 B2 B3 B4C

Feature extractionChroma (Harmony)

Example: Brahms Hungarian Dance No. 5 (Ormandy)

G minor G minor

D

GBb

Time (seconds)

Page 31: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Feature Representation

A1 A2 A3B1 B2 B3 B4C

Feature extractionChroma (Harmony)

Example: Brahms Hungarian Dance No. 5 (Ormandy)

G minor G major G minor

D

GBb

D

GB

Time (seconds)

Page 32: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Overview

Introduction

Feature Representations

Self-Similarity Matrices

Audio Thumbnailing

Novelty-based Segmentation

Page 33: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)

General idea: Compare each element of the feature sequence with each other element of the feature sequence based on a suitable similarity measure.

→ Quadratic self-similarity matrix

Page 34: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 35: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 36: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 37: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 38: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 39: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 40: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

G major

G m

ajor

Page 41: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 42: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 43: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 44: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 45: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Slower

Fast

er

Page 46: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Fast

er

Slower

Page 47: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Idealized SSM

Page 48: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Self-Similarity Matrix (SSM)Example: Brahms Hungarian Dance No. 5 (Ormandy)

Idealized SSM

Blocks: Homogeneity

Paths: Repetition

Corners: Novelty

Page 49: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Feature smoothing Coarsening

Time (samples)

Tim

e (s

ampl

es)

Block Enhancement

Page 50: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Block Enhancement

Feature smoothing Coarsening

Time (samples)

Tim

e (s

ampl

es)

Page 51: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Feature smoothing Coarsening

Time (samples)

Tim

e (s

ampl

es)

Block Enhancement

Page 52: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementChallenge: Presence of musical variations

Idea: Enhancement of path structure

Fragmented paths and gaps

Paths of poor quality

Regions of constant (high) similarity

Curved paths

Page 53: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementShostakovich Waltz 2, Jazz Suite No. 2 (Chailly)

SSM

Page 54: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementShostakovich Waltz 2, Jazz Suite No. 2 (Chailly)

SSM

Page 55: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementShostakovich Waltz 2, Jazz Suite No. 2 (Chailly)

SSM

Page 56: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementShostakovich Waltz 2, Jazz Suite No. 2 (Chailly)

Enhanced SSMFiltering along main diagonal

Page 57: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Idea: Usage of contextual information (Foote 1999)

smoothing effect

Comparison of entire sequences = length of sequences = enhanced SSM

Page 58: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

SSM

Page 59: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Filtering along main diagonalEnhanced SSM with

Page 60: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Filtering along 8 different directions and minimizingEnhanced SSM with

Page 61: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Idea: Smoothing along various directionsand minimizing over all directions

Tempo changes of -50 to +50 percent

Page 62: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Time (samples)

Tim

e (s

ampl

es)

Path Enhancement

Page 63: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Time (samples)

Tim

e (s

ampl

es)

Path Enhancement

Diagonal smoothing

Page 64: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Time (samples)

Tim

e (s

ampl

es)

Path Enhancement

Diagonal smoothing Multiple filtering

Page 65: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Time (samples)

Tim

e (s

ampl

es)

Path Enhancement

Diagonal smoothing Multiple filtering Thresholding (relative) Scaling & penalty

Page 66: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Time (samples)

Tim

e (s

ampl

es)

Further Processing

Path extraction

Page 67: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Time (samples)

Tim

e (s

ampl

es)

Further Processing

Path extraction Pairwise relations

100 200 300 400

1

Time (samples)

234567

Page 68: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

Time (samples)

Tim

e (s

ampl

es)

Further Processing

Path extraction Pairwise relations Grouping (transitivity)

100 200 300 400

1

Time (samples)

234567

Page 69: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

100 200 300 400Time (samples)

SSM Enhancement

Time (samples)

Tim

e (s

ampl

es)

Further Processing

Path extraction Pairwise relations Grouping (transitivity)

100 200 300 400

1

Time (samples)

234567

Page 70: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM Enhancement

V1 V2 V3 V4 V5 V6 V7 V8 OBI

Example: Zager & Evans “In The Year 2525”

Page 71: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementExample: Zager & Evans “In The Year 2525”

Page 72: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementExample: Zager & Evans “In The Year 2525”Missing relations because of transposed sections

Page 73: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementExample: Zager & Evans “In The Year 2525”Idea: Cyclic shift of one of the chroma sequences

One semitone up

Page 74: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementExample: Zager & Evans “In The Year 2525”Idea: Cyclic shift of one of the chroma sequences

Two semitones up

Page 75: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementExample: Zager & Evans “In The Year 2525”Idea: Overlay Transposition-invariant SSM& Maximize

Page 76: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

SSM EnhancementExample: Zager & Evans “In The Year 2525”Note: Order of enhancement steps important!

Maximization Smoothing & Maximization

Page 77: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Similarity Matrix Toolbox

Meinard Müller, Nanzhu Jiang, Harald GrohganzSM Toolbox: MATLAB Implementations for Computing and Enhancing Similarity Matrices

http://www.audiolabs-erlangen.de/resources/MIR/SMtoolbox/

Page 78: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Overview

Introduction

Feature Representations

Self-Similarity Matrices

Audio Thumbnailing

Novelty-based Segmentation

Thanks:

Jiang, Grosche Peeters Cooper, Foote Goto Levy, Sandler Mauch Sapp

Page 79: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Audio Thumbnailing

A1 A2 A3B1 B2 B3 B4C

Example: Brahms Hungarian Dance No. 5 (Ormandy)

General goal: Determine the most representative section(“Thumbnail”) of a given music recording.

V1 V2 V3 V4 V5 V6 V7 V8 OBI

Example: Zager & Evans “In The Year 2525”

Thumbnail is often assumed to be the most repetitive segment

Page 80: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Audio Thumbnailing

Two steps Paths of poor quality (fragmented, gaps) Block-like structures Curved paths

1. Path extraction

2. Grouping Noisy relations(missing, distorted, overlapping)

Transitivity computation difficult

Both steps are problematic!

Main idea: Do both, path extraction and grouping, jointly

One optimization scheme for both steps Stabilizing effect Efficient

Page 81: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Audio Thumbnailing

Main idea: Do both path extraction and grouping jointly

For each audio segment we define a fitness value

This fitness value expresses “how well” the segmentexplains the entire audio recording

The segment with the highest fitness value isconsidered to be the thumbnail

As main technical concept we introduce the notion of a path family

Page 82: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

0 50 100 150 2000

20

40

60

80

100

120

140

160

180

200

−2

−1.5

−1

−0.5

0

0.5

1

Fitness Measure

Enhanced SSM

Page 83: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Consider a fixed segment

Path over segment

Page 84: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Consider a fixed segment

Path over segment Induced segment Score is high

Path over segment

Page 85: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Path over segment

Consider a fixed segment

Path over segment Induced segment Score is high

A second path over segment Induced segment Score is not so high

Page 86: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Path over segment

Consider a fixed segment

Path over segment Induced segment Score is high

A second path over segment Induced segment Score is not so high

A third path over segment Induced segment Score is very low

Page 87: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Path family

Consider a fixed segment

A path family over a segmentis a family of paths such thatthe induced segments do not overlap.

Page 88: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Path family

This is not a path family!

Consider a fixed segment

A path family over a segmentis a family of paths such thatthe induced segments do not overlap.

Page 89: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Path family

This is a path family!

Consider a fixed segment

A path family over a segmentis a family of paths such thatthe induced segments do not overlap.

(Even though not a good one)

Page 90: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Optimal path family

Consider a fixed segment

Page 91: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Optimal path family

Consider a fixed segment

Consider over the segmentthe optimal path family,i.e., the path family havingmaximal overall score.

Call this value:Score(segment)

Note: This optimal path family can be computedusing dynamic programming.

Page 92: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Optimal path family

Consider a fixed segment

Consider over the segmentthe optimal path family,i.e., the path family havingmaximal overall score.

Call this value:Score(segment)

Furthermore consider theamount covered by theinduced segments.

Call this value:Coverage(segment)

Page 93: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Fitness

Consider a fixed segment

P := R :=

Score(segment)Coverage(segment)

Page 94: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Fitness

Consider a fixed segment

Self-explanation are trivial!

P := R :=

Score(segment)Coverage(segment)

Page 95: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Fitness

Consider a fixed segment

Self-explanation are trivial!

Subtract length of segment

P := R :=

Score(segment)Coverage(segment)

- length(segment) - length(segment)

Page 96: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Normalize( )

Fitness Measure

Fitness

Consider a fixed segment

Self-explanation are trivial!

Subtract length of segment

Normalization

P := R :=

Score(segment)Coverage(segment)

- length(segment)- length(segment)

]1,0[]1,0[

Normalize( )

Page 97: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness Measure

Fitness

Consider a fixed segment

F := 2 • P • R / (P + R)Fitness(segment)

Normalize( ) Normalize( )

P := R :=

Score(segment)Coverage(segment)

- length(segment)- length(segment)

]1,0[]1,0[

Page 98: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Thumbnail

Segment center

Seg

men

t len

gth

Fitness Scape Plot

Segment length

Segment center

Fitness

Page 99: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Thumbnail

Segment center

Fitness Scape Plot

Fitness(segment)

Segment length

Segment center

Fitness

Seg

men

t len

gth

Page 100: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Thumbnail

Segment center

Fitness Scape PlotFitness

Seg

men

t len

gth

Page 101: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Thumbnail

Segment center

Fitness Scape Plot

Note: Self-explanations are ignored → fitness is zero

Fitness

Seg

men

t len

gth

Page 102: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Thumbnail

Segment center

Fitness Scape Plot

Thumbnail := segment having the highest fitness

Fitness

Seg

men

t len

gth

Page 103: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

ThumbnailFitness Scape Plot

Example: Brahms Hungarian Dance No. 5 (Ormandy)

Fitness

A1 A2 A3B1 B2 B3 B4C

Page 104: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness

ThumbnailFitness Scape Plot

Example: Brahms Hungarian Dance No. 5 (Ormandy)

A1 A2 A3B1 B2 B3 B4C

Page 105: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness

ThumbnailFitness Scape Plot

Example: Brahms Hungarian Dance No. 5 (Ormandy)

A1 A2 A3B1 B2 B3 B4C

Page 106: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness

ThumbnailFitness Scape Plot

Example: Brahms Hungarian Dance No. 5 (Ormandy)

A1 A2 A3B1 B2 B3 B4C

Page 107: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Scape Plot

Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 108: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Scape Plot

Coloring accordingto clustering result(grouping)

Example: Brahms Hungarian Dance No. 5 (Ormandy)

Page 109: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Scape Plot

Example: Brahms Hungarian Dance No. 5 (Ormandy)

Coloring accordingto clustering result(grouping)

A1 A2 A3B1 B2 B3 B4C

Page 110: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

ThumbnailFitness Scape Plot

Example: Zager & Evans “In The Year 2525”

Fitness

V1 V2 V3 V4 V5 V6 V7 V8 OBI

Page 111: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Fitness

ThumbnailFitness Scape Plot

Example: Zager & Evans “In The Year 2525”

V1 V2 V3 V4 V5 V6 V7 V8 OBI

Page 112: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Overview

Introduction

Feature Representations

Self-Similarity Matrices

Audio Thumbnailing

Novelty-based Segmentation

Thanks:

Foote Serra, Grosche, Arcos Goto Tzanetakis, Cook

Page 113: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Find instances where musicalchanges occur.

Find transition between subsequent musical parts.

General goals: Idea (Foote):

Use checkerboard-like kernelfunction to detect corner pointson main diagonal of SSM.

Page 114: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Idea (Foote):

Use checkerboard-like kernelfunction to detect corner pointson main diagonal of SSM.

Page 115: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Idea (Foote):

Use checkerboard-like kernelfunction to detect corner pointson main diagonal of SSM.

Page 116: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Idea (Foote):

Use checkerboard-like kernelfunction to detect corner pointson main diagonal of SSM.

Page 117: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Idea (Foote):

Use checkerboard-like kernelfunction to detect corner pointson main diagonal of SSM.

Page 118: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Idea (Foote):

Use checkerboard-like kernelfunction to detect corner pointson main diagonal of SSM.

Novelty function using

Page 119: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Idea (Foote):

Use checkerboard-like kernelfunction to detect corner pointson main diagonal of SSM.

Novelty function using

Novelty function using

Page 120: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Idea: Find instances where

structural changes occur.

Combine global and localaspects within a unifying framework

Structure features

Page 121: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Enhanced SSM

Structure features

Page 122: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Enhanced SSM Time-lag SSM

Structure features

Page 123: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Enhanced SSM Time-lag SSM Cyclic time-lag SSM

Structure features

Page 124: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Enhanced SSM Time-lag SSM Cyclic time-lag SSM Columns as features

Structure features

Page 125: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based SegmentationExample: Chopin Mazurka Op. 24, No. 1

SSM

Time-lag SSM

Page 126: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based SegmentationExample: Chopin Mazurka Op. 24, No. 1

SSM

Time-lag SSM

Page 127: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based SegmentationExample: Chopin Mazurka Op. 24, No. 1

SSM

Time-lag SSM

Page 128: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Novelty-based Segmentation

Structure-based novelty function

Example: Chopin Mazurka Op. 24, No. 1

SSM

Time-lag SSM

Page 129: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Structure Analysis

Conclusions

Page 130: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Representations

Structure Analysis

AudioMIDIScore

Conclusions

Page 131: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Representations

Musical Aspects

Structure Analysis

TimbreTempoHarmony

AudioMIDIScore

Conclusions

Page 132: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Representations

Segmentation Principles

Musical Aspects

Structure Analysis

HomogeneityNoveltyRepetition

TimbreTempoHarmony

AudioMIDIScore

Conclusions

Page 133: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Temporal and Hierarchical Context

Representations

Segmentation Principles

Musical Aspects

Structure Analysis

HomogeneityNoveltyRepetition

TimbreTempoHarmony

AudioMIDIScore

Conclusions

Page 134: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Conclusions

Combined Approaches

Hierarchical Approaches

Evaluation

Explaining Structure

MIREX SALAMI-Project

Smith, Chew

Page 135: 2017 MuellerMeinard LectureMusicProcessing MusicStructure

Links SM Toolbox (MATLAB)

http://www.audiolabs-erlangen.de/resources/MIR/SMtoolbox/

MSAF: Music Structure Analysis Framework (Python)https://github.com/urinieto/msaf

SALAMI Annotation Datahttp://ddmal.music.mcgill.ca/research/salami/annotations

LibROSA (Python)https://librosa.github.io/librosa/

Evaluation: mir_eval (Python)https://craffel.github.io/mir_eval/

Deep Learning: Boundary DetectionJan Schlüter (PhD thesis)