Educational Software using
Audio to Score Alignment
Antoine Gomas supervised by
Dr. Tim Collins & Pr. Corinne Mailhes
7th of September, 2007
2
Agenda Introduction Objectives Review & Innovation Work
Dynamic Time WarpingHidden Markov Models Interface
Conclusion
4
Project objectives
Implement a monophonic audio to score alignment algorithm
Evaluate characteristics of the performance
Design a learning interface to help music students improve their performance
5
Review (1)
Previous workAlgorithms already existSimilar to Spoken Language ProcessingApplication: musicologyProfessional recordings
6
Review (2)
Previous work (continued)Dynamic Time Warping
Few parameters Heavy Low flexibility
Hidden Markov Models Very flexible Large number of parameters (training)
7
Review (3)
InnovationApply to educational softwareRequires modifications & new functionalities
Cope with errors Detect errors
10
DTW (2) Structure
Feature extraction Distance matrix Find optimal path
Signal Score
Instrumentmodel
Feature vectors Feature vectors
DTW
Aligned sequence
12
DTW (4) Results
~95% notes aligned on “good” performances Rhythm errors
Very high tolerance Provided pitches are correct
Pitch errors Tuning errors: no problem Note errors: OK
Good results, but limitations
13
DTW (5) Limitations
Impossible to recover from severe student mistakes
Self-correction not perfect
14
HMM (1) Why?
ExpectedLower computing requirementsFlexibility to recover from student’s errors
And alsoUse state-of-the-art techniquesFind connections with SLP
15
HMM (2) Application to ASA
HMM Observed symbols State trellis Emission matrix
Decoded sequence
ASA Recording frames Score representation Instrument model
Performance image
16
HMM (3) Flexibility
Note 6
D6, P6
1-p12 1
Note 1
D1, P1
Note 2
D2, P2
Note 3
D3, P3
Note 4
D4, P4
Note 5
D5, P5p12 p23 1 1
Note 1
D1, P1
Note 2
D2, P2
Note 3
D3, P3
Note 4
D4, P4
Note 5
D5, P5
p23
1 1p12
Note 7
D’3, P3
Note 8
D’4, P4
1-p23
11
Note 6
D2, P’2
1-p12
1-p63
p63
1-p23
17
HMM (4) Results
100% on rhythmic recordings Good on melodic recordings Rhythm errors
Good tolerance, though inferior to DTW Pitch errors
No data Severe mistakes
Fine when anticipated Self correction
More robust than DTW Tempo estimation not critical
18
HMM (5) Extensions
Pitch Other note topologies Improve speed
Local algorithmLanguage
Waiting state
19
ITS & Interface (1)
Intelligent Tutoring Systems Knowledge models
Domain modelLearner model
Open LearnerModel
DM
LM
Teaching strategies
DM LM
Teaching strategies
Overlay Perturbation
21
Conclusion
DTW not suitable for education Promising HMM results
Works without pitch Additional paths for anticipated errors
Still room for improvements Pitch Computation efficiency
Coherent ground together with IF design