emotionally-controlled music synthesis
TRANSCRIPT
Emotionally-Controlled Music Synthesis
António Pedro OliveiraAmílcar CardosoUniversity of Coimbra, Portugal12/12/2008
2
Outline
Introduction Computational Model Features Extraction Regression Models Conclusion
3
Outline
Introduction Computational Model Features Extraction Regression Models Conclusion
Introduction
4
Music is accepted as a language of emotional expression
To control this expression in an automatic way, we are developing a computational model that establishes relations between emotions and musical features
Emotions are defined in 2 dimensions: Valence: degree of happiness (from very sad to very happy
music) Arousal: degree of activation (from very relaxing to very
activation music)
5
Outline
Introduction Computational Model Features Extraction Regression Models Conclusion
Computational Model – Features Extraction
6
Use a database of MIDI music labelled with symbolic and audio features
Computational Model – Regression models
7
Use a database of MIDI music labelled with symbolic and audio features
Modelling relations between emotions and music features with regression models
Computational Model
8
Use a database of MIDI music labelled with symbolic and audio features
Modelling relations between emotions and music features with regression models
Use these models to control the affective content of synthesized music
Computational Model - Experiments
9
96 MIDI pieces of film music that last between 20 and 90 seconds
80 listeners Label online each affective
dimension with integer values between 0 and 10
10
Outline
Introduction Computational Model Features Extraction Regression Models Conclusion
Features Extraction
11
Make a music base with MIDI music labelled with symbolic and audio features
Features Extraction – Correlation between audio
features and valence
12
Sharpness – ratio of high/bass frequencies Loudness – total energy Flatness – spectral distribution of energy Dissonance – perceptive interference of
sinusoids
Similarity – temporal spectral correlation of energy distribution by frequency bands
Dissonance – perceptive interference of sinusoids
Sharpness – ratio of high/bass frequencies Energy – total energy
Features Extraction – Correlation between audio
features and arousal
13
Bridge the gap between audio and symbolic domain:
Spectral similarity vs. note duration, interonset interval Spectral dissonance vs. prevalence of percussion
instruments
Features Extraction – Correlation between audio and symbolic
features
14
15
Outline
Introduction Computational Model Features Extraction Regression Models Conclusion
Regression models
16
Establish weighted relations between emotions and musical features
Use non-linear regression models Model with symbolic and audio
features
Regression models – Correlation between models
and valence
17
Best hybrid (use of audio and symbolic features) non-linear regression model – 84%
Best symbolic linear regression model – 75% Best audio non-linear regression model – 61%
Regression models – Best audio and symbolic features for
valence
18
Regression models – Correlation between models
and arousal
19
Best hybrid (use of audio and symbolic features) non-linear regression model – 90%
Best symbolic linear regression model – 84% Best audio non-linear regression model – 75%
Regression models – Best audio and symbolic features for
arousal
20
21
Outline
Introduction Computational Model Features Extraction Regression Models Conclusion
Conclusion
22
Hybrid non-linear regression models outperformed results of symbolic linear regression models
Non-linear models seem more appropriate than linear models
The use of features from audio and symbolic domains is more appropriate than the use of features from only one domain
Timbre/sound can be used to control/influence the emotional expression