audio-based emotion recognition for advanced information retrieval in judicial domain ict4justice...
TRANSCRIPT
Audio-based Emotion Recognition for Advanced Audio-based Emotion Recognition for Advanced Information Retrieval in Judicial DomainInformation Retrieval in Judicial Domain
ICT4JUSTICE 2008 – Thessaloniki,October 24
G. Arosio, E. Fersini, E. Messina, F. Archetti
Dipartimento di Informatica, Sistemistica e Comunicazione
Università degli Studi di Milano-Bicocca
Affective Computing
Learning the emotional state of a human being
Learning from:
Vocal signals
Facial expressions
Biometric signals
Multimodal sources
Applications
Games (personal robots)
Call centers
Automotive
JUMAS: Emotion Recognition in Judicial Domain for Semantic Retrieval
Emotion Recognition
JUMAS Project
Audio&Video Document
Current ScenarioManual
TranscriptionManual
Retrieval
ManualInformation Extraction
AutomaticRecording
ManualRetrieval
Manual Information Extraction
AutomaticRecording
Audio Stream
Analogical / Digital
Acquisition
Video Stream
Future Scenario
Audio&Video
Document
Digital
Acquisition
Automatic Audio
Transcription
Automatic Audio&Video
Annotation
Automatic Information
Extraction
Automatic Semantic Retrieval
Audio&Video Stream
EmotionAnnotation
for
PRESIDENT: C’è qualcuno per XXXXXX? Non c’è nessuno per XXXXXX? Perché mi risultava difesa dall’avvocato YYYYYY. PRESIDENT: Allora XXXXXX è difesa dall’avvocato YYYYYY e dall’avvocato ZZZZZZ. PROSECUTOR: Possiamo chiamare a testimoniare il signor KKKKKK? PRESIDENT: L’accusa chiama KKKKKK…….. PROSECUTOR: Signor KKKKKK lei conferma di aver udito la signora XXXXXX prendere accordi per un trasferimento di fondi all’estero? WITNESS: No…ehm… io in realtà non ho mai conosciuto personalmente la signora XXXXXX.
<Anger>
Neutral
Fear
Emotion Recognition
Output: XML Searchable Tags
Neutral
Neutral
Neutral
Challenges: What features are able to describe and discriminate different emotional states?
Which kind of environment influences emotional state recognition?
Which kind of learning models produces the optimal performance?
Emotion Recognition
Italian DB: 391 samples Sentences from movies
5 emotional states: Anger
Happiness
Sadness
Neutral
Fear
Step 1 – Vocal Signature Acquisition
Emotion Recognition from vocal signatures
German DB: 531 samples Acted sentences: emotion on
request
7 emotional states Anger
Fear
Happiness
Sadness
Neutral
Disgust
Boredom
Preliminary Experimental Results
46.8
38.3
48.342.2
56.559.3
0
20
40
60
80
100
Naive Bayes K-Nearest Neighbor Support VectorMachines
Italian DB German DB
Flat Models
Learning Models are biased by:
Language
Gender
Neutral emotional state
Multi-Layer Support Vector Machines
Hierarchical Classification:
Multi-Layer Support Vector Machines