mediaeval 2016 - cosmir and the openmic challenge: a plan for sustainable music information...
TRANSCRIPT
![Page 1: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/1.jpg)
COSMIR and the OpenMIC Challengethe talk previously known as
A Plan for Sustainable Music IR Evaluation
Brian McFee* Eric Humphrey Julián Urbano**
![Page 2: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/2.jpg)
solving problems with science
![Page 3: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/3.jpg)
Hypothesis(model)
Experiment(evaluation)
Progress depends on reproducibility
![Page 4: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/4.jpg)
![Page 5: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/5.jpg)
We’ve known this for a while
● Many years of MIREX!
● Lots of participation
● It’s been great for the community
![Page 6: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/6.jpg)
![Page 7: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/7.jpg)
![Page 8: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/8.jpg)
MIREX (cartoon form)
Scientists Code MIREX machines(and task captains)
Data (private)
Results
![Page 9: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/9.jpg)
Evaluating the evaluation model
We would not be where we are today without MIREX.But this paradigm faces an uphill battle :’o(
![Page 10: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/10.jpg)
Costs of doing business
● Computer time
● Human labor
● Data collection
![Page 11: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/11.jpg)
Costs of doing business
● Computer time
● Human labor
● Data collection
Annual sunk costs(proportional to participants)
Best ! for $
*arrows are probably not to scale
![Page 12: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/12.jpg)
Costs of doing business
● Computer time
● Human labor
● Data collection Best ! for $
*arrows are probably not to scale
Annual sunk costs(proportional to participants)
The worst thing that could happen is growth!
![Page 13: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/13.jpg)
Limited feedback in the lifecycle
Hypothesis(model)
Experiment(evaluation)
Performance metrics (always)System outputs (sometimes)Input data (almost never)
![Page 14: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/14.jpg)
Stale data implies bias
https://frinkiac.com/caption/S07E24/252468https://frinkiac.com/caption/S07E24/288671
![Page 15: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/15.jpg)
The current model is unsustainable
● Inefficient distribution of labor
● Limited feedback
● Inherent and unchecked bias
![Page 16: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/16.jpg)
What is a sustainable model?
● Kaggle is a data science evaluation community (sound familiar?)
● How it works:○ Download data○ Upload predictions
○ Observe results
● The user-base is huge○ 536,000 registered users○ 4,000 forum posts per month
○ 3,500 competition submissions per day (!!!)
![Page 17: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/17.jpg)
What is a sustainable model?
● Kaggle is a data science evaluation community (sound familiar?)
● How it works:○ Download data○ Upload predictions
○ Observe oresults
● The user-base is huge○ 536,000 registered users○ 4,000 forum posts per month
○ 3,500 competition submissions per day (!!!)
Distributed computation.
![Page 18: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/18.jpg)
Academic Instances
Related / Prior work in Multimedia
• DCASE – Detection and Classification of Acoustic Scenes and Events
• NIST – Iris Challenge Evaluation, SRE i-Vector Challenge, Face Recognition Grand Challenge, TRECVid, ...
• MediaEval – Emotion in Music, Querying Musical Scores, Soundtrack Selection
![Page 19: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/19.jpg)
Open content
● Participants need unfettered access to audio content
● Without input data, error analysis is impossible
● Creative commons-licensed music is plentiful on the internet!○ Jamendo: 500K tracks
○ FMA: 90K tracks
![Page 20: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/20.jpg)
The distributed model is sustainable
● Distributed computation
● Open data means clear feedback
● Efficient allocation of human effort
![Page 21: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/21.jpg)
But what about annotation?
![Page 22: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/22.jpg)
Incremental evaluation
● Performance estimates +/- degree of uncertainty
● Annotate the most informative examples first
○ Beats: [Holzapfel et al., TASLP 2012]
○ Similarity: [Urbano and Schedl, IJMIR 2013]
○ Chords: [Humphrey & Bello, ISMIR 2015]
○ Structure: [Nieto, PhD thesis 2015]
● Which tracks do we annotate for evaluation?
○ None, at first!
![Page 23: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/23.jpg)
Incremental evaluation
● Performance estimates +/- degree of uncertainty
● Annotate the most informative examples first
○ Beats: [Holzapfel et al., TASLP 2012]
○ Similarity: [Urbano and Schedl, IJMIR 2013]
○ Chords: [Humphrey & Bello, ISMIR 2015]
○ Structure: [Nieto, PhD thesis 2015]
● Which tracks do we annotate for evaluation?
○ None, at first!
This is already common practice in MIR.
Let’s standardize it!
![Page 24: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/24.jpg)
The evaluation loop
1. Collect CC-licensed music
2. Define tasks
3. ($) Release annotated development set
4. Collect system outputs
5. ($) Annotate informative examples
6. Report scores
7. Retire and release old data
Human costs ($) directly produce data
![Page 25: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/25.jpg)
The evaluation loop
1. Collect CC-licensed music
2. Define tasks
3. ($) Release annotated development set
4. Collect system outputs
5. ($) Annotate informative examples
6. Report scores
7. Retire and release old data
Human costs ($) directly produce data
![Page 26: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/26.jpg)
What are the drawbacks here?
● Loss of algorithmic transparency
● Potential for cheating?
● CC/PD music isn’t “real” enough
![Page 27: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/27.jpg)
What are the drawbacks here?
● Loss of algorithmic transparency
● Potential for cheating?
● CC/PD music isn’t “real” enough
● Linking to source makes results verifiable and replicable!
● What’s the incentive for cheating?● Can we make it unpractical?● Even if people do cheat, we still get the
annotations.
● For which tasks?
![Page 28: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/28.jpg)
Okay ... so now what?
![Page 29: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/29.jpg)
![Page 30: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/30.jpg)
![Page 31: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/31.jpg)
The Open-MIC Challenge 2017
Open Music Instrument Classification Challenge
● Task 1 (classification): what instruments are played in this audio file?
● Task 2 (retrieval): in which audio files (from this corpus) is this instrument played?
![Page 32: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/32.jpg)
![Page 33: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/33.jpg)
● Complements what is currently covered in MIREX
● Objective and simple (?) task for annotators
● A large, well-annotated data set would be valuable for the community
The Open-MIC Challenge 2017
https://cosmir.github.io/open-mic/
● 20+ collaborators from various groups across academia and industry
![Page 34: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/34.jpg)
Content Management System Backend
● Manages audio content and user annotations
● Flask web application will hide endpoints behind user authentication
● Working with industry to secure support for Google Compute Platform (App Engine, DataStore)
● Designed to scale with participation / users
![Page 35: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/35.jpg)
Web Annotator
● Requests audio and instrument taxonomy from CMS backend
● Provides audio playback, instrument tag selection, annotation submission
● Authenticates users for annotator attribution & data quality
● Provides statistics and feedback on individual / overall annotation effort
![Page 36: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/36.jpg)
Web Annotator – SONYC
![Page 37: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/37.jpg)
Web Annotator – Gracenote
![Page 38: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/38.jpg)
● Audio content: green light from Jamendo!
● Instrument taxonomy – converging to ≈25 classes for now
● Currently iterating on task definition, measures, annotation task design, etc.
Data and Task Definition
![Page 39: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/39.jpg)
Roadmap (Tentative)
Sept2016
Oct2017
JuneOct Nov Dec Jan2017
Feb Mar Apr May July Aug Sept
Backend
Web Annotator
IngestAudio
Annotation
ISMIRChallenge OpenTask Definition
Instrument Taxonomy
Evaluation Machinery
Build / release development set
![Page 40: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/40.jpg)
● https://cosmir.github.io … https://cosmir.github.io/open-mic/contribute.html
● Watch the GitHub repo, read / comment on issues, submit PRs: https://github.com/cosmir/open-mic
● Ask questions on [email protected]
● Participate in the Open-MIC challenge (coming Summer 2017)
● Talk to us / me! { [email protected] / [email protected] / [email protected] }
Want to get involved?
![Page 41: MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Music Information Retrieval (MIR) Evaluation](https://reader031.vdocuments.net/reader031/viewer/2022030305/587366a91a28abe7648b6fb9/html5/thumbnails/41.jpg)
Thanks!