mediaeval 2016 - placing task overview
TRANSCRIPT
PLACING TASK 2016
Bart Thomee (Google, San Bruno) Olivier Van Laere (Blueshift Labs, San Francisco)
Claudia Hauff (TU Delft, Netherlands) Jaeyoung Choi (ICSI, Berkeley / TU Delft, Netherlands)
Oct. 20th, 2016 Hilversum, Netherlands
TASK DESCRIPTION• given a video or a photo, how accurately can it be
placed on a map, e.g., give longitude and latitude coordinates or selecting a neighborhood
TASK OVERVIEW• Two sub-tasks
• locales-based subtask, mobility-based subtask (2015)
• estimation-based sub-task
• verification-based sub-task
• Organizer baselines provided
• Live leaderboard
ESTIMATION-BASED SUBTASK• participants are given a hierarchy of places across the world
• place hierarchy from YFCC100M Places expansion pack
• Country - State - City - Neighborhood
• for each photo/video, participants can:
• estimate the GPS-coordinate
• choose a node (i.e. a place) from a hierarchy in which they most confidently believe it was taken
VERIFICATION-BASED SUBTASK• Verify whether or not the media item was really
captured in the given place
• Requires a notion of confidence
Paris, France?
Yes / No
ORGANIZER BASELINES
• Two open-source baselines are provided
• Estimation-based sub-task
• Verification-based sub-task
http://bit.ly/2dnggcg
LIVE LEADERBOARD
• Participants can submit runs and view their relative standing toward others
• Evaluated on a dev set (part of test set)
TASK DATASETTraining Testing
#Photos #Videos #Photos #Videos
4,991,679 24.955 1,497,464 29.934
• Drawn from Yahoo Flickr Creative Commons 100 Million (YFCC100M) dataset
• photos and videos that are successfully reverse geocoded
PRECOMPUTED FEATURES• textual metadata
• as included in YFCC100M
• visual features
• LIRE, GIST, SIFT
• CNN Codes (HybridNet, VGG)
• audio features
• MFCC, Pitch (Kaldi, SAcC)
TASK EVALUATION• estimation-based sub-task
• geographic distance between ground truth coordinate and the predicted coordinate or place from the hierarchy
• verification-based sub-task
• classification accuracy is measured
• Karney’s formula is used to calculate distance between the ground truth and the estimated location
RUNS• run1 - Only the provided textual metadata
• run2 - Only the provided visual & aural features
• run3 - Only the provided textual metadata as well as the visual & aural features
• run4 & 5 - Everything is allowed (e.g., gazetteers, dictionaries, Web corpora)
• Except for crawling the exact items contained in the test set
PARTICIPANT STATISTICSestimation-based subtask verification-based subtask
run1 run2 run3 run4/5 run1 run2 run3 run4/5
CERTH/CEA LIST
O O O O
RECOD O O O O
CSUA O O O O O O O O
Two ‘veterans’ andOne new participants
RESULT - RUN1 TEXTUAL METADATA ONLY
CERTH/CEALIST RECOD CSUAPhoto Video Photo Video Photo Video
10m 0.59 0.55 0.59 0.45 0.27 0.27100m 6.42 6.86 6.07 5.74 2.88 3.031km 24.55 22.73 21.06 18.69 14.13 13.510km 43.32 40.6 38 33.57 35.28 33.48100km 51.26 48.24 46.23 41.56 50.28 47.61000km 64.06 60.84 59.69 54.51 64.17 60.06
0 10 20 30 40 50 60 70
Photo
Video
Photo
Video
Photo
Video
CERTH/CEALIST
RECO
DCSUA
1000km 100km 10km 1km 100m 10m
0 5 10 15 20 25 30
Photo
Video
Photo
Video
Photo
Video
CERTH/CEALIST
RECO
DCSUA
1km 100m 10m
CERTH/CEALIST RECOD CSUAPhoto Video Photo Video Photo Video
10m 0.59 0.55 0.59 0.45 0.27 0.27100m 6.42 6.86 6.07 5.74 2.88 3.031km 24.55 22.73 21.06 18.69 14.13 13.5
RESULT - RUN1 TEXTUAL METADATA ONLY
RESULT - RUN2 VISUAL & AUDIO ONLY
CERTH/CEALIST RECOD CSUAPhoto Video Photo Video Photo Video
10m 0.08 0 0.09 0 0 0100m 1.84 0.06 0.87 0.03 0 01km 5.62 0.5 2.36 0.15 0.42 0.14
10km 8.16 2.48 4.47 1.15 2.13 0.81100km 10.21 4.97 5.88 2.46 4 1.77
1000km 26.31 22.1 21.46 13.54 22.97 6.95
0 5 10 15 20 25 30
Photo
Video
Photo
Video
Photo
Video
CERTH/CEALIST
RECO
DCSUA
1000km 100km 10km 1km 100m 10m
RESULT - RUN2 VISUAL & AUDIO ONLY
CERTH/CEALIST RECOD CSUAPhoto Video Photo Video Photo Video
10m 0.08 0 0.09 0 0 0100m 1.84 0.06 0.87 0.03 0 01km 5.62 0.5 2.36 0.15 0.42 0.14
0 1 2 3 4 5 6
Photo
Video
Photo
Video
Photo
Video
CERTH/CEALIST
RECO
DCSUA
1km 100m 10m
RESULT - RUN3TEXT + VISUAL + AUDIO
Table 1
CERTH/CEALIST RECOD CSUAEstimation Photo Video Photo Video Photo Video
10m 0.56 0.55 0.56 0.51 0.27 0.27100m 6.58 6.86 5.97 5.82 2.89 3.031km 25.03 22.73 20.83 18.46 14.13 13.510km 43.73 40.6 37.72 33.38 35.26 33.48100km 51.69 48.24 46.04 41.2 50.25 47.61000km 64.58 60.84 59.89 54.77 64.03 60.08
0 10 20 30 40 50 60 70
Photo
Video
Photo
Video
Photo
Video
CERTH/CEALIST
RECO
DCSUA
1000km 100km 10km 1km 100m 10m
RESULT - RUN3TEXT + VISUAL + AUDIO
0 5 10 15 20 25 30
Photo
Video
Photo
Video
Photo
Video
CERTH/CEALIST
RECO
DCSUA
1km 100m 10m
Table 1
CERTH/CEALIST RECOD CSUAEstimation Photo Video Photo Video Photo Video
10m 0.56 0.55 0.56 0.51 0.27 0.27100m 6.58 6.86 5.97 5.82 2.89 3.031km 25.03 22.73 20.83 18.46 14.13 13.5
RUN 4 & 5USE ANYTHING
0 10 20 30 40 50 60 70
Photo
Video
Photo
Video
Photo
Video
Photo
Video
CERTH/CEALIST
-run
4CERTH/CEALIST
-run
5RECO
DCSUA
1000km 100km 10km 1km 100m 10m
CERTH/CEALIST - run4 CERTH/CEALIST - run5 RECOD CSUAEstimation Photo Video Photo Video Photo Video Photo Video
10m 0.7 0.72 0.72 0.71 0.71 0.37 0.27 0.27100m 7.96 8.27 8.27 8.19 8.19 4.03 2.94 3.361km 27.82 28.54 28.54 26.16 26.16 13.51 13.24 13.29
10km 46.52 46.45 46.45 43.62 43.62 25.76 33.02 32.61100km 53.96 53.5 53.5 50.44 50.44 33.02 51.14 49.351000km 66.11 65.32 65.32 61.93 61.93 47.67 64.58 61.18
RUN 4 & 5USE ANYTHING
0 5 10 15 20 25 30
Photo
Video
Photo
Video
Photo
Video
Photo
Video
CERTH/CEALI
ST-run4
CERTH/CEALI
ST-run5
RECO
DCSUA
1km 100m 10m
CERTH/CEALIST - run4 CERTH/CEALIST - run5 RECOD CSUAEstimation Photo Video Photo Video Photo Video Photo Video
10m 0.7 0.72 0.72 0.71 0.71 0.37 0.27 0.27100m 7.96 8.27 8.27 8.19 8.19 4.03 2.94 3.361km 27.82 28.54 28.54 26.16 26.16 13.51 13.24 13.29
0 10 20 30 40 50 60 70 80 90
run1
run2
run3
run4
baseline
neighborhood city state country
VERIFICATION TASKPHOTO
VERIFICATION TASKVIDEO
0 10 20 30 40 50 60 70 80 90
run1
run2
run3
run4
baseline
neighborhood city state country
WHAT WE LEARNED• Participants’ system
• No visible trend; many different approaches
• language model, similarity search, genetic programming, etc
• Fusion - heuristic, confidence based, ranking fusion
• External data (gazetteer) helps. More data helps!
• Photos were better geo-located than videos (but not always)
THANK YOU!!!