mediaeval 2016: a hybrid approach for verifying multimedia use on twitter

A HYBRID APPROACH FOR VERIFYING MULTIMEDIA USE ON TWITTER Quoc-Tin Phan, Alessandro Budroni, Cecilia Pasquini, Francesco G. B. De Natale Department of Information Engineering and Computer Science – University of Trento, Italy INTRODUCTION Can you recognize? They are “FAKE”. EXISTING APPROACHES THE PROPOSED METHOD Schema of the proposed method. MULTIMEDIA ASSESSMENT 1. Search by keywords: online web search using relevant keywords associated to the event. 2. Search by image/video: Google reverse image search and comment retrieval from YouTube. 3. Forensic feature extraction: Non-Aligned Double JPEG Compression, Block Artifact Grid, and Error Level Analysis. We seek for highest-probable blocks which may undergo modifications and extract statistical features as min, max, mean, and variance. 4. Textual feature extraction: 4.1 Extract most relevant terms from results of 1. as bag-of-words. 4.2 From results of 2., calculate term frequency of bag-of-words from 4.1. 4.3 Calculate term frequency of bag of negative, positive and “fake” words. 4.4 Concatenate features from 4.2 and 4.3 to form textual features. 5. Textual features together with forensic features are fed to Classifier 1. RESULTS AND DISCUSSIONS Multimedia Signal Processing and Understanding Lab, University of Trento, Italy ✘Not useful with short text and multiple languages. ✘Not take into account multimedia content . Hurricane Sandy sharing sharing fake topic real topic Unreliable information about events and news sharing over Online Social Networks might cause negative consequences on community GIVEN: A TWEET comprising <text, images / video> REAL / FAKE INPUT OUTPUT SYNTHETIC MANIPULATION Text-based Multimedia-Forensic- based User-based ✘Sensitive to subsequent modifications and compression. Multimedia Event Post User Forensic feature extraction Search by image/video Search by keywords Forensic features Textual features Textual feature extraction Classifier 1 Post-based features User-based features Classifier 2 Concatenate Score fusion Final decision Post-based feature extraction User-based feature extraction Concatenate Multimedia assessment Tweet credibility assessment TWEET CREDIBILITY ASSESSMENT 1. Post-based feature extraction: useful features reflecting the credibility of a tweet post are extracted, i.e. whether the tweet contains “?” or “!”, number of negative sentiment words. 2. User-based feature extraction: useful features reflecting the credibility of a user are extracted, i.e. number of followers the user has, whether the user is verified by Twitter. 3. Post-based features together with user-based features are fed to Classifier 2. WRONG CONTEXT SCORE FUSION With the assumption that a tweet sharing fake images or videos is likely to be fake, higher weight is assigned to the output from Classifier 1, lower weight is assigned to the output from Classifier 2. In the sub-task , we submitted RUN 1 applying only forensic features, and RUN 2 applying both textual features and forensic features. In the main task, we submitted three RUNs: i) RUN 1: applied only the second classification tier, ii) RUN 2: applied two-tier classification and 0.8 : 0.2 fusion strategy, answered UNKNOWN to cases suffered from online searching errors, iii) RUN 3: same as RUN 2, considered the output of classification tier 2 instead of UNKNOWN. The proposed method is subject to online search errors, which happen to videos NOT hosted by Youtube. Recall Precision F1-score RUN 1 0.5 0.48 0.49 RUN 2 0.93 0.49 0.64 Our method gains recall if we take into account textual features acquired from online text search and image reverse search. This approach effectively reduces false negative rate. Recall Precision F1-score RUN 1 0.55 0.71 0.62 RUN 2 0.94 0.81 0.87 RUN 3 0.94 0.74 0.83

Upload: multimediaeval

Post on 09-Jan-2017

3 views

Category:

Science

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: MediaEval 2016: A Hybrid Approach for Verifying Multimedia Use on Twitter

AHYBRIDAPPROACHFORVERIFYINGMULTIMEDIAUSEONTWITTERQuoc-TinPhan,AlessandroBudroni,CeciliaPasquini,FrancescoG.B.DeNatale

DepartmentofInformationEngineeringandComputerScience– UniversityofTrento,Italy

INTRODUCTION

Canyourecognize?Theyare“FAKE”.

EXISTINGAPPROACHES

THEPROPOSEDMETHOD

Schemaoftheproposedmethod.

MULTIMEDIAASSESSMENT

1. Search by keywords: online web search using relevant keywords associated to theevent.

2. Search by image/video: Google reverse image search and comment retrieval fromYouTube.

3. Forensic feature extraction: Non-Aligned Double JPEG Compression, Block ArtifactGrid, and Error Level Analysis. We seek for highest-probable blocks which mayundergo modifications and extract statistical features as min, max, mean, andvariance.

4. Textual feature extraction:4.1 Extract most relevant terms from results of 1. as bag-of-words.4.2 From results of 2., calculate term frequency of bag-of-words from 4.1.4.3 Calculate term frequency of bag of negative, positive and “fake” words.4.4 Concatenate features from 4.2 and 4.3 to form textual features.

5. Textual features together with forensic features are fed to Classifier 1.

RESULTSANDDISCUSSIONS

MultimediaSignalProcessingandUnderstandingLab,UniversityofTrento,Italy

✘Notusefulwithshorttextand multiplelanguages.

✘Nottakeintoaccountmultimediacontent.

HurricaneSandy

sharing

faketopic

realtopic

UnreliableinformationabouteventsandnewssharingoverOnlineSocialNetworksmightcausenegativeconsequences oncommunity

GIVEN:ATWEETcomprising<text, images/video>

REAL/FAKE

INPUT

OUTPUT

SYNTHETIC MANIPULATION

Text-based

Multimedia-Forensic-based

User-based

✘Sensitivetosubsequentmodifications andcompression.

Multimedia

Event

Post

User

Forensicfeatureextraction

Searchbyimage/video

Searchbykeywords

Forensicfeatures

Textualfeatures

Textualfeatureextraction

Classifier 1

Post-basedfeatures

User-basedfeatures

Classifier 2Concatenate

ScorefusionFinal

decision

Post-based featureextraction

User-based featureextraction

Concatenate

Multimediaassessment

Tweetcredibilityassessment

TWEETCREDIBILITYASSESSMENT

1. Post-based feature extraction: useful features reflecting the credibility of a tweetpost are extracted, i.e. whether the tweet contains “?” or “!”, number of negativesentiment words.

2. User-based feature extraction: useful features reflecting the credibility of a user areextracted, i.e. number of followers the user has, whether the user is verified byTwitter.

3. Post-based features together with user-based features are fed to Classifier 2.

WRONGCONTEXT

SCOREFUSION

With the assumption that a tweet sharing fake images or videos is likely to be fake,higher weight is assigned to the output from Classifier 1, lower weight is assigned to theoutput from Classifier 2.

In the sub-task, we submitted RUN 1 applying only forensic features,and RUN 2 applying both textual features and forensic features.

In the main task, we submitted three RUNs: i) RUN 1: applied only the secondclassification tier, ii) RUN 2: applied two-tier classification and 0.8 : 0.2 fusion strategy,answered UNKNOWN to cases suffered from online searching errors, iii) RUN 3: same asRUN 2, considered the output of classification tier 2 instead of UNKNOWN.

The proposed method is subject to online search errors, which happen to videos NOThosted by Youtube.

Recall Precision F1-score

RUN1 0.5 0.48 0.49RUN2 0.93 0.49 0.64

Our method gains recall if we take into account textual featuresacquired from online text search and image reverse search. Thisapproach effectively reduces false negative rate.

Recall Precision F1-score

RUN1 0.55 0.71 0.62RUN2 0.94 0.81 0.87RUN 3 0.94 0.74 0.83

Hollandsche Mediaeval

The Mediaeval Fair

MediaEval 2015 - SAVA at MediaEval 2015: Search and Anchoring in Video Archives

MediaEval 2015 - Verifying Multimedia Use at MediaEval 2015

Verifying Multimedia Use at MediaEval 2015

MediaEval 2015 - Multi-Scale Approaches to the MediaEval 2015 "Emotion in Music" Task

Image Credibility Analysis with Effective Domain … › pdf › 1611.05328.pdfBoididou et al. [3] propose the Verifying Multimedia Use task which took place as part of the MediaEval

Verifying Multimedia Use at MediaEval 2016

MediaEval 2015 - OHSU @ MediaEval 2015: Adapting Textual Techniques to Multimedia Search

MediaEval 2015 - DMUN at the MediaEval 2015 C@merata Task: the Stravinsqi Algorithm