Transcript
Page 1: MediaEval 2016: A Hybrid Approach for Verifying Multimedia Use on Twitter

AHYBRIDAPPROACHFORVERIFYINGMULTIMEDIAUSEONTWITTERQuoc-TinPhan,AlessandroBudroni,CeciliaPasquini,FrancescoG.B.DeNatale

DepartmentofInformationEngineeringandComputerScience– UniversityofTrento,Italy

INTRODUCTION

Canyourecognize?Theyare“FAKE”.

EXISTINGAPPROACHES

THEPROPOSEDMETHOD

Schemaoftheproposedmethod.

MULTIMEDIAASSESSMENT

1. Search by keywords: online web search using relevant keywords associated to theevent.

2. Search by image/video: Google reverse image search and comment retrieval fromYouTube.

3. Forensic feature extraction: Non-Aligned Double JPEG Compression, Block ArtifactGrid, and Error Level Analysis. We seek for highest-probable blocks which mayundergo modifications and extract statistical features as min, max, mean, andvariance.

4. Textual feature extraction:4.1 Extract most relevant terms from results of 1. as bag-of-words.4.2 From results of 2., calculate term frequency of bag-of-words from 4.1.4.3 Calculate term frequency of bag of negative, positive and “fake” words.4.4 Concatenate features from 4.2 and 4.3 to form textual features.

5. Textual features together with forensic features are fed to Classifier 1.

RESULTSANDDISCUSSIONS

MultimediaSignalProcessingandUnderstandingLab,UniversityofTrento,Italy

✘Notusefulwithshorttextand multiplelanguages.

✘Nottakeintoaccountmultimediacontent.

HurricaneSandy

sharing

sharing

faketopic

realtopic

UnreliableinformationabouteventsandnewssharingoverOnlineSocialNetworksmightcausenegativeconsequences oncommunity

GIVEN:ATWEETcomprising<text, images/video>

REAL/FAKE

INPUT

OUTPUT

SYNTHETIC MANIPULATION

Text-based

Multimedia-Forensic-based

User-based

✘Sensitivetosubsequentmodifications andcompression.

Multimedia

Event

Post

User

Forensicfeatureextraction

Searchbyimage/video

Searchbykeywords

Forensicfeatures

Textualfeatures

Textualfeatureextraction

Classifier 1

Post-basedfeatures

User-basedfeatures

Classifier 2Concatenate

ScorefusionFinal

decision

Post-based featureextraction

User-based featureextraction

Concatenate

Multimediaassessment

Tweetcredibilityassessment

TWEETCREDIBILITYASSESSMENT

1. Post-based feature extraction: useful features reflecting the credibility of a tweetpost are extracted, i.e. whether the tweet contains “?” or “!”, number of negativesentiment words.

2. User-based feature extraction: useful features reflecting the credibility of a user areextracted, i.e. number of followers the user has, whether the user is verified byTwitter.

3. Post-based features together with user-based features are fed to Classifier 2.

WRONGCONTEXT

SCOREFUSION

With the assumption that a tweet sharing fake images or videos is likely to be fake,higher weight is assigned to the output from Classifier 1, lower weight is assigned to theoutput from Classifier 2.

In the sub-task, we submitted RUN 1 applying only forensic features,and RUN 2 applying both textual features and forensic features.

In the main task, we submitted three RUNs: i) RUN 1: applied only the secondclassification tier, ii) RUN 2: applied two-tier classification and 0.8 : 0.2 fusion strategy,answered UNKNOWN to cases suffered from online searching errors, iii) RUN 3: same asRUN 2, considered the output of classification tier 2 instead of UNKNOWN.

The proposed method is subject to online search errors, which happen to videos NOThosted by Youtube.

Recall Precision F1-score

RUN1 0.5 0.48 0.49RUN2 0.93 0.49 0.64

Our method gains recall if we take into account textual featuresacquired from online text search and image reverse search. Thisapproach effectively reduces false negative rate.

Recall Precision F1-score

RUN1 0.55 0.71 0.62RUN2 0.94 0.81 0.87RUN 3 0.94 0.74 0.83

Top Related