demo - precog research groupprecog.iiitd.edu.in/publications_files/sonal_defense_slides.pdf ·...

41
cerc.iiitd.ac.in Demo 1

Upload: buinhan

Post on 25-Jan-2019

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Demo

1

Page 2: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

Image Retrieval for Improved Law and Order

Search, Analyse, Predict Image Spread on Twitter

Sonal Goel (MT14026)Advisor: Dr. Ponnurangam Kumaraguru

Cybersecurity Education and Research Centre (CERC)

Page 3: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Thesis committee

Dr. AV Subramanyam, IIIT-Delhi

Dr. Samarth Bharadwaj, IBM-IRL

Dr. Ponnurangam Kumaraguru (Chair), IIIT-Delhi

3

Page 4: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Recent events

4

Page 5: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Baba Ram Rahim posing as Lord Vishnu

5

Page 6: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Doctored picture of the PM

6

Page 7: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Social media impact

Before After

Limited reach Exponential reach

Time lapse - high Time lapse - ~nil

Localised issues Globalised/ Nationalised issues

7

How news spread before and after arrival of Social media

Page 8: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Problem

8

Disturbing Events

Social Media

a. Religious

b. Caste

c. Communal

d. Political

e. ....

Law & Order

Page 9: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Research Aim

9

A real time image search system for security analysts to monitor the spread of an image, analyse the users spreading the content, the sentiments propagating.

To predict the spread of an image

Real Time Image Search

OSINT

Page 10: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Current State of Art

Content based image retrieval (CBIR)- Colour, texture, size, shape: A combination of

color and shape is a more robust feature than individual

- Identify the keypoints using SIFT, SURF, or ORB: Rublee showed that ORB is two orders magnitude faster than SIFT, while performing as well in most situations

10

• Jain, Anil K., and Aditya Vailaya. "Image retrieval using color and shape."Pattern recognition 29.8 (1996).

• Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. Orb: an efficient alternative to sift or surf. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 2564–2571. IEEE, 2011.

Page 11: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Contd..

Real-time image based search systems:- Google Reverse Image

- TinEye

Absence of real-time image retrieval for micro-blogging sites to aid security analysts

11

• https://www.tineye.com/• https://images.google.com/

Page 12: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Contributions

The system is currently being used by a Government security agency to analyse image spread on micro-blogging sites.

Robust to retrieve modified images.

High level image features like presence of face is are better to predict retweet count than low level image features like colour intensity of image.

12

Page 13: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in 13

AIM

Image Retrieval System

Predicting the Spread of an image

Page 14: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Architecture diagram

14

Text

Image

Keywords

Image DB

Similar images

Image ComparisonMethodology

Text Mining

Collects tweetscontaining images

RESTAPI

Page 15: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in 15

1. 2.

4.

3.

5.

Page 16: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in 16

Details of data collected

Event Name Total images

Total images Similar set

Total images dissimilar set

RamRahim 411 99 312

Kulkarni 1912 348 1564

ShaniShingnapur 183 67 116

KejriwalInsultsHanuman

666 278 388

CharlieHebdo 570 114 456

Page 17: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Challenges in finding similar images

Images can be cropped

Scaled images

Changed colour, brightness

Text Added

Images stitched with other images

Rotated

17

Page 18: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Image features to look for similarity

Colour distribution: 3D-colour histogram

Keypoint descriptors

a. Daisy

b. ORB (Oriented FAST and Rotated BRIEF)

c. Improved ORB (ORB+RANSAC)

RANSAC (Random Sampling Consensus)

18

Page 19: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Histogram

• Different accuracy at different distance for every event• Avg variance at 3 points (0.2, 0.4 ,0.5) is 104.24

19

Acc

ura

cy

Distance between histograms of two images

Page 20: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

DAISY features

20

Similar images

Dissimilar images

Dis

tan

ce

Dis

tan

ce

Charlie Hebdo: 570; Similar: 114; Dissimilar: 456

Page 21: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Histogram presentation

21

Page 22: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

ORB

• At a distance 29 the accuracy in all events > 85% • Avg variance for 3 points(29,32,35) is 17.6

22

Acc

ura

cy

Page 23: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

ORB + RANSACA

ccu

racy

True match ratio by RANSAC

23

• Avg variance at 3 points (0.33, 0.35, 0.37) is ~ 6.2

Page 24: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Comparing FPR and TPR

Event ORB Improved ORB

BabaRamRahim TPR: 0.6 TPR: 0.7

Kulkarni Ink TPR: 0.62 TPR: 0.78

ShaniShingnapur TPR: 0.97 TPR: 1.0

KejriwalInsultsHanuman TPR: 0.90 TPR: 1.0

24

AIM: To minimise FPR and achieve best TPR.

Compare TPR of ORB and Improved ORB at FPR=0

Page 25: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Comparing accuracy with different input

images

25

Page 26: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Images stitched and scaled

26

Scaling factor for the above images are: Cyan :(9.6*3.87), Magenta: (2.08*1.25), Yellow: (1.9*1.2), Red: (2.1*1.23), Green:(1.7*1.3), Blue:(2.7*2.3)

Acc

ura

cy

True match ratio

Page 27: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Image with text and scaled

Scaling Factor of the above images areRed : (1.66*1.7)Green : (7.45*5.25)Cyan: (1.2*1.08)Blue: original: size (600*815)

27

Acc

ura

cy

True match ratio

Page 28: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Cropped, text, stitched images

28

True match ratio

Acc

ura

cy

Page 29: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Images having less colour

29

Scaling Factor:Green: (2.02* 1.54), Red: (1.04*1.28)Cyan: (3.9*2.4) ,Magenta: (2.0*1.13),Blue: Not scaled, Black: Original

Acc

ura

cy

True match ratio

Page 30: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in 30

Comparing proposed system with Google reverse image search

Kanhaiya Kumar

Proposed System Google Image Search

Total images: 892Most Similar: 42Moderately Similar: 36 Least Similar: 814

Total images: 48

Page 31: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Proposed System Google Reverse Image

31

Screen Shot

Page 32: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in 32

AIM

Image Retrieval

System

Predicting the Spread

Page 33: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Data collection

Event Total tweets

Unique Tweets

kulkarni 1912 404

BabaRamRahim 420 117

KejriwalInsultsHanuman 1400 665

CharlieHebdo 1079 312

ShaniShingnapur 1230 183

RohithVemulla 3104 359

33

Page 34: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Features analysed

Image Features Tweet Features User features

Mean Red Tweet length Status Count

Mean Green Sentiment Followers Count

Mean Blue Hashtag Count Friends Count

Presence of face Media Count Follower_Followee Ratio

Mention Count Verified

Tweet Age Favourites Count

Account Age

34

• Ethem F Can, Hu¨seyin Oktay, and R Manmatha. Predicting retweet count using visual cues. International Conference on information & knowledge management, pages 1481–1484. ACM, 2013.• Bob van de Velde, Albert Meijer, and Vincent Homburg. Police message diffusion on twitter: analysing the reach of social media communications. Behaviour & Information Technology, 2015.

Page 35: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Spearman correlation:Features & Retweet count

Features Correlation

Media count * -0.683

Mention count * 0.603

Favourites count * (log) 0.433

Sentiment * 0.367

Tweet length * 0.363

Verified (binary) * 0.311

Hashtag count * 0.267

Tweet age * -0.259

Account age * 0.109

Friends count * (log) 0.077

Face presence (binary) * -0.066

Status count * (log) -0.059

Follower-Followee ratio * -0.053

Follower count * 0.052

35

Ethem F Can, Hu¨seyin Oktay, and R Manmatha. Predicting retweet count using visual cues. International Conference on information & knowledge management, pages 1481–1484. ACM, 2013.

Page 36: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Regression results

Model(10 Fold cross validation)

RMSE MAE

Linear Regression 1.67 1.37

SVR (C= 8, gamma =2)

2.41 2.16

Random Forest (#trees =60)

1.10 0.72

36

• Mean retweet count: 2.731• Data count: 2040

Page 37: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Conclusions

Improved ORB gives the best results, with accuracy above 90%

Reduce in accuracy is seen if the input image is highly cropped or scaled (factor > ~3.5), or modification is done on images with more colors

Lower level image features like mean red, green, blue do not give good correlation values with retweetcount

Out of the three models Random Forest gives the best results

37

Page 38: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Acknowledgements

Department of Electronics and Information Technology (Deity), for funding the work

Niharika Sachdeva, PhD, IIIT-Delhi

Committee Members

Precog & CERC members, family and friends

38

Page 39: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Bibliography

Ethem F Can, Hu¨seyin Oktay, and R Manmatha. Predicting retweetcount using visual cues. In Proceedings of the 22nd ACM international Conference on information & knowledge management, pages 1481–1484. ACM, 2013.

Maximilian Jenders, Gjergji Kasneci, and Felix Naumann. Analyzing and predicting viral tweets. In Proceedings of the 22nd international conference on World Wide Web companion, pages 657–664. International World Wide Web Conferences Steering Committee, 2013.

Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. Orb: an efficient alternative to sift or surf. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 2564–2571. IEEE, 2011.

Bob van de Velde, Albert Meijer, and Vincent Homburg. Police message diffusion on twitter: analysing the reach of social media communications. Behaviour & Information Technology, 2015

39

Page 40: Demo - Precog Research Groupprecog.iiitd.edu.in/Publications_files/Sonal_Defense_Slides.pdf · Current State of Art Content based image retrieval (CBIR)-Colour, texture, size,

cerc.iiitd.ac.in

Bibliography (I)

Lei Yu, Zhixin Yu, and Yan Gong. An improved orb algorithm of extracting and matching. 2015.

Hacker Factor. The hacker factor. http://www.hackerfactor.com/blog/index.php? /archives/529-Kind-of-Like-That.html, 2013.

Adrian Rosebrock. How-to: Python compare two images. http://www.pyimagesearch. com/2014/09/15/python-compare-two-images/, 2014.

Jain, Anil K., and Aditya Vailaya. "Image retrieval using color and shape."Pattern recognition 29.8 (1996).

40