sc7 hangout 2 :integrating social sensing for security

15
INTEGRATING SOCIAL SENSING FOR SECURITY 2 nd BDE Hangout “Big Data in Secure societies” 21 April 2016 George Giannakopoulos, George Kiomourtzis NCSR “Demokritos”

Upload: bigdataeurope

Post on 09-Apr-2017

203 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: SC7 Hangout 2 :Integrating Social Sensing for Security

INTEGRATING SOCIAL

SENSING FOR SECURITY

2nd BDE Hangout “Big Data in Secure societies” 21 April 2016

George Giannakopoulos, George Kiomourtzis

NCSR “Demokritos”

Page 2: SC7 Hangout 2 :Integrating Social Sensing for Security

BigDataEurope Pilot Remote Sensing Big Data Platform Social Sensing

• query • download • pre-processing • change

detection

• monitor news • cluster into events • filter relevant

events • extract AoI

Change Detection Workflow

Event Detection Workflow

Page 3: SC7 Hangout 2 :Integrating Social Sensing for Security

Social media

◎ We know that user generated content (social media, blogs, etc.)

o Volume: Amounts to TBytes per day

o Variety: Structure and Formats; Level of editing (curated/free

text); Languages; Length

o Velocity: Real-time streams and requirements

o Veracity: Credibility and verification

o Value: Usable in risk management, brand monitoring, event

detection

Page 4: SC7 Hangout 2 :Integrating Social Sensing for Security

What can media say?

◎ News reporting and social media:

o Report events

o Share

o Discuss events

◎ ...but

o People use free text

o There exists minimal annotation

o Reports are difficult to confirm

Page 5: SC7 Hangout 2 :Integrating Social Sensing for Security

Event detection workflow

◎ Listen and monitor news and social media sources

◎ Identify events

i. Compare documents

ii. Form clusters

iii. Determine importance

◎ Enrich and store

i. Extract meta-data and geo-location information

ii. Update semantic infrastructures

◎ Combine with satellite data to inform user

Page 6: SC7 Hangout 2 :Integrating Social Sensing for Security

Listen to media

● Define sources a. News feeds (RSS) b. Selected social media accounts (trust) c. Generic streams (e.g. Twitter) d. Keyword-based search

● Periodically check, or... ● ... consume a stream

Page 7: SC7 Hangout 2 :Integrating Social Sensing for Security

Identify events

● Form pairs of news texts ● For each pair

○ Compare texts ○ If similarity above threshold

■ Consider related ● Form clusters, based on related pairs

○ Each cluster identifies an event ○ If cluster has a specific support

■ Keep as important

Page 8: SC7 Hangout 2 :Integrating Social Sensing for Security

Enrich and store events

● For every social media item ○ Compare to cluster ○ If above threshold

■ Attach item to event ● For every cluster

○ Get metadata (date, location) ○ Extract location names ○ Request geo-location data

● Store meta-data in semantic infrastructure ○ Keep the links to sources

Page 9: SC7 Hangout 2 :Integrating Social Sensing for Security

Many sources, many articles (example)

Baby rescued after 6 hours under quake

rubble (CNN)

Oil falls on failed output freeze; Dow above 18,000 (CNBC, Reuters)

GLOBAL MARKETS - Shares follow oil down after

Doha disappointment (Reuters)

U.S.-Philippines enhance military alliance, China isn't happy (CNN)

U.S. warily eyes New Year's threats in cities abroad

(IBT, Reuters)

Page 10: SC7 Hangout 2 :Integrating Social Sensing for Security

Clustering (example)

Baby rescued after 6 hours under quake rubble (CNN)

Oil falls on failed output freeze; Dow above 18,000 (CNBC, Reuters)

GLOBAL MARKETS - Shares follow oil down after Doha disappointment (Reuters)

Similarity: 0.2

Similarity: 0.8

(In same cluster)

● N-gram graphs

● Markov Clustering

● Transitive closure

Page 11: SC7 Hangout 2 :Integrating Social Sensing for Security

Identify Events (example)

Match

No match

Page 12: SC7 Hangout 2 :Integrating Social Sensing for Security

Enrich and store events (example)

News feed items in cluster: 5

Title: Shares follow oil down after Doha disappointment

ID: 5-88affec1f2d6a28ea9e332087a0978bc-14685

Description:

The plunge in crude oil prices took a large slice out of commodity currencies, pushing the dollar

almost 1 percent higher against its Canadian counterpart to C$1.2926 CAD=D4.

Locations extracted:

[Brazil, Britain, Europe, Hong Kong, Iran, Japan, London, Saudi Arabia, Washington]

Related Tweets (IDs):

[722035074401705984]

Locations extracted:

[Brazil : [[[35.31,25.3],[35.31,19.25],[41.09,19.25],[41.09,25.3],[35.31,25.3]]]>,

Britain: <polygon>...]

Strabon

Strabon

Page 13: SC7 Hangout 2 :Integrating Social Sensing for Security

Summary

● We listen to what news and social media say ● We analyze and enrich ● We update the semantic infrastructure in

real-time ● We support

○ Discovery of interesting events ○ Location-based focus and, thus... ○ ...Validation with satellite data

Page 14: SC7 Hangout 2 :Integrating Social Sensing for Security

Conclusion and future work

● (Social) Media data as a security resource ● Big data infrastructure ● Semantic enrichment in realtime

Next steps: ● Multi-threaded to distributed ● Fine-tune location extraction ● Fine-tune clustering ● Fine-tune post (tweet) assignment

Page 15: SC7 Hangout 2 :Integrating Social Sensing for Security

Thank you

George Giannakopoulos, George Kiomourtzis

E-mail: [email protected]

Icons from flaticon.com