a large-scale analysis of regional tendency of twitter...

29
A Large-scale Analysis of Regional Tendency of Twitter Photos Using Only Image Features Tetsuya Nagano, Takumi Ege, Wataru Shimoda and Keiji Yanai Department of Informatics, The University of Electro- Communications, Tokyo 2017 UEC Tokyo.

Upload: others

Post on 18-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

A Large-scale Analysis of Regional Tendency of Twitter Photos Using

Only Image Features

Tetsuya Nagano, Takumi Ege, Wataru Shimoda and Keiji Yanai Department of Informatics, The University of Electro-

Communications, Tokyo

ⓒ 2017 UEC Tokyo.

Page 2: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Introduction

• Many people to post photos on open SNSs such as Twitter and Instagram

Page 3: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Introduction

• Flickr

– Textual information attached to photos for photo sharing

• Twitter, Instagram

– Do not tend to represent the contents of photos directly

– Not for search but for explaining additional information

Page 4: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Introduction

• Analyze the differences on idea of the privacy issue at SNSs depends on culture and history

Page 5: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Related Works

• Typical works on Twitter photo,

– Event detection

– Using both text analysis and image analysis

• Our work

– We analyze a million-scale of Twitter photos without using any textual information

– We aim to detect the differences of regional tendency of posted photos to Twitter

Page 6: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Overview of the Proposed Method

• (1) Gather a million-scale of geotagged photos from the Twitter stream

• (2) Extract image features

• (3) Carry out clustering of them

• (4) Classify only large clusters

• (5) Compare the ratios of five categories between eight regions over the world

Page 7: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Detail of the Method

• We are continuously gathering geotagged photo tweets from the Twitter stream

• Two million geotagged Twitter photos which we had gathered for half years in 2016.

MessagesLocation

Image

LocationImage

Page 8: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Image Classification

• AlexNet (CaffeNet)

– Extract CNN features of 1000 images per one minutes

• FC6 layer of CaffeNet is 4096

• Principal Component Analysis (PCA)

– 4096-d features into 128-d featuresInput

3x256x256

Layer1Conv

96x24x24

Layer2Conv

256x12x12

Layer3Conv

512x12x12

Layer4Conv

1024x12x12

Layer5Conv

1024x6x6

Layer6full

3072x1x1

Layer7full

4096x1x1

Layer8full

1000x1x1

Output

1000x1x1

Page 9: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Clustering

• K-means clustering.

– k = 100

• A small pre-liminary experiments with 1000 Twitter photos

– To examine the difference on clustering results between the cases with and without PCA-based feature compression

Page 10: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Clusteling results (4096-d)

Page 11: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Clusteling results (128-d)

Page 12: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

5 Experiments

• Dataset

– Collected from January to June in 2016

– 2,161,000 geotagged Twitter images

• Feature

– CNN(128-d compressed by PCA)

• K-means

– K-means with one-tenth images

• Assigned rest of images into the nearest clusters

– K=100

Page 13: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

5.3 How to Analyze

• East Asia, North America, South America, Europe, Africa, Middle East, South Asia and South-East Asia, Oceania

Page 14: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

5.3 How to Analyze

• pre-selected photo genres.

– “people”

– “building”

– “document”

– “scene”

– “food”

Page 15: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Analysis of Regional Tendency of Photo Genres

1st Europe183,250枚

2rd South east Asia145,088枚

3rd North America124,141枚

Page 16: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Page 17: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

• East Asia

– No people photos

– Many building and food photos

– The total ratio of building and food photos were more than 70%

People

Food

Landscape

Document

Building

Page 18: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

• North America

– The ratio of people and building were high more than 60%.

People

Food

Landscape

Document

Building

Page 19: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

• South America

– People photos are the most popular genre (67%)

People

Food

Landscape

Document

Building

Page 20: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Analysis of Regional Tendency of Photo Genres

• Europe

– The number of posted photos was the most large

– The genres were well balanced.

• Africa

– Almost no building, scene and food photos were posted

– People photos occupied 70%.

Page 21: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

• Middle East

– Although the number of posts were fewer than Europe, all the five genres were balanced as well

People

Food

Landscape

Document

Building

Page 22: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Analysis of Regional Tendency of Photo Genres

• South Asia

– More than half of the photos were document photos.

– This tendency was not

observed in other regions

• SouthEast Asia

– People photos are the most and in addition food photos was the second most

ss

Page 23: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Sourth America:People

Page 24: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Africa:People

Page 25: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

East Asia:Food

Page 26: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Sourth East Asia:Food

Page 27: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Discussion

• Tendency

– East Asia and East-South Asia,

• Food photos are relatively high

– South America, South Asia and East-South Asia

• people photos are exceptionally high.

– Europe and MiddleEast

• well balanced.

• East Asia enjoy posting food photos

• South America, South Asia and EastSouth Asia like to post people photos without caring privacy issue.

Page 28: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.

Conclusions

• Analyzed the differences of the tendency of the photo genres of posted photos.

– Gathered geotagged Twitter photos

– Extracted CNN features

– Carried out clustering

• Future works

– making typical genres more fine-grained

• “people photo” -> “selfy” and“group photo”

Page 29: A Large-scale Analysis of Regional Tendency of Twitter ...img.cs.uec.ac.jp/pub/conf18/190328nagano_2_ppt.pdf · Overview of the Proposed Method •(1) Gather a million-scale of geotagged

ⓒ 2017 UEC Tokyo.