big data, big tourism

30
Big Data, Big Tourism Tourism and Mechanics https://www.slideshare.net/sirmmo/big-data-big-tourism

Upload: marco-montanari

Post on 29-Jan-2018

180 views

Category:

Education


0 download

TRANSCRIPT

Big Data, Big TourismTourism and Mechanics

https://www.slideshare.net/sirmmo/big-data-big-tourism

What are «Big Data»?

• Excel gets stuck working a dataset? => «medium» data

• Stata/R suffer working a dataset? => «big» data

Where do we get the data?

• Tourists• Have sensors

• Are sensors

• Are actors

• Attractions• Are sensors

• Are actors

• Hotels, restaurants• Are sensors

• Have sensors

Can we access the data?

• Tourists• Have sensors

• Are sensors

• Are actors

• Attractions• Are sensors

• Are actors

• Hotels, restaurants• Are sensors

• Have sensors

Can we access the data?

• Tourists• Have sensors

• Are sensors

• Are actors

• Attractions• Are sensors

• Are actors

• Hotels, restaurants• Are sensors

• Have sensors

Can we access the data?

• Tourists• Have sensors

• Are sensors

• Are actors

• Attractions• Are sensors

• Are actors

• Hotels, restaurants• Are sensors

• Have sensors

Government

Can we access the data?

• Tourists• Have sensors

• Are sensors

• Are actors

• Attractions• Are sensors

• Are actors

• Hotels, restaurants• Are sensors

• Have sensors

Private Sector

Can we access the data?

• Tourists• Have sensors

• Are sensors

• Are actors

• Attractions• Are sensors

• Are actors

• Hotels, restaurants• Are sensors

• Have sensors

Private SectorGovernment

Open(able/ish) Data

Almostalways

Ok so who owns that data?

• Government• Bureaucracy-driven data• Incoherent• Inconsistent• Irregular production

• Private Sector• Deeply integrated with user

experience• Very «behavioral», and as such

very «real»• Very business-oriented metrics

Ok so who owns that data?

• Government• Bureaucracy-driven data• Incoherent• Inconsistent• Irregular production

• Private Sector• Deeply integrated with user

experience• Very «behavioral», and as such

very «real»• Very business-oriented metrics

Ok so who owns that data?

• Government• Bureaucracy-driven data• Incoherent• Inconsistent• Irregular production

• Private Sector• Deeply integrated with user

experience• Very «behavioral», and as such

very «real»• Very business-oriented metrics

Scraping

• Time consuming

• Power consuming

• Illegal (up to a certain point)

• Unavoidable (up to a certainpoint)

Scraping

• It relies on the fact that (most) web is based on HTML• And HTML is text

• And JavaScript is text

• And CSS is text

• Everything can be read beforethe render…

Scraping

• It relies on the fact that (most) web is based on HTML• And HTML is text

• And JavaScript is text

• And CSS is text

• Everything can be read beforethe render…

• Or after the render

Tools

• Not easy for «complex» sites• Some cases come up

• Some tools help• Maybe knowledge of Xml Query

Language or CSS required

• Some tools are very advanced• Selenium browser driver

• «headless» browsers

• Chrome• https://chrome.google.com/webstore/detai

l/scraper/mbigbapnjcgaffohmbkdlecaccepngjd?hl=en

• https://chrome.google.com/webstore/detail/web-scraper/jnhgnonknehpejjnehehllkliplmbmhn?hl=en

• https://chrome.google.com/webstore/detail/advanced-web-scraper/gpolcofcjjiooogejfbaamdgmgfehgff

• Firefox• https://addons.mozilla.org/en-

US/firefox/addon/datascraper/

• Web• https://www.import.io/• https://scrapinghub.com/portia/

Cases and issues of scraping

• Booking.com • Amazing website

• Easy navigation for the user

• Issues• They know!!!

• The website gets a complete structural overhaul every 6-9 months

• They tend to hate scrapers

• The webpage is empty at the beginning

Cases and issues of scraping

• Booking.com • Amazing website

• Easy navigation for the user

• Issues• They know!!!

• The website gets a complete structural overhaul every 6-9 months

• They tend to hate scrapers

• The webpage is empty at the beginning

Cases and issues of scraping

• AirBnB• Nice navigation

• Full overhaul every 3 months

• Issues• The page really tracks what kind of

user is accessing

• The visible pages are 13 (only)

• They are randomly generatedevery day for the major areas

Cases and issues of scraping

• Weather• Many sources

• Many formats

• Issues• Normalization of vocabulary

• Bad weather == Rain == Rainy == Cloud Icon == ???

• Normalization of ranges

• Normalization of numbers

• Normalization of periodicity

Apps

Questionnaireto get user to explicitly givedata

Information drivenapplication to track user data

Gamificationand/or information platform to elaborate and give data back

Explicit data

• Relies on the user’s knowingactions

• Requires real willing acceptancefor sharing information

• Stops at politically correctness

• Implies (almost always) anonimity

• Questionnaire

• In-place review

• In-place comment

• Bureaucracy

Behavioral data

• Almost always true

• Difficult to get

• Easily contextualizable

• Interactive

• Interconnected

• Application

• Platform

• Social Media integration

• Gamification

• Social Media involvement

Cool, so what can be done?

Getting Data

• Municipalities are setting up open wireless networks. • Users can be tracked.

• Services can be offered (and instrumented)

• Museums can track users withintheir premises

• Social Media interactions

Using Data

• Analysis of context of specificbehaviours

• Automated storytelling for city visits

• Pricing methodologies

• Destination brand analysis

Big and Big-ish Data Tools

• The problem is computationalpower

• Lots of work on AI• Classification

• Generation

• Machine Learning

• Correlations

• DataWarehouses• Mondrian -

http://community.pentaho.com/projects/mondrian/

• Big Data DBs• Cassandra - http://cassandra.apache.org/• Hadoop - http://hadoop.apache.org/

• Big Data Search• BigQuery -

https://cloud.google.com/bigquery/• GraphQL - http://graphql.org/

• Big Data AI/ML• TensorFlow -

https://www.tensorflow.org/• ScikitPy - https://www.scipy.org/

A few open questions

• Impact of crowdfunding on tourism-bound projects

• Impact of meta-search-engines on pricing

• Impact (or lack thereof) of destination information websites on user decisions

• How can the user be «vetted» in order to tailor the touristicexperience around her?• Would such vetting process impact on customer return decisions?

Thanks! Questions?

@[email protected]://ingmmo.com, https://medium.com/@ingmmosirmmohttp://it.linkedin.com/in/montanarim/https://www.facebook.com/marco.montanarimarco.montanari

https://www.slideshare.net/sirmmo/big-data-big-tourism