little big data - inovex.de · little big data analyze your own data with the elastic stack inovex...

27
Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017

Upload: others

Post on 04-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Little Big DataAnalyze your own data with the Elastic Stack

inovex Meetup Köln, 18.09.2017

Page 2: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Fun:MarathonTriathlonLactateGarmin

Work (can be fun, too):SearchElastic StackCrawlingGeo Data

Wolfgang SchochData Management & Analyticsinovex [email protected]

Page 3: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

You create a lot of data every day

Page 4: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Some of them on purpose

Page 5: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

So many devices

Page 6: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

So many possibilitiesVendor Apps Third-Party Apps

Page 7: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Not enough?

Page 8: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Reclaim your data!

Page 9: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Just 10 clicks to create a cool biking tour

Thanks to billions of tracks uploaded by millions of users.

Page 10: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Motivation

• Explore your data

• Discover new patterns

• Combine data from different sources

• Customize. Everything.

• Most important: Why not?

Page 11: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Example: Garmin Forerunner 920xt

• GPS Multisport Watch

• Integrates in Garmin Connect online portal and mobile app

• Writes natively .fit

• Integrates in Garmin Connect

• Other formats (tcx, gpx) available via Garmin Connect

Page 12: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Mission

• Pull whole history of activities from Garmin Connect

• Normalize and transform it as needed

• Load it into Elasticsearch

• Visualize the hard work in Kibana

• Gain new insights

Page 13: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Why Elasticsearch?

• The Swiss Army knife for search and analytics

• My everyday technology

• Easy to use

Page 14: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Export activities as CSV

Page 15: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

CSV Export

Fixed Columns

Fixed Length

Page 16: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Export Single Activity

• Original is binary .fit

• GPX is XML, contains only route

• TCX is XML, contains everything

Page 17: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

TCX export works, but…

• there is no bulk export (I created my own with some Python magic)

• can be a huge amount of data (contains GPS data)

• The X in TCX stands for XML, and nobody likes XML (except machines)

Page 18: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

I ❤ XML

Page 19: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Why not use the REST API?

• Did, was every expert should do: googled for help and advice

• Found a couple of semi-official APIs from Garmin

• Endpoint for bulk export of activities

Page 20: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Normalize & Transform

• JSON is a good starting point

• Different units for attributes like speed or cadence per activity type

• Select relevant attributes

Page 21: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Load into Elasticsearch

• Straight forward via _bulk API

• Dynamic mapping is a good starting point

• Define minimal template, e.g. to cover GeoPoints

Page 22: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Kibana Dashboard

Page 23: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Add another data source

• Discover relations between different data

• Gain insights

• This is where the fun begins

Page 24: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Example: Weight Data from Withings• Export CSV from vendor portal

• Import with Logstash into Elasticsearch

• Play with Timelion and Visual Builder

Page 25: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Learnings

• Everyone produces a lot of interesting data every single day

• It is always interesting to have a close look to that

• Start with a small amount of data, avoid confusion

• Try to find answers on questions like „Is there a relation between X and Y?“

Page 26: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Next Steps

• Use Spark / Zeppelin to analyze and preprocess data

• Try to build heat maps on my preferred GPS tracks

Page 27: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Questions?