the weather of the century part 3: visualization

45
A. Jesse Jiryu Davis #MongoDBWorld The Weather of the Century Part III: Visualization Senior Python Engineer, MongoDB

Upload: mongodb

Post on 29-Aug-2014

150 views

Category:

Technology


3 download

DESCRIPTION

MongoDB natively supports geospatial indexing and querying, and it integrates easily with open source visualization tools. In this presentation, learn high-performance techniques for querying and retrieving geospatial data, and how to create a rich visual representation of global weather data using Python, Monary, and Matplotlib.

TRANSCRIPT

Page 1: The Weather of the Century Part 3: Visualization

A. Jesse Jiryu Davis

#MongoDBWorld

The Weather of the Century!Part III:!Visualization

Senior Python Engineer, MongoDB

Page 2: The Weather of the Century Part 3: Visualization

Serious MongoDB Talk

Page 3: The Weather of the Century Part 3: Visualization

Serious MongoDB Talk

Database

Page 4: The Weather of the Century Part 3: Visualization

Serious MongoDB Talk

Page 5: The Weather of the Century Part 3: Visualization

This Talk

Page 6: The Weather of the Century Part 3: Visualization

Where’s the data from?

Page 7: The Weather of the Century Part 3: Visualization

Where’s the data from?

Page 8: The Weather of the Century Part 3: Visualization

How Much Is There?

Page 9: The Weather of the Century Part 3: Visualization

Deployment

Page 10: The Weather of the Century Part 3: Visualization

Visualization

Page 11: The Weather of the Century Part 3: Visualization

Visualization Pipeline

MongoDB PyMongo NumPy MatplotlibPython dicts

SciPy

Page 12: The Weather of the Century Part 3: Visualization

import numpy!import pymongo!!data = []!db = pymongo.MongoClient().my_database!!for doc in db.collection.find(query):! data.append((! doc['position']['coordinates'][0],! doc['position']['coordinates'][1],! doc['airTemperature']['value']))!!arrays = numpy.array(data)!

Page 13: The Weather of the Century Part 3: Visualization

# NumPy column access syntax.!lons = arrays[:, 0]!lats = arrays[:, 1]!temps = arrays[:, 2]!

Page 14: The Weather of the Century Part 3: Visualization
Page 15: The Weather of the Century Part 3: Visualization

from scipy import griddata!from matplotlib import pyplot!!xs = numpy.linspace(-180, 180, 361)!ys = numpy.linspace(-90, 90, 181)!zs = griddata(lats, lons, temps,! (xs, ys),! method='linear')!!pyplot.contour(xs, ys, zs)!

Magic!!

Also magic!!

Page 16: The Weather of the Century Part 3: Visualization
Page 17: The Weather of the Century Part 3: Visualization

from matplotlib import pyplot!!xs = numpy.linspace(-180, 180, 361)!ys = numpy.linspace(-90, 90, 181)!zs = griddata(lats, lons, temps,! (xs, ys),! method='linear')!!pyplot.contour(xs, ys, zs)!

Page 18: The Weather of the Century Part 3: Visualization

Triangulation

Page 19: The Weather of the Century Part 3: Visualization

Triangulation

Page 20: The Weather of the Century Part 3: Visualization

What temperature?

Triangulation

Page 21: The Weather of the Century Part 3: Visualization

Barycentric Interpolation

What temperature? 53

48

54

Weighted Average

51.1

Page 22: The Weather of the Century Part 3: Visualization

Interpolation

51.1

Page 23: The Weather of the Century Part 3: Visualization

Interpolation

Page 24: The Weather of the Century Part 3: Visualization

Interpolation

Page 25: The Weather of the Century Part 3: Visualization

Contours

Page 26: The Weather of the Century Part 3: Visualization

Contours

Page 27: The Weather of the Century Part 3: Visualization
Page 28: The Weather of the Century Part 3: Visualization
Page 29: The Weather of the Century Part 3: Visualization
Page 30: The Weather of the Century Part 3: Visualization
Page 31: The Weather of the Century Part 3: Visualization
Page 32: The Weather of the Century Part 3: Visualization
Page 33: The Weather of the Century Part 3: Visualization

import numpy!import pymongo!!data = []!db = pymongo.MongoClient().my_database!!for doc in db.collection.find(query):! data.append((! doc['position']['coordinates'][0],! doc['position']['coordinates'][1],! doc['airTemperature']['value']))!!arrays = numpy.array(data)!

Not terrifically fast

Page 34: The Weather of the Century Part 3: Visualization

Analyzing large datasets

• Querying: 109k documents per second • (On localhost) • Can we go faster? • Enter “Monary”

Page 35: The Weather of the Century Part 3: Visualization

MongoDB PyMongo NumPy MatplotlibPython dicts

MongoDB Monary NumPy Matplotlib

Monary by David Beach

Page 36: The Weather of the Century Part 3: Visualization

import monary!!data = []!connection = monary.Monary()!!arrays = monary_connection.query(! db='my_database',! coll='collection',! query=query,! fields=['lon', 'lat', 'temp'],! types=[! 'float32', 'float32', 'float32'])!

Page 37: The Weather of the Century Part 3: Visualization

Monary

• PyMongo: 109k documents per second • Monary: 817k documents per second

Page 38: The Weather of the Century Part 3: Visualization

Visualization

Page 39: The Weather of the Century Part 3: Visualization

{! ts: ISODate("1991-01-01T00:00:00Z"),! position: {! type: "Point",! coordinates: [! -94.6,! 39.117! ]! },! airTemperature: {! value: 45,! quality: "1"! }!}!

Original Schema

Page 40: The Weather of the Century Part 3: Visualization

{! ts: ISODate("1991-01-01T00:00:00Z"),! lon: 39.117,! lat: -94.6,! temp: 45!}!

Target Schema

Page 41: The Weather of the Century Part 3: Visualization

Future of Monary

• Author:David Beach

• Interns:Kyle SuarezMatt Cotter

• Mentors:A. Jesse Jiryu DavisJason Carey

Page 42: The Weather of the Century Part 3: Visualization

Future of Monary

• Subdocuments: "airTemperature.value"

• Aggregation cursor

• Packaging

• Bugfixes

• Python 3

Page 43: The Weather of the Century Part 3: Visualization

Thanks

• Monary

• NumPy

• SciPy

• Matplotlib

Page 44: The Weather of the Century Part 3: Visualization

Thanks

Page 45: The Weather of the Century Part 3: Visualization

Thank you

#MongoDBWorld

A. Jesse Jiryu DavisSenior Python Engineer, MongoDB