weather of the century: visualization

44
A. Jesse Jiryu Davis The Weather of the Century: Visualization Senior Python Engineer, MongoDB @jessejiryudavis

Upload: mongodb

Post on 19-May-2015

931 views

Category:

Data & Analytics


1 download

DESCRIPTION

MongoDB natively supports geospatial indexing and querying, and it integrates easily with open source visualization tools. In this webinar, learn high-performance techniques for querying and retrieving geospatial data, and how to create a rich visual representation of global weather data using Python, Monary, and Matplotlib.

TRANSCRIPT

Page 1: Weather of the Century: Visualization

A. Jesse Jiryu Davis

The Weather of the Century:Visualization

Senior Python Engineer, MongoDB

@jessejiryudavis

Page 2: Weather of the Century: Visualization

Serious MongoDB Talk

Page 3: Weather of the Century: Visualization

Serious MongoDB Talk

Database

Page 4: Weather of the Century: Visualization

Serious MongoDB Talk

Page 5: Weather of the Century: Visualization

This Talk

Page 6: Weather of the Century: Visualization

Where’s the data from?

Page 7: Weather of the Century: Visualization

Where’s the data from?

Page 8: Weather of the Century: Visualization

How Much Is There?

Page 9: Weather of the Century: Visualization

Visualization

Page 10: Weather of the Century: Visualization

Visualization Pipeline

MongoDB PyMongo NumPy MatplotlibPython dicts

SciPy

Page 11: Weather of the Century: Visualization

{ ts: ISODate("1991-01-01T00:00:00Z"), position: { type: "Point", coordinates: [ -94.6, 39.117 ] }, airTemperature: { value: 45, quality: "1" }}

GeoJSON

Page 12: Weather of the Century: Visualization

import numpyimport pymongo

data = []db = pymongo.MongoClient().my_database

for doc in db.collection.find(query): data.append(( doc['position']['coordinates'][0], doc['position']['coordinates'][1], doc['airTemperature']['value']))

arrays = numpy.array(data)

Page 13: Weather of the Century: Visualization

# NumPy column access syntax.lons = arrays[:, 0]lats = arrays[:, 1]temps = arrays[:, 2]

Page 14: Weather of the Century: Visualization
Page 15: Weather of the Century: Visualization

from scipy import griddatafrom matplotlib import pyplot

xs = numpy.linspace(-180, 180, 361)ys = numpy.linspace(-90, 90, 181)zs = griddata(lats, lons, temps, (xs, ys), method='linear')

pyplot.contour(xs, ys, zs)

Magic!!

Also magic!!

Page 16: Weather of the Century: Visualization
Page 17: Weather of the Century: Visualization

from matplotlib import pyplot

xs = numpy.linspace(-180, 180, 361)ys = numpy.linspace(-90, 90, 181)zs = griddata(lats, lons, temps, (xs, ys), method='linear')

pyplot.contour(xs, ys, zs)

Page 18: Weather of the Century: Visualization

Triangulation

Page 19: Weather of the Century: Visualization

Triangulation

Page 20: Weather of the Century: Visualization

What temperature?

Triangulation

Page 21: Weather of the Century: Visualization

Barycentric Interpolation

What temperature?53

48

54

Weighted Average

51.1

Page 22: Weather of the Century: Visualization

Interpolation

51.1

Page 23: Weather of the Century: Visualization

Interpolation

Page 24: Weather of the Century: Visualization

Interpolation

Page 25: Weather of the Century: Visualization

Contours

Page 26: Weather of the Century: Visualization

Contours

Page 27: Weather of the Century: Visualization
Page 28: Weather of the Century: Visualization
Page 29: Weather of the Century: Visualization
Page 30: Weather of the Century: Visualization
Page 31: Weather of the Century: Visualization
Page 32: Weather of the Century: Visualization
Page 33: Weather of the Century: Visualization

import numpyimport pymongo

data = []db = pymongo.MongoClient().my_database

for doc in db.collection.find(query): data.append(( doc['position']['coordinates'][0], doc['position']['coordinates'][1], doc['airTemperature']['value']))

arrays = numpy.array(data)

Not terrifically fast

Page 34: Weather of the Century: Visualization

Analyzing large datasets

• Querying: 109k documents per second• (On localhost)• Can we go faster?• Enter “Monary”

Page 35: Weather of the Century: Visualization

MongoDB PyMongo NumPy MatplotlibPython dicts

MongoDB Monary NumPy Matplotlib

Monaryby David Beach

Page 36: Weather of the Century: Visualization

import monary

data = []connection = monary.Monary()

arrays = monary_connection.query( db='my_database', coll='collection', query=query, fields=[ 'position.coordinates.0', 'position.coordinates.1', 'airTemperature.value'], types=[ 'float32', 'float32', 'float32'])

Page 37: Weather of the Century: Visualization

Monary

• PyMongo: 109k documents per second

• Monary: 817k documents per second

Page 38: Weather of the Century: Visualization

Visualization

Page 39: Weather of the Century: Visualization

• Author:David Beach

• Interns:Kyle SuarezMatt Cotter

• Mentors:A. Jesse Jiryu DavisJason Carey

Monary

Page 40: Weather of the Century: Visualization

Recent features:

• Easy installation

• Nested field access

• Aggregation

• Python 3

Monary

Page 41: Weather of the Century: Visualization

• Insert, update, remove

• SSL and authentication mechanisms

• parallelCollectionScan

Monary

Future:

Page 42: Weather of the Century: Visualization

Thanks

• Monary

• NumPy

• SciPy

• Matplotlib

Page 43: Weather of the Century: Visualization

Thanks

Page 44: Weather of the Century: Visualization

Thank you

#MongoDBWorld

A. Jesse Jiryu DavisSenior Python Engineer, MongoDB