we made a few trillion new pixels yesterday. here’s why. 10jun11 terry busch dia this briefing is...
TRANSCRIPT
We Made a Few Trillion New Pixels Yesterday. Here’s Why.
10JUN11Terry Busch DIA
This briefing is classifiedUNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
Directorate for Analysis
2
DIA manages a large GIS enterprise
UNCLASSIFIED
UNCLASSIFIED
Over the years it became, a large IT dinosaur
• Complicated
• Heavy
• Expensive
• Proprietary “I love deadlines. I like the whooshing sound they make as they fly by.”-- Douglas Adams
Directorate for Analysis
3
And then we discovered Open Source Technology
UNCLASSIFIED
UNCLASSIFIED
And as a bi-product we became
• Lightweight
• Less Expensive
• Interoperable
• Faster (yeah, that’s right).
“Bill Gates is a very rich man today... and do you want to know why? The answer is one word: versions.”-- Dave Berry
Directorate for Analysis
4
INTELLIGENCE DATAPARAMETRIC DATATECHNICAL DATAINFRASTRUCTURE DATAOPERATIONAL DATA
DOCTRINESTRATEGYTACTICS
One nice piece of technology we found was Tile Cache (Map Layers)
“Technology does not run an enterprise, relationships do.”-- Patricia Fripp
• Simple caching tool for map chips
• Very, very fast (at home speeds)
• The client actually makes its own cache?
• Wait, wait. What can we do on the server?
Directorate for Analysis
5
Tile Cache seemed a panacea
“Birds do it, bees do it, even educated fleas do it. Let's do it, let's fall in love.” -- Cole Porter
• We don’t have Google or Bing or MapQuest
• Map Services had always been – “slow to load” (my fault)
• Our map services never quite hit the tipping point.
• Free (Cost Neutral)
Directorate for Analysis
6
Light Bulb!
“An idea that is not dangerous isunworthy of being called an idea at all” --Oscar Wilde
So What if We?
• Cached the entire world?
• I mean everything we had:• Scanned Maps• Imagery Maps• Elevation• That sink over there?
• It’s just disk!
Directorate for Analysis
7
Stop the Presses
“Your heart is my pinata” –Chuck Palahniuk
“Love is a battlefield” – Pat Benatar
A few problems:
• Scripting for tile caches is kinda slow.
• So, basically, for our datasets…
• It would take 2 years to process on our best servers!
• Ugh
Directorate for Analysis
8
The Holy Grail
“Oh, wicked, bad, naughty Zoot! She has been setting alight to our beacon,which, I just remembered, is grail-shaped. It's not the first time we've had this problem” -- Dingo
Aha! What if we went to the cloud:
• Amazon EC2
• Instead of scripting on a few servers…how about a few thousand servers?
• Cost = Peanuts
Directorate for Analysis
9
You want the truth?
2 Year core processing effort reduced to 24 hours.
3+TB raw data 30TB tile cache to 18 Predefined Levels.
Map Layers: Image Maps, Military Charts, Elevation Data, and much much more.
Above: An scanned digital map of Haiti. Using tile cache we scanned every map andimage map in our archives.
Directorate for Analysis
10
Handling the Truth
Initial Effort: Created 1.75 Trillion new pixels (2009).
At one point in processing we harnessed about 5000 servers simultaneously.
Can process tile cache layers at 2, 4, 8, million pixels per second.
Above: Open Street Maps (OSM) tile cache services are amongst our most popular.
Directorate for Analysis
11
So Now:
Got rid of all the wires and databases and junk.
2-3 Million service requests per month.
Stopped spending my GIS money on servers and expensive hardware.
Above: Satellite Imagery tile cacheBlended with some sort of vectorlayer.
Directorate for Analysis
12
Oh, and…
Scalable (just add disk)
Portable (if you have data bricks)
Updatable (just re-cache)
Version proof Above: Imagery elevation blendsHave become as popular in ourworld as it has in public.
Directorate for Analysis
13
Cloud Computing is not error free
Had to write error handling
Read and react in real-time.
Oh, and the bandwidth I/O problem…Q. How does one move 30Tb off
Amazon.
A. Data Bricks
“It’s hard enough to find an error in your code when you’re looking for it; It’s even harder when you’ve assumed your code is error-free.” – Steve McConnell
Directorate for Analysis
14
But wait! There’s more…
We had all this data sitting in the cloud….
What if we solved some other problems?
Saved some people some time?
Created app ready surfaces for analysis?
“Without promotion something terrible happens... Nothing!”-- PT Barnum
Directorate for Analysis
15
G++ Base Layers for GIS
With the data left hanging around we created:
Slope Aspect Hillshade Terrain Ruggedness Relevation
For the whole world…Above: Examples of surface area ofThe world cached with a chip from Slope and Aspect global layers.
Directorate for Analysis
16
Can We?
Eliminate the status bar wait (get coffee GIS)
Get rid of the slope button
Prepare ourselves for a web-driven GIS world
Let our analysts….be analysts!!
Terrain Ruggedness Index: A key index for understanding how humans interact with topography for site selection and mobility.
Directorate for Analysis
17
That’s Levitation Holmes!• So, now, if you want a travel
cost model, we’ve got your base.
• Want a viewshed, line-of sight? Done.
• Need site hel for your weekend home or establishing you vineyard? Check.
• GP Ready.
Relevation: Foundational layer for identifying landform categories (e.g. ridges, valleys, side slopes…). Exceptionally useful for suitability modeling and routing.
Directorate for Analysis
18
Cloud GIS – MrGeo
Map Reduce Geo
Distributed or Cloud GIS
Helping us with that crowd-source mass-data scaling problem we face in the future now.
Gives us the tools we need for our crowd sourced very busy data future.
Somewhat Open Source
MrGeo: Examples of cloud based processing as elevation data for the world is calculated into elevation derived layers.
Directorate for Analysis
That was all so 2009
• More Hybrid Layers for the Cache (More OSM Please).
• Hydrology layer for the whole world
• Global Change Detection
• World Remoteness
• Give me some ideas here people!
“There's a fine line between fishing and just standing on the shore like an idiot.” - Steven Wright