deepsat: a deep learning framework for satellite image ... · nasa earth exchange (nex). + nex is...
TRANSCRIPT
National Aeronautics and Space Administration
www.nasa.gov
DeepSAT: A Deep Learning Framework
for Satellite Image Classification
Sangram Ganguly
NASA Ames Research Center/ Bay Area Environmental Research Institute
Contributions from:
Gayaka Shreekant, Subodh Kalia, Ramakrishna Nemani, Andrew Michaelis,
Thomas Vandal. Supratik Mukhopadhyay
DeepSAT 2
NASA EARTH EXCHANGE (NEX).
+ NEX is virtual collaborative that brings
scientists together in a knowledge-based social
network and provides the necessary tools,
computing power, and access to bigdata to
accelerate research, innovation and provide
transparency.
OVERVIEW
To provide “science as a service” to the Earth science community addressing global environmental challenges
VISION
To improve efficiency and expand the scope of NASA Earth science technology, research and applications
programs
GOAL
NEX Provides a Complete Work Environment – “Science As A Service”
COLLABORATION
Over 400 Members
CENTRALIZED DATA REPOSITORY
Over 2.3 PB of Data & growing..
COMPUTING
Scalable Diverse Secure/Reliable
KNOWLEDGE
Workflows Machine Images Model codes Re-useable software
DeepSAT
DeepSAT 4
Science @ NEX
Global Vegetation Biomass at 100m resolution
by blending data from 4 different satellites
High resolution climate projections
for climate impact studies
High resolution monthly global data for monitoring
forests, crops and water resources
Mapping fallowed area in California
during drought
Machine learning and Data mining moving towards more data-driven approaches
DeepSAT 5
.
Models
Physics
based
Data Volume
Mo
nito
rin
g
Models Machine
Learning
Models Physics
based
DeepSAT 6
NEX AI Lab.
DeepSAT 7
Multiple Conditions
Shadows/no-shadows, BRDF effects,
cloud cover, aerosols, mixed-pixel
effects, presence/absence of
atmospheric correction, view angle
effects, sensor altitude, etc.
Multiple Classes
Roads, impervious, grasses/shrubs,
tree, water, rooftops, etc.
Multiple Sensors
Worldview, Ikonos, high-res airborne,
Landsat, Sentinel-1/2 etc.
High Resolution Satellite/Airborne Image Classification.
Multiple Applications
Tree cover, carbon sequestration,
water extent mapping, solar efficiency,
road network monitoring, construction
monitoring, habitat monitoring, climate
modeling, etc.
NASA Carbon Monitoring System (CMS)- funded activity
DeepSAT 8
High Resolution Tree Cover Classification.
Tree cover delineation
is a hard problem
Quality of data affected by data acquisition,
pre-processing and filtering.
Significant inter-class overlaps and
often hard to distinguish between
classes.
Accuracy of present algorithms is low and there
is a pressing need to create high resolution land
cover maps.
Need to harness strong
discriminative features and
efficient learning algorithm.
DeepSAT 9
Lots of big images!
Landsat Thematic Mapper
1984-2012
Monthly composites of
Biophysical products such
as LAI
Focus on:
Land cover changes
Migration of
ecosystems
High altitude
ecosystems
Forest mortality
330,000 NAIP
Scenes
65 Terabytes of Images
7000x7000 Image Matrix
Big Data
Need for Big Computation
Images fed in parallel to cores in
HPC
Current End-to-end Processing Time (California with 11,000 scenes) -> 48 hours
Total Wall time for processing Continental U.S. -> 2.64 million hours
Total size of Feature Vectors Extracted 2.8 PetaBytes
One Epoch (2010 – 2012)
DeepSAT 10
NAIP Processing Architecture.
NEX
NASA Earth Exchange
Storage
HPC M1 HPC Module 1
HPC M2 HPC Module 2
HPC M3 HPC Module 3 NASA Earth Exchange High Performance Computing (HPC)
NEX HPC
NEX HPC
NEX
NEX
HPC M3
HPC M1 HPC M2
INPUT IMAGE
VOTING
UPDATE TRAINING DATASET
OUTPUT IMAGE
DeepSAT 11
NAIP Processing Architecture on AWS.
• Configure a base set of
AWS services to build the
processing pipeline
• Process ~15,000 Scenes
• ~5000 x 5000 pixels /
scene
• Leveraged Spot Instances
• 70% savings
• Managed services
• Spinup, process,
tear down in 1
week.
• More that just computing…
DeepSAT 12
Learning Module – Training Phase.
Training data
EXTRACT FEATURE VECTORS
CCM DCT
NDVI EVI
INITIALIZE WEIGHTS OF NEURAL NETWORK USING DEEP BELIEF NETWORK
TRAINING CLASS LABELS
TAKE SUB-SAMPLE OF THE FEATURE VECTORS APPEND CLASS LABEL AND FEED TO ANN
TRAIN ANN WITH BACKPROPAGATION AND STOCHASTIC GRADIENT DECENT
Trained Neural Network
EACH LAYER IN DBN IS A RBM AND TRAINED USING CONTRASTIVE DIVERGENCE WITH
REPEATED GIBBS SAMPLING
From Unsupervised Pre-training to Supervised Learning
Learning Module - Testing/ Prediction Phase.
NAIP Tile
EXTRACT FEATURE VECTORS
CCM DCT
NDVI EVI
Trained Neural Network CLASS MASK
PREDICT CLASS
AND GENERATE LABELS
NORMALIZE DATA AND FEED TO ANN
Basu, S.; Ganguly, S.; Nemani, R.R.; Mukhopadhyay, et al., A Semiautomated Probabilistic Framework for Tree-Cover Delineation From 1-m NAIP Imagery Using a High-Performance Computing Architecture, in Geoscience and Remote Sensing, IEEE Transactions on , vol.53, no.10, pp.5690-5708, Oct. 2015 doi: 10.1109/TGRS.2015.2428197.
DeepSAT
Experimental Results.
Densely Forested
Fragmented forests
Urban areas Overall
Total samples 12000 12000 12000 36000
Tree samples 6000 6000 6000 18000
Non-tree samples
6000 6000 6000 18000
True Positive Rate (%)
85.87 88.26 73.65 82.59
False positive Rate (%)
2.21 0.99 1.98 1.73
Total scenes processed = 11095 for the whole of California
DeepSAT
California Tree Cover Mosaic.
San Francisco Bay Area
DeepSAT
DeepSAT
DeepSAT A Learning Framework for Satellite Imagery
Feature-enhanced
DBN CNN
Stacked Autoencoder MODELS
OUR DATA SAT-4 SAT-6
500,000 Image Patches
4 Land Cover Types
(Barren, Tree, Grass, All Other)
405,000 Image Patches
6 Land Cover Types
(Barren, Tree, Grass, Road,
Building, Water Bodies)
RESULTS SAT4 Classifier Accuracy:
97.946
SAT4 Classifier Accuracy:
86.827
SAT4 Classifier Accuracy:
79.978
SAT6 Classifier Accuracy:
93.916
SAT6 Classifier Accuracy:
79.063
SAT6 Classifier Accuracy:
78.430
Saikat Basu, Sangram Ganguly, Supratik Mukhopadhyay, Robert Dibiano, Manohar Karki and Ramakrishna Nemani, DeepSat - A Learning framework for Satellite Imagery, ACM SIGSPATIAL 2015 Saikat Basu, Manohar Karki, Sangram Ganguly, Robert DiBiano, Supratik Mukhopadhyay, Ramakrishna Nemani, Learning Sparse Feature Representations using Probabilistic Quadtrees and Deep Belief Nets, European Symposium on Artificial Neural Networks, ESANN 2015
CNN: Convolutional Neural Network
SATNet – The satellite imagery training database & model zoo.
Model Zoo (pre-trained models for different satellites e.g. Landsat, Sentinel-1/2,
Worldview, etc.
Inspired by Imagenet, we are building a huge database of labeled satellite/aerial imagery dataset Labeled data generated by experts using a GUI interface. Covers different landcover classes – trees, barren lands, shrubs, rooftops, water bodies and much more. The goal is to create a dataset with 5,000,000 labeled samples.
Current State of Art – more deep architectures (SegNet).
DeepSAT
Original NAIP data (RGB)
Forest mask
NAIP forest scene
Training Data collection
Training data generation.
DeepSAT
17 US states – AL, AR, AZ, CO, CT, DE, FL, GA, IA, ID, IL, IN, KS, KY, LA, MA, MD
Selection of 10 random NAIP forest scenes for each US state
Dividing each scene into 200 patches (dimension: 600 x 600)
Random selection of 60 patches
Forest mask generation from each patch
Total number of forest mask generated = 1020
Phase – 1
Phase – 2
8 US states – CA, ME, MI, MN, MO, MS, MT, NC (Under progress)
Sample training data – forest/ non-forest.
DeepSAT
Roadmap.
DeepSAT
Phase II (started & almost in completion):
• Generate training data – 60 patches of 600 x 600 for 50 states
• Established baseline land cover classification accuracy using CNN based segmentation
architecture such as SegNet and FCN
• Scale results to all 50 states ~ 330,000 tiles of 7000 x 8000 pixels
Phase II (in progress):
• Extend SegNet & FCN architecture to fuse information from multiple bands and other data
Phase III:
• Develop techniques to combine large amount of unlabeled data with supervised techniques
to further improve classification (e.g. model learnt from DBN as a replacement to the fully
connected layer in CNN)
SegNet Sample Result.
DeepSAT
Input image Ground truth 100,000 iterations
6000x6000 NAIP Tile
6000x6000 Prediction
THE WHOLE TILE
Statistics Overall
Total samples 14,565
True Positive Rate (%) 83.29
False Positive Rate (%) 4.61
Accuracy (%) 92.87
Time to predict a tile ~2 minutes
Super Resolution CNNs to Downscale Climate Models.
DeepSAT
Global climate models (GCMs) exist at low resolutions (> 100km) but climate
change effects are local
Downscaling Problem: Learn a mapping from low resolution GCMs precipitation
(and high resolution topography) to high resolution observed precipitation.
Proposed Method: Apply a Super-Resolution CNN [Dong 2014] to downscaling
1. Dong, Chao, et al. "Learning a deep convolutional network for image super-resolution." European Conference on Computer
Vision. Springer International Publishing, 2014.
SRCNN Architecture.
DeepSAT
SRCNN Training.
DeepSAT
Data:
-Precipitation from Prism 4km
upscaled to 16km and 50km
interpolated to 16km grid
-Training years: 1981-2000
-Testing years: 2001-2015
Mapping: 50km -> 16km
“Sub-images”:
-Crop 51x51 patches with stride 30
-Count > 10 million in training set
Computing: Tensorflow + 4 GPUs
from the Nvidia Devbox
Figure: https://www.tensorflow.org/versions/r0.11/tutorials/deep_cnn/index.html
Daily RMSE over the Continental United States .
DeepSAT
RMSE: BCSD = 2.23, SRCNN = 1.57
* BCSD is a widely used statistical downscaling method for global climate models
Massively Leveraging Nvidia .
DeepSAT
All our CNN models are trained on Nvidia devbox cluster
Pleiades GPU cluster is used to train models with multiple sets of train/test
datasets – both DBN and CNN architectures
DGX-1 like systems are important !
Model size: It has been recently shown deep neural networks – such as 152 layer ResNet architecture outperforms
shallower networks such as VGG-16. The increase in layers leads to a tremendous increase in number of parameters.
Hence, the mathematical operations needed such as gradient and nonlinear function of inputs, also increase. These
networks require ~15 billion FLOPS and this number will keep increasing as the networks grow in size and complexity.
The model complexity as well as input data size is limited by the GPU memory. With 128 GB vram, more complicated
models can be used for experimentation. Increasing input size has also many desirable effects e.g., the gradient descent
will have less noise, in the context of CNN, bigger images could provide more context for classification and hence,
improve classification/segmentation accuracy.
Training time: As training time is reduced due to P100 improvements over last generation (Maxwell architecture), as well
significant increase in number of cores, the training time will be much shorter. Faster training implies more
experimentation and faster innovation. Pascal architecture has been shown to perform 5-7x faster than the last
generation using some Deep Learning benchmarks.
Going Forward.
DeepSAT
• NEX-AI’s core is to focus on blending physical models with state-of-art machine
learning frameworks to address NASA’s mission objectives
• NEX-AI currently has focus on a number of problems related to satellite image
classification, climate downscaling and large scale anomaly detection
• DeepSAT will provide the current modeling frameworks along with access to
training data for NEX users
• NEX-AI will collaborate with industry leading experts in testing newer
frameworks in AI and in defining hard problems in the land-climate-atmosphere
continuum that can be “possibly” be solved by clever ensemble learning models
Contact Email: [email protected]