deepsat: a deep learning framework for satellite image ... · nasa earth exchange (nex). + nex is...

National Aeronautics and Space Administration

www.nasa.gov

DeepSAT: A Deep Learning Framework

for Satellite Image Classification

Sangram Ganguly

NASA Ames Research Center/ Bay Area Environmental Research Institute

Contributions from:

Gayaka Shreekant, Subodh Kalia, Ramakrishna Nemani, Andrew Michaelis,

Thomas Vandal. Supratik Mukhopadhyay

DeepSAT 2

NASA EARTH EXCHANGE (NEX).

+ NEX is virtual collaborative that brings

scientists together in a knowledge-based social

network and provides the necessary tools,

computing power, and access to bigdata to

accelerate research, innovation and provide

transparency.

OVERVIEW

To provide “science as a service” to the Earth science community addressing global environmental challenges

VISION

To improve efficiency and expand the scope of NASA Earth science technology, research and applications

programs

GOAL

NEX Provides a Complete Work Environment – “Science As A Service”

COLLABORATION

Over 400 Members

CENTRALIZED DATA REPOSITORY

Over 2.3 PB of Data & growing..

COMPUTING

Scalable Diverse Secure/Reliable

KNOWLEDGE

Workflows Machine Images Model codes Re-useable software

DeepSAT

DeepSAT 4

Science @ NEX

Global Vegetation Biomass at 100m resolution

by blending data from 4 different satellites

High resolution climate projections

for climate impact studies

High resolution monthly global data for monitoring

forests, crops and water resources

Mapping fallowed area in California

during drought

Machine learning and Data mining moving towards more data-driven approaches

DeepSAT 5

.

Models

Physics

based

Data Volume

Mo

nito

rin

g

Models Machine

Learning

Models Physics

based

DeepSAT 6

NEX AI Lab.

DeepSAT 7

Multiple Conditions

Shadows/no-shadows, BRDF effects,

cloud cover, aerosols, mixed-pixel

effects, presence/absence of

atmospheric correction, view angle

effects, sensor altitude, etc.

Multiple Classes

Roads, impervious, grasses/shrubs,

tree, water, rooftops, etc.

Multiple Sensors

Worldview, Ikonos, high-res airborne,

Landsat, Sentinel-1/2 etc.

High Resolution Satellite/Airborne Image Classification.

Multiple Applications

Tree cover, carbon sequestration,

water extent mapping, solar efficiency,

road network monitoring, construction

monitoring, habitat monitoring, climate

modeling, etc.

NASA Carbon Monitoring System (CMS)- funded activity

DeepSAT 8

High Resolution Tree Cover Classification.

Tree cover delineation

is a hard problem

Quality of data affected by data acquisition,

pre-processing and filtering.

Significant inter-class overlaps and

often hard to distinguish between

classes.

Accuracy of present algorithms is low and there

is a pressing need to create high resolution land

cover maps.

Need to harness strong

discriminative features and

efficient learning algorithm.

DeepSAT 9

Lots of big images!

Landsat Thematic Mapper

1984-2012

Monthly composites of

Biophysical products such

as LAI

Focus on:

Land cover changes

Migration of

ecosystems

High altitude

ecosystems

Forest mortality

330,000 NAIP

Scenes

65 Terabytes of Images

7000x7000 Image Matrix

Big Data

Need for Big Computation

Images fed in parallel to cores in

HPC

Current End-to-end Processing Time (California with 11,000 scenes) -> 48 hours

Total Wall time for processing Continental U.S. -> 2.64 million hours

Total size of Feature Vectors Extracted 2.8 PetaBytes

One Epoch (2010 – 2012)

DeepSAT 10

NAIP Processing Architecture.

NEX

NASA Earth Exchange

Storage

HPC M1 HPC Module 1

HPC M2 HPC Module 2

HPC M3 HPC Module 3 NASA Earth Exchange High Performance Computing (HPC)

NEX HPC

NEX HPC

NEX

NEX

HPC M3

HPC M1 HPC M2

INPUT IMAGE

VOTING

UPDATE TRAINING DATASET

OUTPUT IMAGE

DeepSAT 11

NAIP Processing Architecture on AWS.

• Configure a base set of

AWS services to build the

processing pipeline

• Process ~15,000 Scenes

• ~5000 x 5000 pixels /

scene

• Leveraged Spot Instances

• 70% savings

• Managed services

• Spinup, process,

tear down in 1

week.

• More that just computing…

DeepSAT 12

Learning Module – Training Phase.

Training data

EXTRACT FEATURE VECTORS

CCM DCT

NDVI EVI

INITIALIZE WEIGHTS OF NEURAL NETWORK USING DEEP BELIEF NETWORK

TRAINING CLASS LABELS

TAKE SUB-SAMPLE OF THE FEATURE VECTORS APPEND CLASS LABEL AND FEED TO ANN

TRAIN ANN WITH BACKPROPAGATION AND STOCHASTIC GRADIENT DECENT

Trained Neural Network

EACH LAYER IN DBN IS A RBM AND TRAINED USING CONTRASTIVE DIVERGENCE WITH

REPEATED GIBBS SAMPLING

From Unsupervised Pre-training to Supervised Learning

Learning Module - Testing/ Prediction Phase.

NAIP Tile

EXTRACT FEATURE VECTORS

CCM DCT

NDVI EVI

Trained Neural Network CLASS MASK

PREDICT CLASS

AND GENERATE LABELS

NORMALIZE DATA AND FEED TO ANN

Basu, S.; Ganguly, S.; Nemani, R.R.; Mukhopadhyay, et al., A Semiautomated Probabilistic Framework for Tree-Cover Delineation From 1-m NAIP Imagery Using a High-Performance Computing Architecture, in Geoscience and Remote Sensing, IEEE Transactions on , vol.53, no.10, pp.5690-5708, Oct. 2015 doi: 10.1109/TGRS.2015.2428197.

DeepSAT

Experimental Results.

Densely Forested

Fragmented forests

Urban areas Overall

Total samples 12000 12000 12000 36000

Tree samples 6000 6000 6000 18000

Non-tree samples

6000 6000 6000 18000

True Positive Rate (%)

85.87 88.26 73.65 82.59

False positive Rate (%)

2.21 0.99 1.98 1.73

Total scenes processed = 11095 for the whole of California

DeepSAT

California Tree Cover Mosaic.

San Francisco Bay Area

DeepSAT

DeepSAT

DeepSAT A Learning Framework for Satellite Imagery

Feature-enhanced

DBN CNN

Stacked Autoencoder MODELS

OUR DATA SAT-4 SAT-6

500,000 Image Patches

4 Land Cover Types

(Barren, Tree, Grass, All Other)

405,000 Image Patches

6 Land Cover Types

(Barren, Tree, Grass, Road,

Building, Water Bodies)

RESULTS SAT4 Classifier Accuracy:

97.946

SAT4 Classifier Accuracy:

86.827


79.978


93.916


79.063


78.430

Saikat Basu, Sangram Ganguly, Supratik Mukhopadhyay, Robert Dibiano, Manohar Karki and Ramakrishna Nemani, DeepSat - A Learning framework for Satellite Imagery, ACM SIGSPATIAL 2015 Saikat Basu, Manohar Karki, Sangram Ganguly, Robert DiBiano, Supratik Mukhopadhyay, Ramakrishna Nemani, Learning Sparse Feature Representations using Probabilistic Quadtrees and Deep Belief Nets, European Symposium on Artificial Neural Networks, ESANN 2015

CNN: Convolutional Neural Network

SATNet – The satellite imagery training database & model zoo.

Model Zoo (pre-trained models for different satellites e.g. Landsat, Sentinel-1/2,

Worldview, etc.

Inspired by Imagenet, we are building a huge database of labeled satellite/aerial imagery dataset Labeled data generated by experts using a GUI interface. Covers different landcover classes – trees, barren lands, shrubs, rooftops, water bodies and much more. The goal is to create a dataset with 5,000,000 labeled samples.

Current State of Art – more deep architectures (SegNet).

DeepSAT

Original NAIP data (RGB)

Forest mask

NAIP forest scene

Training Data collection

Training data generation.

DeepSAT

17 US states – AL, AR, AZ, CO, CT, DE, FL, GA, IA, ID, IL, IN, KS, KY, LA, MA, MD

Selection of 10 random NAIP forest scenes for each US state

Dividing each scene into 200 patches (dimension: 600 x 600)

Random selection of 60 patches

Forest mask generation from each patch

Total number of forest mask generated = 1020

Phase – 1

Phase – 2

8 US states – CA, ME, MI, MN, MO, MS, MT, NC (Under progress)

Sample training data – forest/ non-forest.

DeepSAT

Roadmap.

DeepSAT

Phase II (started & almost in completion):

• Generate training data – 60 patches of 600 x 600 for 50 states

• Established baseline land cover classification accuracy using CNN based segmentation

architecture such as SegNet and FCN

• Scale results to all 50 states ~ 330,000 tiles of 7000 x 8000 pixels

Phase II (in progress):

• Extend SegNet & FCN architecture to fuse information from multiple bands and other data

Phase III:

• Develop techniques to combine large amount of unlabeled data with supervised techniques

to further improve classification (e.g. model learnt from DBN as a replacement to the fully

connected layer in CNN)

SegNet Sample Result.

DeepSAT

Input image Ground truth 100,000 iterations

6000x6000 NAIP Tile

6000x6000 Prediction

THE WHOLE TILE

Statistics Overall

Total samples 14,565

True Positive Rate (%) 83.29

False Positive Rate (%) 4.61

Accuracy (%) 92.87

Time to predict a tile ~2 minutes

Super Resolution CNNs to Downscale Climate Models.

DeepSAT

Global climate models (GCMs) exist at low resolutions (> 100km) but climate

change effects are local

Downscaling Problem: Learn a mapping from low resolution GCMs precipitation

(and high resolution topography) to high resolution observed precipitation.

Proposed Method: Apply a Super-Resolution CNN [Dong 2014] to downscaling

1. Dong, Chao, et al. "Learning a deep convolutional network for image super-resolution." European Conference on Computer

Vision. Springer International Publishing, 2014.

SRCNN Architecture.

DeepSAT

SRCNN Training.

DeepSAT

Data:

-Precipitation from Prism 4km

upscaled to 16km and 50km

interpolated to 16km grid

-Training years: 1981-2000

-Testing years: 2001-2015

Mapping: 50km -> 16km

“Sub-images”:

-Crop 51x51 patches with stride 30

-Count > 10 million in training set

Computing: Tensorflow + 4 GPUs

from the Nvidia Devbox

Figure: https://www.tensorflow.org/versions/r0.11/tutorials/deep_cnn/index.html

Daily RMSE over the Continental United States .

DeepSAT

RMSE: BCSD = 2.23, SRCNN = 1.57

* BCSD is a widely used statistical downscaling method for global climate models

Massively Leveraging Nvidia .

DeepSAT

All our CNN models are trained on Nvidia devbox cluster

Pleiades GPU cluster is used to train models with multiple sets of train/test

datasets – both DBN and CNN architectures

DGX-1 like systems are important !

Model size: It has been recently shown deep neural networks – such as 152 layer ResNet architecture outperforms

shallower networks such as VGG-16. The increase in layers leads to a tremendous increase in number of parameters.

Hence, the mathematical operations needed such as gradient and nonlinear function of inputs, also increase. These

networks require ~15 billion FLOPS and this number will keep increasing as the networks grow in size and complexity.

The model complexity as well as input data size is limited by the GPU memory. With 128 GB vram, more complicated

models can be used for experimentation. Increasing input size has also many desirable effects e.g., the gradient descent

will have less noise, in the context of CNN, bigger images could provide more context for classification and hence,

improve classification/segmentation accuracy.

Training time: As training time is reduced due to P100 improvements over last generation (Maxwell architecture), as well

significant increase in number of cores, the training time will be much shorter. Faster training implies more

experimentation and faster innovation. Pascal architecture has been shown to perform 5-7x faster than the last

generation using some Deep Learning benchmarks.

Going Forward.

DeepSAT

• NEX-AI’s core is to focus on blending physical models with state-of-art machine

learning frameworks to address NASA’s mission objectives

• NEX-AI currently has focus on a number of problems related to satellite image

classification, climate downscaling and large scale anomaly detection

• DeepSAT will provide the current modeling frameworks along with access to

training data for NEX users

• NEX-AI will collaborate with industry leading experts in testing newer

frameworks in AI and in defining hard problems in the land-climate-atmosphere

continuum that can be “possibly” be solved by clever ensemble learning models

Contact Email: [email protected]

[email protected]

deepsat: a deep learning framework for satellite image ... · nasa earth exchange (nex). + nex is...

Documents