cyberinfrastructure (geoteci) lab dr. jianting zhang ...jzhang/geoteci_intro.pdfbig geospatial data...

9
Geospatial Technologies and Environmental CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang Department of Computer Science The City College of New York http://www-cs.ccny.cuny.edu/~jzhang/ Affiliated Institutions Collaborating Institutions Students: Simin You (Ph.D. 2009 -), Siyu Liao (Ph.D. 2014-), Costin Vicoveanu (Undergraduate, 2014-) Bharat Rosanlall (Undergraduate, 2014), Jay Yao (MS-thesis, 2011-2012), Chandrashekar Singh (MS 2013), Agniva Banerjee (MS, 2012), Roger King (MS, 2012), Wahyu Nugroho (MS, 2011), Xiao Quan Cen Feng (MS 2011), Chetram Dasrat (Undergraduate, 2008)

Upload: others

Post on 07-Jun-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang ...jzhang/geoteci_intro.pdfBig Geospatial Data Challenges –Event Locations, trajectories and O-D data •E.g., Taxi trip records

Geospatial Technologies and Environmental

CyberInfrastructure (GeoTECI) Lab

Dr. Jianting Zhang

Department of Computer Science

The City College of New York

http://www-cs.ccny.cuny.edu/~jzhang/

Affiliated Institutions

Collaborating Institutions

Students: Simin You (Ph.D. 2009 -), Siyu Liao (Ph.D. 2014-), Costin Vicoveanu (Undergraduate, 2014-)

Bharat Rosanlall (Undergraduate, 2014), Jay Yao (MS-thesis, 2011-2012), Chandrashekar Singh (MS

2013), Agniva Banerjee (MS, 2012), Roger King (MS, 2012), Wahyu Nugroho (MS, 2011), Xiao Quan

Cen Feng (MS 2011), Chetram Dasrat (Undergraduate, 2008)

Page 2: CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang ...jzhang/geoteci_intro.pdfBig Geospatial Data Challenges –Event Locations, trajectories and O-D data •E.g., Taxi trip records

Computational

Geometry

Spatial Databases:

data modeling,

indexing, query

processing

Scientific Data/Information Visualization

Statistics/Machine learning

Geographical Information System

GIS

Environmental

Modeling

Air quality

Remote

Sensing

Hydrology

Ecology

Social-

Economic

Modeling Urban planning

Transportation

Census/Taxation

Social Studies

Computer

Graphics

Image Processing/Computer Vision

Page 3: CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang ...jzhang/geoteci_intro.pdfBig Geospatial Data Challenges –Event Locations, trajectories and O-D data •E.g., Taxi trip records

GeographyGIS ApplicationsRemote Sensing

Ecological Informatics

Computer Science •Spatial Databases

•Data Mining

Environmental sciences Computer Science

Page 4: CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang ...jzhang/geoteci_intro.pdfBig Geospatial Data Challenges –Event Locations, trajectories and O-D data •E.g., Taxi trip records

Big Geospatial Data Challenges

– Event Locations, trajectories and O-D data• E.g., Taxi trip records (GPS traces or O-D locations)

• 0.5 million in NYC (medallion taxi cab only) and 1.2 million in Beijing per day

• From O-D locations to trajectories to frequent patterns

– Satellite: e.g., from GOES to GOES-R (2015/2016) [$11B] • http://www.goes-r.gov/downloads/GOES-R-Tri-10-06-09_v7.pdf

• Spectral (3X)*spatial (4X)* temporal (5X)=60X

• 2km*2km*5min*16bands(360*60)*(180*60)*(12*24)*16~ 1+ trillion pixels per day

• Derived thematic data products (vector)

– http://www.goes-r.gov/products/baseline.html

– http://www.goes-r.gov/products/option2.html

– Species distributions• E.g. 400+ million occurrence records (GBIF)

• E.g. 717,057 polygons and 78,929,697 vertices for 4148 birds distribution data (NatureServe)

Page 5: CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang ...jzhang/geoteci_intro.pdfBig Geospatial Data Challenges –Event Locations, trajectories and O-D data •E.g., Taxi trip records

Cloud computing+MapReduce+Hadoop

A

B

C

Thread Block

CPU Host (CMP)

Core

Local Cache

Shared Cache

DRAM

HDD SSD

GPU

SIMD

PCI-E

Ring Bus

Local Cache

Core ... Core

Core Core

GDRAM GDRAM

Core Core Core... ...

MIC

PCI-E

T0 T1

T2 T3

4-Threads

In-Order

16 Intel Sandy Bridge CPU cores+ 128GB RAM + 8TB disk + GTX TITAN + Xeon Phi 3120A ~ $9,994

Page 6: CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang ...jzhang/geoteci_intro.pdfBig Geospatial Data Challenges –Event Locations, trajectories and O-D data •E.g., Taxi trip records

ASCI Red: 1997 First 1 Teraflops

(sustained) system with 9298 Intel Pentium

II Xeon processors (in 72 Cabinets)

•Feb. 2013

•7.1 billion transistors (551mm²)

•2,688 processors •4.5 TFLOPS SP and 1.3 TFLOPS DP

•Max bandwidth 288.4 GB/s

•PCI-E peripheral device

•250 W (17.98 GFLOPS/W -SP)

• Suggested retail price: $999

What can we do today using a device that is more

powerful than ASCI Red 16 years ago?

Page 7: CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang ...jzhang/geoteci_intro.pdfBig Geospatial Data Challenges –Event Locations, trajectories and O-D data •E.g., Taxi trip records

GeoTECI@CCNY

•HIGHEST-DB

•HIgh-performance GrapHics units based Engine for Spatial-Temporal data

•Spatial and Spatiotemporal indexing, query processing and optimization

•Trajectory data management on GPUs

•Segmentation/simplification/compression/Aggregation/Warehousing

•Map matching with road networks

•Data mining (moving cluster, convoy, swarm...)

… when yellow cabs,

green cabs and MTA

buses meet with multi-

core CPUs, GPUs and

MICs in NYC …

$449,845/4yr (08/01/2013-07/31/2017)

Page 8: CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang ...jzhang/geoteci_intro.pdfBig Geospatial Data Challenges –Event Locations, trajectories and O-D data •E.g., Taxi trip records

GeoTECI@CCNY

High-resolution Satellite ImageryT

In-situ Observation Sensor Data

T

Global and Regional Climate Model Outputs

Data AssimilationZonal Statistics

TEcological, environmental and administrative zones

T

T

V

Temporal Trends

High-End Computing Facility

A

B

C

Thread Block

ROIs

… when GOES-R satellites, extratropical

cyclones and hummingbirds meet with TITAN …

Page 9: CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang ...jzhang/geoteci_intro.pdfBig Geospatial Data Challenges –Event Locations, trajectories and O-D data •E.g., Taxi trip records

GeoTECI@CCNYCCNY Computer Science LAN

Microway

Dual 8-core

128GB memory

Nvidia GTX Titan

Intel Xeon Phi 3120A

8 TB storage

DIY

*2

SGI Octane III

Dual Quadcore

48GB memory

Nvidia C2050*2

8 TB storage

Dual-core

8GB memory

Nvidia GTX Titan

3 TB storage

Dell T5400

Dual Quadcore

16GB memory

Nvidia Quadro 6000

1.5 TB storage

Lenovo T400s

Dell T7500

Dual 6-core

24 GB memory

Nvidia Quadro 6000

Dell T7500

Dual 6-core

24 GB memory

Nvidia GTX 480

Dual Quadcore

16GB memory

Nvidia FX3700*2

Dell T5400

DIY

Quadcore (Haswell)

16 GB memory

AMD/ATI 7970

Quadcore

8 GB memory

Nvidia Quadro 5000m

HP 8740w

HP 8740w

CUNY HPCC

KVM

“Brawny” GPU cluster

“Wimmy” GPU cluster

Web Server/

Linux App Server Windows

App Server

...building a highly-configurable experimental computing

environment for innovative BigData technologies…