big data, cloud, and the noaa crada at the climate corporation

7
© 2015 The Climate Corporation All Rights Reserved Big Data, Cloud and the NOAA CRADA at the Climate Corporation 1

Upload: valliappa-lakshmanan

Post on 14-Apr-2017

273 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Big data, Cloud, and the NOAA CRADA at The Climate Corporation

© 2015 The Climate Corporation All Rights Reserved

Big Data, Cloud and the NOAA CRADA at the Climate Corporation

1

Page 2: Big data, Cloud, and the NOAA CRADA at The Climate Corporation

© 2015 The Climate Corporation All Rights Reserved

The Climate Corporation (TCC) provides decision-making tools for farmers

Weather is both a feature that we provide to our users, and an input to our agronomic models

http://www.climate.com/

Page 3: Big data, Cloud, and the NOAA CRADA at The Climate Corporation

© 2015 The Climate Corporation All Rights Reserved

Big Data refers to computational limitations of the analysis that you are performing

http://goo.gl/NB5hDO

Page 4: Big data, Cloud, and the NOAA CRADA at The Climate Corporation

© 2015 The Climate Corporation All Rights Reserved

The Cloud is remote computational infrastructure, typically something you rent

4

https://cloud.google.com/

https://aws.amazon.com/

Elastic Compute Cloud (EC2)

Simple Storage Service (S3)

Page 5: Big data, Cloud, and the NOAA CRADA at The Climate Corporation

© 2015 The Climate Corporation All Rights Reserved

The old way to do large-scale analysis at TCC

NOAA NCDCRadar Archive

Simple Storage Service (S3)3.Download

Elastic Compute Cloud (EC2)

1. Request2.Launch

6. LaunchElastic Compute Cloud (EC2)

5. ValidateTCC S3 bucket

4. Upload

7. Compute

Page 6: Big data, Cloud, and the NOAA CRADA at The Climate Corporation

© 2015 The Climate Corporation All Rights Reserved

The NOAA CRADA puts NEXRAD data on S3

Simple Storage Service (S3)

1. LaunchElastic Compute Cloud (EC2)

Amazon public bucket

2. Compute

Page 7: Big data, Cloud, and the NOAA CRADA at The Climate Corporation

© 2015 The Climate Corporation All Rights Reserved

Everyone winsTCC projects are several weeks shorterTCC evaluations of new methods happen on larger datasets

We don’t pay Amazon for the S3 bucket to store NEXRAD dataInstead, we pay Amazon for the EC2 instances to process the larger dataset

NOAA data is used more widely, but without overwhelming NCDC TCC/AWS found long-standing problem in NOAA archive, improving data quality

7

Opportunity around the other two components of the cloud