sc5 hangout2 pilot 1 description
TRANSCRIPT
![Page 1: SC5 Hangout2 pilot 1 description](https://reader035.vdocuments.net/reader035/viewer/2022062522/587a8d3f1a28ab58288b621d/html5/thumbnails/1.jpg)
SC5 1st Pilot Hangout
![Page 2: SC5 Hangout2 pilot 1 description](https://reader035.vdocuments.net/reader035/viewer/2022062522/587a8d3f1a28ab58288b621d/html5/thumbnails/2.jpg)
To demonstrate what can be achieved through the BDE platform in:Managing large volumes of climate / weather numerical dataIngestion / exporting of dataAnalytics potentialData lineage
BASIC AIM
![Page 3: SC5 Hangout2 pilot 1 description](https://reader035.vdocuments.net/reader035/viewer/2022062522/587a8d3f1a28ab58288b621d/html5/thumbnails/3.jpg)
Downscaling Downscaling of climatic and / or meteorological data:
o Essential first step for any further analysis, assessment or processing in climate and related domains
![Page 4: SC5 Hangout2 pilot 1 description](https://reader035.vdocuments.net/reader035/viewer/2022062522/587a8d3f1a28ab58288b621d/html5/thumbnails/4.jpg)
BDE SC5 Pilot I - ArchitectureCassandraMetadata & data lineage
Hive/Hadoop
Raw data & analytics
WRF ModelInstitutional
resource connectors
NetCDFInterfaces
and visualisationSC5
Pilot
![Page 5: SC5 Hangout2 pilot 1 description](https://reader035.vdocuments.net/reader035/viewer/2022062522/587a8d3f1a28ab58288b621d/html5/thumbnails/5.jpg)
Current status Operations
o Data ingestion (NetCDF files) Both manually, for bootstrapping, as well as after downscaling
o Data export (NetCDF files) Selection of variables / time slices
o Start and monitor WRF-based downscaling on institutional resources If requested results already exist, they are retrieved If not, WRF is started
o Maintain data lineage records on BDE platform Monitoring and further analysis Subset of W3C PROV, http://www.w3.org/TR/prov-overview
![Page 6: SC5 Hangout2 pilot 1 description](https://reader035.vdocuments.net/reader035/viewer/2022062522/587a8d3f1a28ab58288b621d/html5/thumbnails/6.jpg)
Current statuso Support basic analytics on BDE
Hive querieso Console-based UI
Python/Jupyter interface for demonstration
![Page 7: SC5 Hangout2 pilot 1 description](https://reader035.vdocuments.net/reader035/viewer/2022062522/587a8d3f1a28ab58288b621d/html5/thumbnails/7.jpg)
Sample analytics Climate-change indices / analytics (indicative)
o Number of summer days, frost days o Tropical nights o Monthly minimum value of daily maximum temperatureo Precipitation-based statisticso Etc.
Analytics for other applicationso Comfort indices (temperature – humidity)o Risk for forest fires (wind speed – temperature – humidity)o Atmospheric pollution (wind speed – vertical gradient of
temperature – heat fluxes )o Etc.
![Page 8: SC5 Hangout2 pilot 1 description](https://reader035.vdocuments.net/reader035/viewer/2022062522/587a8d3f1a28ab58288b621d/html5/thumbnails/8.jpg)
Further pilot development Investigation regarding transparent
climate NetCDF transformation tailored to the WRF model, using the BDE integrator (esp. Spark)
Testing and further development regarding data lineage and downscaling parameterisation and execution
![Page 9: SC5 Hangout2 pilot 1 description](https://reader035.vdocuments.net/reader035/viewer/2022062522/587a8d3f1a28ab58288b621d/html5/thumbnails/9.jpg)
Expected added value Scalability and ease in managing large
data sets Efficient use of institutional resources in
performing downscaling computationso Avoiding calculating products when not
needed Data lineage
o either for existing data in the database, or for data that are not present anymore
o reproducibility
![Page 10: SC5 Hangout2 pilot 1 description](https://reader035.vdocuments.net/reader035/viewer/2022062522/587a8d3f1a28ab58288b621d/html5/thumbnails/10.jpg)
Hands-on The jupyter notebook is accessible at:
o https://143.233.226.108
(please bypass the warnings)