hydroinformatics for scientific knowledge, informed policy ... · cuahsi conference on...

55
CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for scientific knowledge, informed policy, and effective response

Upload: others

Post on 14-Mar-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

CUAHSI Conference on HydroinformaticsJuly 29 - 31, 2019

Brigham Young UniversityProvo, Utah

Hydroinformatics for scientific knowledge, informed policy, and effective response

Page 2: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2

AGENDAPage 4 - 11

BYU CAMPUS DIRECTIONSPage 12

KEYNOTE SPEAKERSPage 13 - 17

PLENARY LIGHTNING TALKSPage 18

SESSIONSPage 19 - 35

WORKSHOPSPage 36 - 39

TOWN HALLPage 40

POSTER PRESENTATIONSPage 41 - 54

TABLE

OF

CONTENTS

Page 3: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3

CUAHSI would like to acknowledge the contributions, support, and assistance from the following organizations:

Conference on Hydroinformatics Program CommitteeDaniel P. Ames (Brigham Young University)Sara Larsen (Upper Colorado River Commission)Emilio Mayorga (University of Washington)Lauren Patterson (Duke University)Jon Pollak (CUAHSI)Jordan Read (U.S. Geological Survey)Dwane Young (U.S. Environmental Protection Agency)

Conference Session Chairs, Speakers and Workshop Organizers

Brigham Young University

CUAHSI Member Institutions

National Science Foundation

ACKNOWLEDGEMENTS

Page 4: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4

Monday, July 29

7:00 - 8:30am Registration and Breakfast Engineering Building

(2nd Fl)

8:45 - 9:00am Welcome AddressSpeaker: Jerad Bales (CUAHSI)

Engineering Building

(Room 204/206)

9:00 - 10:00am KeynoteIncubating progress in hydrologic big data deep learning as a community

Speaker: Chaopeng Shen (Pennsylvania State University)

Engineering Building

(Room 204/206)

10:00 - 10:30am Refreshment Break Engineering Building

(2nd Fl)

10:30 - 11:00am Plenary Lightning TalksLeveraging water quality monitoring data through the water quality portal

Speaker: Dwane Young (U.S. Environmental Protection Agency)

HydroQuality: Upload and download quality data

Speaker: Chao Chen (Boise State University)

Model and code sharing via CUAHSI hosted MATLAB online

Speaker: Lisa Kempler (MathWorks)

HydroShare: An overview of new functionality developed in support of collaborative reproducible research

Speaker: David Tarboton (Utah State University)

HydroLearn: Facilitating the development, adaptation and sharing of active-learning resources in hydrology education

Speaker: Emad Habib (University of Louisiana at Lafayette)

Engineering Building

(Room 204/206)

11:00am - 12:00pm

KeynoteEnabling global scale water analysis with cloud technologies

Speaker: Tyler Erickson (Google)

Engineering Building

(Room 204/206)

AGENDA

Page 5: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

5

12:00 - 1:30pm Lunch Wilkinson Student Center

(3rd Fl - Room 3250/3252)

Monday Afternoon Concurrent Sessions / Workshops1:30 - 2:45pm Unveiling new innovations in advanced cyberinfrastructure to support a

community hydrologic modeling ecosystemConveners: Jonathan L. Goodall (University of Virginia; Anthony M. Castronova (CUAHSI); Christina Bandaragoda (University of Washington)StreamPULSE: a platform for modeling river and stream metabolism on a global scaleSpeaker: Michael Vlah (Duke University) Design and implementation of cyberinfrastructure to support a cloud-based, community hydrologic modeling ecosystemSpeaker: Young-Don Choi (University of Virginia) Model and code sharing via CUAHSI hosted MATLAB onlineSpeaker: Lisa Kempler (MathWorks) Temporal evapotranspiration aggregation method: An application for calculating evapotranspiration metrics, exploring the modifiable aerial unit problem, and shortening the time to scienceSpeaker: James Matthew Coll (University of Kansas) A roadmap for Earthdata remote sensing for hydroinformaticsSpeakers: Michael Gangl (NASA Physical Oceanography Distributed Active Archive Center); Catalina Oaida (Raytheon); Lewis McGibbney (NASA JPL); Jessica Hausman (NASA JPL)

Engineering Building

(3rd Fl - Room 321)

Workshop: The Western States Water Council Water Data Exchange (WaDE) - Hands-on use cases for insights into water rights and use in the Western United StatesInstructors: Adel Abdallah, Western States Water Council and Sara Larsen, Upper Colorado River Commission

Engineering Building

(2nd Fl - Room 221)

Workshop: HydroQuality: Upload and download quality data Instructors: Chao Chen (Boise State University); Connor Scully-Allison (University of Arizona); Chase Carthen, (University of Nevada Reno); Rui Wu (East Carolina University)

Clyde Building

(2nd Fl - Room 234)

2:45 - 3:00 Refreshment Break Engineering Building

(2nd Fl)

Page 6: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

6

Monday Afternoon Concurrent Sessions / Workshops3:00 - 4:30pm Continental scale community hydrologic modeling cyberinfrastructure,

knowledge representations and data management IConvener: David Tarboton (Utah State University)Optimal access to NASA water cycle dataSpeaker: Richard Strub (NASA Goddard Earth Sciences Data and Information Services Center) Hydrologic observation, model, and theory congruence on evapotranspiration variance: diagnosis of continental scale land surface modelsSpeaker: Ruijie Zeng (Utah State University) Global monitoring of fresh water at high spatial and temporal resolutions. Assessing stream and lake hydrological/physical features within a machine learning frameworkSpeaker: Giuseppe Amatulli (Yale University) NWM-driven hydrodynamic simulations to resolve complex flow dynamics in low gradient watershedsSpeaker: Haitham Saad (University of Louisiana) A novel multi-scale data fusion framework for massive datasetsSpeaker: Dhruva Kathuria (Texas A&M University)

Engineering Building

(3rd Fl - Room 321)

Synergies between mechanistic and machine learning modelsConvener: Jordan Read (U.S. Geological Survey)Physics guided machine learning: A new paradigm for modeling dynamical systemsSpeaker: Vipin Kumar (University of Minnesota)Application of a convolution neural network to the identification of karst featuresSpeaker: Scott Haag (Drexel University)GLADD: A new Global Lake Dynamics Database created using machine learning and satellite dataSpeaker: Ankush Khandelwal (University of Minnesota)Clowder: Open source data sharing leveraging active curation and applied machine learningSpeaker: Bing Zhang (University of Illinois)

Engineering Building

(3rd Fl - Room 325)

Workshop: Developing open source water resources web applications using Tethys platformInstructor: Dan Ames (Brigham Young University)

Engineering Building

(2nd Fl - Room 221)

4:30 - 5:00pm Afternoon Break

5:00 - 6:30pm Poster Session Wilkinson Student Center (3rd Fl - Room 3280/3290)

Page 7: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

7

Tuesday, July 30

7:00 - 8:45am Registration and Breakfast Engineering Building

(2nd Fl)

8:45 - 9:00am Welcome Engineering Building

(Room 204/206)

9:00 - 10:00am KeynoteContemporary challenges in optical remote sensing for hydroenvironmental change detection

Speaker: Ni-Bin Chang, University of Central Florida

Engineering Building

(Room 204/206)

10:00 - 10:15am

Refreshment Break Engineering Building

(2nd Fl)

10:15 - 11:00am

State of CUAHSI Community CyberinfrastructureSpeaker: Anthony Castronova, CUAHSI

Engineering Building

(Room 204/206)

11:00am - 12:00pm

KeynoteOpen water data and western state water agencies - notes from the field

Speaker: Sara Larsen, Upper Colorado River Commission

Engineering Building

(Room 204/206)

12:00 - 1:30pm Lunch Wilkinson Student Center

(3rd Fl - Room 3250/3252)

Page 8: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

8

Tuesday Afternoon Concurrent Sessions / Workshops1:30 - 3:30pm Applied Hydroinformatics

Convener: Kyle Onda (Duke University)Cyberinfrastructure for intelligent water supply: Measuring water use, conservation, and socio-demographic differences using an inexpensive, high frequency metering systemSpeaker: Jeffery S. Horsburgh (Utah State University) A citizen science approach to streamflow and temperature forecastingSpeaker: Pedro Mauricio Avellaneda-Lopez (Indiana University) Characterization of water resources using an online groundwater level mapping toolSpeaker: Norm Jones (Brigham Young University) Using random forest models to predict streamflow metrics in ungauged watersheds and to estimate hydrologic alteration in urban streamsSpeaker: Charles Stillwell (U.S. Geological Survey) Hydrological Event Detection & Analysis (HEDA) tool for streamflow water quality time seriesSpeaker: Scott Hamshaw (University of Vermont) Regional flood forecasting applications for the Dominican RepublicSpeaker: Jason Biesinger (Brigham Young University)

Engineering Building

(3rd Fl - Room 321)

Continental scale community hydrologic modeling cyberinfrastructure, knowledge representations and data management IIConvener: Jerad Bales (CUAHSI)

The National Hydrologic Model: an infrastructure for collaboration in the hydrologic community

Speaker: Steven Markstrom (U.S. Geological Survey)

Cyberinfrastructure needs for continental-domain hydrological modeling

Speaker: Martyn P. Clark (University of Saskatchewan at Canmore)

Incorporating river geometry in large scale hydrologic and hydrodynamic models

Speaker: Sayan Dey (Purdue University)

Fast summarizing algorithm for polygonal statistics over a regular grid

Speaker: Scott Haag (Drexel University)

Engineering Building

(3rd Fl - Room 325)

Workshop: Facilitating the development, adaptation, and sharing of active-learning resources in hydrology educationInstructors: Emad Habib, University of Louisiana; Melissa Gallagher, University of Louisiana; David Tarboton, Utah State University; Dan Ames, Brigham Young University

Engineering Building

(2nd Fl - Room 221)

1:30 - 5:00pm Workshop: Use GDAL, PKTOOLS and GRASS for massive raster operations in hydrologyInstructors: Giuseppe Amatulli, Yale University

Clyde Building

(2nd Fl - Room 234)

3:30 - 4:00pm Refreshment Break Engineering Building

(2nd Fl)

Page 9: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

9

Tuesday Afternoon Concurrent Sessions / Workshops4:00 - 6:00pm Advancing cyberinfrastructure for sharing, managing and publishing geoscience

data and models in support of transparent, trustworthy and reproducible researchConvener: Daniel Ames (Brigham Young University)

Interoperable Watersheds Network - Open source data publishing

Speaker: Dwane Young (U.S. Environmental Protection Agency)

HydroGlobe - a platform enabling the integration of earth observations in hydrologic models

Speaker: Venkatesh Merwade (Purdue University)

HydroShare: An overview of new functionality developed in support of collaborative reproducible research

Speaker: David Tarboton (Utah State University)

User-defined metadata schemas for hydrological models

Speaker: Jeffrey M. Sadler (University of Virginia)

Simplifying GIS and HIS web service deployment using HydroShare

Speaker: Ken Lippold (Brigham Young University)

Engineering Building

(3rd Fl - Room 321)

Advances in cyberinfrastructure for hydrologic modelingConvener: Martin Seul (CUAHSI)

Running research models in the cloud for hands-on education

Speaker: Bart Nijssen (University of Washington)

Hyper-resolution flood modeling and mapping using a computationally-efficient distributed modeling approach

Speaker: Siddharth Saksena (Purdue University)

Enabling modeling frameworks with surrogate modeling capabilities

Speaker: Francesco Serafin (Colorado State University)

Addressing challenges for mapping irrigated fields in subhumid temperate U.S. systems by integrating remote sensing and hydroclimatic data

Speaker: Tianfang Xu (Utah State University)

Investigating climate change impacts on water quality using climate stress testing with data-based analysis (machine learning)

Speaker: Khanh Thi Nhu Nguyen (University of Massachusetts Amherst)

Validating the historical simulation and the forecast skill for the streamflow prediction tool. Case study: Nepal, Bangladesh and Colombia

Speaker: Jorge-Luis Sanchez (Brigham Young University)

Engineering Building

(3rd Fl - Room 325)

Workshop: CUAHSI compute services for working with data in the cloudInstructor: Anthony Castronova, CUAHSI

Engineering Building

(2nd Fl - Room 221)

Page 10: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

1 0

Wednesday, July 31

7:30 - 8:45am Breakfast Engineering Building

(2nd Fl)

8:45 - 9:00am Welcome Engineering Building

(Room 204/206)

9:00 - 10:00am KeynoteThe role of data driven decisions in managing a large water project

Speaker: Gene Shawcroft (Central Utah Water Conservancy District)

Engineering Building

(Room 204/206)

10:00 - 10:15am

Refreshment Break Engineering Building

(2nd Fl)

Wednesday Morning Concurrent Sessions / Town Hall / Workshops

10:15 - 11:45am

Advances in hydroinformatics and virtual realityConvener: Ivo Arrey (University of Venda)

Model visualization and virtual reality

Speaker: Ivo Arrey (University of Venda)

Interactive and real-time flood inundation mapping on client-side web systems

Speaker: Ibrahim Demir (University of Iowa)

4D digital watershed: Advanced bedrock-to-canopy characterization for watershed functions

Speaker: Haruko Wainwright (Lawrence Berkeley National Laboratory)

A map algebra approach to analyzing time series of rasters

Speaker: Xingong Li (University of Kansas)

Engineering Building

(3rd Fl - Room 321)

CUAHSI Water Data Services Town HallConveners: David Tarboton, Utah State University

Engineering Building

(3rd Fl - Room 325)

Workshop: Discovering and using water quality sampling data from water quality portal Instructor: Dwane Young (U.S. Environmental Protection Agency)

Engineering Building

(2nd Fl - Room 221)

Workshop: Using NHDPlus value added attributes to create useful analytical toolsInstructor: Alan Rea, U.S. Geological Survey

Clyde Building

(2nd Fl - Room 234)

Page 11: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

1 1

11:45am - 12:00pm

Break Engineering Building

(2nd Fl)

12:00 - 1:00pm Closing KeynoteTransforming science in the 21st century: NSF big ideas and a vision for a national cyberinfrastructure ecosystem

Speaker: Manish Parashar, National Science Foundation

Engineering Building

(Room 204/206)

Enter CUAHSI’s Raffle!

Enter for a chance to win the following:• One free registration to a training workshop or short course of your choosing (for graduate students and

post-docs); or• One free registration to the CUAHSI Conference on Hydroinformatics (for professionals)

You’re already two-thirds of the way there for an entry!The last step is to complete the feedback survey upon completion of the 2019 CUAHSI Conference on Hydroin-formatics. To access the feedback survey, visit: http://www.cvent.com/d/nbqxvf/7E

Winners will be chosen at 10:00 a.m. on Thursday, August 15 and notified by e-mail.

Terms and Conditions:This contest is open to all attendees of the 2019 CUAHSI Conference on Hydroinformatics with the exclusion of CUAHSI employees, officers, and Board of Directors. Must be affiliated with a U.S. university. Limitations apply with use of free registration to a training workshop or short course. Some training workshops or short courses have an application process. With this, you must be accepted in order to use the free registration. Registration to the CUAHSI Conference on Hydroinformatics will be awarded as a check mailed to your current mailing address.

Page 12: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

1 2

BYU CAMPUS DIRECTIONS

Note: Guest Parking is near the Daniel Wells ROTC Building.Below is the highlighted walking route to the Engineering Building.It is directly south of the Clyde Building.

Page 13: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

1 3

Contemporary challenges in optical remote sensing for water quality change detection through various types of satellite earth observations include: 1) complexity in data merging and image fusion for higher spatial and temporal resolution, 2) cross-mission data merging and cloudy image reconstruction with the aid of machine learning and big data analytics, 3) intelligent feature extraction of different environmental quality images utilizing machine learning and big data analytics techniques, and 4) design of integrated decision support systems for smart city initiatives. The recent regime shift of machine learning techniques from regular learning to deep learning to fast learning has triggered a renewed interest in remote sensing image processing for better earth observations. These environmental change detection issues mainly include, but are not limited to, water and air quality monitoring under climate change impact. This presentation will focus on these forefronts and challenges using case studies of Lake Nicaragua and Lake Erie as a study site for water quality monitoring based on both multispectral and hyperspectral remote sensing imageries. Monitoring chlorophyll-a, total phosphorus and total nitrogen concentrations will be discussed for lake eutrophication assessment. The latest advances in Integrated Data Fusion and Mining (IDFM) by fusing images collected from two satellites will be first introduced. To recover the missing information caused by cloud contamination, SMart Information Reconstruction (SMIR) and Spectral Information Adaptation and Synthesis Scheme (SIASS) will be presented interactively for merging cross-mission consistent ocean color reflectance observations and reconstructing the missing pixels with the aid of big data analytics. A decision support system by integrating IDFM, SIASS, and SMIR that is known as Cross-mission Data Merging with Image Reconstruction and Mining (CDMIM) will be described with applications to map the water quality conditions in Lake Nicaragua. Extended discussion to tackle air quality monitoring in complex urban regions with a similar philosophy will be covered in the end of my presentation.

Ni-Bin Chang is Professor of Environmental Systems Engineering, having held this post in the US since 2002. He received his B.S. degree in Civil Engineering from the National Chiao-Tung University in Taiwan in 1983, and M.S. and PhD degrees in Environmental Systems Engineering from Cornell University in the US in 1989 and 1991, respectively. He is Director of the Stormwater Management Academy and Professor with the Department of Civil, Environmental, and Construction Engineering at the University of Central Florida in the United States. His research lies at the intersection between “Environmental Sustainability”, “Water Resources in Changing Environment”, and “Resilient Infrastructure Systems”. Beginning with the research in the early 1990s through today these investigations have provided the research focus for a large and diverse scientific community. From August 2012 to August 2014, Professor Chang served

KEYNOTE SPEAKERS

Contemporary challenges in optical remote sensing for hydroenvironmental change detectionNi-Bin Chang, University of Central Florida

Page 14: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

1 4

as Program Director of the Hydrologic Sciences Program and Cyber-Innovated Sustainability Science and Engineering Program at the National Science Foundation. He has received over thirty-eight awards/honors since 1987 nationally and internationally, including the Outstanding Achievement Award from the American Society of Civil Engineers in 2010, the Fulbright Scholar Award from the Department of State and German-American Fulbright Council in 2012, the Bridging the Gaps Award from the Engineering and Physical Sciences Research Council in the United Kingdom in 2012, the Distinguished Visiting Fellowship from the Royal Academy of Engineering in the United Kingdom in 2014, and the Blaise Pascal Medal from the European Academy of Sciences in 2016 and the citation is “for his outstanding contribution in Environmental Sustainability, Green Engineering, and Systems Analysis“. He is Fellow of the American Society of Civil Engineers (FASCE), the Institute of Electrical and Electronics Engineers (FIEEE), the International Society of Optics and Photonics (FSPIE), the American Association for the Advancement of Science (FAAAS), the Royal Society of Chemistry in the United Kingdom (FRSC), the National Academy of Inventors (FNAI) and the European Academy of Sciences (FEASc).

Enabling global scale water analysis with cloud technologiesTyler Erickson, Google

Dr. Tyler A. Erickson is a Developer Advocate and manages the Earth Outreach Developer Relations team at Google. In this role, he fosters collaborations with researchers from academia, NGO’s, and governmental organizations seeking to capitalize on Earth Engine’s capabilities for geospatial analyses that involve immense satellite and model-based datasets. Dr. Erickson leads the development of Earth Engine’s core efforts in water and climate, guides the evolution of Earth Engine to support these scientific domains, and leads support efforts for the Earth Engine Python API. A snow hydrologist by training, he has degrees in civil & environmental engineering and geography degrees from Colorado State University, California Institute of Technology, Stanford, and the University of Colorado at Boulder. Tyler is a longtime Python programmer and open source contributor, particularly with the OSGeo, NumFocus, and Jupyter projects.

Page 15: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

1 5

Open water data and western state water agencies - Notes from the fieldSara Larsen, Upper Colorado River Commission

Sara recently became the Deputy Director at the Upper Colorado River Commission, but her prior position was with the Western States Water Council (WSWC). At the Council, she worked with state water administrators appointed by 18 western governors to address water data and science-related challenges in the West. Sara also oversaw the Water Data Exchange (WaDE) Program, a data-sharing platform initiated by the WSWC, its member states, and the Western Federal Agency Support Team (WestFAST) in support of state-curated “open data” exchange and publication. Sara will speak on her experience with WSWC member state agencies, and their collective efforts to adopt and comply with new open data expectations. She will also talk about the challenges many agencies face regarding resources for data programs and rapidly changing technologies that support data processes and publication. Before joining the WSWC, Sara was a research engineer at Los Alamos National Laboratory in the Decision Applications Division. She also worked for the State of Utah Division of Water Resources in their Hydrology and Computer Modeling and GIS sections. Ms. Larsen received a BS in Geography/GIS and an MS in Civil and Environmental Engineering with a Water Resources emphasis from the University of Utah. She is also a licensed professional engineer in the State of Utah and is active in many civic and professional groups and boards.

The NSF Big Ideas represent long-term research investments aimed at catalyzing new, cross disciplinary and convergent research at the frontiers of science and engineering that has the potential for transforming science and society in the 21st century. Research and advanced research infrastructure investments by the Office of Advanced Cyberinfrastructure (OAC) are playing a key role across these big ideas and are central to realizing their envisioned impact. This talk will provide an overview of the NSF Big Ideas and will highlight strategic directions, priorities, programs and investments at OAC to support their realization.

Manish is Office Director of the Office of Advanced Cyberinfrastructure at NSF. He joins NSF from Rutgers, The State University of New Jersey, where he is currently a Distinguished Professor and the founding Director of the Rutgers Discovery Informatics Institute. His research interests are in the broad areas of Parallel and Distributed Computing and Computational and Data-Enabled Science and Engineering. Manish is Fellow of AAAS, Fellow of IEEE/IEEE Computer Society and ACM Distinguished Scientist.

Transforming ccience in the 21st Century: NSF big ideas and a vision for a national cyberinfrastructure ecosystemManish Parashar, National Science Foundation

Page 16: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

1 6

Created in 1964, the Central Utah Water Conservancy District covers all or parts of 8 Utah Counties and over 60% of the State’s population live within the District boundaries. The District operates and maintains the large Federal Central Utah Project with roughly 2MAF of storage. In addition, it owns operates and maintains 3 surface water treatment plants and a $400M water development project. The annual budget is $200M and just over 100 great employees work across the district boundaries. Gene has the honor of being the General Manager for the District.

Gene grew up on a farm in Southern Colorado and enjoyed working and playing in the water. After graduating from Brigham Young University with a B.S. and M.S. in Civil Engineering, Gene joined the State of Utah working for the Division of Water Resources. He joined Central Utah Water Conservancy District in 1991 and worked as a Project Engineer, Assistant General Manager, Deputy General Manager and was appointed as the General Manager in January of 2015. The District operates and maintains the massive Federal Central Utah Project, a large local Central Water Project and various water treatment plants and related facilities and works diligently to conserve our precious water resources.

Gene is a licensed Professional Engineer in Utah and is active in various professional groups and serves on several governing boards in the water industry. He is a frequent lecturer on water related topics to professional, academic and civic groups. Gene enjoys his work, outdoor activities and his family. Gene is married to Janeen and between them they have 9 children and 20 grandchildren.

The role of data driven decisions in managing a large water projectGene Shawcroft, Central Utah Water Conservancy District

Page 17: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

1 7

Recently, deep learning (DL) has emerged as a revolutionary and versatile tool transforming industry applications and generating new and improved capabilities for scientific discovery. In this talk I discuss recent applications of hydrologic DL from both our group and the community, including time series modeling of soil moisture and streamflow, surrogate modeling, information retrieval, and knowledge discovery, etc. Taking a broad, transdisciplinary view, it is here argued that DL-based methods open up new research avenues toward knowledge discovery. However, hydrology presents many challenges for DL methods, such as data limitations, heterogeneity and co-evolution, and the general inexperience of the hydrologic field with DL. The roadmap toward DL-powered scientific advances will require the coordinated effort from a large community involving scientists and citizens. Integrating process-based models with DL models will help alleviate data limitations. I outline several key steps that the community, together, can help incubate progress in hydrologic DL, including open competitions, open models and datasets, and revamped educational programs. Competitions could serve as the organizing events. The area of hydrologic DL presents numerous research opportunities that could, in turn, stimulate advances in machine learning as well.

Chaopeng Shen is an Associate Professor in Civil Engineering at The Pennsylvania State University. He received the Ph.D. degree in environmental engineering from Michigan State University, East Lansing, MI, USA, in 2009. He developed the hydrologic model process-based adaptive watershed simulator. He was a Post-Doctoral Research Associate with the Lawrence Berkeley National Laboratory, Berkeley, CA, USA, from 2011 to 2012. His recent efforts focused on harnessing the big data and machine learning opportunities in advancing hydrologic prediction and understanding. In addition, his research interests also include interactions between water and ecosystems, floodplain systems, scaling issues, process-based hydrologic modeling, multiscale modeling, and hydrologic data mining. He is currently an Associate Editor of the Water Resources Research.

Incubating progress in hydrologic big data deep learning as a communityChaopeng Shen, Pennsylvania State University

Page 18: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

1 8

Plenary Lightning Talks

This session will highlight new community resources in an abbreviated format. Each speaker will have five minutes to present their new technology. Leveraging water quality monitoring data through the water quality portalDwane Young (U.S. Environmental Protection Agency)

HydroQuality: Upload and download quality dataChao Chen (Boise State University)

Model and code sharing via CUAHSI hosted MATLAB onlineLisa Kempler (MathWorks)

HydroShare: An overview of new functionality developed in support of collaborative reproducible researchDavid Tarboton (Utah State University)

HydroLearn: Facilitating the development, adaptation and sharing of active-learning resources in hydrology educationEmad Habib (University of Louisiana at Lafayette)

Page 19: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

1 9

Advances in cyberinfrastructure for hydrologic modelingConvener: Martin Seul (CUAHSI)

Advances in computational resources continue to improve hydrologic modeling capabilities by creating new data sources, developing new modeling frameworks, and offering increased data storage and analysis capacity with cloud resources. This session will include presentations that highlight cyberinfrastructure enabling the hydrologic modeling community to advance the state of water predictions in the United States and around the world.

Running research models in the cloud for hands-on educationSpeaker: Bart Nijssen (University of Washington)Co-Authors: Andrew Bennett (University of Washington); Joseph J. Hamman (National Center for Atmospheric Research)

It can be challenging to provide students with access to research hydrological models for online and in-class projects and assignments. Some models require libraries and compilers that are not readily available in all computing environments. Students may have limited experience in managing computing environments other than the one they are already familiar with. At the same time, it can be difficult for an instructor to reproduce and troubleshoot errors if students use different computing setups. Over the last year, we have used cloud-based computing environments provided by the Pangeo project to run the SUMMA hydrological model from within Jupyter notebooks to expose students to research-grade hydrological models. We have done this as part of an online snow course taught through CUAHSI’s Virtual University, as part of Waterhackweek, and as part of an Advanced Hydrology course at the University of Washington. In some of these exercises, we also used HydroShare to share model setups. Using this cyberinfrastructure has allowed us to provide a scalable, controlled environment to all students, while removing some of the hurdles that can make it difficult for students to run these research models. We will discuss our setup and our experiences with using this cyberinfrastructure for education and for model tutorials.

Hyper-resolution flood modeling and mapping using a computationally-efficient distributed modeling approachSpeaker: Siddharth Saksena (Purdue University)Co-Authors: Venkatesh Merwade (Purdue University); Peter Singhofen (Purdue University); Sayan Dey (Purdue University)

Due to the increased frequency of occurrence of high magnitude floods in the US, there is a need to develop large-scale flood modeling and alert systems that can disseminate accurate information on flood hydrodynamics and inundation extent at a very fine spatial resolution. The objective of this presentation is to describe a physically-based but computationally-efficient approach for large-scale (area > 10000 km2) flood modeling of unprecedented events using a distributed model called ICPR (Interconnected Channel and Pond Routing). Specifically, the presentation will highlight the following: (i) incorporation of natural and engineered systems for which data typically do not exist or difficult to get; (ii) role of flexible computational mesh structure on the model’s performance; and (iii) accuracy in simulating the flood hydrodynamics for use in emergency operations. Application of this approach at multiple spatial scales ranging from few hundred square kilometers to hundred thousand square kilometers will be discussed.

SESSIONS

Page 20: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2 0

Enabling modeling frameworks with surrogate modeling capabilitiesSpeaker: Francesco Serafin (Colorado State University)Co-Authors: Olaf David (Colorado State University); Jack Carlson (Colorado State University); Timothy Green (USDA Agricultural Research Service); Kipka Holm (USDA Agricultural Research Service); Andre Dozier (Colorado State University); Charles Ehlschlaeger (US Army Corp of Engineers)

Applications of conceptual and physically-based environmental models pervade both research and planning/consulting environments. However, due to their complexity, data resolution requirements, large number of parameters, platform dependencies, and other factors , the suitability of these models “out-of the box” for field and consulting applications often becomes problematic. Operating an entire system requires dedicated knowledge, extensive set up, and sometimes significant computational time. Conversely, questions from field applications require quick and “accurate enough” answers. Consequently, attempts to use physically-based research models in the field cause problems with timely delivery of the models themselves, IT deployment infrastructure management, model usability for the field personnel, performance expectations, data provisioning for field use, and field user training. The use of web-services could alleviate some of the implications for model users but ultimately shifts the responsibility and workload to the service hosting environment. For widespread frequent use, service delivery organizations need models to compute results quickly with limited set-up and reduced data entry, taking advantage of existing organization-wide data resources. To bridge the existing gap, this contribution aims to address and alleviate research model application complexity, streamlining data and parameter setup, reducing runtime, and improving model infrastructure efficiency. The solution involves a machine learning (ML)-based surrogate modeling approach aiming to capture the intrinsic knowledge of a conceptual/physical model into an ensemble system of artificial neural networks (ANN)s. The approach developed streamlines the transition from research to field by enabling a modeling framework to interact with ML libraries to emerge model surrogates for any modelling solution. The work extended the Cloud Services Integration Platform (CSIP) / Object Modeling System (OMS) framework and infrastructure to harvest data and derive the surrogate model at the modeling framework level. The effort applied NeuroEvolution of Augmenting Topology (NEAT) techniques in an ensemble application, combined with artificial neural network (ANN) uncertainty analysis. Testing the solution involved developing ANN surrogates for a sheet and rill erosion model and a daily streamflow model and evaluating their suitability for planning/consulting purposes. Results will be presented.

Addressing challenges for mapping irrigated fields in subhumid temperate U.S. systems by integrating remote sensing and hydroclimatic dataSpeaker: Tianfang Xu (Utah State University)Co-Authors: Jillian M. Deines (Stanford University); Anthony D. Kendall (Michigan State University); David W. Hyndman (Michigan State University)

High-resolution mapping of irrigated fields is important to better estimate water and nutrient fluxes in the landscape, food production, and local to regional climate, however this remains a challenge in humid to subhumid regions. In these regions, irrigation has been expanding into what was traditionally largely rainfed agriculture, driven by trends in climate, prices, technology, and practice. Remote sensing irrigated areas can be difficult in these regions as rainfed areas have similar characteristics. We present methods to address this challenge and enhance the contrast between neighboring rainfed and irrigated areas, including weather-sensitive scene selection, applying recently-developed composite indices, and calculating spatial anomalies. The methods are demonstrated in southwestern Michigan, where groundwater is the main source of irrigation water for row crops (primarily corn and soybeans). Integrating remote sensing imagery and various hydrometeorological data products in Google Earth Engine, a cloud-based geospatial analysis platform, we create annual, 30m-resolution maps of irrigated corn and soybeans from 2001 to 2016 using a machine learning method (random forest). The irrigation maps reasonably capture the spatio-temporal pattern of irrigation, with accuracies that exceed available products.

Investigating climate change impacts on water quality using climate stress testing with data-based analysis (machine learning)Speaker: Khanh Thi Nhu Nguyen (University of Massachusetts Amherst)Co-Authors: Umit Taner (Deltares); Hassaan Khan (Stanford University); Casey Brown (University of Massachusetts Amherst)

Many public water utilities with surface water supplies are faced with managing the uncertain effects on water quality of future climate change. Water quality constituents, such as turbidity and total organic carbon, are potentially affected by the volume, intensity, and timing of precipitation. Temperature may also play a role in reservoir water quality. In this analysis, we attempt to derive empirical relationships between climate variables and water quality that can be used to conduct a climate stress test of the system performance. Previous studies have used data-driven methods to link turbidity and total organic carbon (TOC) to hydro-climatic variables. However, they used only one method and

Page 21: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2 1

one metric to assess the prediction ability of their model. In this study, we use a variety of methods that are machine learning, parametric and non-parametric models together with the combination of three metrics to assess the model. This helps explain true underlying drivers for turbidity and TOC generation in the water system. In particular, we simulate turbidity and TOC using two machine learning models, which are the k-nearest neighbors (KNN), and the artificial neural network (ANN). We also test the local polynomial regression (LOESS), a non-parametric model, and the generalized additive model (GAM), a parametric method. The predictors are 31 combinations of precipitation, air temperature, water temperature, antecedent dry days, season. In order to avoid overfitting, we apply a 250-fold cross validation for turbidity and a leave-one-out cross validation for TOC. We select the best model by a new metric, called performance factor, which is the average of correlation factor, coefficient of determination or R-squared, and Nash-Sutcliffe model efficiency coefficient. We find that KNN is the best model to predict turbidity; whereas, TOC can be predicted optimally by LOESS. The models are then used within a climate stress test framework using a climate/weather generator to evaluate turbidity and TOC responses under climate change. The results are used to identify the specific climate changes that are problematic for water management.

Validating the historical simulation and the forecast skill for the streamflow prediction tool. Case study: Nepal, Bangladesh and ColombiaSpeaker: Jorge-Luis Sanchez (Brigham Young University)Co-Author: Jim Nelson (Brigham Young University)

Streamflow Prediction Tool (SFPT) is a web application that visualizes the results for a high resolution global-scale hydrological forecast based in the ensemble forecasts and global historical runoff generated by the European Centre for Medium-Range Weather Forecasts (ECMWF) using the Routing Application for Parallel computatIon of Discharge (RAPID). We used observed data in Nepal, Bangladesh, Dominican Republic and Colombia combined with the Hydrostats Python Package to run the validation for the 35 years (190-2014) historical simulation carried out by SFTP. Additionally, using the Continuous Ranked Probability Score (CRPS), we defined the Forecast Skill of SFTP at different Forecast Days in Nepal, Bangladesh, Dominican Republic and Colombia. Some problems related to the project were, the location of stations is not exact, streams in SFTP are not always displayed accurately, simulated basin areas are not the same as the hydrometric stations’ basin areas. We found that as the area of the basin gets bigger, the historic simulation becomes more accurate. Similarly, as the forecast days advance, the CRPS becomes less accurate. On the other hand, the skill score did not present any tendency to improve or get worse as the forecast days advance.

Page 22: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2 2

Advancing cyberinfrastructure for sharing, managing and publishing geoscience data and models in support of transparent, trustworthy and reproducible researchConvener: Daniel Ames (Brigham Young University)

There is increasing emphasis on open data that is findable, accessible, interoperable and reusable (FAIR). Journals are also increasingly requiring data in support of the papers they publish to be openly accessible. A number of communities and organizations have established data repositories to address this need. This session welcomes contributions from researchers, repositories, and journal editors and publishers with innovative ideas and solutions for geoscience data and model management. Contributions that span technical, institutional, social and policy considerations are welcomed.

Interoperable watersheds network - Open source data publishingSpeaker: Dwane Young (U.S. Environmental Protection Agency)

The Interoperable Watersheds Network (IWN), created in partnership between U.S. EPA, USGS, NJ DEP, the NJ Meadowlands Commission, Clermont County (Ohio), CUAHSI, and numerous other partners is a national data sharing platform that seamlessly links continuously monitored sensor data from multiple agencies into one searchable location. The IWN allows water quality managers to better evaluate the health of local water resources by providing them with near real-time access to watershed-level monitoring data. The IWN was a demonstration project to test open data standards, and their utility for sharing continuous data. The IWN uses Open Geospatial Consortium (OGC) Sensor Observation Service (SOS) 2.0 and WaterML2 standards as the foundation for a distributed sensor data sharing network. Data owners have published their continuous sensor data and related metadata either through “data appliances” (running the open-source and off-the-shelf commercial products). Metadata are harvested into a centralized catalog that provides a REST Service Application Program Interface (API), where users can discover data by querying for specific parameters or using spatial boundaries. The demonstration project was successful, enabling access to over 15,000 sensors nationwide from 8 data providers, including state, local, and federal agencies. EPA has published a lessons learned document outlining the results of this project and an additional recommendations document around data standards for continuous monitoring data. Through continuing partnerships, EPA has also released a ‘data appliance’, available through Amazon Web Services, which allows any partner to publish data to the IWN. Through open source software and system functionality based around web services and data standards, the IWN has created a pathway for data providers to easily set up a connection to publish their continuously monitored data to the web.

HydroGlobe - a platform enabling the integration of earth observations in hydrologic modelsSpeaker: Venkatesh Merwade (Purdue University)Co-Authors: Adnan Rajib (U.S. Environmental Protection Agency); I Luk Kim (Purdue University); Jaewoo Shin (Purdue University); Jack Smith (Marshall University); Lan Zhao (Purdue University); Carol Song (Purdue University)

Most hydrologic models are evaluated by their ability to match streamflow observations that are typically available at the watershed outlet and few other discrete locations. Despite the importance of other hydrologic fluxes such as soil moisture and evapotranspiration, hydrologic models are not calibrated against these fluxes due to the unavailability of observed data. Earth observations obtained from satellite and remote sensors provide unique opportunity to use them to fill this data void in hydrologic modeling. However, unmanageable data storage, disparate formats and spatial resolution among different sources of earth observations hinder their use in hydrologic models. To address this issue, an open access, online platform, named HydroGlobe, is developed that can deliver ready-to-use data from different earth observation sources for use in hydrologic models. HydroGlobe can provide spatially-averaged time series of earth observations by using the following inputs: (i) data source, (ii) temporal extent in the form of start/end date, and (iii) geographic units (e.g., grid cell or sub-basin boundary) and extent in the form of GIS shapefile. This presentation will demonstrate the application of HydroGlobe and its interoperability with an online model sharing platform, SWATShare, for calibration of Soil and Water Assessment Tool (SWAT) models using earth observations.

Page 23: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2 3

HydroShare: An overview of new functionality developed in support of collaborative reproducible researchSpeaker: David Tarboton (Utah State University)Co-Authors: Ray Idaszak (RENCI, University of North Carolina at Chapel Hill); Jeffery S. Horsburgh (Utah State University); Daniel P. Ames, (Brigham Young University); Jonathan L. Goodall (University of Virginia); Alva Couch (Tufts University); Pabitra Dash (Utah State University); Hong Yi (RENCI, University of North Carolina at Chapel Hill); Christina Bandaragoda (University of Washington); Anthony Castronova (CUAHSI); Martyn Clark (University of Saskatchewan); Richard Hooper (Tufts University); Shaowen Wang (University of Illinois); Maurier Ramirez (Utah State University); Jeffrey Sadler (University of Virginia); Mohamed Morsy (University of Virginia); Scott Black (Utah State University); Dandong Yin (University of Illinois); Tanu Malik (DePaul University); Liza Brazil (CUAHSI)

HydroShare is a domain specific data and model repository operated by the Consortium of Universities for the Advancement of Hydrologic Science Inc. (CUAHSI) to advance hydrologic science by enabling researchers to more easily share data, model and workflow products resulting from their research and used to create and support reproducibility of the results reported in scientific publications. HydroShare is comprised of two sets of functionality: (1) a repository for users to share and publish data and models, collectively referred to as resources, in a variety of formats, and (2) web application tools that can act on content in HydroShare for computational and visual analysis. Together these serve as a platform for collaboration and computation that integrates data storage, organization, discovery, and analysis and that allows researchers to employ services beyond their desktops to make data storage and manipulation more reliable and scalable, while improving their ability to collaborate and reproduce results. This presentation will describe ongoing enhancements to HydroShare, some of the challenges being faced in its design and ongoing development. Content storage is being consolidated into a single primary resource type that may hold multiple content aggregation types. This better supports storage of the diverse data involved with hydrologic data and model studies in a single shareable unit. Reproducible and easy to use computational functionality is being advanced using JupyterHub as a gateway to XSEDE and other high performance compute resources. This presentation will describe the progress made and challenges being addressed for managing the storage and use of HydroShare resources from JupyterHub, and using containers to enabling simple and scalable access to these resources.

User-defined metadata schemas for hydrological modelsSpeaker: Jeffrey M. Sadler (University of Virginia)Co-Authors: Jonathan L. Goodall (University of Virginia); Mohamed M. Morsy (Dewberry); Jeffery S. Horsbrugh (Utah State University); David G. Tarboton (Utah State University)

Water resources engineers and hydrologic scientists rely on a wide range of hydrologic models to address specific challenges. The input file(s) (i.e., the model instance) for one piece of hydrologic modeling software (i.e., the model program) can be substantially different compared to another. Therefore, the metadata schemas used to describe model instances of any two model programs may likewise vary substantially. Recently, the support for storing and describing hydrologic model programs and instances has increased with the development of web-based model repositories (e.g., HydroShare, Community Surface Dynamics Modeling System (CSDMS)). However, the support of custom metadata schemas for describing the model instances executed by an arbitrary model program is still lacking. For example, in HydroShare, providing support for an instance of a specific model program (e.g., MODFLOW) has been done by the HydroShare developers. This is a bottleneck because a few developers cannot provide support for the dozens of model programs used by the hydrologic science community. The aim of this research, therefore, is to enable a shift in that responsibility - from the repository and tool developers to the modeling community. The main contribution for achieving this goal is the development of a machine-readable file specification used to describe the metadata schema for the model instances of a model program. This file specification gives those who are most knowledgeable and invested in describing the model instances (i.e., the model users and/or developers) the ability to specify the metadata fields used to describe model instances of a model program in a standardized, machine-readable way. The metadata schema can then be leveraged in an automated way by model and data repositories to provide model instance contributors with descriptive metadata fields specific to their model program. In this way, the modeling community, instead of the web-based repository developers, can provide and maintain the metadata schemas. Furthermore, with this approach, support can be provided for a theoretically indefinite variety of hydrologic models with little cost to the repository developers. This presentation will describe the metadata schema specification file format, demonstrate its use for the MODFLOW model as an example, and show how the web-based repository, HydroShare, can support and leverage this new approach for community-contributed metadata schemas for hydrologic models.

Page 24: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2 4

Simplifying GIS and HIS web service deployment using HydroShareSpeaker: Ken Lippold (Brigham Young University)Co-Authors: Daniel P. Ames (Brigham Young University).

HydroShare is a system being developed by the Consortium of Universities for the Advancement of Hydrologic Science Inc. (CUAHSI) with the goal of facilitating the dissemination, visualization, and publishing of hydrologic data and models. HydroShare is a powerful tool for finding and sharing many types of hydrologic data, including geospatial and time series data, but there is currently no framework in place for exposing this data via web-based data services. As a result, HydroShare’s ability to interact with other systems that use this data is greatly limited. This presentation will describe a framework for providing web-based data services through HydroShare using both the Open Geospatial Consortium’s Geographic Information System (GIS) web services standards and CUAHSI’s Hydrologic Information System (HIS). A prototype web services system has been built using Django and GeoServer to implement this framework and help HydroShare provide more robust and interactive GIS and HIS data services.

Page 25: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2 5

Advances in hydroinformatics and virtual realityConvener: Ivo Arrey (University of Venda)

This session will look at the developing trends where hydrologic models have increasingly been combined with virtual reality applications and internet technology for improved interaction of modeler experience.

Interactive and real-time flood inundation mapping on client-side web systemsSpeaker: Ibrahim Demir (University of Iowa)Co-Author: Anson Hu (University of Iowa)

The flood inundations maps are critical for disaster preparedness, response and recovery efforts. Accurate and detailed inundation map generation requires time and resources that may not be available for many small communities. Changes in terrain and flood protection infrastructure require updates for the inundation maps. We have created a real-time flood inundation map system on the web using height above the nearest drainage (HAND) method. The framework doesn’t require any server-side GIS or database processing. The framework allows users to select an area on the elevation map and generates the corresponding inundation map on the client-side. The platform allows users to modify elevation map, add or remove levees or reservoirs, and change modeling parameters for the inundation map generation. The platform generates inundation maps comparable to existing FEMA approved maps.

4D digital watershed: Advanced bedrock-to-canopy characterization for watershed functionsSpeaker: Haruko Wainwright (Lawrence Berkeley National Laboratory)Co-Authors: Nicola Falco (Lawrence Berkeley National Laboratory); Baptiste Dafflon (Lawrence Berkeley National Laboratory); Sebastian Uhlemann (Lawrence Berkeley National Laboratory); Ken Williams (Lawrence Berkeley National Laboratory); Susan Hubbard (Lawrence Berkeley National Laboratory)

Predictive understanding of watershed function and dynamics is often hindered by the heterogeneous and multiscale fabric of watersheds. In particular, ecohydrology and biogeochemical cycling involves complex hydrological-biogeochemical interactions occurring from bedrock-to-canopy, including geology, plants, microorganisms, organic matter, minerals, dissolved constituents, and migrating fluids. Recently, there are significant advances in remote sensing to capture spatiotemporal “patterns” of plants, topography and subsurface. However, there is still a significant challenge to connect these patterns to watershed processes and functioning such as carbon and nutrient exports, water resources and quality. In this study, we develop novel watershed-characterization methodology to quantify complex watershed systems across scales, using advanced sensing, inversion, and machine learning approaches. Through explicitly bridging information derived from “on the ground” observations and remote sensing data, we catalyze the development of the fundamental scientific linkages among interacting processes in the watershed. We integrate multi-scale multitype datasets of surface geophysics (e.g. electrical, seismic), airborne electromagnetic survey, airborne LiDAR, airborne snow survey and satellite/UAV images collected over the East River Watershed (near Crested Butte, CO, USA). Specifically, we have (1) identified the co-variability among geology, geomorphology, and vegetation (i.e., plant functional types and their dynamics) and the key drivers that regulate the sensitivity to drought, and (2) developed the watershed functioning zonation concept in which we perform hierarchical cluster analyses to categorize hillslopes – functioning units within the watershed – into several representative zones that have distinct characteristics of those co-varied properties, water quality and nitrogen exports.

A map algebra approach to analyzing time series of rastersSpeaker: Xingong Li (University of Kansas)Co-Authors: David Tarboton (Utah State University); Mike Hodgson (University of South Carolina); Shaowen Wang (University of Illinois at Urbana-Champaign)

Time series of rasters (TSORs) are important and widely available spatiotemporal data for monitoring, modeling and predicting hydrological systems. Processing and analyzing TSORs are somewhat cumbersome, if not impossible, with traditional GIS data analysis frameworks and tools which were developed, conceptually, for handling individual snapshot grids and, computationally, for limited geographical extent and resolutions. We propose a spatiotemporal map algebra framework which is developed based on a new understanding on the nature of map algebra and the categorization of map algebra operations. We think map algebra provides a computational instrument that performs geographical convolutions using locally defined neighborhoods, which can reveal global emergent patterns and forms. Under this perspective, the original zonal operations are considered as focal operations where the neighborhoods are defined in a zone raster. We will further extend this new map algebra framework for analyzing TSORs where both the iterations and neighborhoods can be defined in space and time. We will also discuss the key implementation challenges, strategies, and possible solutions for implementing the proposed framework and operations on a parallelized cyber-infrastructure

Page 26: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2 6

for analyzing very large TSORs.

Page 27: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2 7

Applied HydroinformaticsConvener: Kyle Onda (Duke University)

This session will highlight methods and resources for sharing and analyzing data that advance environmental monitoring and decision-making grounded in data. Presentations will describe successful applications of hydroinformatics for citizen science, international water challenges, prediction of storm events, and more.

Cyberinfrastructure for intelligent water supply: measuring water use, conservation, and socio-demographic differences using an inexpensive, high frequency metering systemSpeaker: Jeffery S. Horsburgh (Utah State University)Co-Authors: Joseph Brewer (Utah State University); Paul Consalvo (Utah State University); Nicole Vause (Utah State University); Travis Whitfield (Utah State University); Amy Carmellini (Utah State University); Daniel Henshaw (Utah State University); Josh Tracy (Utah State University)

We present an inexpensive, open source, water metering system for measuring water use quantity and behavior at high temporal frequency. We have demonstrated this technology in multiple water metering case studies, including observing water use within two high-traffic, public restrooms at Utah State University (USU) before and after installing high efficiency, automatic faucets and toilet flush valves. For this case study, we also integrated an inexpensive sensor to count user traffic. Sensing restroom visits and water use events allowed us to identify fixture malfunctions, average water use per person, variability in use by fixtures (faucets versus urinals and toilets), variability in use by fixtures compared to manufacturer specifications, gender differences in use, and the difference in use after retrofit of the restrooms with high efficiency fixtures. Additional case study applications to which we have applied this system include investigating differences in water use of residential populations on USU’s campus with varying sociodemographics, investigating the effectiveness of dual flush toilets, and observing water use in residential homes. In this presentation, we describe both the inexpensive hardware we have used for collecting data along with results for each of our case study applications. Inexpensive metering systems like the one we have demonstrated can help institutions remotely measure and record water use trends and behavior, identify leaks and fixture malfunctions, and schedule fixture maintenance or upgrades, all of which can ultimately help them meet goals for sustainable water use.

A citizen science approach to streamflow and temperature forecastingSpeaker: Pedro Mauricio Avellaneda-Lopez (Indiana University)Co-Authors: Darren L. Ficklin (Indiana University Bloomington); Christopher Lowry (University at Buffalo); Jason H. Knouft (Saint Louis University); Damon Hall (University of Missouri)

Participation of the general public (citizen science) and do-it-yourself sensor systems have transformed hydrologic research. Such transformation touches conventional monitoring and modeling approaches in ungauged basins, where the application of complex hydrological models is limited. Do-it-yourself sensor systems and crowd-sourced data have the potential to bridge that data gap; however, only few studies have exploited this capability. This study explores the potential for real-time crowd-sourced data to improve complex computational hydrologic models. We selected the Boyne River basin in northern Michigan as a case study to demonstrate crowd-source data assimilation in distributed hydrological models. We utilized CrowdHydrology, a citizen science network that collects hydrologic data throughout the United States, to obtain local stream stage and stream temperature measurements. CrowdHydrology provides an infrastructure for citizen scientists to voluntarily send a text message with the current water stage height and stream temperature to a server located at the University at Buffalo. Our approach retrieves CrowdHydrology observations and weather data on a weekly basis from a nearby weather station. Stream stage citizen science observations are processed to obtain streamflow discharge based on field-derived stage-discharge relationships at four locations along the river. This database of observations was used as input to a Soil and Water Assessment Tool (SWAT) model of the Boyne River basin. Within this framework, the hydrologic model is re-calibrated on a bi-weekly schedule using the Ensemble Kalman Filter. Each new calibrated model provides a more accurate estimate of a 7-day forecast of streamflow and stream temperature throughout the river basin. This novel approach can potentially benefit small communities by providing information on local water resources derived from complex hydrological models.

Characterization of water resources using an online groundwater level mapping toolSpeaker: Norm Jones (Brigham Young University)Co-Authors: Steve Evans (Brigham Young University); Gus Williams (Brigham Young University)

Groundwater is one of the most challenging water resources to characterize, quantify, and monitor on a regional basis. To overcome these challenges, we have developed the Groundwater Level Mapping Tool, an open-source, Python-

Page 28: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2 8

based web application to enable visualization and quantification of groundwater resources throughout a region. This application includes tools to extrapolate and interpolate time series observations of groundwater levels in monitoring wells through multi-linear regression, using correlated data from other wells and from satellite observations. The app also performs spatial interpolation using GSLIB Kriging code. Combining the results of spatial and temporal interpolation, the app enables the user to calculate changes in aquifer storage, and to produce and view maps and animations of groundwater levels over time. This presentation will detail the development of these analysis tools and will demonstrate their application in aquifers located throughout southern Utah, Texas, Colombia, and the Dominican Republic.

Using random forest models to predict streamflow metrics in ungauged watersheds and to estimate hydrologic alteration in urban streamsSpeaker: Charles Stillwell (U.S. Geological Survey)Co-Authors: Natalie Nelson (North Carolina State University); Bill Hunt (North Carolina State University)

Despite the prevalence of U.S. Geological Survey streamgages, most streams across the United States lack flow data. Watershed managers and planners need practical and interpretable hydrologic predictions throughout the entire stream network, especially in watersheds experiencing urban growth and hydrologic alteration. Although reference watersheds can be used to estimate hydrologic responses in unmonitored streams, the complex relationships between watershed attributes (slope, soil characteristics, climate drivers, land use, etc.) and hydrologic responses often limit predictive capabilities. Machine learning approaches, such as random forest models, can identify non-linear relationships between predictor and response variables, which can then be used for predictive and explanatory purposes. Random forest regression models were developed for North and South Carolina to achieve two objectives: (1) predict annual flow metrics (total streamflow volume, baseflow index, and the frequency of stormflow pulses) in unmonitored streams, and (2) conduct explanatory data analyses to determine relationships between watershed attributes and hydrologic responses. The model demonstrated high Nash-Sutcliffe Efficiency (NSE) values for all three hydrologic responses: annual streamflow volume (NSE = 0.878), annual baseflow index (NSE = 0.884), and number of stormflow pulses per year (NSE = 0.779). To test the effect of urban land use on hydrologic response, a scenario analysis was conducted by replacing urban land uses with low-density lands and measuring the resultant differences in predicted responses. We discuss successes and shortcomings of the scenario analysis and potential applications of random forest regression models for hydrologic prediction. This research demonstrated that random forest models (1) can inform a wide array of watershed management and planning objectives and (2) may substantially improve streamflow predictions in ungauged watersheds.

Hydrological Event Detection & Analysis (HEDA) tool for streamflow water quality time seriesSpeaker: Scott Hamshaw (University of Vermont)Co-Author: Ali Javed (University of Vermont)

When analyzing time series of streamflow and associated water quality sensor data such as turbidity, researchers and managers often are interested in isolating storm events since that is when behavior is dynamic and physical processes can be inferred. In this presentation, we will present preliminary development of a web-based data analysis tool that can be used for hydrological event detection and analysis (HEDA). Currently, a lack of options exist for detecting, delineating and analyzing hydrological events that don’t require utilizing a programming environment such as R or MATLAB. We will present a first look at a web-based tool capable of interfacing with time series stored on the CUAHSI HydroServer and USGS NWIS databases and then performing subsequent analysis. We will demonstrate the event-based analysis that can be obtained from the HEDA tool as well as encourage feedback from potential users of the tool.

Regional flood forecasting applications for the Dominican Republic Speaker: Jason Biesinger (Brigham Young University)Co-Author: Jim Nelson (Brigham Young University)

Flooding can be a disastrous event where lives are at stake and cleanup is extremely costly. This is especially true to those areas of the world that are affected by tropical storms and hurricanes. To help mitigate flooding, we have developed a dashboard, called the Hydroviewer, that displays predicted stream runoff using the Streamflow Prediction Tool, and displays Flash Flood Guidance, amongst other capabilities. The Hydroviewer app is customizable to different regions, making the Streamflow Prediction Tool accessible locally. We will be demonstrating the use of the Hydroviewer Hispaniola, which is customized for the Dominican Republic. For the Dominican Republic we have also been incorporating other global forecasting models, such as the GFS and WRF models, and displaying their results in the dashboard.

Page 29: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

2 9

Continental scale community hydrologic modeling cyberinfrastructure, knowledge representations and data managementConvener: David Tarboton (Utah State University)

There are a number of continental scale hydrologic models that have emerged over the last few years including the US based NOAA National Water Model, Continental Parflow, USGS National Hydrologic Model, and others, with even more under development. These are used for operational forecasting, water resources studies and research. The increasing complexity of these models poses a number of cyberinfrastructure and data management challenges. Also as these models advance to incorporate representations of different processes, they increasingly become holders of knowledge. They represent the encoding of research results in a platform that can be used for decision making and as a starting point for other research. This session welcomes contributions from model developers, model users, researchers and decision makers who are in any way involved with the cyberinfrastructure aspects of these models and the management of the data they use and provide.

Session IOptimal access to NASA water cycle dataSpeaker: Richard Strub (NASA Goddard Earth Sciences Data and Information Services Center)Co-Authors: Bill Teng (NASA Goddard Earth Sciences Data and Information Services Center), Hualan Rui (NASA Goddard Earth Sciences Data and Information Services Center), Carlee Loeser (NASA Goddard Earth Sciences Data and Information Services Center), Jim Acker (NASA Goddard Earth Sciences Data and Information Services Center), Mahabaleshwa Hegde (NASA Goddard Earth Sciences Data and Information Services Center) and Bruce Vollmer (NASA Goddard Earth Sciences Data and Information Services Center)

A “Digital Divide” in data representation exists between the preferred way of data access by the hydrology community (i.e., as time series of discrete spatial objects) and the common way of data archival by NASA earth science data centers (i.e., as continuous spatial fields, one file per time step). This Divide has been an obstacle between hydrology data users (e.g., CUAHSI HIS, HydroShare) and the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC). The GES DISC (one of 12 NASA Earth Observing System (EOS) data centers) processes, archives, documents, and distributes data from Earth science missions and related projects, including hydrologic land surface data. The latter are part of the GES DISC Water & Energy Cycle data holdings. Of the many related data services available to users, the NASA Giovanni (Geospatial Interactive Online Visualization and Analysis Infrastructure) is the best known and most used (cited in more than 2000 peer-reviewed research publications). Giovanni provides a relatively simple way for researchers to conduct exploratory investigations with a variety of NASA Earth observation data and related data sets. Among Giovanni’s suite of plotting options, the time series is probably of most interest to hydrology data users (and is the second most popular among users in general). However, for optimal access to GES DISC data along the time dimension, the data as archived must be reorganize to some way that is optimal for that mode of access. Given the importance of bridging the Digital Divide, the GES DISC has (1) developed “Data Rods,” a set of REST endpoints for long time series; (2) improved the performance of Giovanni’s time series plotting option; and (3) assisted the University of Texas-Austin in developing and supporting the Data Rods Explorer (DRE), a HydroShare app that combines data from the first two sources. As part of NASA data centers’ overall transitioning to the cloud, the GES DISC has been investigating “Giovanni in the cloud.” Though still under development, the prototype Giovanni Cloud-Optimized Data Store (CODS) has already demonstrated a significant performance increase in time series capabilities—5-10 times faster than the current Data Rods endpoints. The GES DISC aims to continually explore and implement appropriate technologies to improve its data services, in response to user needs of the hydrology community.

Hydrologic observation, model, and theory congruence on evapotranspiration variance: Diagnosis of continental scale land surface modelsSpeaker: Ruijie Zeng (Utah State University)

This study reconciles the state-of-the-art observations and simulations of evapotranspiration (ET) temporal variability through a diagnostic framework composed of an observation-model-theory triplet. Specifically, a confirmed theoretical tool, Evapotranspiration Temporal VARiance Decomposition (ETVARD), is used as a benchmark to estimate ET monthly variance across the contiguous United States (CONUS) with inputs including hydroclimatic observations, Gravity Recovery and Climate Experiment (GRACE)-based terrestrial water storage, four observation-based products (remote sensing from University of Washington, MODIS16, GLEAM, and FLUXNET-MTE), and four operational land surface models (MOSAIC, NOAH, NOAH-MP, and VIC).

Page 30: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3 0

Global monitoring of fresh water at high spatial and temporal resolutions. Assessing stream and lake hydrological/physical features within a machine learning frameworkSpeaker: Giuseppe Amatulli (Yale University)

Lakes and rivers process a significant quantity of freshwater, which is among the most vulnerable resources in nature. The physico-chemical characteristics of each watershed or stream are the result of complex interactions among several environmental variables. These variables regulate discharge regimes and stream profiles, such as rainfall, evapotranspiration, soil infiltration and retention, geomorphology, land use, and snow cover, among others. Current knowledge of stream flow trends in rural areas and developing countries is limited and fragmented, and small streams are often not represented. A full geo-analysis is needed to capture these stream features. Freshwater quantification at high spatial resolution is therefore essential to this aim, and also the first step towards a comprehensive assessment of the global water cycle. The project captures the multi-dimensional aspects of the flow regimes (monthly discharge) and model hydraulics worldwide, using a broad range of 90m geo-datasets and gauging station data in a machine learning framework. This work will be the most comprehensive hydrological model based on a high dimensional data-driven approach able to assess stream network location, width, depth and water flow. The overarching goal of this research is to revolutionize our understanding of the fundamental principles that govern freshwater discharge regimes worldwide.

NWM-driven hydrodynamic simulations to resolve complex flow dynamics in low gradient watershedsSpeaker: Haitham Saad (University of Louisiana)Co-Authors: Emad Habib (University of Louisiana), Robert Miller (University of Louisiana)

Streamflow monitoring is usually limited to main rivers with the majority of the tributaries and low-order streams being underserved. The US National Oceanic and Atmospheric Administration (NOAA) developed the National Water Model (NWM) to produce a reanalysis dataset that contains hydrologic simulations from a 25-year retrospective simulation (January 1993 through December 2017). This allowed for having streamflow time series at more than 2.7 million stream locations during periods when no other source of streamflow hydrographs existed. The current study presents an effort that capitalizes on the NWM Reanalysis dataset to drive a detailed hydrodynamic model and obtain a better understanding of complex flow dynamics in low-gradient coastally-influenced watersheds. The study site, the Vermilion River basin, is a medium-sized 2013-mile2 basin located in south Louisiana. This river is characterized by a very flat gradient, a typical slope of 1:10,000, and frequent reported major floodings. During these floodings, unfavorable conditions of significant backwater effects and reverse flows are usually eyewitnessed. Given the low-gradient nature of the area and the interdependency of its natural and human-made infrastructure, such conditions usually lead to severe, widespread and prolonged-duration of flooding. This study conducted a multi-approach modeling analysis using the NWM Reanalysis dataset driving 1-D and 2-Dimensional simulations using the Hydrologic Engineering Center’s River Analysis System (HEC-RAS). The study also reports on a pilot-scale effort to develop an operational forecasting system for flood warnings and inundation mapping using a real-time forcing of the hydrodynamic simulations using operational outputs from the NWM.

A novel multi-scale data fusion framework for massive datasetsSpeaker: Dhruva Kathuria (Texas A&M University)Co-Authors: Binayak Mohanty (Texas A&M University) and Matthias Katzfuss (Texas A&M University)

The global burgeoning of environmental remote sensing datasets in the past decade holds a significant potential in improving our understanding of multi-scale hydrological dynamics. The primary issues that hinder the fusion of different data platforms are 1) Massive size of datasets on a continental scale, 2) different spatial resolutions of the data platforms, 3) inherent spatial variability in environmental variables caused due to atmospheric and land surface controls and 4) measurement errors caused due to imperfect retrievals of remote sensing platforms. We present a novel data fusion scheme which takes all the above factors into account using a spatial hierarchical model (SHM). An SHM enables coherent integration of data, science and uncertainties to make optimal predictions at unobserved locations using an underlying non-stationary geostatistical model. The applicability of the hierarchical approach, however, is severely limited by huge datasets. To account for the massive size of the datasets at a continental scale, we propose a novel extension of a likelihood approximation in a multi-scale multi-platform setting. The applicability of the framework is demonstrated by fusing insitu soil moisture observations from SCAN and USCRN with satellite derived soil moisture products from SMOS and SMAP for Contiguous USA (CONUS).

Page 31: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3 1

Session IIConvener: Jerad Bales (CUAHSI)

The National Hydrologic Model: an infrastructure for collaboration in the hydrologic communitySpeaker: Steven Markstrom (U.S. Geological Survey)

The National Hydrologic Model (NHM) was developed to support coordinated, comprehensive, and consistent hydrologic modeling at multiple scales for the conterminous United States. The NHM development has been driven for the past decade by specific applications to meet stakeholder needs for accessible, adaptable surface water models that address local hydrologic modeling needs. NHM-based applications provide information to scientists, water resource managers, and the public to support advanced scientific inquiry and effective decision-making. The NHM infrastructure supports the execution of the Monthly Water Balance Model (NHM-MWBM) and the daily Precipitation Runoff Modeling System (NHM-PRMS). The NHM-PRMS balances all components of the water budget and can include simulation of stream temperature. Complete local models can be subset from the NHM-PRMS, then adapted and applied with local expertise to address stakeholder needs, providing nationally-consistent, locally informed, stakeholder relevant results. The NHM infrastructure provides an opportunity for collaboration in the hydrologic community.

Cyberinfrastructure needs for continental-domain hydrological modelingSpeaker: Bart Nijssen (University of Washington)Co-Authors: Martyn P. Clark (University of Saskatchewan at Canmore) and Andrew Wood (NCAR)

In this presentation we will highlight some key cyberinfrastructure challenges in continental-domain hydrological modeling. We will cover on 1) developing multi-scale continental-domain instantiations from the same geospatial framework, introducing novel approaches for adaptive nests; (2) application of methods in large-domain parameter estimation, focusing on the development of model workflows and computational infrastructure; and (3) parallelization of continental-domain models, with focus on efficient hierarchical/hybrid spatial decomposition strategies that are necessary to efficiently process connected river networks. We will summarize recent progress on each of these topics as well as outstanding research challenges.

Incorporating river geometry in large scale hydrologic and hydrodynamic modelsSpeaker: Sayan Dey (Purdue University) Co-Authors: Venkatesh Merwade (Purdue University) and Siddharth Saksena (Purdue University)

Accurate representation of river geometry, including river centerline, banks and bathymetry, is critical in simulating river hydrodynamics. However, most regional and continental scale models do not incorporate complete river geometry in simulating river processes due to the unavailability of such data in public domain. Even the most commonly used Flowlines in the National Hydrography Dataset (NHD) have a poor spatial correspondence with the latest Lidar-based Digital Elevation Models (DEMs). Additionally, manual digitization of river centerlines or other geometric features using aerial photography is time consuming and impractical for large watersheds (> 1000 sq. km.). This presentation will describe an automated framework to estimate river centerline, banks and river bathymetry for large networks for use in integrated hydrologic-hydrodynamic simulations. Initial results from applying the framework on the entire Wabash Basin (Hydrologic Unit Code 0512, area = 85,300 sq. km.) by creating a 1D/2D integrated hydrodynamic model using Integrated Channel and Pond Routing (ICPR) will be presented.

Fast summarizing algorithm for polygonal statistics over a regular gridSpeaker: Scott Haag (Drexel University)Co-Authors: Ali Shokoufandeh (Drexel University), David Tarboton (Utah State University)

In this presentation we discuss a novel method to calculate univariate statistics for a regular grid given a region of interest (i.e. zonal statistics). This method called Fast Zonal Statistics (FZS) allows the retrieval of common numerical (e.g., mean, sum, and standard deviation) and categorical (count) attributes. The complexity of the FZS algorithm is dependent on the length of the region’s boundary. Existing techniques to calculate zonal statistics scale in relation to the area of the region. Therefore, we both expect and measure geometric decreases in query time using the FZS algorithm in comparison to existing techniques. The FZS algorithm relies on a simple aggregation preprocessing technique to be run one time over every cell in the input grid. Lastly, we demonstrate an API implementation to return the National Land Cover Dataset NLCD zonal statistics for the Chesapeake Bay watershed.

Page 32: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3 2

Synergies between mechanistic and machine learning modelsConvener: Jordan Read (U.S. Geological Survey)

Big-data machine learning has enabled novel applications and fresh opportunities in hydrologic modeling. However, there remains a need for hydrologic process understanding to realistically represent unobserved states and processes when predicting the quality and quantity of our water resources. Process understanding can not only be derived from machine learning applications but can also be embedded within those applications to improve on purely data-driven approaches, leading to better predictions even when extrapolating beyond the training data. This session will explore opportunities in hydrology to integrate data-driven modeling with process knowledge, including: enhancing machine learning with process understanding, deriving process understanding from machine learning applications, and creating hybrid machine-learning and process-based models.

Physics guided machine learning: A new paradigm for modeling dynamical systemsSpeaker: Vipin Kumar (University of Minnesota)Co-Authors: Jordan Read (U.S. Geological Survey); Jacob Zwart (U.S. Geological Survey); Alison Appling (U.S. Geological Survey); Xiaowei Jia (University of Minnesota); Jared Willard (University of Minnesota); Michael Steinbach (University of Minnesota); Paul Hanson (University of Wisconsin)

Physics-based models of dynamical systems are often used to study engineering and environmental systems. Despite their extensive use, these models have several well-known limitations due to incomplete or inaccurate representations of the physical processes being modeled. Given rapid data growth due to advances in sensor technologies, there is a tremendous opportunity to systematically advance modeling in these domains by using machine learning (ML) methods. However, capturing this opportunity is contingent on a paradigm shift in data-intensive scientific discovery since the “black box” use of ML often leads to serious false discoveries in scientific applications. Because the hypothesis space of scientific applications is often complex and exponentially large, an uninformed data-driven search can easily select a highly complex model that is neither generalizable nor physically interpretable, resulting in the discovery of spurious relationships, predictors, and patterns. This problem becomes worse when there is a scarcity of labeled samples, which is quite common in science and engineering domains.

This talk makes a case that in a real-world systems that are governed by physical processes, there is an opportunity to take advantage of fundamental physical principles to inform the search of a physically meaningful and accurate ML model. While the talk will illustrate this paradigm in the context of modeling water temperature, it has the potential to greatly advance the pace of discovery in a number of scientific and engineering disciplines where physics-based models are used, e.g., power engineering, climate science, weather forecasting, materials science, and biomedicine.

Application of a convolution neural network to the identification of karst featuresSpeaker: Scott Haag (Drexel University)Co-Authors: Andrew McDonald (Drexel University); Michael Campagna (Drexel University); Ali Shokoufandeh (Drexel University)

The extraction of Karst-like features including sinkholes, depressions, and swales from 3D LIDAR point-clouds is often accomplished via manual or mechanistical labor-intensive processes. Automated computer vision approaches have the potential to drastically speed up this process, thereby allowing accurate identification of features over large regions. Our approach toward solving this problem employs multi-dimensional learned convolutional filters within a convolutional neural network (CNN) architecture. We train our network to detect Karst-like features using a pre-existing labeled dataset of Karst features, and assess its performance against a subset of this dataset. Using standard statistical and cross-validation approaches, we present preliminary results showcasing the merits of using CNNs to detect these features from an accuracy and temporal standpoint.

GLADD: A new Global Lake Dynamics Database created using machine learning and satellite dataSpeaker: Ankush Khandelwal (University of Minnesota)Co-Authors: Anuj Karpatne (Virginia Tech); Zhihao Wei (University of Minnesota); Rahul Ghosh (University of Minnesota); Huangying Kuang (University of Minnesota); Hilary Dugan (University of Wisconsin); Paul Hanson (University of Wisconsin); Vipin Kumar (University of Minnesota)

Freshwater resources play a crucial role for human sustenance as they provide freshwater for agriculture, power generation, human consumption, and recreation. A global database of lake and reservoirs that provides their location

Page 33: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3 3

and dynamics can be of great importance to the ecological community as it enables the study of the impact of human actions and climate change on fresh water availability. This paper presents a new database, GLADD (Global Lake Dynamics Database) that has been created by analyzing spectral data from Earth Observation (EO) Satellites using novel machine learning (ML) techniques. These ML techniques can construct highly accurate surface area extents of water bodies at regular intervals despite the challenges arising from heterogeneity and missing or poor quality spectral data. The GLADD database aims to provide surface area variations of approximately 500,000 lakes and reservoirs (larger than 0.1 sq. kms.) at monthly scale from 1984 to 2015. Apart from providing dynamics of water bodies, the database also detects new water bodies that are missing from existing static databases such as GLWD and HydroLAKES. Thus, the GLADD database provides a global view on the changing state of these water bodies that are being impacted by climate change and human actions. Access to surface area variations in conjunction with bathymetry can also enable creation of calibration data of hydrological models especially for regions where in-situ data is limited or not available. Finally, the visualization of these water bodies and their surface area time series is available online via GLADD web interface.

Clowder: Open source data sharing leveraging active curation and applied machine learningSpeaker: Bing Zhang (University of Illinois)Co-Authors: Kenton McHenry (University of Illinois Urbana-Champaign); Praveen Kumar (University of Illinois Urbana-Champaign); Luigi Marini (University of Illinois Urbana-Champaign)

Scientific data is often very heterogeneous. Within geoscience, data spans time series, geospatial, remote sensing, geophysical image, geophysical and geochemical laboratory analyses, experimental outcomes, and images to name a few. For such data to be usable by others, large collections of data spanning these types, some of it unstructured, must be annotated and/or processed into more readily usable products. If datasets are large, which is more and more the case today, local computational capabilities are also often essential towards usability in order to save the user from having to download the data or identify a suitably powerful local computational resource to run analysis. We present Clowder, an open source data management framework built on the notion of Active Curation, providing machine learning and other analysis based tools to facilitate the annotation of large, broad, and unstructured datasets. Being customizable from the ground up, Clowder can be leveraged and deployed as needed at local institutions for specific scientific needs or deployed remotely on cloud/HPC resources, extended to meet new data visualization/analysis needs, and utilized to run custom analysis near the data where it resides, and interoperate with other data infrastructure components e.g. for long term archiving. Clowder has been leveraged to support the data sharing and processing needs of a broad range of communities spanning biology, geoscience, materials science, medicine, social science, cultural heritage and the arts.

Page 34: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3 4

Unveiling new innovations in advanced cyberinfrastructure to support a community hydrologic modeling ecosystemConveners: Jonathan L. Goodall (University of Virginia; Anthony M. Castronova (CUAHSI); Christina Bandaragoda (University of Washington)

Big data. Machine learning. Open Source Research Software. Cloud computing -- enabling community hydrologic modeling requires advanced cyberinfrastructure. This session seeks to be a venue for sharing ideas, approaches, applications, and tools addressing one or more components of the cyberinfrastructure ecosystem needed to enable community hydrologic modeling. We want to unpack the hype and buzzwords circulating around our ecosystem of modelers and build understanding on best new practices in collaboratively building community models, running them using high performance and high throughput computing, and using online tools to engage a broad set of model users. How do we use advanced cyberinfrastructure to make reproducible models that are easy to reuse, share, and improve on in an open source framework? What design considerations accommodate specific needs within the hydrologic science and water resources communities? What are the latest tools and approaches for accommodating large datasets, variety of alternative model structures, complex data pre-processing workflows, and visualization of large geospatial and temporal outputs from hydrology models? We seek presentations addressing these and related questions that advance understanding of the needed cyberinfrastructure ecosystem (e.g., workflow software, modeling frameworks, data repositories, user interfaces, model wrappers, model metadata, visualization tools, etc.) for community hydrologic modeling.

StreamPULSE: a platform for modeling river and stream metabolism on a global scaleSpeaker: Michael Vlah (Duke University)Co-Authors: Emily Bernhardt (Duke University); James Heffernan (Duke University)

Quantifying metabolic rhythms in streams allows us to understand how floods, droughts, and nutrients alter the capture of light and organic matter that fuel aquatic food webs. Historically, stream metabolism has been complicated to measure, but recent technological improvements have allowed stream monitoring to occur at a near continuous rate. StreamPULSE, an open-source data platform, gives ecologists and resource managers access to a wide array of tools for uploading and cleaning sensor data, modeling metabolism, and visualizing model outputs. It also provides access to data and modeled metabolism estimates from hundreds of streams and rivers around the world, including data collected by NEON and USGS. Find out how StreamPULSE can help you explore and learn from a growing body of foundational ecosystem data.

Design and implementation of cyberinfrastructure to support a cloud-based, community hydrologic modeling ecosystemSpeaker: Young-Don Choi (University of Virginia)Co-Authors: Jeffrey M. Sadler, Jonathan L. Goodall (University of Virginia), Anthony M. Castronova (CUAHSI), Andrew Bennett, Bart Nijssen (University of Washington), Ray Idaszak (University of North Carolina), Shaowen Wang (University of Illinois), Martyn P. Clark (University of Saskatchewan), David G. Tarboton (Utah State University)

Hydrologic research is tackling more and more complex questions, requiring researchers to collaborate in teams to build complex, integrated model simulations. Accordingly, the use of cyberinfrastructure is increasing due to the need for collaborative modeling, high throughput computing, and reproducibility and usability. However, the design and implementation in cyberinfrastructure to support community hydrologic modeling is still challenging because much functionality, such as the user interface for modeling, online data sharing, and different model execution environments are necessary to support modeling cyberinfrastructure. In this research, we present a collaborative, cloud-based modeling system built on the Structure for Unifying Multiple Modeling Alternatives (SUMMA) hydrologic model as an example paradigm for the design and implementation of cyberinfrastructure. The general paradigm consists of three main components: (i) a Python-based model Application Programming Interface (API) for interacting with hydrologic models, (ii) an online repository for storing model input and output files for different simulation runs, and (iii) a public JupyterHub environment for creating and running model simulations that leverages both the Python API and the online data repository. In this instance, we first created pySUMMA as an example API for interacting with the SUMMA modeling framework. Second, we used HydroShare as an online repository for sharing data and models. Finally, we used a JupyterHub instance tailored for running SUMMA model simulations and hosted by the Consortium of Universities for the Advancement of Hydrologic Science, Inc (CUAHSI). Together, these three components serve as a general example

Page 35: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3 5

of a cloud-based modeling environment that can be used along with other models and modeling frameworks, in addition to SUMMA, to foster a community supported cyberinfrastructure for collaborative hydrologic modeling.

Model and code sharing via CUAHSI hosted MATLAB onlineSpeaker: Lisa Kempler (MathWorks)

Are there analysis tools that can work with my data? Have other researchers developed code that I can reuse? How can I find these code examples and ramp up quickly so that I can apply them to my project?

With MATLAB Online hosted directly on Hydroshare.org, researchers, educations and students can access the relevant data and shared models more easily. This talk will demonstrate the use of MATLAB Online to work with hydrological data using new geospatial data access, data analytics and visualization techniques. We’ll also cover how to share work as notebooks, complete with embedded graphics, equations, and publication-quality formatting, using the new MATLAB Live Editor, enabling more transparent research and improved teaching and learning of water data science and more.

Temporal evapotranspiration aggregation method: An application for calculating evapotranspiration metrics, exploring the modifiable aerial unit problem, and shortening the time to scienceSpeaker: James Matthew Coll (University of Kansas)

Evapotranspiration metrics play a key role in parameterizing a variety of earth process systems, and accurate parameterization of these is critical to their successful use. To define these metrics, users either find published values, measure them directly, or look to remote sensing platforms to calculate these for themselves. Before the creation of large-scale remote sensing platforms such as Google Earth Engine, this process was tedious, time consuming, and poorly reproducible. Tools like the Google Earth Engine JavaScript API alleviate some of these problems but require a modest amount of programming skill to be able to efficiently harness its full potential. Alternatively, if a simple user interface is created, regional researchers who are interested more narrowly in parameterizing a region can acquire their metric of choice without having to invest the time into calculating it for themselves. To demonstrate this, a Google Earth Engine Web Application called the Temporal Evapotranspiration Aggregation Method (TEAM) was created where users can point and click over a map to query a desired metric over a specified aggregation unit, and optionally receive metric values for the area or by land cover class. Using this framework, we not only provide an application for parameterizing evapotranspiration values, but also drill into how these metrics are affected by the modifiable aerial unit problem and demonstrate how other remote sensing datasets may be extended into this same framework. See it in action at https://jamesmcoll.users.earthengine.app/view/team.

A roadmap for Earthdata remote sensing for hydroinformaticsSpeakers: Michael Gangl (NASA Physical Oceanography Distributed Active Archive Center); Catalina Oaida (Raytheon); Lewis McGibbney (NASA JPL); Jessica Hausman (NASA JPL)

The Physical Oceanography Distributed Active Archive Center (PO.DAAC) is expected to archive and distribute Surface Water and Ocean Topography (SWOT) [1] data, which will be available in 2022, after launch in 2021. SWOT will provide the very first comprehensive view of Earth’s freshwater bodies from space and will allow scientists to determine changing volumes of fresh water across the globe at an unprecedented resolution [2]. To maximize the impact and reach of this data, PO.DAAC is working to make inroads on search, analysis, and GIS integrations for the hydrology communities. This session will describe these approaches and plans. The PO.DAAC is planning to: Integrate hydrologic units based polygonal search capabilities such as HUC and shapefiles into long standing PO.DAAC search and subsetting tools; build dynamic GIS-ready shapefiles from relevant data products by mosaicing data in time and space for easier integration with the user community; and support services to serve PO.DAAC data in community portals and standards such as ArcGIS and WaterML for easy integration with industry standard tools.

While the SWOT mission is a few years away, PO.DAAC is looking to offer other data products n the above mentioned formats and services such as simulated products in the SWOT data model to enable early adopters and GRACE and GRACE-FO for water table storage[3].

[1] https://podaac.jpl.nasa.gov/SWOT [2] https://swot.jpl.nasa.gov/hydrology.htm [3] https://gracefo.jpl.nasa.gov/science/water-storage/

Page 36: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3 6

Developing Open Source Water Resources Web Applications using Tethys PlatformDaniel Ames (Brigham Young University)

Tethys Platform is a collection of free and open source software (FOSS) tools that has been carefully selected and integrated into a Django server framework to address the unique development needs of environmental and water resources web applications. Tools included with Tethys Platform include Geoserver, OpenLayers, 52North, PostGIS and others. Tethys web apps are developed using a Python software development kit (SDK) which includes programmatic links to each software component. Tethys Platform is powered by the Django Python web framework giving it a solid web foundation with strong security and performance. Numerous Tethys Apps have been developed to support CUAHSI, HydroShare, NASA, US Army Corps of Engineers and other entities. This workshop will introduce the app development environment and will show users how to create water resources web applications using this technology. Attendees should have access to a computer running Oracle Virtual Box and Ubuntu 18. We will start the workshop by installing Tethys on your computers and building a few web applications. We will discuss how to deploy these applications on the HydroShare apps server or your own server. Participants should be comfortable with basic Python, HTML, and JavaScript.

Discovering and Using Water Quality Sampling Data from Water Quality PortalDwane Young (U.S. Environmental Protection Agency)

The Water Quality Portal is a joint USGS, EPA, National Water Quality Monitoring Council effort that contains over 350 million water quality sampling results from over 400 organizations, including EPA, USGS, other federal agencies, states, tribes, and other local groups. During this workshop, attendees will become familiar with the Portal, the Portal web services, and open source tools available for interacting with the Portal and making use of Portal data. Attendees will also be introduced to the Water Quality Exchange (WQX), which acts as the data standard and publishing model for publishing data through the Portal. WQX is the de-facto standard for publishing water quality sampling data to the Portal, including physical chemical and biological data.

HydroQuality: Upload and Download Quality DataChao Chen (Boise State University); Connor Scully-Allison (University of Arizona); Chase Carthen, (University of Nevada Reno); Rui Wu (East Carolina University)

In this session we will explain and demonstrate an alpha build of the CUAHSI supported HydroQuality web application. In the interest of growing the contributions of smaller research teams and growing a diversity of data products uploaded to CUAHSI data discovery platforms, we propose a web application which provides a pipeline workflow for setting up and performing Quality Control (QC) processes on flat data files. This software will flag data values with meaningful metadata informing users at a glance: what

WORKSHOPS

Page 37: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3 7

processes were run on the data, what metrics these processes used, what software performed these checks and when these processes were undertaken. Over the course of this hour session we will show you the design and technologies we used to build this software platform, walk you through a typical use case in a live demo, and open the floor for discussion and feedback from you, the water experts who may one day use this software package.

Facilitating the development, adaptation and sharing of active-learning resources in hydrology educationEmad Habib (University of Louisiana); Melissa Gallagher (University of Louisiana); David Tarboton (Utah State University); Daniel Ames (Brigham Young University)

This workshop is offered for hydrology faculty interested in implementing or adapting active-learning, data-driven resources to their educational settings. The workshop aspires to create faculty networking and development opportunities with the overall goal of promoting and reducing barriers against adoption of active-learning resources in hydrology. The workshop will use the recently developed NSF-sponsored HydroLearn platform, along with resources from CUAHSI, HydroShare and other community platforms, to enable participating faculty to develop and share educational resources. The workshop will showcase existing seed modules and will cover best practices in developing student-centered learning activities, including the design of pedagogically-sound learning objectives and assessment rubrics. Faculty who currently teach hydrology-related courses are encouraged to participate, especially those who teach undergraduate or early-level graduate courses. Interested faculty may also be invited to participate in a follow-up funded fellowship program to engage in a semester-long adoption and field testing of the HydroLearn platform and its content. The workshop will be jointly conducted by hydrology faculty along with an expert in education research.

The Westerns States Water Council Water Data Exchange (WaDE) Workshop: Hands-on use cases for insights into water rights and use in the Western United StatesAdel Abdallah (Western States Water Council); Sara Larsen (Upper Colorado River Commission)

The Western States Water Council’s (WSWC) Water Data Exchange (WaDE) program provides an API that streamlines access to data provided by 18 western state water agencies (e.g., water rights, water supply, and water use information). WaDE is built using an agreed upon data schema that reconciles syntactic (e.g., structural) differences and addresses semantic differences through a shared, controlled vocabulary adopted by participating data providers. WaDE complements the widely available data on water discharge in the U.S. by providing water rights information and water budget-related time-series data. WaDE supports the following four distinct types of water data shared by the member states: i) water allocations (e.g., water rights and permitting data), ii) aggregated water budget estimates such as water supply, withdrawal, consumptive use, return flows, and transfers for reporting units using geospatial delineations used by the states such as counties, HUCs, and custom areas over time as time series, iii) site-specific reported water time series data (e.g., withdrawals, consumptive use, return flows), and iv) regulatory and institutional constraints at play within states/basins that regulate water supply and use in specific locations.

Page 38: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3 8

Use GDAL, PKTOOLS and GRASS for massive raster operations in hydrologyGiuseppe Amatulli (Yale University); Longzhu Shen (University of Cambridge)

Geospatial Data Abstraction Library (GDAL), Processing Kernel for geospatial data (PKTOOLS) and Geographic Resources Analysis Support System (GRASS) are powerful command line utilities that are essential for manipulating and analyzing large geo-datasets. In the context of hydrological modelling, they can be used for data pre-processing but also to extract hydrographic features (stream network, flow accumulation, flow direction etc.) from digital elevation models. In the first part of the workshop, I will explain the main principles behind these tools by illustrating examples of simple geodata processing for raster cropping and re-projection, image masking, spatial and temporal/spectral filtering. I will also explain how to maximize computational operations and process raster data more efficiently by building routines that allow the saving of temporary raster outputs in RAM and using VRT files for tiling operations in a multicore environment. In the second part of the workshop, I will demonstrate how to correctly start a GRASS session and how to automate common geo-data processing tasks. In the context of hydrological modelling, I will use the MERIT-DEM to predict flow direction, stream network extraction and basin delineation. I will also teach how to compute geomorphology features, such as profile and tangential curvature, first and second order derivatives and geomorphology classes. Basic command line knowledge (any language is fine) and general know-how of geospatial data processing and hydrological modelling are prerequisites for this workshop. Participants should bring laptops with 25GB free on their hard disk, which will be used to install a customized OSGeo-Live virtual machine with GDAL and PKTOOLS and GRASS included.

CUAHSI compute services for working with data in the cloudAnthony Castronova (CUAHSI)

Advancements in cyberinfrastructure (CI) to support cloud-based tools and services for the water science community have changed how researchers conduct, share, and publish scientific workflows. These have had a transformative impact on how our community addresses the challenges associated with interdisciplinary collaboration, reproducing scientific findings, and developing real-world educational modules. The Consortium of Universities for the Advancement of Hydrologic Science, Inc (CUAHSI) facilitates discussion around these topics, with the water science community, to better identify the shortcomings of current CI approaches and define the requirements for the next generation of cloud services. The purpose of this workshop is to introduce and solicit feedback on the current suite of CUAHSI community to computational tools to that have been designed to improve the way water science research and education is conducted in the cloud. This workshop will consist of several technologies that are actively being developed for working with data Earth surface data. Our goal is to demonstrate how these compute environments can be used in educational applications, workshops, reproducing published work, and conducting research. Participants will be presented with several approaches for working with their data within the CUAHSI ecosystem of tools. The workshop will focus heavily on interactive examples and will feature several programming languages including Python, R, and MATLAB. Participants are not required to be proficient in these languages but should bring a laptop computer, be ready to work through live examples, and willing to provide constructive feedback.

Page 39: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

3 9

Using NHDPlus Value Added Attributes to Create Useful Analytical ToolsAlan Rea (U.S. Geological Survey); Karen Adkins (U.S. Geological Survey); Michele Basile (U.S. Geological Survey)

The NHDPlus version 2 provides a robust geospatial hydrologic modeling framework for the United States (except Alaska) that is being used in many different applications. The NHDPlus High Resolution will provide even greater detail and will extend the framework to include Alaska and adds contributing streams and areas in Canada and Mexico. All versions of NHDPlus include a set of Value-Added Attributes (VAAs) which greatly improve the capabilities for upstream and downstream navigation, analysis, and modeling. Examples of these enhanced capabilities include using structured queries for rapid retrieval of all NHDFlowline features and catchments upstream of a selected NHDFlowline feature; selecting stream segments (sorted in hydrologic order) for stream profile analysis and plotting; and calculating cumulative catchment attributes using hydrologic sequence routing attributes. VAA-based routing methods were used to produce NHDPlus HR attributes such as cumulative drainage areas. This workshop will introduce the concepts behind the NHDPlus VAAs and show how they may be used in hydrologic analyses. The workshop will include in-depth examples and demonstrations. Attendees are welcome to follow along on their own laptop, but this is not a hands-on workshop. Open source code repositories containing useful tools, source code, and programming specifications will be introduced. A major goal of the workshop is to begin to build a community around these open source repositories.

Page 40: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4 0

CUAHSI Water Data Services Town HallConveners: David Tarboton, Utah State University

CUAHSI operates a number of data services to enhance access to and provide functionality for sharing and collaboration with data and models. These include the HydroShare repository for users to share and publish data and models in a variety of flexible formats to make this information available in a citable, shareable and discoverable manner, tools (web apps) that can act on content in HydroShare providing users with a gateway to computing and analysis, Wateroneflow web services and HydroClient. This users meeting will be a forum for current users and potential new users to provide input and feedback on current services, to share experiences with other users and developers, and to provide input on needed new features and services. The format will be informal open discussion with an opportunity for presentations and short demos from users who have done cool things with CUAHSI data services, as well as an update from developers about features planned. Users who wish to make a presentation (generally 5 min, but may be more if time allows) do not need to provide an abstract in advance, but should email the convener ([email protected]) prior to the meeting. At the meeting users will be invited to contribute to and vote on prioritization of a wish list of new features.

TOWN HALL

Page 41: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4 1

Posters are available for viewing throughout the duration of the CUAHSI Conference on Hydroinformatics.

A convolutional neural network for near-real time precipitation estimation from bispectral satellite informationMojtaba Sadeghi (University of California, Irvine)Co-Authors: Ata Akbari Asanjan (University of California, Irvine); Mohammad Faridzad (University of California, Irvine); Phu Nguyen (University of California, Irvine); Kuolin Hsu (University of California, Irvine); Soroosh Sorooshian (University of California, Irvine)

Accurate and timely precipitation estimates are critical for monitoring and forecasting natural disasters such as floods. Despite having high-resolution satellite information, precipitation estimation from remotely sensed data still suffers from methodological limitations. The state-of-the-art deep learning algorithms, renowned for their skill in learning accurate patterns within large and complex data sets, seem to fit nicely to the task of precipitation estimation given the ample amount of high-resolution satellite data. In this study, the effectiveness of applying Convolutional Neural Networks (CNNs) together with the Infrared (IR) and Water Vapor (WV) channels from geostationary satellites for estimating the precipitation rate is explored. The proposed model performances are evaluated over the central CONUS at the spatial resolution of 0.08-degree and at an hourly time scale. Precipitation Estimation from Remotely Sensed Imagery using an Artificial Neural Network (PERSIANN) Cloud Classification System (CCS), which is an operational satellite-based product, and PERSIANN-Stacked Denoising Autoencoder (PERSIANN-SDAE), are employed as baseline models. Results from the study demonstrate that the proposed CNN-based model (PERSIANN-CNN) is capable of providing more accurate rainfall estimates compared to the baseline models at various temporal (hourly and daily) and spatial (0.08, 0.16, 0.25, and 0.5 degrees) scales. Specifically, PERSIANN-CNN outperforms PERSIANN-CCS (PERSIANN-SDAE) by 54% (23%) in Critical Success Index (CSI) which shows the detection skills of this model. Furthermore, the Root Mean Squared Error (RMSE) of the rainfall estimates by PERSIANN-CNN was lower than that of PERSIANN-CCS (PERSIANN-SDAE) by 37% (14%) which shows the estimation accuracy of the proposed model.

A deep learning framework for short-range quantitative precipitation forecasting from satellite informationAta Akbari Asanjan (University of California, Irvine)

Predicting short-range quantitative precipitation is very important for flood forecasting, early flood warning, and other hydrometeorological purposes. This study aims to improve the precipitation forecasting skills using a recently developed and advanced machine learning technique named as Generative Adversarial Networks (GANs). The proposed Conditional GAN learns the changing patterns of clouds from infrared (IR) images, retrieved from the infrared channel of Geostationary Operational Environmental Satellite (GOES), using a dynamic and effective learning method. After learning the dynamics of clouds, the LSTM model predicts the upcoming rainy CTBT events. The proposed model is then merged with a precipitation estimation algorithm termed Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) to provide precipitation forecasts. The results of merged LSTM with PERSIANN are compared to the results of an Elman-type Recurrent Neural Network (RNN) merged with PERSIANN and Final Analysis of Global Forecast System model over the states of Oklahoma, Florida, and Oregon. The performance of each model is investigated during 3 storm events each located over one of the study regions. The results indicate the outperformance of merged LSTM forecasts compared to the numerical and statistical baselines in terms of Probability of Detection (POD), False Alarm Ratio (FAR), Critical Success Index (CSI), RMSE and correlation coefficient, especially in convective systems. The proposed method shows superior capabilities in short-term forecasting over compared methods.

Poster Presentations

Page 42: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4 2

A formal Bayesian calibration method for a multi-layer vadose zone model using Markov chain Monte Carlo simulation and field soil moisture dataIvo Arrey (University of Venda)

Inverse modelling of in situ soil water dynamics plays an important role in process understanding and in estimating soil hydraulic properties at field scale. Accurate estimates of vadose zone flow dynamics at field scales through numerical modelling have shown to have profound benefits to agriculture and water resources management in general. In this study, we used a formal Bayesian approach to estimate the unknown parameters in the hydrologic model, including its structural, and input uncertainties. We demonstrate this method using a multi-layer soil moisture data assimilation scheme in a delineated area of the Nzhelele catchment, South Africa. The system response is simulated by a numerical solution of the Richards equation implemented in a HYDRUS-1D model with the parameterized hydraulic functions, and input of suitable initial and atmospheric boundary conditions as forcing’s. Our assumption of autocorrelated, heteroscedastic and non-Gaussian distributed models as more realistic representations of the statistical properties describing the residuals are in conformity with previous studies.

A generalized model tree framework for simulating hydrologic systems: Application to reservoirsMatin Rahnamay Naeini (University of California, Irvine) Co-Authors: Tiantian Yang (University of Oklahoma); Ahmad Tavakoly (U.S. Army Engineer Research and Development Center); Bita Analui (University of California, Irvine); Amir AghaKouchak (University of California, Irvine); Kuolin Hsu (University of California, Irvine); Soroosh Sorooshian (University of California, Irvine)

Decision tree algorithms are among the popular data-driven method for simulating hydrologic systems. The models generated by these algorithms are transparent and can reveal useful information about the underlying process of hydrologic systems. Most decision tree algorithms employ a top-down tree growing procedure that involves an exhaustive search method to construct the structure of the trees. This process is computationally intensive and can be biased in variable selection. Also, the generated models cannot capture the variability of the continuous systems in many cases. These shortcomings of the decision tree algorithms motivated us to develop a new Generalized Model Tree (GMT) framework for simulating continuous hydrologic processes. Here, we employed the GMT framework to mimic human decision-making in reservoir systems and compared its performance with other algorithms such as Classification and Regression Tree (CART) and M5’.

A global Earth observation app as a data access serviceRiley Hales (Brigham Young University)Co-Author: Jim Nelson (Brigham Young University)

Reliable and accurate earth data governs analysis and decisions for professionals across many disciplines; including sciences, engineering, and politics. Many countries lack adequate environmental monitoring and modeling infrastructure leaving large regions of the world unmonitored and undocumented. Global datasets and information powered by remote sensing equipment, especially large satellite networks, have sufficiently grown in reliability and in temporal and spatial coverage to supplement and fill gaps in local monitoring. The Earth Observer Tool is a web application built on Tethys Platform that removes the technical barriers of collecting, processing, and displaying global earth observation data, enabling users to access the data as actionable information. The application automates access to NASA Global Land Data Assimilation System (GLDAS) historical data and NOAA Global Forecast System (GFS) forecasted data. Data are presented through time-enabled map animations with the ability to extract timeseries data at a point or across large regions.

A hydrologic and hydraulic modeling approach for the storm triggered cascading flood inundationMengye Chen (University of Oklahoma) Co-Authors: Xiangyu Luo (University of Oklahoma); Yang Hong (University of Oklahoma)

As the global climate change will continue to cause the unevenly distributed precipitation and the water resources, hydrological models can be the helpful tools to analyze water storage and forecast and to support the water security. This study focuses coupling a 2D hydraulic model with an existing global flood early warning system to simulate the forecast the riverine flood and stormwater accumulation at a large scale with high resolution (e.g. =<10m grid cells) using high performance computing techniques and the machine learning modules. The high resolution 2D simulation of flood extent can better facilitate the public to understand the impact and scientist to quantify the risk and damage. The model was applied to the 2017 Hurricane Harvey over the Northern Houston area which include the entire Spring Creek and Cypress Creek channel networks. The case study of the Hurricane Harvey showed the promising results of the stream flows, surface water extents and water depths with the very efficient computation speed to meet the real-time simulation demand. Since the model is capable to simulate in real-time over large area, the model driven by the

Page 43: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4 3

real time QPE (Quantitative Precipitation Estimation) and the real time QPF (Quantitative Precipitation Forecast) over the same region with the 10 m resolution overall, which will be further studied with the advanced logistic optimization projects. The model is integrated with a FEMA adapted national hydrological simulating frame work, EF5, where to support the water resource management in the national scale.

A low-cost, open source, monitoring system for collecting high-resolution water use data on positive displacement residential water metersCamilo Bastidas (Utah State University)Co-Authors: Jeffery S. Horsburgh (Utah State University); Josh Tracy (Utah State University)

We present a low-cost monitoring system to collect high resolution residential water use data without disrupting the operation of common, positive displacement meters. This system was designed for installation on top of nutating, magnetically-driven, positive displacement, residential water meters and is capable of collecting data at a high resolution. The system couples an Arduino PRO, a custom designed interface board, and a magnetometer sensor. The system was developed and calibrated at the Utah Water Research Laboratory and was deployed for testing on 3 single family residences in Logan, Utah for a period of 1 month. The system’s design and software are openly available for potential reuse. Battery life for the device was estimated to be over 2 months with continuous data collection at a resolution of approximately 4 seconds. The results using this system indicate a high level of accuracy compared to the meter where they are installed. This system is of special interest for water end-use studies, future projections of residential water use, infrastructure design and for advancing our understanding of water use timing and behavior given its ability to collect high resolution data. Given its open source characteristics, the system can be customized for the specific needs of each research. The architecture and software are presented in this poster. Results from field deployments are presented to demonstrate the applicability and usefulness of the system.

A method and software for localizing global gridded weather forecastsXiaohui (Sherry) Qiao (Brigham Young University) Co-Authors: Jim Nelson (Brigham Young University); Daniel P. Ames (Brigham Young University)

Global or national scale flood early warning systems (FEWS) can benefit developing countries and ungauged regions that lack observational data, computational infrastructure, and the human capacity to create actionable FEWS. However, existing global land surface models have typically been generated at coarse resolutions that, at least for streamflow, have little value at local scales when used for flood warning. We have developed a tool that can route the runoff generated from various land surface models (LSM) used in large-scale modelling into vector-based river network expediently. The tool works as a “coupler” of various LSM and a vector-based river routing model – Routing Application for Parallel computatIon of Discharge (RAPID). The whole procedure includes a GIS-based preprocessing toolset and an open source Python package called RAPIDpy. A set of web applications have been developed to demonstrate that this mapping method can provide support to data-scarce regions on water and flood management by enabling the estimation of streamflow from various global forecasting systems.

A visualization workflow for quantifying parameter sensitivities to uncertainties for hydrologic modelsKyla Semmendinger (Cornell University)Co-Authors: Catherine Finkenbiner (Oregon State University); David Blodgett (U.S. Geological Survey); Nels Frazier (University of Wyoming).

From regional to continental scales, models require modular components for representing individual hydrologic processes due to factors such as data availability and physical attributes. Therefore, the need for a common procedure within the hydrologic field to evaluate model output based on parameter sensitivity and performance metrics is evident. We developed a reproducible workflow, with the help of National Water Center staff, for evaluating parameter sensitivities and uncertainties using the hydrologic modeling framework of the NOAA National Water Model (NWM). The NWM simulates observed and forecasted streamflow across the contiguous United States (CONUS) several times a day. High variability in soil types, elevations, vegetation characteristics, and forcings (e.g. precipitation) across CONUS leads to complexities when comparing model outputs and observed streamflow datasets. Our workflow objectively evaluates model output as a function of parameter choice using both numerical and visualization techniques. The workflow was implemented for three case studies, provided by graduate student researchers at the National Water Center Innovators Program: Summer Institute: 1) hydrologic models of soil physical processes from the NWM and TOPMODEL, 2) channel parameter ensemble schemes and streamflow from the NWM, and 3) representations of near-coastal dynamics developed from NWM and D-Flow models. The results can be reproduced and visualized using python/R jupyter notebooks within a community GitHub code repository. Model parameter sensitivity was evaluated using variance-based methods and Bayesian theory. Uncertainty in parameter spaces were quantified to highlight the impact of unreliable input data on model output. Model parameter sensitivities and uncertainties were evaluated

Page 44: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4 4

numerically and visually to provide a comprehensive outlook on their impacts to model output. For each of the three case studies, we provided a summary and interpretation of the workflow results. Our workflow can be integrated into hydrologic modeling frameworks, like the NWM, for objective modular model and parameter scheme evaluations based on a data-driven approach for model selection.

Application and evaluation of WRF-Hydro National Model Water configuration to a headwater snowmelt dominated watershed in a mountain Karst regionIrene Garousi-Nejad (Utah State University)Co-Author: David Tarboton (Utah State University)

Physically based distributed hydrologic models are widely used to understand the hydrological dynamics of a catchment, quantify the response of the hydrologic systems to climate and land use changes, and manage water resources. WRF-Hydro is a physically based distributed hydrologic model representing the key physical processes between the land and atmosphere involved in streamflow generation from rainfall and weather inputs at a level of detail sufficient to capture the physical sensitives to changes in climate or land surface properties. The WRF-Hydro modeling framework is used by the National Water Model (NWM) to simulate storages (snow accumulation, soil moisture, canopy interception both liquid and snow) and fluxes of the system (evaporation, transpiration, surface and subsurface runoff). In this study, we applied WRF-Hydro to the Logan River watershed (within the Bear River Range in the United States) to: (1) evaluate differences between observations, our results, and operational NWM results; (2) explore the use of different parameters, and (3) evaluate alternative physical process representations. This watershed is a snowmelt dominated watershed with significant karst conduit groundwater flow. The required inputs to WRF-Hydro were obtained from CUAHSI Subsetter tool that extracts static NWM inputs (such as soil properties, land use, NOAH-MP parameters) for a watershed of interest. We used the meteorological forcing data from the NWM archive at RENCI. We examined the overall water balance with a particular focus on snow, and explored calibration of some of the infiltration and retention parameters to account for the presence of karst.

Assessing machine learning techniques for predicting macroscale catchment function from stream bacterial DNA sequencesDawn R Urycki (Oregon State University)Co-Authors: Stephen P.Good (Oregon State University); Byron C. Crump (Oregon State University)

The stream microbiome, as identified by sequencing DNA collected from the stream, has been shown to be related to catchment hydrology and has recently been used to estimate stream discharge. Given that most aquatic bacteria in streams originate in upslope environments and that stream water at outlets integrates runoff from across catchments, we posit that the stream microbiome also carries information about the macroscale catchment environment. In this study, we refine and extend methods to relate the stream microbiome to the hydrology, ecology, and geochemistry of catchments. To explore our hypothesis, we extracted, amplified, and sequenced bacterial 16S rRNA gene fragments collected at 10 stream sites in the HJ Andrews Experimental Forest in the Cascade Mountains of Oregon. We then clustered very similar sequences into operational taxonomic units (OTUs), resulting in over 4000 different OTUs present throughout our 10 study streams. We used statistical models developed through machine learning techniques to relate the bacterial community composition (i.e., relative abundance of OTUs) to hydrology, ecology, and geochemistry in the catchment and then apply these models to estimate catchment characteristics. Our models for discharge and stand age were very sensitive to the subset of OTUs used in the model. Beyond threshold values, models were less sensitive to the free parameters in the machine learning algorithms. Our approach could be used in other studies applying machine learning techniques relating bacterial DNA to ecohydrological characteristics in order to take advantage of the wealth of information contained within the stream water bacterial DNA fragments.

Data support for the Critical Zone ObservatoriesZhiyu (Drew) Li (Brigham Young University)Co-Author: Martin Seul (CUAHSI)

The NSF-sponsored Critical Zone Observatories (CZO) program has been collecting physical, chemical and biological datasets across the US for more than a decade, providing an invaluable diverse data archive for cross-discipline research. CZO data are currently spread across individual observatories in an ad hoc manner and only a fraction of data are stored in a long-term archive. CZO is partnering with CUAHSI to migrate the distributedly maintained datasets to HydroShare - a centralized long-term science data repository and sharing platform designed by domain scientists aiming to serve domain communities. In addition to a more robust data service, CZO users will also benefit from improved search & browse, fine-grain access control, sharable modeling & reproducible workflow, and various out-of-box WebApps for fast visualization and data analysis. This paper presents the latest status of the on-going CZO-to-HydroShare data migration work. The challenges faced and lessons learned are shared and discussed, which might

Page 45: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4 5

provide guidance on future science data management projects in the middle of current Big Data torrent.

Development of IT animations for engaging citizens with hydrogelogy knowledgeRuopu Li (Southern Illinois University-Carbondale)

Public awareness and participation are critical to the success of various programs in support of water resources planning and management. Although tremendous amount of scientific information has been produced by various government agencies and research institutions, effective communication of the information with the general public has been a challenge due to the technical complexity and limited staff dedicated to such endeavors. There tends to be a widening gap between the gigantic volumes of technical information and the needs for quality and comprehensible information from the public. This study developed Internet browser-based interactive computer programs using HTML5, CSS and JavaScript without the integration of conventional Adobe Flash tool. The key concepts presented by the programs include the basics of a hydrologic cycle in an agricultural context and the nature of the stream and aquifer interaction associated with irrigation activities. The project results in computerized interactive animation programs to visualize the fundamental hydrogeologic information associated with surface water and groundwater interactions.

Dimension reduction approaches to assist IoT implementation in agro-hydrologyRaghavendra B. Jana (Skolkovo Institute of Science and Technology) Co-Authors: Ivan Oseledets (Skolkovo Institute of Science and Technology)

We present a study that extends the concept of temporal stability and uses dimension reduction techniques such as Dynamic Mode Decomposition (DMD) and Discrete Empirical Interpolation Method (DEIM) to determine optimal locations for long-term soil moisture sensor installation. Such locations provide not simply the mean soil moisture values for the domain, but also help recover the variability in space across all other locations in the study domain. As with time stability, the initial analysis is dependent on an intensive sampling history. The DMD/DEIM method is an application of model reduction techniques for non-linearly related measurements. Using this technique, it is possible to determine the number of sampling points that would be required for a given accuracy of prediction across the domain, and the location of those points. The technique can be adapted for other variables of interest and to other hydro-climates easily.

Evaluating groundwater response to coastal storms– an application of hydro-informatics in southern Rhode IslandJeeban Panthi (University of Rhode Island) Co-Authors: Thomas Boving (Univeristy of Rhode Island); Soni Pradhanang (University of Rhode Island)

Groundwater response to coastal storms is complex because of the multiple processes are occurring together. Storm surge pushes salty water inland and due to the density gradient, the saltwater infiltrates to the groundwater. Heavy precipitation during storm events causes higher recharge to the groundwater in areas not inundated by storm surge, and freshwater lens sitting on top of saltwater is pushed up due to the landward incursion of saltwater pressure. The objective of the research project is to evaluate the groundwater response to the storm surge and the heavy precipitation. We have installed ground-based sensors to measure the groundwater level, salinity and water temperature in the southern coast of Rhode Island. Climate and tidal data are collected from the NOAA ground stations from the location. For delineating interface between fresh groundwater and salty ocean water, we have been surveying the aquifer using Ground Penetrating Radar in 6 transacts from ocean to inland. Integrating these data sets in a usable format for the Nor Easter winter storm in March 2018, results show that ocean surge is a more powerful predictor of water table response in near-shore aquifer than precipitation, and the freshwater lens is thinner in shore-lines than inland. This study is very important to safeguard freshwater resource because 40% of the drinking water in Rhode Island comes from the groundwater and the rate goes up to 100% in southern towns.

Examining the influence of influent water quality on chemical use at water treatment plantHui Wang (Tampa Bay Water)Co-Authors: Tirusew Asefa (Tampa Bay Water); Jack Thornburgh (Tampa Bay Water)

Chemical use is a large portion of operational cost for surface water treatment plants, which heavily depends on influent water quality. Understanding the impact of influent water quality on chemical use is critical to: (a) identify a range of favorable water quality conditions for operation; and (b) build a real-time predictive model that forecasts water quality and chemical use. Such a predictive model can serve as a decision support tool for surface water treatment plants. This study presents an explanatory data analysis (EDA) using SCADA data and chemical use data for a regional water supply utility, Tampa Bay Water, located in west coast of Florida, United States. Water quality data, including PH, color, turbidity and Total Organic Carbon (TOC), etc., collected from SCADA system for the last five years (2014 – 2018). A range of chemicals, including sulfuric acid, ferric sulfate and liquid oxygen, are analyzed. EDA analyses reveal that TOC

Page 46: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4 6

and color are the two primary water quality parameters dominating chemical use in the treatment process. In addition, potential impact of extreme hydrologic events, e.g., heavy rainfall, is also examine. This study is an important step towards building a real-time decision tool for forecasting water quality and chemical use.

Five years of innovation - Student perspectives of the National Water Center Innovators Program: Summer InstituteCatherine Finkenbiner (Oregon State University)Co-Authors: Kyla Semmendinger (Cornell University); Danielle Tijerina (CUAHSI)

The National Water Center Innovators Program: Summer Institute is a seven-week residential program where graduate students and Theme Leaders (i.e. senior academic faculty and federal scientists) conduct collaborative research projects that involve rapid prototyping of new ideas focused on the NOAA National Water Model (NWM). The Summer Institute, jointly sponsored by Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) and the National Weather Service, fosters an environment for innovation and facilitates the engagement of students from many universities to exchange ideas and advance model formulation and function. The inaugural Summer Institute, initially named the National Flood Interoperability Experiment (NFIE), was held in 2015. For the past five years, themes were selected based on the research and development focus of the NWM. The 2019 Summer Institute was held from June 10th - July 24th, had eighteen graduate student researchers and was led by five Theme Leaders, all with different research interests and expertise. Over a two-week intensive training period, the graduate students formed teams, structured research projects and questions while collaborating with Theme Leads, federal agencies and National Water Center staff. The students spent the following five weeks executing projects encompassing the 2019 themes of modeling near-coastal dynamics, scaling hydrologic and hydraulic models from small basins to regional watersheds, and hydroinformatic expansions on the previously mentioned topics. Beyond the research component of the Summer Institute, the students were exposed to educational and networking opportunities available to them at the National Water Center. Here we present the measurable outcomes from the last five years of the Summer Institute including the number of student participants, partnerships with universities and organizations, as well as soft outcomes such as networking opportunities, experiential learning, friendships, and personal accounts. Each of these outcomes fulfill the mission of the program: “to develop an innovation incubator where graduate students from many universities can exchange ideas and advance concepts that, although developed over a short timeframe and smaller study areas, are illustrative of issues that affect the functions of the NWM.”

From flood forecast to flood impactChris Edwards (Brigham Young University) Co-Author: Jim Nelson (Brigham Young University)

As flood frequency and flood duration on a global scale increase more effective tools to prepare for future disasters are needed. A new approach needs to be taken to provide essential flood mapping information sufficient to provide warnings that can improve response before and recovery after an event. Using the Group on Earth Observation’s (GEO) Global Water Sustainability (GEOGloWS) global streamflow forecast services and the Height Above Nearest Drainage (HAND) method we have piloted a solution that can be used to help provide both flood inundation maps that lead to flood impact. This approach is free and easy to use so that it is accessible to anyone who needs the information. It is open source so that users can easily view, extend, or use the model for their own purposes.

GEOGloWS global streamflow forecastKyler Ashby (Brigham Young University)Co-Author: Jim Nelson (Brigham Young University)

The vision of the Group on Earth Observations (GEO) is to realize a future where decisions and actions, for the benefit of humankind, are informed by coordinated, comprehensive and sustained Earth Observation information and services. The GEO Global Water Sustainability Initiative (GEOGloWS) is working to provide relevant, actionable information about water that promotes the use of earth observations while strengthening observational networks in local operational frameworks. By leveraging advances in earth observations, numerical weather predictions, supercomputing, hydrologic modeling, cloud services, big data hydroinformatics, the GEOGloWS partnership has established a service that provides streamflow forecasts on every river in the world. One that naturally brings together the relevant disciplines necessary to create a holistic, earth-systems approach to provide actionable water resources information to regional, national and local decision-makers. This system is capable of transforming the way we understand, analyze, and solve critical problems that requires streamflow information.

Page 47: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4 7

Geomorpho90m: Global high-resolution topographic variables for environmental modelingGiuseppe Amatulli (Yale University)

Topographical relief involves the vertical and horizontal variation of the Earth’s terrain and it drives processes in hydrology, climatology, geography and ecology. Its assessment and characterization is fundamental for various types of modeling and simulation analysis. In this regard, the Multi-Error-Removed Improved Terrain (MERIT) Digital Elevation Model (DEM) currently provides the best high-resolution DEM globally available, at a 3 arc-second resolution (90m), due to the removal of multiple error components from the underlying SRTM3 and AW3D30 DEMs. To depict topographical variations worldwide, we developed a new dataset comprising different terrain features derived from the MERIT-DEM. The fully standardized topographical variables consist of slope, aspect, eastness, northness, roughness, terrain roughness index, vector ruggedness measure, topographic position index, stream power index, convergence, profile/tangential curvature, first/second order partial derivative and 10 geomorphological landform classes with their parameters features (intensity, exposition, range, variance, elongation, azimuth, extend and width). To assess how potential errors in the MERIT-DEM affect the derived topographic variables, we compared our results with the same variables derived from the National Elevation Dataset (NED), which is the best-available gridded elevation dataset for the United States. We compared the two data sources by calculating the first order derivative (i.e., rate of change through space measured in degrees) of the difference between a MERIT- derived vs. a NED-derived topographic variable. All newly-created topographic variables are readily available at a 3 arc-second resolution, for use as input data in various environmental models and analyses in the field of geography, geology, hydrology, ecology and biogeography.

HydroEval: A web interface for performance evaluation of hydrologic model simulationsNavid Jadidoleslam (University of Iowa) Co-Authors: Radoslaw Goska (University of Iowa); Witold F. Krajewski (University of Iowa); Ricardo Mantilla (University of Iowa)

Assessment of hydrologic model performance in space and time is challenging due to large data size and multitude of the observations and hydrologic models. To address this challenge, the authors developed an integrated web interface that allows visualizing observed and simulated timeseries and analyzing model evaluations. The web-based interactive platform enables user to evaluate and export model(s) simulations performance for any time period and export summary reports. The results can also be visualized as maps in georeferenced context of relevant factors. Examples of model performance evaluations are given for soil moisture and streamflow data. The simplicity and modularity of the developed platform facilitates addition of new datasets for evaluations with minimal effort or expertise in informatics.

Input parameterization and database development for riparian model in glaciated landscapeMarzia Tamanna (University of Rhode Island)Co-Authors: Soni M. Pradhanang (University of Rhode Island); Arthur Gold (University of Rhode Island); Philippe Vidon (SUNY College of Environmental Science and Forestry); Kelly Addy (University of Rhode Island)

Riparian zones are widely used as best management practices (BMPs) to mitigate the impact of agriculture on the quality of our waters. Advancing Existing tools such as Riparian Ecosystem Management Model (REMM) evaluates the effectiveness of the buffering capacity of riparian zones to mitigate Nitrogen losses to streams. REMM requires a considerable amount of site-specific information to parameterize and run, which often leads users to rely on default parameters. REMM simulations are also not bounded by maximum or minimum values, which can lead to unrealistic simulation results if the model is poorly parameterized or not properly validated. This study focuses on the glaciated settings of US Northeast and US Midwest and provides an additional way to validate REMM results by comparing REMM estimates of riparian functions with respect to Nitrogen and Phosphorus with those collected in field. We developed site specific input database for a number of riparian sites based on information from hundreds of studies published on riparian zones. The product from this study will aid the managers and landowners to be able to take riparian zone management decisions by placing, restoring and protecting riparian zones more effectively as BMPs for water quality protection.

Integrating hydrologic modeling web services with online data sharing to prepare, store, and execute models in hydrologyTian Gan (Utah State University) Co-Authors: David Tarboton (Utah State University); Tseganeh Gichamo (Utah State University)

Web based apps, web services, and online data and model sharing technology are becoming increasingly available to support hydrologic research. This promises benefits in terms of collaboration, computer platform independence, and reproducibility of modeling workflows and results. In this research, we designed an approach that integrates hydrologic modeling web services with online data sharing systems to support reproducible hydrologic modeling work. The

Page 48: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4 8

general architecture design consists of a user interface layer, a data service layer, and a data storage layer. We used two example systems to implement this approach as a case study, which includes HydroDS, a system of modeling web services for input data preparation and execution of a snow model, and HydroShare, a hydrologic information system for sharing hydrologic data, models, and analysis tools. We developed a web app to serve as the user interface layer to simplify the use of HydroDS. HydroDS acts as the data service layer that receives web requests from the web app to process the data and execute the model. HydroShare is the data storage layer that supports storage and sharing of the results generated by HydroDS. Snowmelt modeling for a test watershed in the Colorado River Basin served as use cases to validate integration of the functionality implemented. We demonstrate that, with the resulting system, users can prepare model inputs or execute the model through the web app without writing code. The model input/output files and associated metadata are stored and shared in HydroShare. These files include a Python script that is automatically generated by HydroDS to help document and reproduce the model input preparation workflow. Once stored in HydroShare, researchers can share or formally publish the modeling work so that others can directly discover, access, repeat, or modify the modeling work in HydroShare. This approach contributes to reuse of open source cyberinfrastructure to support web-based simulation, which improves research reproducibility and the usability of hydrologic modeling web services. Moreover, the three-layer architecture approach can be adopted or adapted to integrate other hydrologic modeling web services with data sharing systems like HydroShare to better serve the hydrologic science community to solve critical water issues.

Interactive data management tools in Jupyter notebooks and Tethys platformScott Christensen (U.S. Army Engineer Research and Development Center)Co-Authors: Marvin S. Brown (U.S. Army Engineer Research and Development Center)

The US Army Corps of Engineers relies on numerical simulations to support design, planning, and analysis tasks involving many aspects of the environment. A common element in these extensive modeling efforts is the need for data management. As part of a team seeking to streamline hydrodynamic modeling workflows, we have created a set of tools that support various data management tasks including discovery, retrieval, organization, manipulation, and archival. Quest, an extensible Python library, provides an Application Programming Interface (API) that facilitates automating these data management tasks. Quest supports many commonly used data sources including NASA, NOAA, and the USGS, but is also capable of dynamically adding data sources through plugins. This flexibility enables Army Corps researchers to access local data repositories and public data services using the same tools. In addition to Quest as the core library, we have created additional tools that leverage Quest’s data automation capabilities to provide interfaces for users to interactively explore and visualize their data. Quest is designed to integrate with the PyViz tool stack so that it can be used in Jupyter notebooks with interactive widgets and plots. We have also created a web application using Tethys Platform that provides a map-based interface for the Quest API. These tools enable an interactive user experience when searching for and retrieving environmental data needed to build critical environmental models and workflows. In this presentation, we demonstrate the capabilities of these tools by showing several example workflows using both Jupyter notebooks and the Quest Tethys App.

Leveraging global and local data sources for flood hazard mitigation: An application of machine learning to ManilaMasahiko Haraguchi (Harvard University)

The objective of this paper is to illustrate how developing countries with limited datasets and capacity can utilize global hazard data to support risk-informed decision-making at the local level. Using urban hydrologic models for flood risk assessment requires the collection of intensive data for model calibration, and even after such an effort leads to considerable uncertainty for spatially specific risk assessment. The case study in this paper examines flooding events in Metro Manila, which routinely experiences the disasters. We explore whether relationships between flood occurrence in Manila and remotely-sensed environmental data, such as rainfall amounts and vegetation moisture could be established using machine learning techniques such as visualization, decision tree, and logistic regression. The study demonstrated that a model that uses an appropriate rainfall index is better than one that uses only daily rainfall amounts or adds a vegetation index. The best predictive models are found to be: (i) one that uses rainfall type and rainfall amount; (ii) and one that integrates all the information including, rainfall amount, rainfall type and vegetation index. The case study demonstrates that globally available data used with machine learning techniques can be effective for local flood management.

Page 49: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

4 9

Machine learning and remote sensing to identify temperature and precipitation relationships to rainfed wheat productivityDaniyal Hassan (University of Utah) Co-Authors: Ryan Johnson (University of Utah); Steven J. Burian (University of Utah)

Agricultural production is the most important sector in developing nations where food and fiber prerequisites determine national security, economic growth, and stability. Due to prohibitive cost and development of extensive irrigation networks, many of these regions depend exclusively on seasonal precipitation to satisfy crop water requirements. As part of the water, energy, food (WEF) nexus we evaluate growing season precipitation and temperature influences on wheat productivity in the Potohar Plateau, a rainfed agricultural region in Northeastern Pakistan. NASA TRMM, GPM, and GLDAS 3-hr remote sensing data bypass sparse data availability and provide the necessary resolution to evaluate meteorological agricultural feedbacks. Using Scikit-learn machine learning algorithms, over 64 meteorological parameters are evaluated for relationship strength to wheat yield productivity. Our results show January to March cumulative precipitation, February to March cumulative precipitation, February minimum temperature, February maximum consecutive drought duration, and February mean consecutive drought duration as the most influential meteorological parameters on yield productivity contributing to feature importance of 20%, 12.5%, 10.7%,4%, and 3.5%, respectively. From this research, this region’s wheat shooting and heading phases are especially sensitive to February drought and low temperatures, particularly sub-freezing. This methodology sets the foundation for globally identifying climate-rainfed agricultural yield relationships, rainfed agricultural regions most vulnerable to climate change, and informing decisions on infrastructure resource allocation.

Modelling groundwater recharge with multiple climate models in machine learning frameworksKevin Achieng (University of Wyoming)

Groundwater supplies 70% of global irrigation water needs; 25% of total freshwater consumption the United States; and a source of safe drinking water to 90% of United States rural population. Climate models are increasingly being used to simulate the groundwater recharge. However, these climate models often have uncertainty in their recharge predictions. These uncertainty in climate models’ predictions stem from difference in the models’ structure, the models’ parameters, and the models’ physics. In this study, ten regional climate models (RCMs) are used to model groundwater recharge. The RCMs used in this study were obtained from the North American Regional Climate Change Assessment Program (NARCCAP). In order to combat the uncertainty in the RCMs’ recharge predictions, the predictions are averaged in machine learning frameworks. The machine leaning models used in this study include the artificial neural network (ANN), the deep neural networks (DNNs), and the support vector regression (SVR) models. Results suggest that the radial basis function-based SVR model was the superior model in modelling recharge.

Modelling coastal hydrodynamics under climate change impactsFateme Yousefi Lalimi (Arizona State University and Northern Arizona University)Co-Author: Ian Walker (Arizona State University)

Coastal zones are transitional zones between ocean and land and are among the most complex systems on earth. The stability of these environments are threatened by climate change impacts such as sea level rise and increasing the frequency and magnitude of storm events. In this work, we use long-term time series of wave climate data and employ an open-source model to simulate coastal hydrodynamics influencing coastal dunes in Arcata, CA, which is experiencing among the highest rates of relative sea level rise in California. To validate the model, we apply it to existing vegetation characteristics and elevation data obtained from cross-shore surveys along the shoreline. Then, we evaluate the amount of erosion and deposition of the system controlled by hydrodynamics and vegetation processes under future scenarios of sea level rise and increasing storminess.

Multiscale routing – integrating fine scale fuels treatments into a watershed ecohydrologic modelWilliam Burke (University of California, Santa Barbara)Co-Author: Naomi Tague (University of California, Santa Barbara)

Integrating model processes across scales is central to the goal of making models both physically realistic while still useful and accessible to managers and decision-makers who operate at watershed or regional scales. Current development of the Regional Hydro-Ecologic Simulation System (RHESSYs) has focused on the inclusion of fire and fuels treatments. Management actions, such as fuel treatments, as well as fire effects, especially for moderate severity fires, occur at the scale of individual trees, and the subsequent modification to stand structures can lead to watershed-scale effects. A key challenge is how to incorporate these tree changes into watershed scale models that can be used to assess impacts on streamflow, groundwater recharge and landscape-scale carbon and nutrient cycles. RHESSys, similar to other watershed scale ecohydrology models, is typically constrained by computation and data to resolutions

Page 50: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

5 0

~30m, that preclude explicit modeling of tree-scale interactions. To bridge this gap between tree scale effects and watershed processes, we have developed multiscale routing. Multiscale routing adds multiple non-spatial units for each of the smallest spatial model units, and subsequently allows for the transfer of water and nutrients between those non-spatial units. This approach enables the simulation of process and fluxes at scales smaller than RHESSys has traditionally been able to model, and with a relatively light increase to parameters and computational load. This broader architecture also has a variety of potential future uses, including more accurate and effective upscaling of fine-scale remote sensing and the ability to characterize the complex routing that occurs at fine scales in urban areas. Multiscale routing enables us to better characterize fine scale processes, without sacrificing the broader versatility of the RHESSys model or the ability to produce model outputs at scales useful to decision makers.

On the application of data mining self-organizing map (SOM) in meteorological drought clusteringArash Modaresi Rad (Boise State University)Co-Author: Mojtaba Sadegh (Boise State University)

Appropriateness of optimum clusters of raingauge stations is investigated for spatiotemporal meteorological drought applications. The study area is the semi-arid Karkheh watershed (western Iran), representing 16 raingauge stations with 41 years of data. The Kendall-tau and Mann-Kendall tests detected annual rainfall data dependency and a significant temporal trend and therefore, those stations with data dependency were removed from our analysis. A boxplot approach identified rainy months for SPI drought analysis (time scales of 3-, 6-, 9- and 12-months). Self-organizing map (SOM), is a popular unsupervised neural network model for the analysis of high-dimensional patterns in data mining applications and it has been used in modeling rainfall-runoff processes, forecasting flood events and identification of homogeneous regions. SOM was used to identify homogenous regions of drought and several cluster validation techniques such as Silhouette, C-index, Davies and Bouldin index, and Calinsky and Harabsz index was used to find partitioning that best fits the underlying data. Furthermore, these indices were used to find the optimum number of clusters and to ensure the maximum homogeneity within each cluster. Cluster appropriateness for spatiotemporal drought applications was based on the achievement of relative coherency, observed among SPI time series graphs of member stations. According to results, all time scales represented non-coherent clusters, whereby, at least one raingauge station had different SPI time series behavior compared to other cluster members. This research has shown that identification of cluster formation and the optimum number of clusters based on validation processes require observation for relative coherency behavior, prior to spatiotemporal meteorological drought applications.

Parameter Based Ensemble Solutions for the NWMAustin Raney (University of Alabama)Co-Authors: Iman Maghami (University of Virginia), Yenchia Feng (Stanford University), and David Blodgett (U.S. Geological Survey)

The process of hydrological model calibration is often burdensome and time consuming still often underrepresenting extreme hydrologic events such as floods and droughts. With the advent of forecasting hydrological models, it has become increasingly important to best represent both average and outlying events. A common approach to improve model representation across the gamut of hydrological events includes creating ensemble outputs by perturbing forcing inputs to the model. However, this approach does not bode well when using an uncalibrated model (probably need to cite this). Currently, 100% of the National Water Model version 2 (NWM) channel parameters remain uncalibrated due to the processing time and disk space required to calibrate such a model at a continental scale. This may also be because of the new complex channel formulation added in version 2. Additionally, in NWM calibrated domains only a handful of model parameters are used during the calibration process. To improve upon these issues, in this study we hypothesis that by perturbing NWM parameters spaces, generating numerous model outputs, an ensemble of outputs will more accurately represent outlying (e.g. flooding, droughts) hydrologic events than the current formulation. In this study, we test our hypothesis on the channel parameters that dictate the characteristics of the idealized trapezoidal channel present in the NWM. A python framework for deploying and tracking numerous Docker containers running instances of the NWM was built to make this task possible. This framework allows for a user to simply supply the required input files to run the NWM and the framework will spin-up and execute NWM runs. In doing so, we hope to assert a hypothetical solution implementable operationally that more accurately represents outlying hydrological events than the current operational model.

Physics-guided deep transfer learning: An application in predicting lake temperatureJared Willard (University of Minnesota) Co-Authors: Xiaowei Jia (University of Minnesota); Anuj Karpatne (Virginia Tech); Jordan S. Read (U.S. Geological Survey); Jacob Zwart (U.S. Geological Survey); Alison Appling (U.S. Geological Survey); Michael Steinbach (University of Minnesota); Paul Hanson (University of Wisconsin); Vipin Kumar (University of Minnesota)

Page 51: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

5 1

Current machine learning models to predict lake temperature are often limited by both paucity of labeled data and the inability to output values consistent with the laws of physics. Though the recent deluge of sensor data in water resources has created new opportunities for knowledge discovery with the use of deep learning, many lakes still have little to no observation data and have thus been inaccessible for traditional deep learning models. This paper introduces a physics-guided deep transfer learning (PGDTL) framework that uses data from nearby lakes with similar depth profiles and limnological properties to predict a given lake’s temperature using no direct observations from that lake. This framework also leverages natural science domain knowledge to output solutions physically consistent with conservation laws and properties of water density, while also reducing the error from the more traditional deep learning frameworks such as the simple Long Short Term Memory (LSTM). We find that PGDTL can outperform the state of the art calibrated General Lake Model (GLM) without using any labeled data from the lake itself.

Prediction of road salt on suburban watershed areas using Artificial Neural Networks (ANN) and analysis its impactKhurshid Jahan (University of Rhode Island)Co-Author: Soni M. Pradhanang (University of Rhode Island)

Road salts in stormwater runoff in both urban and suburban areas are of concern to many urban and suburban dwellers. The chloride-based de-icers, i.e., sodium chloride (NaCl), magnesium chloride (MgCl2), and calcium chloride (CaCl2) are dissolved in stormwater runoff and also percolated into the soils during storage, and transport. Event-based stormwater runoff was considered to predict the chloride concentration and predict the seasonal changes in different sites in a suburban watershed area. Water quality data for 34 rainfall events (2016-2019) greater than 0.5 inch were used in this study. The Artificial Neural Network (ANN) model was developed using rainfall amount, turbidity, total suspended solids (TSS), Dissolved organic carbon (DOC), sodium, chloride, and total nitrate. Data were trained using the Levenberg-Marquardt backpropagation algorithm. The model was applied in six different sites within a small location and predict the seasonal change. Sensitivity analyses conducted for the study to determine the influence of input variables on the dependent variable to check the uncertainty of the study result. The study aims to develop the model, predict the seasonal changes of the chloride concentration and assess its impact on the roadside areas.

Process-guided deep learning for water resources predictions: a case study with lake water temperaturesJordan S. Read (U.S. Geological Survey)Co-Authors: Xiaowei Jia (University of Minnesota); Jared Willard (University of Minnesota); Alison P. Appling (U.S. Geological Survey); Jacob A. Zwart (U.S. Geological Survey); Samantha K. Oliver (U.S. Geological Survey); Anuj Karpatne (Virginia Tech); Gretchen J.A. Hansen (University of Minnesota); Paul C. Hanson (University of Wisconsin-Madison); William Watkins (U.S. Geological Survey); Michael Steinbach (University of Minnesota); Vipin Kumar (University of Minnesota)

The recent growth in environmental data is staggering, and many opportunities exist to direct water resources data and computing resources towards answering societally relevant questions. Data growth and methodological advances have introduced deep learning techniques that improve prediction accuracy and aid can scientific discovery. Hybrid models that integrate theory with state-of-the art empirical techniques have the potential to improve predictions while remaining true to physical laws. We designed a Process-Guided Deep Learning (PGDL) hybrid model to predict water temperature. The PGDL has three primary components: a deep learning model with temporal awareness, theory-based feedbacks (model penalties for violating conversation of energy), and model pre-training to initialize the network with water temperature predictions from a process-based model. We evaluated the PGDL model performance compared to a deep learning (DL) and a process-based (PB) model in conditions where training data were sparse and when predictions were made outside of the range in the training dataset. The PGDL model performance was superior for all modeling conditions (when ten or more temperature observation dates were available) compared to PB and DL models. The PGDL model also performed well when applied to temperature predictions for sixty-eight lakes in the Midwest U.S., with a median RMSE of 1.21°C during the test period (range of 0.78° to 2.39°C). This case study demonstrates that integration of scientific knowledge into deep learning tools shows promise for improving predictions of many societally-relevant environmental variables.

Real time monitoring of surface waters using digital repeat photography by a large network of camerasBijan Seyednasrollah (Northern Arizona University and Harvard University)Co-Author: Andrew D. Richardson.

Global water, carbon and energy cycles largely depend on the interactions between surface water, vegetation and atmosphere. However, extensive, continuous and accurate in-situ observations at high temporal resolution are scarce and often not available in real time. This can be addressed by monitoring wetlands using digital repeat photography and real time processing. In this technique a camera (often known as a phenological camera or PhenoCam) is

Page 52: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

5 2

mounted near the surface to capture images at high frequency with the goal of extracting quantitative metrics to study the ecosystems. We introduce a large network of cameras, named the PhenoCam Network, that collect real time observations using near surface remote sensing from nearly 60 wetlands including freshwater and saltwater ecosystems. With an emphasis on North American ecosystems, the aquatic cameras are distributed globally from 4°N to 68°N latitude and from 11°E to 150° W longitude. The cameras capture and upload visible range and near infrared images on 15-minute or 30-minute bases to our central server where the images are nightly processed and analyzed and the generated time-series become freely available in near real-time. We process images by delineating a region of interest for each camera and then extracting color-based statistics from each image. This extensive observatory network created a total of more than 130 site-years of high temporal resolution data from aquatic ecosystems that are great resources for studying the ecosystem processes across scales. While the PhenoCam network and datasets are extensively used to study the seasonal cycles of terrestrial vegetations, we would like to encourage the hydrology community to utilize this dataset a source of in situ observations for studying aquatic ecosystems and surface water processes. As the images are publicly available under a Creative Commons (CC) license, we believe the image archive can be used to quantify a wide range of metrics such as water level, snow depth, and spatial distribution of solar radiation, in addition to aquatic vegetation seasonal cycles.

Simplifying the development and deployment of water resources web applications using Tethys platformRohit Khattar (Brigham Young University) Co-Author: Daniel P. Ames (Brigham Young University)

From the dawn of the internet, water resources data scientists and engineers have engaged with the challenge of developing and deploying interactive water data visualization and analysis web sites. This challenge is characterized by ever-changing internet technologies, new and endlessly varying programming languages and libraries, rapidly growing datasets, and increasingly complex analytical and modeling techniques. It is likely that such challenges will exist for many future generations of hydroinformaticists. Tethys Platform for water resources web apps has been developed to help with this challenge. This platform combines a number of key visualization and data management technologies within a Django-based Python programming environment that simplifies deploying GIS-enabled water resources web apps. The system provides developers and users with an app portal, not entirely unlike the app paradigm that is common on tablets and mobile phones, where each app is developed, tested, deployed, and operated independently of other apps in the same portal. This presentation will give an architectural overview of the free and open source Tethys Platform and will illustrate the capabilities of the framework.

Topographically conditioned parameter grids for mechanistic, statistical, and machine learning hydrologic modelsTheodore B. Barnhart (U.S. Geological Survey) Co-Authors: Roy Sando (U.S. Geological Survey); Seth Siefken (U.S. Geological Survey); Peter McCarthy (U.S. Geological Survey)

Mechanistic, statistical, and machine learning models of hydrologic systems require parameters representative of conditions on the land surface. Often, parameters are computed for hydrologic modeling units or the area upstream of a forecast point using a delineated boundary and zonal statistics. We introduce a toolset to compute parameter grids that are conditioned by the underlying topography. The primary benefit of using the toolset is that it eliminates the need for repeated delineation of modeling units, drastically reducing the time needed to generate parameters. A topographically conditioned parameter grid is generated by first deriving flow direction and upstream accumulated surface area grids from a digital elevation model. Second, the parameter of interest is accumulated following the flow direction grid generated in the previous step. The accumulated parameter grid is then normalized by the upstream accumulated surface area grid resulting in a grid where each cell holds the mean upstream value of the parameter. The calculation of the final grid can be altered to account for missing data in the parameter grid of interest. The topographically conditioned parameter grid can then be queried to train a machine learning model, parameterize a mechanistic hydrologic model, be used as a basin characteristic for statistical models of high and low flows, or be used to drive Web applications that serve hydrologic predictions for user-specified locations. The toolset currently works for parameters such as aspect and slope; however, it can also be used on input data sets such as precipitation and air temperature. Work is ongoing to extend the toolset to more complex parameters, such as longest flow path, for statistical modeling.

Tracking historic land-use change using machine learning techniques: Random forests and support vector machineJiada Li (University of Utah)Co-Authors: Steven Burian (University of Utah); Carlos Oroza (University of Utah)

The Sugar House area in Salt Lake City has become a denser urban drainage catchment with the growing urbanization

Page 53: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

5 3

in recent years. So far, it is still questionable how much extend the land-use has been changed in the urbanization process. For the sake of tracking the historic land-use change of Sugar House urban catchment, a series of 10-years aerial images from 2007 to 2017 are collected from Google Earth. Considering the imperviousness percentage as the indicator of land-use change, this study adopted two machine learning classification algorithms: random forest and support vector machine to classify the pervious and impervious areas. To improve the accuracy of the land-use classification, principal component analysis was incorporated into the classifiers for reducing image pixel’s dimensionality. The results of this study show the imperviousness percentage of Sugar House area slightly increased from 2007 to 2017. Random forests algorithm reported an increase of imperviousness ratio from 48.75% to 50.21% while support vector machine technique presented an increment from 51.00% to 52.89%. Future work will center on evaluating the performance of these two machine learning approaches in flooding risk representation.

Tracking long-term lake dynamics with Landsat imagery and topographic/bathymetric datasetsDavid M. Weekley (University of Kansas)Co-Author: Xingong Li (University of Kansas)

Water distribution, both spatially and temporally, are critical aspects of the environment with dramatic effects on ecology, economy, and human welfare. While water presence and surface area are often easy to detect using standard remote sensing techniques, water volume calculations require additional information such as water surface elevation traditionally acquired using in-situ gauge monitoring stations or space-based altimetry satellites, both of which are generally limited to larger lakes and reservoirs. This research uses Google Earth Engine to estimate long-term lake dynamics (surface elevation, surface area, volume, and volume change) for multiple reservoirs using Landsat imagery combined with Lidar and bathymetry data products. The accuracy of water surface elevation estimates was analyzed using a variety of water indices (NDWI, MNDWI, AWEI), segmentation thresholds, water boundaries, and statistics. Additionally, image contamination, such as cloud cover, shadow, snow, and ice, are identified in each image via the Pixel QA band in the Landsat Surface Reflectance data product which serves to increase the water surface elevation accuracy as well as provide increased temporal resolution by allowing analysis of “imperfect” imagery. Water surface area, volume, and volume change were then calculated using elevation/surface area/volume relationships derived from the combined Lidar/bathymetry data providing 40+ years of lake dynamic data applicable to a wide-variety of fields and interests.

Using LSTM for daily streamflow forecastingShiva Shrestha (University of Rhode Island)

Artificial neural network (ANN) models are employed to water resources and hydrology problems. Forecasting of streamflows is one of key tool for proper water resources planning and management as it aids for flood caution, operation of flood-control-purposed reservoir, determination of river water potential, production of hydroelectric energy etc. There have been many studies with recurrent neural network (RNN) on this matter. Here, the attempt is to use Long Short-Term Memory (LSTM) for daily stream flow forecasting. LSTM is an RNN architecture which is better suited to solve many time series problems where feed-forward networks using fixed size time windows finds it difficult. They are suited to time series analysis since there can be lags of unknown duration between important events in a time series. The choice of LSTM over RNN is motivated due to their property with the exploding and vanishin gradient problems that training traditional RNNs encounter.

Using a 3D model to investigate the spatial patterns of algal growth in western Lake MichiganBahram Khazaei (University of Wisconsin-Milwaukee)Co-Authors: Harvey A. Bootsma (University of Wisconsin-Milwaukee); Hector R. Bravo (University of Wisconsin-Milwaukee)

Algal dynamics play an important role in biogeochemical functioning of the coastal areas and inland water bodies. In particular, analysis of the water quality samples of the western coast of Lake Michigan have shown that Cladophora is an important component of the phosphorus cycle in the lake. Mussel activities is also another driver of phosphorus variability and Cladophora abundance in the nearshore shallow waters. Yet, the spatial distribution of Cladophora and its interactions with other biogeochemical and physical variables of the system have not been studied well. To understand the patterns of Cladophora variations spatial data of temperature, phosphorus availability, wave conditions, and light are required near the bottom which all can be provided by a 3D hydrodynamic model. That highlights the need for development of a coupled model that can simulate physical and biogeochemical processes simultaneously. Modeling physical processes is a challenging task since it requires intensive and complex numerical computations of the continuity, momentum, and energy equations. Additionally, simultaneous and extensive observations of meteorological variables, as well as sufficient measurements of currents, water temperature, and transported substances are required to run and validate such models. Numerical modeling of the water systems and hydrodynamic processes

Page 54: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

5 4

has improved remarkably with the advancement in computational tools and techniques. Better understandings of the physical processes will indeed result in better estimations of the biogeochemical characteristics of the aquatic systems. This study explains a successful example of coupling two physical and biogeochemical models in the Lake Michigan nearshore zone around Milwaukee. The physical model is nested into the National Oceanic and Atmospheric Administration (NOAA) Lake Michigan model of the Great Lakes Coastal Forecasting System and is based on the Princeton Ocean Model. The biogeochemical model is developed to simulate the phosphorus cycle in dissolved and particulate formats and comprises of Cladophora, mussel, and sediment storage modules. The current model needs significant amount of observations and data processing to carry out the simulations including air temperature, dew point temperature, wind, cloud cover, and heat flux data at the surface of the lake and for the entire simulation period required for running the model, as well as in situ point observations of currents, water temperature, particulate, dissolved, and soluble reactive phosphorus, mussel density, Cladophora biomass, and Cladophora phosphorus content necessary for model validations. We applied our model to the outflows of the Milwaukee South Shore Wastewater Treatment Plant in Lake Michigan in two different years, in order to help establish phosphorus discharge limits for the plant. Results indicated reasonable agreement between simulated and measured phosphorus concentrations and Cladophora biomass and phosphorus content. Results have also shown that including mussel module in the modeling framework has improved Cladophora simulations. Water quality assessment in a depleting aquifer based on data acquisition from a comprehensive groundwater monitoring network in Khorasan Razavi, IranSeyedehfatemeh Seyedi (Marquette University)Co-Authors: Hamidreza Khazaei (Ferdowsi University of Mashhad); Seyedehfatemeh Seyedi (Marquette University); Kamran Davari (Ferdowsi University of Mashhad)

Neyshabur basin is one of the most populated regions in Northwestern, Iran, yet a heavily stressed hydrologic system. A semi-arid climate has dominated the basin leading to a considerable reduction in precipitation in the past 50 years. The population has almost doubled during this time and remarkably increased water demand in the basin. Nearly half a million people live in the basin and their economy is dependent upon agriculture. Surface water supplies less than 15% of the water demand, therefore, groundwater is the main source of water in the basin. Improper management of water resources and lack of alternatives for existing water-intensive agricultural practices has led to the depletion of Neyshabur aquifer. However, significant efforts have been made for monitoring water levels and water quality parameters across the aquifer. During the period of 1987-2015, the water level in 62 wells has been observed and recorded as a part of a groundwater monitoring program. Additionally, a suite of water quality parameters in these wells or in their adjacency has been measured. Observations showed that water level in majority of the wells had declined by about ~1 m per year in the studied time period. Multiple water quality parameters including electric conductivity (EC), pH, major ions concentration, total dissolved solids (TDS), total hardness (TH), sodium adsorption ratio (SAR), and sodium percentage (%Na) were examined in some of the most depleted wells. EC, pH, TDS, and TH were used to assess the suitability of groundwater for domestic purposes, while SAR, TDS, and NA% were employed for irrigation suitability assessment. Results indicated that in the wells with significant groundwater depletion, water quality has degraded considerably. Based on the EC, TDS, and TH of all wells, groundwater of the Neyshabur basin is not suitable for drinking purposes, i.e., most of the wells are in critical conditions and few of them can only be considered for drinking in case of water shortage. The range of pH measured was about ~8 indicating moderate to high alkalinity of the groundwater. Few of the wells meet the standards for irrigation according to SAR, TDS, and NA% values. Increased EC and TDS concentration indicate increased salinity and contamination in the groundwater. Additionally, increase in the concentration of cations such as Ca2+, Mg2+, Na+, and K+ and anions such as CO32-, HCO3-, Cl-, and SO42- suggests potential health-related or environmental issues that may be encountered in the future. This data can help improve our understanding of the change in water storage of the aquifer and how groundwater depletion resulted in reduced groundwater quality. Furthermore, the current groundwater monitoring program has provided a rich dataset for decision makers, utilities, and authorities to apply required modifications to policies and help with the development of future water resources management plans.

Page 55: Hydroinformatics for scientific knowledge, informed policy ... · CUAHSI Conference on Hydroinformatics July 29 - 31, 2019 Brigham Young University Provo, Utah Hydroinformatics for

CONSORTIUM OF UNIVERSITIES FOR THE ADVANCEMENT OF HYDROLOGIC SCIENCE, INC.

150 CAMBRIDGEPARK DRIVE, SUITE 203 | CAMBRIDGE, MA 02140

(339) 221-5400 | [email protected]

Dr. Jerad Bales Executive DirectorJessica Annadale ControllerAinsley Brown Administrative AssistantDr. Anthony Castronova Hydrologic Scientist

Phuong Doan DevOps EngineerNicole Farrell Accounting AssistantMark O’Brien Software EngineerJonathan Pollak Program Manager

Martin Seul Technical DirectorDanielle Tijerina HydrologistElizabeth Tran Community Relations Specialist

CUAHSI Staff