consumable, scalable formats for downstream scientific...

15
Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring the advantages of cloud native, easily consumable, scalable formats for downstream scientific exploitation of point cloud data

Upload: others

Post on 05-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Trevor Skaggs Matthew Hanson Dan Pilone

Exploring the advantages of cloud native easily consumable scalable formats for downstream

scientific exploitation of point cloud data

About the Project

Project Objective To perform a proof of concept (small scale) transformation of ICESAT2 ATL06 product into a cloud native easily usable and scalable format for downstream scientific exploitation

Work is being performed under the NASA ACCESS Grant Program that works towards increasing the scalability and accessibility of all large NASA satellite datasets

Who is Element 84

Element 84 is a company that aims to help organizations and government agencies (NASA USGS etc) solve problems using big remote sensing life sciences and transportation datasets

Trevor Skaggs - Senior Geospatial Engineer Matthew Hanson - Senior Software Engineer Dan Pilone - CEOCTO

What is a Pointcloud

Point clouds are a collection of points that represent a 3D shape or feature Each point has its own set of X Y and Z coordinates and in some cases additional attributes

Point clouds are most often created by methods used in photogrammetry or remote sensing Remote sensing is a way of collecting data of the Earth by use of satellites or aircrafts

Source httpscommunitysafecomsarticlewhat-is-a-point-cloud-what-is-lidar

What is Entwine Point Tiles (EPT)

Simple and flexible octree-based storage format for

point cloud data

Encoding-agnostic

Flexible attribute schema

Lossless

Supporting Groups httpsentwineio httpshobucohttpsenwikipediaorgwikiOctreemediaFileOctree2svg

Download Data

Sort By Cycle

Convert to LAZ

Index to EPT

Upload to S3

EPT Processing Overview

Bounding Box 156-81162-69

Start Time 2018-10-14T000000Z

End Time 2020-05-21T185230Z

Data across Cycles 1-6

Sample ATL06 Data

Standard Attributes Mapping X [beam]land_ice_segmentslongitude Y [beam]land_ice_segmentslatitude Z [beam]land_ice_segmentsh_li GpsTime [beam]land_ice_segmentsdelta_time

Custom Attributes atl06_quality_summary h_li_sigma segment_id

sigma_geo_h CycleNumber FileId (PointSourceId) BeamId (ReturnNumber)

We are using Return Number as a proxy for the beam id the mappings are as follows gt1l gt 0 gt1r gt 1 gt2l gt 2 gt2r gt 3 gt3l gt 4 gt3r gt 5

Sample ATL06 Data (cont)

Raw HDF5 Files 1208 Total Files 88 GB

LAZ Files (intermediate product) 1208 Total Files 45 GB

EPT Files 13127 Total Files 64 GB

64 GB 88 GB = ~725 of original storage size

This size will increase if we decide to add additional attributes Please see ldquoCycle Data Breakdownrdquo slide at end of deck for breakdown of HDFLAZEPT filessize by cycle

Sample ATL06 Data (cont)

Upon uploading to S3 the user is then able to access each end point directly (and without a server) via PDAL as a EPT Reader Stage

Cycle 1 End Point Url httpss3-us-west-2amazonawscomaccess-icesat2-entwine1

Sample ATL06 Data - Cycle 1 End Point

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 2: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

About the Project

Project Objective To perform a proof of concept (small scale) transformation of ICESAT2 ATL06 product into a cloud native easily usable and scalable format for downstream scientific exploitation

Work is being performed under the NASA ACCESS Grant Program that works towards increasing the scalability and accessibility of all large NASA satellite datasets

Who is Element 84

Element 84 is a company that aims to help organizations and government agencies (NASA USGS etc) solve problems using big remote sensing life sciences and transportation datasets

Trevor Skaggs - Senior Geospatial Engineer Matthew Hanson - Senior Software Engineer Dan Pilone - CEOCTO

What is a Pointcloud

Point clouds are a collection of points that represent a 3D shape or feature Each point has its own set of X Y and Z coordinates and in some cases additional attributes

Point clouds are most often created by methods used in photogrammetry or remote sensing Remote sensing is a way of collecting data of the Earth by use of satellites or aircrafts

Source httpscommunitysafecomsarticlewhat-is-a-point-cloud-what-is-lidar

What is Entwine Point Tiles (EPT)

Simple and flexible octree-based storage format for

point cloud data

Encoding-agnostic

Flexible attribute schema

Lossless

Supporting Groups httpsentwineio httpshobucohttpsenwikipediaorgwikiOctreemediaFileOctree2svg

Download Data

Sort By Cycle

Convert to LAZ

Index to EPT

Upload to S3

EPT Processing Overview

Bounding Box 156-81162-69

Start Time 2018-10-14T000000Z

End Time 2020-05-21T185230Z

Data across Cycles 1-6

Sample ATL06 Data

Standard Attributes Mapping X [beam]land_ice_segmentslongitude Y [beam]land_ice_segmentslatitude Z [beam]land_ice_segmentsh_li GpsTime [beam]land_ice_segmentsdelta_time

Custom Attributes atl06_quality_summary h_li_sigma segment_id

sigma_geo_h CycleNumber FileId (PointSourceId) BeamId (ReturnNumber)

We are using Return Number as a proxy for the beam id the mappings are as follows gt1l gt 0 gt1r gt 1 gt2l gt 2 gt2r gt 3 gt3l gt 4 gt3r gt 5

Sample ATL06 Data (cont)

Raw HDF5 Files 1208 Total Files 88 GB

LAZ Files (intermediate product) 1208 Total Files 45 GB

EPT Files 13127 Total Files 64 GB

64 GB 88 GB = ~725 of original storage size

This size will increase if we decide to add additional attributes Please see ldquoCycle Data Breakdownrdquo slide at end of deck for breakdown of HDFLAZEPT filessize by cycle

Sample ATL06 Data (cont)

Upon uploading to S3 the user is then able to access each end point directly (and without a server) via PDAL as a EPT Reader Stage

Cycle 1 End Point Url httpss3-us-west-2amazonawscomaccess-icesat2-entwine1

Sample ATL06 Data - Cycle 1 End Point

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 3: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Who is Element 84

Element 84 is a company that aims to help organizations and government agencies (NASA USGS etc) solve problems using big remote sensing life sciences and transportation datasets

Trevor Skaggs - Senior Geospatial Engineer Matthew Hanson - Senior Software Engineer Dan Pilone - CEOCTO

What is a Pointcloud

Point clouds are a collection of points that represent a 3D shape or feature Each point has its own set of X Y and Z coordinates and in some cases additional attributes

Point clouds are most often created by methods used in photogrammetry or remote sensing Remote sensing is a way of collecting data of the Earth by use of satellites or aircrafts

Source httpscommunitysafecomsarticlewhat-is-a-point-cloud-what-is-lidar

What is Entwine Point Tiles (EPT)

Simple and flexible octree-based storage format for

point cloud data

Encoding-agnostic

Flexible attribute schema

Lossless

Supporting Groups httpsentwineio httpshobucohttpsenwikipediaorgwikiOctreemediaFileOctree2svg

Download Data

Sort By Cycle

Convert to LAZ

Index to EPT

Upload to S3

EPT Processing Overview

Bounding Box 156-81162-69

Start Time 2018-10-14T000000Z

End Time 2020-05-21T185230Z

Data across Cycles 1-6

Sample ATL06 Data

Standard Attributes Mapping X [beam]land_ice_segmentslongitude Y [beam]land_ice_segmentslatitude Z [beam]land_ice_segmentsh_li GpsTime [beam]land_ice_segmentsdelta_time

Custom Attributes atl06_quality_summary h_li_sigma segment_id

sigma_geo_h CycleNumber FileId (PointSourceId) BeamId (ReturnNumber)

We are using Return Number as a proxy for the beam id the mappings are as follows gt1l gt 0 gt1r gt 1 gt2l gt 2 gt2r gt 3 gt3l gt 4 gt3r gt 5

Sample ATL06 Data (cont)

Raw HDF5 Files 1208 Total Files 88 GB

LAZ Files (intermediate product) 1208 Total Files 45 GB

EPT Files 13127 Total Files 64 GB

64 GB 88 GB = ~725 of original storage size

This size will increase if we decide to add additional attributes Please see ldquoCycle Data Breakdownrdquo slide at end of deck for breakdown of HDFLAZEPT filessize by cycle

Sample ATL06 Data (cont)

Upon uploading to S3 the user is then able to access each end point directly (and without a server) via PDAL as a EPT Reader Stage

Cycle 1 End Point Url httpss3-us-west-2amazonawscomaccess-icesat2-entwine1

Sample ATL06 Data - Cycle 1 End Point

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 4: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

What is a Pointcloud

Point clouds are a collection of points that represent a 3D shape or feature Each point has its own set of X Y and Z coordinates and in some cases additional attributes

Point clouds are most often created by methods used in photogrammetry or remote sensing Remote sensing is a way of collecting data of the Earth by use of satellites or aircrafts

Source httpscommunitysafecomsarticlewhat-is-a-point-cloud-what-is-lidar

What is Entwine Point Tiles (EPT)

Simple and flexible octree-based storage format for

point cloud data

Encoding-agnostic

Flexible attribute schema

Lossless

Supporting Groups httpsentwineio httpshobucohttpsenwikipediaorgwikiOctreemediaFileOctree2svg

Download Data

Sort By Cycle

Convert to LAZ

Index to EPT

Upload to S3

EPT Processing Overview

Bounding Box 156-81162-69

Start Time 2018-10-14T000000Z

End Time 2020-05-21T185230Z

Data across Cycles 1-6

Sample ATL06 Data

Standard Attributes Mapping X [beam]land_ice_segmentslongitude Y [beam]land_ice_segmentslatitude Z [beam]land_ice_segmentsh_li GpsTime [beam]land_ice_segmentsdelta_time

Custom Attributes atl06_quality_summary h_li_sigma segment_id

sigma_geo_h CycleNumber FileId (PointSourceId) BeamId (ReturnNumber)

We are using Return Number as a proxy for the beam id the mappings are as follows gt1l gt 0 gt1r gt 1 gt2l gt 2 gt2r gt 3 gt3l gt 4 gt3r gt 5

Sample ATL06 Data (cont)

Raw HDF5 Files 1208 Total Files 88 GB

LAZ Files (intermediate product) 1208 Total Files 45 GB

EPT Files 13127 Total Files 64 GB

64 GB 88 GB = ~725 of original storage size

This size will increase if we decide to add additional attributes Please see ldquoCycle Data Breakdownrdquo slide at end of deck for breakdown of HDFLAZEPT filessize by cycle

Sample ATL06 Data (cont)

Upon uploading to S3 the user is then able to access each end point directly (and without a server) via PDAL as a EPT Reader Stage

Cycle 1 End Point Url httpss3-us-west-2amazonawscomaccess-icesat2-entwine1

Sample ATL06 Data - Cycle 1 End Point

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 5: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

What is Entwine Point Tiles (EPT)

Simple and flexible octree-based storage format for

point cloud data

Encoding-agnostic

Flexible attribute schema

Lossless

Supporting Groups httpsentwineio httpshobucohttpsenwikipediaorgwikiOctreemediaFileOctree2svg

Download Data

Sort By Cycle

Convert to LAZ

Index to EPT

Upload to S3

EPT Processing Overview

Bounding Box 156-81162-69

Start Time 2018-10-14T000000Z

End Time 2020-05-21T185230Z

Data across Cycles 1-6

Sample ATL06 Data

Standard Attributes Mapping X [beam]land_ice_segmentslongitude Y [beam]land_ice_segmentslatitude Z [beam]land_ice_segmentsh_li GpsTime [beam]land_ice_segmentsdelta_time

Custom Attributes atl06_quality_summary h_li_sigma segment_id

sigma_geo_h CycleNumber FileId (PointSourceId) BeamId (ReturnNumber)

We are using Return Number as a proxy for the beam id the mappings are as follows gt1l gt 0 gt1r gt 1 gt2l gt 2 gt2r gt 3 gt3l gt 4 gt3r gt 5

Sample ATL06 Data (cont)

Raw HDF5 Files 1208 Total Files 88 GB

LAZ Files (intermediate product) 1208 Total Files 45 GB

EPT Files 13127 Total Files 64 GB

64 GB 88 GB = ~725 of original storage size

This size will increase if we decide to add additional attributes Please see ldquoCycle Data Breakdownrdquo slide at end of deck for breakdown of HDFLAZEPT filessize by cycle

Sample ATL06 Data (cont)

Upon uploading to S3 the user is then able to access each end point directly (and without a server) via PDAL as a EPT Reader Stage

Cycle 1 End Point Url httpss3-us-west-2amazonawscomaccess-icesat2-entwine1

Sample ATL06 Data - Cycle 1 End Point

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 6: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Download Data

Sort By Cycle

Convert to LAZ

Index to EPT

Upload to S3

EPT Processing Overview

Bounding Box 156-81162-69

Start Time 2018-10-14T000000Z

End Time 2020-05-21T185230Z

Data across Cycles 1-6

Sample ATL06 Data

Standard Attributes Mapping X [beam]land_ice_segmentslongitude Y [beam]land_ice_segmentslatitude Z [beam]land_ice_segmentsh_li GpsTime [beam]land_ice_segmentsdelta_time

Custom Attributes atl06_quality_summary h_li_sigma segment_id

sigma_geo_h CycleNumber FileId (PointSourceId) BeamId (ReturnNumber)

We are using Return Number as a proxy for the beam id the mappings are as follows gt1l gt 0 gt1r gt 1 gt2l gt 2 gt2r gt 3 gt3l gt 4 gt3r gt 5

Sample ATL06 Data (cont)

Raw HDF5 Files 1208 Total Files 88 GB

LAZ Files (intermediate product) 1208 Total Files 45 GB

EPT Files 13127 Total Files 64 GB

64 GB 88 GB = ~725 of original storage size

This size will increase if we decide to add additional attributes Please see ldquoCycle Data Breakdownrdquo slide at end of deck for breakdown of HDFLAZEPT filessize by cycle

Sample ATL06 Data (cont)

Upon uploading to S3 the user is then able to access each end point directly (and without a server) via PDAL as a EPT Reader Stage

Cycle 1 End Point Url httpss3-us-west-2amazonawscomaccess-icesat2-entwine1

Sample ATL06 Data - Cycle 1 End Point

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 7: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Bounding Box 156-81162-69

Start Time 2018-10-14T000000Z

End Time 2020-05-21T185230Z

Data across Cycles 1-6

Sample ATL06 Data

Standard Attributes Mapping X [beam]land_ice_segmentslongitude Y [beam]land_ice_segmentslatitude Z [beam]land_ice_segmentsh_li GpsTime [beam]land_ice_segmentsdelta_time

Custom Attributes atl06_quality_summary h_li_sigma segment_id

sigma_geo_h CycleNumber FileId (PointSourceId) BeamId (ReturnNumber)

We are using Return Number as a proxy for the beam id the mappings are as follows gt1l gt 0 gt1r gt 1 gt2l gt 2 gt2r gt 3 gt3l gt 4 gt3r gt 5

Sample ATL06 Data (cont)

Raw HDF5 Files 1208 Total Files 88 GB

LAZ Files (intermediate product) 1208 Total Files 45 GB

EPT Files 13127 Total Files 64 GB

64 GB 88 GB = ~725 of original storage size

This size will increase if we decide to add additional attributes Please see ldquoCycle Data Breakdownrdquo slide at end of deck for breakdown of HDFLAZEPT filessize by cycle

Sample ATL06 Data (cont)

Upon uploading to S3 the user is then able to access each end point directly (and without a server) via PDAL as a EPT Reader Stage

Cycle 1 End Point Url httpss3-us-west-2amazonawscomaccess-icesat2-entwine1

Sample ATL06 Data - Cycle 1 End Point

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 8: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Standard Attributes Mapping X [beam]land_ice_segmentslongitude Y [beam]land_ice_segmentslatitude Z [beam]land_ice_segmentsh_li GpsTime [beam]land_ice_segmentsdelta_time

Custom Attributes atl06_quality_summary h_li_sigma segment_id

sigma_geo_h CycleNumber FileId (PointSourceId) BeamId (ReturnNumber)

We are using Return Number as a proxy for the beam id the mappings are as follows gt1l gt 0 gt1r gt 1 gt2l gt 2 gt2r gt 3 gt3l gt 4 gt3r gt 5

Sample ATL06 Data (cont)

Raw HDF5 Files 1208 Total Files 88 GB

LAZ Files (intermediate product) 1208 Total Files 45 GB

EPT Files 13127 Total Files 64 GB

64 GB 88 GB = ~725 of original storage size

This size will increase if we decide to add additional attributes Please see ldquoCycle Data Breakdownrdquo slide at end of deck for breakdown of HDFLAZEPT filessize by cycle

Sample ATL06 Data (cont)

Upon uploading to S3 the user is then able to access each end point directly (and without a server) via PDAL as a EPT Reader Stage

Cycle 1 End Point Url httpss3-us-west-2amazonawscomaccess-icesat2-entwine1

Sample ATL06 Data - Cycle 1 End Point

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 9: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Raw HDF5 Files 1208 Total Files 88 GB

LAZ Files (intermediate product) 1208 Total Files 45 GB

EPT Files 13127 Total Files 64 GB

64 GB 88 GB = ~725 of original storage size

This size will increase if we decide to add additional attributes Please see ldquoCycle Data Breakdownrdquo slide at end of deck for breakdown of HDFLAZEPT filessize by cycle

Sample ATL06 Data (cont)

Upon uploading to S3 the user is then able to access each end point directly (and without a server) via PDAL as a EPT Reader Stage

Cycle 1 End Point Url httpss3-us-west-2amazonawscomaccess-icesat2-entwine1

Sample ATL06 Data - Cycle 1 End Point

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 10: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Upon uploading to S3 the user is then able to access each end point directly (and without a server) via PDAL as a EPT Reader Stage

Cycle 1 End Point Url httpss3-us-west-2amazonawscomaccess-icesat2-entwine1

Sample ATL06 Data - Cycle 1 End Point

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 11: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Cycles 1-6 Colorized by Elevation

httpspotreeentwineiodataviewhtmlr=[22httpss3-us-west-2amazonawscomaccess-icesat2-entwine12222httpss3-us-west-2amazonawscomaccess-icesat2-entwine22222httpss3-us-west-2amazonawscomaccess-icesat2-entwine32222httpss3-us-west-2amazonawscomaccess-icesat2-entwine42222httpss3-us-west-2amazonawscomaccess-icesat2-entwine52222httpss3-us-west-2amazonawscomaccess-icesat2-entwine622]

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 12: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by PointSourceID (File ID)

Cycle 1 Only Colorized by Index

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 13: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Sample ATL06 Data Visualized

httpspotreeentwineiodataviewhtmlr=22httpss3-us-west-2amazonawscomaccess-icesat2-entwine122

Cycle 1 Only Colorized by Elevation Filtered to gt1l Beam Only

Cycle 1 Colorized by Elevation Filtered to gt1lgt1r Beam Pair

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 14: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Thank you

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388

Page 15: consumable, scalable formats for downstream scientific ...ceos.org/document_management/Working_Groups/WGISS...Sep 23, 2020  · Trevor Skaggs, Matthew Hanson, Dan Pilone Exploring

Cycle Data Breakdown

Cycle HDF5 HDF5 Size (MB) LAZ LAZ Size (MB) EPT EPT Size (MB)

1 168 13000 168 638 1810 869

2 226 16000 226 841 2406 1200

3 237 17000 237 918 2685 1300

4 154 12000 154 604 1774 841

5 240 18000 240 928 2664 1300

6 183 12000 183 635 1788 878

Total 1208 88000 1208 4564 13127 6388