ceos future data access & analysis architectures studyceos.org › document_management ›...

33
CEOS Future Data Access & Analysis Architectures Study Interim Report Version 1.0 – October 2016

Upload: others

Post on 24-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

CEOS Future Data Access & Analysis Architectures Study

Interim ReportVersion 1.0 – October 2016

Page 2: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

1

CEOSFutureDataAccess&AnalysisArchitecturesStudy

Interim Report Version 1.0, 13th October 2016

IssuesforPlenaryDiscussionandDecision

1. ApprovalfortheAd-hocTeamonFutureDataAccessandAnalysisArchitecturestocontinueforafurtheryeartocompletethemandate.AndconfirmationoftheCo-Chairs.

2. Agreementfortheproposedpilotprojecttobeprogressedinparallelwiththeongoingreportwork,withoversightbytheFDAteamandcontributionsfromLSI-VC,SEO,andSDCG.

3. InvitationforfurtherproposalsforpracticaldemonstrationsintheareaofFDAfor‘lessonslearnt’evaluationbyCEOSPrincipalsatCEOS-31.

4. ActionforCEOSandSITChairstoconferwiththeFDATeamtoensurenecessaryCEOSPrincipalengagementonthestrategicissuesarisingfromthe2017Report,insupportofidentifyingcommongroundasthebasisforalong-termCEOSstrategy.

ThisreportwasachievedthankstoallagenciesandtheirrepresentativeswhoparticipatedintheAd-hocTeamprocess,inparticularthewritingteam:RobertWoodcock(CSIRO),TomCecere(USGS),AndrewMitchell(NASA),BrianKillough(NASA/SEO),GeorgeDyke(CEOSChairTeam),JonathonRoss(GA),MirkoAlbani(ESA),StephenWard(CEOSChairTeam),andSteveLabahn(USGS).

Page 3: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

2

1.Introduction

OverviewWitheachpassingyear,newgenerationsofEarthobservation(EO)satellitesarecreatingincreasinglysignificantvolumesofdatawithsuchcomprehensiveglobalcoveragethatformanyimportantapplications,a‘lackofdata’nolongerneedstobethelimitingfactor.

Extensiveresearchanddevelopmentactivityhasdeliverednewapplicationsthatoffersignificantpotentialtodelivergreatimpacttoimportantenvironmental,economicandsocialchallenges,includingattheregionalandglobalscalesnecessarytotackle‘thebigissues’.SuchapplicationshighlightthevalueofEOtoMinistersandotherswhoultimatelyadjudicateoninvestmentinprogrammesandmissions

ThechallengeisinprovidingtherightsettingssothepotentialcantranslatetorealitybothforindividualCEOSmembersthroughtoglobalinitiatives.

ForEO,thegapbetweendata,applicationanduserneedstobebridged.Currently,manyapplicationsfailtosuccessfullyscaleupfromsmall-scaleresearchtoglobalorregionaloperationsbecauseofalackofsuitabledatainfrastructure.Eventoday,mucharchivedEOsatellitedatasitunder-utilizedontapes.Despitemultipleexamplesofbigdataanalyticsacrossapplicationdomains,significantdevelopmentremainsconsignedtoprototypes,pilotprojects,exemplarsandtest-beds.

Addressingthischallengeisdifficultforadvancedeconomies.Itissimplynottechnicallyfeasibleorfinanciallyaffordabletoconsidertraditionalprocessing(e.g.localdesktopworkstation)anddatadistributionmethods(e.g.scenebasedfiledownload)toaddressthis‘scaling’challengeinmanyeconomies,asthesizeofthedataandcomplexitiesinpreparation,handling,storage,analysisandbasicprocessingremainsignificantobstacles.ThischallengeisalreadyholdingbackkeyGEO/CEOSinitiativessuchastheGlobalForestObservationsInitiative(GFOI),Disasters,WaterResourcesandtheGEOGlobalAgriculturalMonitoringinitiative(GEOGLAM).

Addressingthisproblembyindividualusersworkingontheirdesktopworkstationshasnotresultedinanoptimalsolutionandmissestheopportunitiesofferedthroughcollaborativeenvironmentsthatbridgedataproviders,intermediaryvalue-adders(suchasresearchersandindustry)anduserstoworktogetheracrossdomains,andacrossgeographicboundaries,toco-createsolutions.

Fortunately,justassatelliteEarthobservationtechnologyhasadvancedsignificantly,sotohasinformationandcommunicationtechnology.Thedatamanagementandanalysischallengesarisingfromtheexplosioninfreeandopendatavolumescanbeovercomewiththehigh-performanceICTinfrastructure,technologiesandarchitecturesnowavailable.Thesesolutionshavegreatpotentialtostreamlinedatadistributionandmanagementforproviderswhilesimultaneouslyloweringthetechnicalbarriersforuserstoexploitthedatatoitsfullpotential.

Page 4: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

3

PurposeInresponsetothesechanges,theCEOSFutureDataAccessandAnalysisArchitecturesAd-hocteam(FDA-AHT)hasbeentaskedbytheCEOSChairteamtoassessthepotentialofnewtechnologiesandapproaches,identifykeyissuesandopportunities,andproposeaplanofactionforconsiderationbyCEOS.

Thisreporthas:

1. ReviewedrelevantinitiativesandplansbeingundertakenbyCEOSandrelatedagencies;

2. ReviewedlessonslearnedfromtheearlyCEOS-ledprototypescurrentlyunderwaywiththegovernmentsofKenyaandColombia;

3. IdentifiedkeyissuesandopportunitiesresultingfromthetrendtowardsBigData,AnalysisReadyData,EOapplicationplatforms,etc;

4. MaderecommendationsforthewayforwardforCEOSanditsagencies,includinginrelationtostandardisation,interoperability,andhowcurrentCEOSprioritiesmightbeadvancedthroughasetofproposedactivities.

ThisstudyisanticipatedtobeofvaluebothtoCEOSAgenciesasdataprovidersandtoexistingandprospectiveusersandbeneficiariesofEOsatellitedata.ThefullpotentialofEOsatellitedatawillnotberealisedwiththeobstaclesthatusersfaceincurrentdatahandlingandanalysisapproaches.GlobalinitiativessuchasGFOIandGEOGLAMexemplifythedifficultiesthatcountrieswithoutdevelopednationalspatialdatainfrastructuresfaceintermsoflackofcapacitytohandleEOsatellitedata.ThiscapacitygapisamajorhindrancetotheuptakeofEOdatainthetypesofglobalinitiativesandagendasCEOShasstatedasimportantinrecentyears.Moreover,evenmanydevelopedcountriesarestrugglingtodeterminehowbesttocapitaliseonlargeandrapidlygrowingEOdatacollectionsandwouldappreciateguidanceonbothbestpracticetheycanadoptthemselves,andapproachesthatcanbringCEOSAgenciestogethertosupportapproachesthatmaximisevaluefromtheircollectiveconstellationofover130Earthobservationsatellites.

StructureoftheReportInthecreationofthisreportsubmissionsweremadebyanumberofCEOSmembersregardingcurrenttrendsandtheirspecificdevelopmentresponsesinEOsystemsarchitecturesandapplications.Aseachagencyhasdifferentterminology,operationalmethods,language,policycontextandbusinessdriversthesubmissionsappeardifferentinthedetail.CarefulanalysisthoughshowscommontrendsandresponsesthatareparticularlyrelevanttothemissionofCEOS.Inaddition,thereporthasbeenlimitedtoonlythoseaspectsoftheEOsystemsarchitecturethatarehighpriorityorimpactdirectlyontheCEOSmission.

Thereportisstructuredasfollows:

Page 5: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

4

Chapter2consolidatescontributionsandidentifiestrendsandprioritiesinEOsystemsarchitecturedevelopmentacrossCEOSagencies.Itservesasabaselineofcurrentarchitectureandnearfuturedevelopmentresponses.Chapter3discussesthechallengesfacedinEOsystemarchitecturedesignanddevelopmentforthemediumtolongtermfuture.ItservestoidentifythekeychallengesthatmustbeaddressedinfuturedataarchitecturesChapter4describeskeyarchitecturalresponsesthatseektoresolvethechallengesidentifiedinChapter3.ItisnotacompletearchitecturaldescriptionandfocusesontheessentialelementsnecessaryfortheCEOSmissionleavingdetailstofutureprojectsorAgencydevelopments.Chapter5summarisestheoutcomesofthereportandpresentsrecommendationsonFutureDataArchitecturesandactivitiesforCEOS.

Page 6: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

5

2.CurrenttrendsanddevelopmentsinEOsystemsarchitectureandapplicationsEarthObservation(EO)programmesofspaceagenciesarefacinganumberoftrendswhich,takentogether,aredrivingtheneedforchangeinthewaysinwhichdataareprocessed,accessed,distributed,andanalysed.ThemagnitudeandspeedofthesechangesisdeterminingtheimportanceandurgencywithwhichchangeisrequiredinfutureEOsystemarchitectures.Thischapterwillattempttosummarisesomeofthosekeytrendsanddevelopanassessmentofthestateofthesesystemsarchitecturesandtheirabilitytomeettheseuserneedsanduserapplications.

MaximisingtheValueofEarthObservationsMaximisingthevalueofEOisafundamentaldriverforallCEOSagenciesandakeypartoftheCEOSStrategicGuidance.ThereisanexpectationthatpubliclyfundedEOagenciesshouldmaximisethevaluereturnedtothecountrythroughtheapplicationofnationaldataholdings.AsafundamentaldrivermostagencysystemsarchitectureshavebeendesignedtodelivercalibratedobservationsandproducevalueaddedproductsforusebyotherGovernmentagenciesonpredominantlynationalandglobalsocietal,environmental,andscientificproblems.InrecentyearstherehasbeenasteadytrendacrossallagenciestowardsgreaterintegrationofdiverseEOdataholdingsforland,inlandwater,coastal,climateandoceanpurposeswithotherdatatypesheldbymorediverseGovernmentagencies-“Comprehensivecollectionandintegrationof...informationindependentlycontrolledbygovernmentalagenciesshouldbepromotedandsuchinformationshouldbedisclosedappropriatelytoincreasetheconvenienceforuserstoaccessandhandlesuchinformation.”(JAXA,OceanPolicy,2013).Theincreasedintegrationproducesamorediverserangeofapplicationsandleadstocomplexityinthevaluechainasobservationsarecombinedwithanalyticstomeetmultipleuserrequirements(Figure2-1).

Figure2-1:Oneexampleofthecomplexinter-dependenciesthatexistinmeetinguserrequirementsfromdiverseEOdatasources.CourtesyofJAXAOceanData

Infrastructure.

Page 7: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

6

IncreasinglyEOdataarevaluednotonlyforitsscientificandtechnologicalvaluebutasapotentialfieldforeconomicgrowththroughnewcommercialventuresandindustrydevelopment.AgenciesarebeingaskedtopromoteandstrengthenanEOindustrywhilstcontinuingtomaintaintheirstrongscientificandtechnologicalfoundation.

Additionally,arecentUSGSreport(Milleretal,2012)attemptedtoevaluatethebenefitsofLandsatdatatoitsusers.Thereportconcludedthatmorethan80%oftheuserssawenvironmentalbenefitsandmorethan90%sawimprovementsindecision-making.Theestimatedannualeconomicbenefitofthisfree/opendataisgreaterthanUS$2billionperyear.Thoughthisisoneexample,therearelikelymanymoresimilarexamplesamongCEOSmissionsanddatasets.Overall,thereisanincreasedrelevanceofEOmissionsforresourcemanagementanddecision-making.

OpenDataPoliciesBeginningwiththeInstitutoNacionaldePesquisasEspaciais(INPE’s)movetowardfreeandopendatapoliciesin2004,datapolicychangeshavebeencriticalandareinfluentialinleadingtoasignificanttrendacrossallCEOSagencies.AnotherimportantpolicychangewastheUSGSadoptionoffreeandopenLandsatdatain2008.ThisallowedinternationalLandsatcollaboratorstochangebusinessmodelsandtomovefrombeingimageresellerstobeingdatascientistsandproviders.OtheragenciesincludingESAandJAXAalsochampionedthesechanges,howevertheglobalreachofLandsatdatameantthattheUSdecisionhadthewidestimplications.NASAhasbeenoperatingunderanopendatapolicysince1990andotherorganizationsaremovingtowardssuchpoliciesinrecentyears.WehaveseentherecentmovementfromEurope(e.g.CopernicusSentineldata,ESAEarthExplorersandHeritagemissions’data)andJapan(e.g.midtocoarseresolutiondata)toprovidefreeandopendata.

WithinAustralia,changesingovernmentpolicyfurthersupportedthedirectionoffreeandopendata.AgenciessuchasGeoscienceAustraliawereabletosupportthedevelopmentofsimplebuteffectiveopenlicensestoapplytotheirEOdatadistribution.Resourcespreviouslycommittedtolicensemanagementandmanualdistributionofproductswereabletobere-focusedonscientificexploitationofLandsatdata.

Fewerhurdlestodataaccessimpliesbroaderuseofdata.Thisresultsinamuchhigherreturnoninvestmentbyorganizationsfromtheirspaceborneandgroundsystems’assets.However,thisalsomeanshigherworkloadsfordatasystems.Itcanalsobeverydifficulttotracktheresultingimpactonceitleavestheconfinesoftheagency.ThereisconsiderablebusinessmodelinnovationoccurringacrossmanyagenciesandusersofEOdataastheimpactandvalueofEOdatanowreadilyavailableisunderstood.

OpenSourceSoftwareInadditiontofreeandopendata,itisalsodesirabletohavefreeandopenaccesstoanalysissoftwareandtoolsthatfacilitatetheuseofdata.SomeexamplesofthisincludeQGIS(anopensourcegeographicinformationsystem),THREDDSandGeoserver(opensourceEOandothergeospatialdatadeliveryservices).Opensourcesoftwarepolicieshelpwiththisdataexploitation.Adraftopensourcepolicyhasjustbeenreleased(March2016)forpubliccommentbytheU.S.FederalChiefInformationOfficer(see

Page 8: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

7

https://sourcecode.cio.gov/)emphasisingthatusingandcontributingbacktoopensourcesoftwarecanfuelinnovation,lowercosts,andbenefitthepublic.AspecificexamplewithinCEOSistheDataCubeinitiative.Thisprojectreliesonopensourcesoftwareforthecreationofdatacubes(ingestingsatellitedata)andtheinteractionwithdatacubes(applicationprogramminginterfaces).Itisbelievedthatopensourcesoftwarewillstimulateapplicationinnovationandtheincreaseduseofsatellitedatasincetheseadvancedtechnologiescanbeutilisedgloballyandevenbydevelopingcountriesthatarenottraditionalusersofsatellitedata.

EmergentEOAnalysisPlatformsintheCloudCloudcomputinghashadadramaticimpactontheavailabilityofhighlyscalableandaccessiblecomputationalanddatainfrastructure.WithrecentadvancesintheCloudtechnologytheuseofscientificcomputingintheCloudhasgrownsignificantlywithmanypreviouslyHPConlyapplicationsnowrunningeffectivelyintheCloudenvironmentamongstthereareseveralEOofferings.

Themostwell-knownofthesewouldtheGoogleEarthEngine,which“combinesamulti-petabytecatalogueofsatelliteimageryandgeospatialdatasetswithplanetary-scaleanalysiscapabilities...availableforscientists,researchers,anddeveloperstodetectchanges,maptrendsandquantifydifferencesontheEarth’ssurface”.

AmazonWebServicesCloudofferingsoperateviaadifferentbusinessmodel,providingaccesstotheCloudplatformbutnotdirectlyofferingapplications,preferringtoprovideabaseplatformuponwhichothersbuildtheirown.AmongsttheEOofferingsavailableareopensourceEOtechnologieslikeGeoTrelliswhichmakeheavyuseofwebservicesarchitecturestoprovidescalablerasteroperations,andcommercialofferingslikePlanetLabsprovidingfullyautomatedscalableimageprocessingpipelinesdownloadingdata,performingcorrectionsandanalyticsat5-10terabytesperday.

WhilstAWSandGEEarementionedsimilarinitiativesexistonMicrosoftAzure(Layerscape)andotherorganisations.EachoftheseplatformsacquireCEOSagencydataandplacesitintoamanagedenvironmentwithcloselycoupledandhighlyscalableanalyticalcapabilitiesintheCloud.Significantlythebusinessmodelsinusevaryfromfree,toopensource,throughtofullcommercialserviceofferings.Howwellthesemodelsworkbothfortheorganisationoperatingtheserviceandfortheircustomershasnotbeenexploredinthisreportbutthebusinessmodelinuseisclearlyanimportantconsideration.TherearealsooperationalissuesfortheseorganisationsinacquiringtheCEOSdata,asoneexampleNASAhasbeenworkingwithmultiplecloudproviders(Amazon,MicrosoftandGoogle)tobetterunderstandhowNASAdatasystemscanbettersupportbulkdatadownloadsbycloudproviders.Theobjectiveistoenableefficientdiscovery,accessandtransferoflargevolumesofdatafromNASAarchivestocommercialcloudsandtomakethetransitiontocommercialcloudinfrastructureeasier.Someoftheseareprovidingefficientmetadata,theuseofstandardfileformats,useofstandardstructureddirectoriestoholdfilesanduseofwell-definedmapprojections.

Page 9: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

8

IncreasedCommercialandNon-GovernmentalInteractionsNASA,NOAA,andtheUSGSconductalargepartoftheirEOactivitiesthroughcontractswithcommercialentities.ESAandJAXAconducttheiractivitiesthroughcommercialentitiesaswell,eventhoughtherearedifferencesinthenatureofcontractsamongthedifferentcountries.Thereisadistinctionbetweencommercialentitiesworkingundercontractswithgovernmentagenciesfordevelopmentandoperationofobservingsystemsanddatasystems,andothercommercialentitiesthatapplytheresultinginformationtosomeself-sustainingprofit-generatingactivities.TheFederationofEarthScienceInformationPartners(ESIP)isanexampleofbothtypesofcommercialentitiescollaboratingwithgovernmentanduniversityorganizations.Fromthepointofviewofdataarchitecture,commercialinteractionshaveaninfluenceonstandardsforinteroperability,amongotherthings.Withtheincreaseinopendatapoliciesandopensourcesoftware(seeexamplesabove),therewillbeanincreasingneedtoworkcloserwithcommercialentitiestoexpandtheuseofsatellitedataanditsbenefits.Thoughtherearemanyexamplesinthecommercialworld,someofthegreatestimpactsonsatellitedataapplicationhavebeenmadebyGoogleandAmazon.

Inadditiontothetraditionalcommercialentities,onemustalsoconsiderthenon-traditionalornon-governmentalgroupsastheyalsoplayamajorroleinconnectingEOdatatousers.SomerecentexamplesinCEOShavebeenconnectionswiththeUN-FAO(supportingforestmanagement),WorldBank(highinterestinwatermanagement),SilvaCarbon(fundedbyUSAIDtosupportforestmanagement),andtheClintonFoundation(workingincentralAfrica).Thesegroups,andmanyothers,utilisesatellitedataandwillcontinuetoincreasetheirdemandforsuchdatatosupportregionalandlocalapplications.

Pre-processedAnalysisReadyDataCountriesandinternationalorganizationshaveexpressedadesireforsupportfromCEOSagenciestofacilitateaccesstoandprocessingofsatellitedataintoCEOSAnalysisReadyDataforLand(CARD4L)products.CARD4Laresatellitedatathathavebeenprocessedtoaminimumsetofrequirementsandorganizedintoaformthatallowsimmediateanalysiswithoutadditionalusereffort.ExistingCEOSagencyeffortsincludeNASA'sMODISmodelthatsetthestandardforARD,Dr.DavidRoy'sWebEnabledLandsatData(WELD)modelthathasstimulateddemandformorehighlyprocessedLandsatdata,GeoscienceAustralia’sefforts,aswellasUSGSLandChangeMonitoringAssessmentandPredictionARDefforts.

SystematicandregularprovisionofCARD4Lwillgreatlyreducethetimeandtechnicalburdenonglobalsatellitedatausers,whohaveuptothispointneededtoinvestsignificanteffortsinpreparingEOdataforfurtheranalysis.Theprovisionofthisdataispossiblethroughmanyoptionsincludingsystematicprocessinganddistribution,processingonhostedplatforms,andprocessingviatoolkitsprovidedtousers.TheCEOSLandSurfaceImagingVirtualConstellation(LSI-VC)teamisdevelopingCARD4LdefinitionandspecificationdocumentsthatwilldefinethedetailsofCARD4L.ThesedocumentsintendtoimprovethecurrentandfutureprovisionofEOdataandtomaximisethevalueofthedatatousersandaddresstheneedsofthemajorityofglobalusers.Inaddition,CEOSisdevelopingaDataCube(spatiallyalignedtimeseriesstackof

Page 10: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

9

pixels)architecturethatdependsonCARD4LtoallowimmediatecreationofDataCubesandsubsequentanalyses.Theresultofthiseffortwillbeimprovedinteroperabilityamonglandbaseddatasets,facilitatingtimeseriesanalysesandenhancedglobaluseandscientificvalueofsatellitedata.ThereisasimilardesiretodevelopARDspecificationsforocean,inlandwater,andcoastalenvironmentsaswellfornon-opticallybasedinstrumentslikeSAR.

AdvancedStorageandDistributionArchitectureforGrowingDataVolumesGrowingdatavolumeswillcontinuetoplacenewrequirementsonadvancedstorageanddistributionarchitectures.WiththeincreaseinCEOSmissionsandsensorsandincreasedspatialandtemporalresolution,theamountofdataavailabletotheworldwillbeordersofmagnitudegreaterthaninthepast.InarecentCEOSAd-hocSpaceDataCoordinationGroup(SDCG)GlobalDataFlow(GDF)StudyfortheGlobalForestObservationsInitiative(GFOI),itwasfoundthatexpectationsmustbemanagedtoachievesustainablesolutionsthatcanbeadaptedascountrycapacityincreases.Significantrisksareassociatedwithmaintaininginfrastructureandexpertisethoughthereisalwaysadesiretoutilisethelatestmoderntechnologiesforstorageanddistributionofkeysatellitedata.

ThroughtheeffortsoftheWorkingGrouponInformationSystemsandServices(WGISS),CEOSisdevelopingnewapproachestoeasedatadiscoverabilityanddevelopingstandardsformetadataanddistributionformats(i.e.spatialortemporaltilesvs.standardscenefiles).TheCEOSWorkingGrouponCalibrationandValidation(WGCV)isworkingonnewapproachesforproductvalidationtoimprovedataquality.Finally,theCEOSDataCubeinitiativeisinvestigatingnewadvancedstorageformats(pixel-baseddatacubesversusscenes)thatwillallowforsubsettingofdata(spatialandtemporalrelevance),significantdatacompressionandfacilitatedistributiontonon-expertusers.

TimeSeriesAnalysesandChangeDetectionTheextendedlifetimeandsuccessofmanymissions(e.g.LandsatandMODIS)haveallowedanewabilitytoexploitinformationfromlongtimeseries.FollowingtheopenreleaseofLandsatdata,therehasbeenasignificantnumberoftheseanalysesfocusedonlandusechange(includinginlandwaterandcoastlines).Theseanalyseshaveutilizedavarietyofchangedetectiontools(e.g.,ContinuousChangeDetectionandClassification(CCDC),BreaksForAdditiveSeasonandTrend(BFAST),Hansenetal.'sglobalforestgainsandlosses)tofindtrendsonthedataoridentifyperiodsofsignificantchange.OneexampleistheAustralianWaterDetectionfromSpace(WOFS)algorithmthatcalculatestimeseriespixel-levelwaterobservations.Theseresultsprovidecriticalinformationforwatermanagementthatwillallowuserstoassesswatercycledynamics,historicalwaterextentandtheriskoffloodsanddroughts.

TheavailabilityoftimeseriesdatawillcontinueintothefutureasprogramssuchasSentineldevelopplansforsustainedlong-termmeasurements.Toefficientlyandeffectivelyusetheselargetimeseriesdatasets,therewillbeaneedtousenewtechnologiesanddataarchitecturessuchasAnalysisReadyData,DataCubes,Data

Page 11: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

10

Provenance,andadvanceddatabases.Withsuchadvancements,wewillbeabletoassesstheimpactofclimateandlandchangeonpeopleandnaturalresourcesovertime.

AdvancedUserRequirementsAdvanceduserrequirementsarethoserequirementsthatextendbeyondthetypicalusecasesandplacesignificantdemandsonfuturedataarchitectures.Theseinclude,butarenotlimitedto:realtimeapplicationsincludingrapidmonitoringoflandandwaterchanges;diverseapplicationsandoutputneedsformonitoring,assessmentsandprojections;integrationofmultipledatasets(climate,insitu,economic,demographic)andsensortypes(e.g.opticalandradar);fusionofdatasets(e.g.combiningLandsatandSentinel-2);accesstolowerlevelproductsfor“powerusers”;theuseofhighperformancecomputingforcomplexanalyses;andClimateDataRecords(CDRs)andEssentialClimateVariables(ECVs).Asmorecomplexdatabecomesavailable,therewillbeanincreasingneedfornewtechnologiesanddataarchitecturestomeetthoseneeds.

AsanexampleofhighperformancecomputingneedsinAustralia,HighPerformanceComputing(HPC)becameavailablein2011forthemanagementandanalysisofAustralia’sLandsatdatacollectionsunderthe“UnlocktheLandsatArchive”project.ThisworkappliedtheautomatedproductionsystemstotheentireAustralianLandsatcollection,whichwasmovedtotheAustralianNationalUniversityNationalComputationalInfrastructure(NCI)tocompletethiswork.GeoscienceAustraliabecameapartnerintheNCIin2012.TheHPCenvironmentallowedthedatatobeheldondisc,ratherthantape,anddirectlyattachedtolargecomputingresources.

AnexampleofadvanceduserrequirementsinJapanisfoundintheWDTMi-Coreproject.Thegoalofthisprojectistoprovideintegrateddataofbothinternationalocean-observingsatellitesandin-situobservations.Theprojectsintendtodevelopacommoninfrastructurewhichintegrates,manages,andprovidesdata,models,andanalyticalresultsfromoceanrelatedsatellitesandIn-Situobservation.

IncreaseintheNumberandDiversityofUsersInthepast,satellitedatausersweretraditionalscientistsandresearcherswiththecapacitytoobtainandanalyzethedatafordecision-making.Duetotrendsinopendataandcloud-basedhosting(e.g.GoogleandAmazon),therehasbeenasignificantincreaseinthenumberanddiversityofglobalEOdatausers.Nolongercanwesaythattheusersareonlytechnical,directlymanipulatingEOimages.Wemustnowconsiderhundredsandthousandsofnon-expertusersthatrelyonEOderivedproductslikelandcover,vegetationcondition,etc.Examplesofthesenon-expertusersincludelocaldecision-makersutilizingGoogleEarth,“crowdsourcing”projectstocompileEOdata,andtheuseofcommonsmartphonestoaccessEOdata.Therapiddevelopmentoftheearthobservationindustryandthelimitedavailabilityofscientificexpertiseitledtohasfosteredaparadigmofstart-to-finishsciencedelivery,startingwithearthobservationdataselection,todatapreparation,toconductingearthobservationanalytics,andconcludingbyrelatingresultingearthobservationpatternsandtrendstoimplicationstospecificapplicationdomains.However,thegrowingawarenessofthepowerofadvancedearthobservationanalyticshassubstantially

Page 12: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

11

increaseddemandfromtheend-usercommunityforaccessibleandusablevalue-addedproductsbuiltfromdata-cubetechnologieswhichremaininthedomainsolelyoftheearthobservationexpertcommunity.Thisincreaseindemandfortheearthobservationscientificcommunitytodeliveronneedsandrequirementsdomesticallyandinternationallycannotbemetunderthetraditionalparadigmofsciencedelivery.Thus,theparadigmofstart-to-finishsciencedeliveryisrapidlydiminishinginfavourofthedevelopmentofanewparadigmwithend-userstakingownershipofthefinalstagesandearthobservationscientistsworkingtodeveloprichanddiversifiedanalysisreadydatathatenablesscientistsanddecision-makersacrossmultipleapplicationdomains.

Asthenumberofusersandtheirdiversityincreases,therewillbeincreasedquestionsoverthecontrolofEOinformationanditsapplicationtodecision-making.AsstatedbyKarenLitfin(MITPress,1998),therelationshipbetweensatellitetechnologyandstatesovereigntycontinuestobecomemorecomplex.Today,usersincludemultinationalcorporations,scientists,policymakers,grassrootsenvironmentalgroups,andindigenouspeoples.”Withthewidespreaduseofcloudstorageandcomputingservicesthetraditionalgeopoliticalbarriersnolongerexist.Usersarenowfreetointeractanddownloaddatafromcloud-basedservers,thoughtheymust“trust”theseprovidersandadheretotheirservicelevelagreements.Formanyinternationalusers,thisisaverylarge“leapoffaith”andmanyarequitereluctanttousecloud-basedservicesandprefertostoreandanalyzethedatalocallytoensure“ownership”andcontroloftheinformation.Withincreasingdatavolumesandcomplexityofdata,thisapproachisnotsustainableforthefuture,andthisparadigmmustshifttoenabletheseuserstotakefulladvantageofEOdata.

Successfullyabstractingthecomplexityofsensordatawillfreeuserstofocusonthedevelopmentofalgorithmswhichcanrunacrosssensors.Thisisakeyareaforfurtherdevelopmentandisnowonlypossibleduetotheadventofdatacubetechnologieswhichcanapplytechniquessuchasmachinelearningtomodelthevariabilityintargetspectralresponsebetweensensors.Withtherangeofnewsensormissionsonthehorizon,removingtheneedfordeepunderstandingofeachsensorwillbeintegraltotheeffectiveutilisationofthesenewdata.

LimitedInternetinDevelopingCountriesThoughdevelopedcountriesdependontheinternetformostoftheirdataaccessandanalyses,thisisnotalwaysthecasefordevelopingcountries.InarecentCEOSGlobalDataFlow(GDF)StudyforGFOIitwasfoundthat50%ofthestudiedcountrieshadinternetspeedsbelow5Mbps,whichwouldrequire~19daystodownload1TBofdata.Evenwithspeedsimproving,thecostofdownloadsisoftenaprohibitivefactor(http://www.tandfonline.com/doi/full/10.1080/01431160903486693).Withoutconsistentandcosteffectiveinternet,theremustbeotheroptionsforuserstoobtaindataoratleastinteractwithdata.AspartoftheCEOSDataCubeproject,theuseofregionaldatahubs(e.g.SERVIR)thatareclosetousersandhaveimprovedinternetperformancearebeinginvestigated.Otherapproachesareconsideringhostingofdataonlargercloud-basedhubs,suchasGoogleorAmazonWebServices,totakeadvantageofwebmappingservices(WMS)thatallowuserstoworkwiththedataremotely,usetheadvancedcloudcomputingcapabilitiesforanalysisandthenonlydownloadsmallresultingproductsoverlimitedinternetbandwidth.Thissameapproachisalsobeing

Page 13: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

12

implementedinEuropefortheCopernicusServicesto“bringuserstothedata”forinteractingwithSentineldataoverlimitedbandwidthinternetwhileavoidingdownloadoflargedatasets.

Page 14: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

13

3.ThechallengeandopportunityofchanginguserexpectationsandincreasingEOdatavolume,varietyandvelocityonEOsystemsarchitecture

Theimpactofvolume,velocityandvariety

Withindividualmissionsproducing10’sofPetabytesperyearby2020,and100’sofCEOSmissions,thereisnoquestiondatavolumesareundergoingastepchangeingrowthrate.Ingeneral,individualCEOSagenciesfactorinoperationaldatarequirementsaspartofmissiondesignandthisisuniquetothatorganization'sneeds(sharingofdesigninformationandlessonslearnedclearlyoccurs).Whilstvolumeandgrowthrateareclearlymajorchallengesforamission,itisbothvelocityandvarietythatposeamajorchallengeforEOsystemsarchitectureswhenviewedfromaCEOScommunityperspective:

● Largedatavolumesimplyaneedforveryaccurateandstructuredsearchservicessothatusersarenotoverwhelmedwiththevolumestheyneedtoaccess.Filediscoveryisnolongersufficient.

● Higheracquisitionratesandthenewnear-realtimeapplicationsthatbenefitfromthem(e.g.Disastermonitoringforbushfires,Floodmonitoring;Agriculturalmonitoring,etc.)requiretheentireacquisition,calibrationthroughproductgenerationpipelineinitsentiretytobecompletedpriortonextacquisition.Automationisessentialatallstagesandincludesthird-parties.

● Thebenefitofincreasingvarietycanonlyberealizedifthebarrierstoaccessingmultiplecollectiontypessimultaneouslyandconsistentlyarereducedsignificantlysothattheburdenofdiscoveryandintegrationdoesnotscalealongwithvolume,velocityandvariety.

● Datavolumesandvelocityaresuchthatinanincreasingnumberofcases,thevolumeistoolargetomovedatatoalocalanalysisplatformontheavailablenetworks.

● Manylocalanalysisplatforms(e.g.PCs,mobileplatforms,departmentalclusters)arenotlargeenoughtobenefitfromtheincreasingvolumesofdataavailable

● Withincreasedvolumes,automationandthirdpartyapplicationdevelopmentconveyingdataqualityinformationbecomeincreasinglyimportant

EarthObservation(EO)datasystemstodayfacechallengesfromtwodirections.Onthe“push”side,newinstrumentsandmodelsareproducingevergreatervolume,velocityandvarietyofdata.Onthe“pull”side,userexpectationsofthedata,andthesystemsthatservethem,areexpanding.

ThepushsidechallengesarethoseofthewiderBigDatamovement.Volumegrowthstemsfromimprovementsinsuchfactorsassensorresolution,space-to-groundbandwidth,retrievalalgorithmsandthecomputingpowerforprocessingthem.

WithinNASA,forinstance,theEarthObservationdatavolumeof15PBasofJanuary,2016,isexpectedtoincreasebyafactoroftenby2024.Inadditiontofindingaffordable

Page 15: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

14

spaceforallofthedata,thevolumeincreasealsomanifestseitherinmoredatafiles,orlargerdatafiles,orboth,compoundingthedatamanagementproblem.

TheamountofdataandinformationbeinggeneratedbytheCopernicusspaceandservicecomponentsarealsochallengingtraditionaldisseminationchannels.ForthespacecomponentalonethecombinedarchivesofunitsAandBofSentinels-1,-2,-3willamounttoapproximately6PBperyearfrom2018onwards.Extrapolatingfromcurrentusagepatterns,eachproductmaybedownloaded10timesonaverage,amountingto60PBdownloadedperyear,or160TBperday.

ThesourcesofthevarietygrowthlieinthedevelopmentofmoreprocessingalgorithmsextractingyetmoreinformationfromtherawdataandtheincreasingavailabilityofmodeldataalongsideEarthObservationdata.

Thechangeinuserexpectationsonthepullsidehastwosources:technologyadvancementcausesuserstoexpectmoremoderncapabilitiesandinterfaces,suchasbroadlysearchingfordataacrossmanyfederationsofsystemsandthenretrievingdatadirectlyfromthediskonwhichtheyarearchivedandretrievingonlythespecificpieceofthedataneeded,orhighlyinteractive,responsive-designuserinterfaces.However,anothersourceofchangeinuserexpectationsistheexpansionoftheusercommunitiestoincludemoredifferentkindsofusersinsuchdiverseareasasinterdisciplinaryresearch,businessandgovernmentapplicationsandeducation.Withthebroadeninganddiversificationoftheusercommunities,informationaboutdataqualityanddataprovenancewillincreaseinimportanceasusersaccessdatafrommanymoredataproviders.Increasingfreeandopendatapolicieswillreducethehurdlestodataaccessandfacilitatebroaderuseofdata.Thiswillresultinamuchhigherreturnoninvestmentbyorganizationsfortheirspace-borneandgroundsystemsassets.

Section2hasalreadyshownthecurrenttrendtowardsmorenon-EOspecialistusers,theubiquityanddiversityofgeospatialapplicationsandchangingroleofparticipantsinapplicationdevelopment.Inthesubsectionsbelow,welookathowthesechallengesmanifestinandarehandledbythevariousareasofEOdatasystems.

DataDiscovery

ThebiggestchallengeinDataDiscoveryispresentedbytheincreaseinVarietyofdata.Thechiefsourcesofvarietyare:

● Sensor/instrument● Platform● Spatialfootprint● Spatialaggregation● Temporalaggregation● Levelofprocessing● Retrievalalgorithm

Giventhelargevarietyofsourcesandtypesofdata,inevitably,similardataproductsmeetingagivenapplication'sneedbecomeavailablefromvariousdataarchives.Thismakesitdifficultforenduserstosiftthroughtofindthemostappropriatedataproductstomeettheirneeds.Ametadataclearinghousecansimplifythesearchtools’taskof

Page 16: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

15

queryingthenecessarysourcesfordataproductinformation.However,suchaclearinghouserequiresacommonmetadatamodel,suchasISO19115,whichcanprovidesomelevellingtoallowfordataproductcomparisons.Manymetadataclearinghousesstandardizetheirmetadatatoasingle,interoperablemetadataformat,suchasISO19115.However,systemdesignersarenowbecomingawarethattheyneedtocontinuesupportingmultiplemetadatastandardsintheirclearinghouse.Thisismostlyinresponsetoconcernsexpressedbythedataprovidercommunityovertheexpenseinvolvedinconvertingexistingmetadatasystemstosystemscapableofgeneratinganewmetadataformat.Asanexample,inordertocontinuesupportingmultiplemetadatastandards,NASAdesignedamethodtoeasilytranslatefromonesupportedstandardtoanotherandconstructedtheUnifiedMetadataModel(UMM)tosupporttheprocess.

Likewise,itisalsohelpfultoprovideanApplicationProgramInterface(API)toallowthedevelopmentofavarietyofsearchclients,rangingfromsimpledatasearch-and-fetchscriptstofull-featuredwebuserinterfaces.TherearetwocommonstandardsforsuchAPIsusedinthecommunity:theCatalogServicesfortheWeb(CSW)isahighlystructuredAPI,whiletheOpenSearchAPIislightweightandbasedprimarilyonkeywordsearch.Notethatsupplyinga“flagship”searchclient,orreferenceimplementation,canprovidenotonlyausefulsearchtoolforthecommunityinitsownright,butalsoaplatformforaddingnewsearchenginefeaturesandastartingpointforprospectiveapplicationdevelopers.Onearearequiringmoreattentioninthefutureisdealingwiththegrowingdiversityofusercommunities.Forexample,thehighlytechnical,detaileddataproductdescriptionsdemandedbythescienceresearchcommunityareoftennotappropriateorusefulto,say,acitizenscientistorabusinessapplicationowner.

DataAccess

Tousethedataandinformation,usershavetobeabletoaccessthem.Twotypesofaccessareconsidered:

A. eitherthedataandinformationaremovedtotheusers'premises(download/pull/push/broadcastreception)sothattheuserscanprocessthemandobtainfromthemtheresultstheywant;

B. orthedatastaysinthedatacentreandtheprocessingoccursnexttothedatawiththeobtainedresultseithersenttotheusersorstoredonlineforfurtherusebyotherusersdowntheline("bringtheusertothedata"scheme).

Bothtypesofaccessarepossiblebutthecostsinvolvedaredifferent:eitherthecostsconsistofhighernetworkbandwidthtomovethedatafromthestoragefacilitytotheusers'ownprocessinginfrastructureorthecostsconsistofmoreprocessingcapacitiesnexttothedataandoflessnetworkbandwidthformovingtheresultstotheusers(iftheusersstillneedtodownloadthem,ifnot,theresultingvalueaddedproductsmaywellbestoredinthesamedatacentreandpublishedfromtheretobefurtherusedbyothers).

Toreachoveralloptimalefficiencyofmodel(b),itshouldbereinforcedbytheguaranteethatthedatawillalwaysbeavailable,otherwiseuserswillbetemptedtodownloadthe

Page 17: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

16

dataandarchivethemasaninsuranceofpermanentavailabilitywithoutnecessarilyimmediatelyknowingwhattodowiththedata/informationdownloaded.Thisaspectofavailabilityalsoincorporateslatencywherebydataassetsthatrequiresignificantlagperiodsfordeliverywillalsoencourageuserstoduplicatethedataholdings.

Bothmodelsmayscale,howevermodel(a)onlywouldonlyscaleifthebandwidthsarescalable,whilemodel(b)scalabilitymayplayondifferentfacilitieswithmoreloadbalancingpossibilities(storage,processing,bandwidthavailability).Thehigherthevolumeofdataandinformationtobeaccessed,themoreadvantageousmodel(b)islikelytobecome.

TheclearestchallengetoDataAccessarisesfromdatavolumes.Ifstoringthevolumesondiskisunaffordable,accesstothedatathatmustbearchivedontapeissignificantlydegraded,bothinlatencyandoverallthroughput.Thelatencyinturngenerallyrequiresanasynchronousaccessmethod,withnotificationstotheuserondatareadiness.Accordingly,compressiontechniquesaretypicallyappliedwheneverpossible,ideallylosslessinternalcompressionsuchasthatavailablewiththeHierarchicalDataFormat(HDF)andNetworkCommonDataForm(NetCDF)version4.DataAccessisalsocomplicatedbylargerdatavolumeswhichpromptmoreuserstorequestdatasubsetsforjustthedataofinterest.Subsettingiscomplicatedbythefactthatdataproductspecificsubsettingtoolsdonotscalewellwithvariety.Preferablearestandardformats(e.g.,HDFandnetCDF)thataretractabletogeneraltools.Also,somedataservices,suchasthoseofferedbytheOpenGeospatialConsortium’s(OGC)WebCoverageService(WCS)ortheOpenSourceNetworkforaDataAccessProtocol(OPeNDAP)canprovidesubsettingontheflyfordatainstandardformats,overtheInternet,alongwithotherservicessuchasreformattingthatsmoothovertheVarietyaspect.

DataUsage

Mostofthedataarestoredtypicallyinformsconvenienttoproducers.Theseformsarenotnecessarilyconvenientforuserstoaccess.Differentusersneeddifferentaccessmechanisms.Somewantaccessviathetraditionalgranuledownloadaccess;somewantgranulelevelaccessaftersomesubsetting(time,space,bands);somewantpixellevelaccess,i.e.,withoutregardtogranuleboundaries,andsomewantpre-builtstandardizedvalue-addedproducts.Itischallengingtoprovideaccesstospatialsubsetsoflongtimeseriesofdata.Howcanweprovideenoughdescriptiveinformationtotheuserstoenablethetypesofaccesstheyneed?

Giventhevarietyofpotentialapplications,'rightformats'meanthatdatawouldbeneededinseveraldifferentformatsandprocessinglevelsasneededbythedifferentusercommunityserved.Thisincludeslongtimeseriesofhomogeneousdatatomonitorchangesandlongtermtrends(e.g.Climate),lowerleveldata(i.e.Level0)toallowscientiststocontributetoalgorithmsdevelopment,higherlevelproducts(i.e.Level1,Level2andhigher)forresearchandapplicationactivitiesandoperationalservices.Dataneedtobecontinuouslyupgradedandvalorisedtoensurecontinuityofobservationsandcomparabilitywithnewmissiondata(e.g.Sentinels)andfitnessforpurposeforaneffectiveutilisationandexploitation.Asindicatedinthe“EOScienceStrategyforESA”,“Long-term,carefullycalibratedanddocumenteddatasetsoftheEarthsystemderived

Page 18: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

17

fromEOsatelliteswillbecomealegacyofthehighestimportanceforscience,policymakersandsociety”.

Traditionally,formatshavebeenthemostproblematic,butcustomASCIIandbinaryformatshavelargelygivenwaytostandard,self-describingformatssuchasHDFandNetCDF.Thistrendinturnhasgivenrisetothedevelopmentofanumberofversatiledatatoolsthatworkonlargenumbersofdatasets,suchasPanoply,IDV,GrADS,aswellasfindingtheirwayintosupportbycommercialtoolssuchasArcGIS,IDLandMatlab.

However,whilethedataformathasbecomelessofanissue,itisstillimportanttofollowconventionsondatastructuresandattributes(suchastheClimateForecastconvention)toenablethesetoolstobeeffective.Furthermore,therearesomeareas,suchasdiversemapprojections,thatremainchallenging;evenwheretoolssupporttransformationstootherprojections,naiveusersapplyingthesere-projectionsmayunknowinglyintroducesignificantartefactsintotheresult.

DataVolumerepresentsasecondchallengetodatausagebytheenduser,whomaybefacedwithfindingenoughspacetostorethedataorenoughprocessingpowertoanalyzeitinareasonableamountoftime.Toaddressthis,somesystemsofferacertainamountofprocessingatthearchive,whichmayrangefromsophisticatedsubsettingschemestorunningtheuser’salgorithmatthearchive.

Fig.3-1Useranalysisofdatarangesfromfairlysimplevariablesubsettingtobothmorecomplexcontentsubsetting(e.g.,qualityfiltering)andtouser-provided

algorithms.

Page 19: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

18

DataSystemFunctions

Inadditiontotheuser-centricaspectsenumeratedabove,somechallengesfallonthedatasystemsservingthem.Perhapsthemostimportantisthestewardshipofthedata.Thisiscomplicatedbyboththedatavolumeandvariety.Largevolumesmayforceadatacentertoresorttotapebackup.Althoughmediacostsarecheaprelativetodisk,thetapesmustbecontinuallyinspected(read)toensuretheyarestillreadable.Furthermore,thevarietyofdataleviesaresponsibilityofstewardingasmuchoftheavailablecontextaspossible,fromdescriptionsoftheinstrumenttoalgorithmdocumentstoproductspecifications.

Inaddition,sciencedatacentersrequireanumberofbackofficecapabilitiestoplanandmanageevolutionofthesystemwithchangingrequirements,usercommunitiesandtechnologies.Theseincludereliableandthoroughmetricsofdataarchivedanddistributed.Ideally,thesciencedatacenteralsomaintainsarepositoryofdatacitations,inordertogaugetheresearchvalueofdatasets,particularlywhenmanyarebeingmanaged.Thedatavarietyalsoaffectscommunitiesofdiversesciencedatacentersthatcoordinatetheirdevelopment,whichbenefitsfromashareddevelopmentenvironmentwithsuchelementsaswikis,tickettrackingsystems,andcoderepositories.

Identifiedaspirations,constraintsandopenproblems

Whilstdatavolumes,varietyandvelocityareclearlyamajortechnicalchallenge,probablythegreatestchallengetomaximisingvaluefromEOdata,andonthesystem'sarchitectureisthechangingexpectationsofusers.

ThereareanumberofCEOScommunityrelatedissuesthatalsoneedtobeconsideredindevelopingFutureDataArchitectures:

● Costre-allocation-therearepotentialopportunities,particularlywithCloudbasedbusinessmodels,tore-allocatethecostofdatadistributionandcomputationintoamoreuser-paysorientedmodelratherthanhavingtheentireburdensupportedbyanagency.

● EconomicsandPerformanceofCloudcomputingforEOstorageandanalysis-thisremainsanopenproblemcurrently.WhilstCloudpotentiallyhasexcellentscalabilityforanalysisagainstlocaldata,theeconomicsofstoringthedata,sovereigntyandsoftwareperformancearestillbeingassessedforEOapplications.

● Capacitybuilding-withexpectationsofeconomicgrowththroughnewEObusinessesandtheexpandinguserbasetherewillbeincreasingdemandfortrainingandcapacitybuilding.UsersupportandcommunicationtoabroadergroupwillalsobenecessaryplacingpressureonCEOSspaceagencies.

● Standardisationandinteroperability-Standardsarekeytointeroperability.ItispreferabletohaveasmallnumberofstandardstofacilitatesearchandaccesstodatainaninteroperablemanneracrosstheCEOScommunity(searchstandards,controlledkeywords,metadata).Wealsoneedamorediversesetofaccess

Page 20: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

19

standardstosupportactionssuchassubsetting,fileformatconversions,spatialprojectionconversions,tileaccess,andpixellevelaccess.

● Administrativereporting-essentialtomaintaininginvestmentinEOactivitiesisbeingabletomeasuretheROIthroughitsuseandvaluegeneration.Inafuturewheretherearemanymorethirdpartiesdevelopingapplicationsandbusiness,alongwithmassiveautomationandconsumeruseofopendata,itwillbeincreasinglydifficulttocollectmetricsnecessaryforadministrativereportingusingthingslikeuserloginsoragencyportalaccess.Withmachinetomachineconnectivityitwillbenecessarytousealternatemethodstogathersuchinformationwhilstrespectingprivacyissuesandremainingtruetotheprincipleofopendata.TherisktoEOdatabecomingananonymouscontributortomajorapplicationsoutcomesishighasincreasinguseseesitbecometakenforgranted.

Page 21: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

20

4.TheFutureofEODataArchitectures

Notwithstandingtheprogressmadeinrecentyears,thedifficultyoffindingandusingEOdataisstillabarriertorealizingtheirfullpotentialandtoproperlyharness“BigEarthScienceData”forsocietalbenefit.ThisismainlyduetotheinabilityofcurrentparadigmsinarchitecturestokeeppacewiththerapidlychangingEOdatamanagementlandscapeandimpedimentstothefreeflowofdatathroughanalysisworkflowsonsuitablecomputationalhardwarethroughouttheentirelifecycle.Thesimultaneousavailabilityofcomplete,high-quality,trusted,ready-to-useandintegratabledatasetsandoftheenablinginfrastructureallowingtheireffectiveutilisationandexploitationbyanincreasinglydiverseusercommunityiskeytomaximizetheimpactofEOdataassets.

Anenticingpossibility,evidentinmanyofthedevelopmenttrends,ischangingtheparadigmofuseranalysisfromthecurrentoneinwhichusersdownloadtheirowncopiesfrommultipledatacentresinordertoperformlocalanalysisovermultipledataproducts.ProvisioningdataintheCloudmayleadtomoreprocessinginplace(i.e.,intheCloudwiththedata),thusreducingnetworktransfersanduserdatapreparationandmanagementheadaches.Thisinturnmayenablemorecross-sensorandinterdisciplinaryanalysis.Thischangeintheuserparadigmfordataanalysisisleadingtodevelopmentofprocessesthatshort-cutwhattheuserneedstodotostartthedataanalysis.CEOSAnalysisReadyDataforLand(CARD4L)reducestheneedfordeepEOcalibrationexpertiseandbroadenstheaccessibleusercommunity.DataCubetechnologydemonstratedintheCEOSandAustralianGeoscienceDataCubeprojectsshowshowsupportingpixel-basedaccesstoCARD4Lwillallowuserstoaccessthespecificspatial,temporalandsensorrangesofthesatellitedataneededforscienceorindustryapplicationsimprovingtheeaseofuseoftimeseriesandinteroperabledatasets.

Anotherexampleistheshiftincostofstoragevscomputevsdatatransfer.Withincreasingcomputecapacityavailabletheabilitytoprocessdataon-the-flywillbecomearealityandwilldrivedowntherequirementforpersistentstorageofderivedproducts.DataCubes,orderivedproductscouldthenbeconsideredtransientcachesandspunupasrequiredatsmallcostratherthanpersistingthem.

Table4-1summarizesthestateofseveraldataarchitectureoptionsandtheirabilitytoaccomplishkeyuserrequirementsanduserapplications.Thismatrixismeanttocomparetraditionalapproachestodatastorageanddistribution(scenebasedmethods)andnewerinnovativeapproaches(pixel-basedmethods).Itisassumedthatpixel-basedmethodscantakeadvantageofsubsetting(e.g.onlyacquirethespatialandtemporaldataneeded)andcompression.Asummaryoftheassessmentisfoundbelowthetable.

Page 22: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

21

Table4-1:AssessmentofUserRequirementsandUserApplicationsvs.DataArchitectureOptions

Page 23: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

22

Bringingtheusertothedata:EarthObservationVirtualLaboratoriesEOVirtualLaboratories(VL),accessiblethroughwebbrowsers,virtualmachines,mobileapplicationsandotherweb-basedormachine-to-machineinterfaces,areonemeanstoaddresstheobjectivesabove.Suchplatformsarevirtualenvironmentsinwhichtheusers-individuallyorcollaboratively-haveaccesstotherequiredanalysisreadydatasourcesandprocessingtools,asopposedtodownloadingandhandlingthedata‘athome’.Akeyqualityofsuchplatformsisthattheyareshapedbyandscalableaccordingtotheneedsandambitionsofusers,theyco-locatedatacollectionsandcomputationalcapacityandtheyarecomposedofarangeofflexiblyinterconnectedservices,oftenfederated,allowingsubstantialtailoringandre-use.TheseVLplatformsaretypicallyimplementedinthe“Cloud”connectinghundredstoseveralthousandcomputernodesacrossanetworkofdatacentresorinregionalhubswithHighPerformanceDatacapabilities.SuchEOVLplatformsareintendedtobringtogetherthefollowingmainfunctionalities:

● dataforbothEOandnon-EOapplications● powerfulcomputingresources● large-scalestoringandarchivingcapabilities● collaborativetoolsforprocessing,datamining,dataanalysis● concurrentdesignandtestbenchfunctionswithreferencedata● high-bandwidthweb-basedaccess● applicationshopsandmarketplacefunctionalities● communicationtools(socialnetwork)anddocumentation● accountingtoolstomanageresourceutilizationandflexibilityinfreeorpay-for-

usebusinessmodels● securityandprivacyenforcement

ThetermVirtualLaboratory,isn’tuniversalnorexclusiveandtherearemanyothervalidnamesforsuchsystems.Thephraseisusedheretosimplyconveythesetofcharacteristicsthatarecommontosuchenvironments,regardlessofwhatnametheyarereferredto.WiththisdefinitiontheAGDCwhencombinedwithsuitablewebservicesinterface,theCEOSSEODataCube,andtheEuropeanThematicExploitationPlatforms(Fig4-1)areallvalidexampleswithmany,ifnotall,ofthesecharacteristics.

Page 24: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

23

FIG.4-1 - EO Community Platform Concept. Image courtesy of ESA.

Conceptually,thisapproachispavingthewayforan“InformationasaService”scenario(Fig.4-2).EOandnon-EOdataareflexiblyandintelligentlylinkedandcombinedbymeansofmodernICTservices(“InfrastructureasaService”,“SoftwareasaService”,“DataasaService”,etc.)thusincreasinglyintegratingtheEOsectorintotheoveralldigitaleconomy.Theinherentchallengeinsuchanapproachistheorchestrationofheterogeneoussystems,datasets,processingtools,anddistributionplatformsleadingtothecreationofinnovativeandhigh-qualityinformationservicesforabroadrangeofusers.

Fig.4-2 - Model for the generation of “Information as a Service”. Image courtesy of ESA.

Page 25: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

24

Insuchenvironments,evolutionisdrivenbyusercommunitiesandtheirneeds.Ideally,usercommunitieswillhaveaccesstoafullyscalableIT-infrastructureenablingthemtodevelopnewbusinessmodelsandtointroducenewapplicationsandservices.CrowdsourcingPlatforms,whichinsomecasesarealreadyprovidingsignificant“CitizenScience”output,mayprovideanumberofrelevantpointersinthiscontext.

Europeismovinginthisdirectionthroughtheimplementationofthe“EOInnovationEurope”conceptincoordinationbetweentheEuropeanCommission,ESAandotherEuropeanSpaceAgencies,andindustry.The“EOInnovationEurope”concepthastheobjectivetoenablelargescaleexploitationofthecomprehensiveEuropeanEOdataassetsforstimulatinginnovationandtomaximizetheirimpact.SimilararchitecturesexistintheUS,Japan,andoutsidespaceagenciesinthebroaderCEOScommunitywiththeCEOSSEODataCubeprojectdevelopingsuchplatformsforKenyaandColombiaandtheAustralianGeoscienceDataCubecombiningmultipleagencycollectionsobservingAustraliafromtheUS,JapanandEuropeanmissionsintoasingleplatformforbroaderuse.Takentogethertheseactivities,whilststillindevelopment,illustratemuchofthefutureofEOdataarchitectureswithimprovedaccessibilityandgreaterdistributionofcomputationempoweringdiverseend-userapplications.

ArchitecturalchangeArchitecturallytheseEOVLsareanetworkofinteroperableinterconnectedplatformsbuiltaroundcore-enablingelements,opentomulti-sourcefundinginitiativesorimplementationbyindependentorganisationsandrelyingoncommonstandardsforintegration.Thecommoditizationandseparationofcore-enablingelementsbeyondasingleagencyboundarysupportsscalabilityandempowersend-usersandindustrytovalue-addandallowsSpaceAgenciestoutiliselargescalecommercialinfrastructures(e.g.Cloudcomputing)whenitismorecosteffectivetodoso.

Figure4-3illustratestheEOInnovationEuropeapproachwherediverseinstitutionalandcommercialplatformsalreadyexistorarecurrentlybeingimplemented.Basedonfunctionalanalysesandidentificationofbestpractices,EO-InnovationEuropehasbeenstructuredaroundthreeelements:anenablingelement(actingasabackoffice),astimulatingelementandanoutreachelement(actingasafrontoffice).

Page 26: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

25

Fig.4-3 – EO Innovation Europe. Image courtesy of ESA.

Withintheenablingelement(backoffice),a“mutualisation”(i.e.sharing)ofeffortsandfundingbetweenpublicinstitutionsshouldpreventanunnecessaryduplicationofinvestmentsforenablinginfrastructuresandwillstimulatetheexistenceofmanyexploitationplatformsorvalue-addingadd-onsfundedbydifferentpublicandprivateentitiesintheoutreachelement(frontoffice).ComparedtoexistingEOarchitecturesratherthansupplyanendtoendproductorbasedata,theSpaceAgencyprovidestheenablingelementandsuitabledatapreparationandinterfacestoallowdynamicusebythird-partiesforresearchorindustryexploitationasrequired.Theabilityforthird-partiestoquicklyandflexiblycreatenewvaluechainsandprovideinnovativeservicesoverthegenericenablingelementsisexpectedtosignificantlyenlargetheuserbase.

Thechangetoamoredistributedarchitecture,bothintermsofcomponentsandparticipation,isaccompaniedbyseveralarchitecturalprinciplesmanyofwhicharealreadyevidentinthetrendsdiscussedinSection2:

DiscoveryandAccess

1. MachineLevelDiscoveryandAccess:Alldataareavailableforsearchandaccesswithmachine-callableAPIs

2. Cross-agencyDiscovery:Cross-agencydatadiscoveryisseamlessatapixelbasedaccesslevel(pixelandattributelevelsub-setting)

3. DatasetSelectionGuidance:Guidanceisavailableondataselectionbasedonfitnessforpurpose.

4. MetadataNamingConventions:KeymetadatafollowstandardnamingconventionsforVariables,Platforms,Instruments,SpatialResolution,TemporalResolution

5. VirtualCollections:Virtualcollectionscanbeorganized/orientedaroundascienceproblemorTheme(eg.hurricanes,agriculture,algalblooms,fires)containingmultiplesensorcollectionsratherthansinglesensorbasedcollection(eg.Landsat,

Page 27: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

26

Sentinel2).TheNASAESDSWGVirtualCollectionsWorkingGrouphasexploredideasaboutsuch“VirtualCollections”

6. AnalysisReadyData:Preparationanddistributionoftrusted,calibrated,welldocumenteddatafrommultiplesensorsinananalysisreadyformforland,inlandwater,coastalandoceanapplications.

7. Changestometadatarelatedtodiscoveryandqualitytoenablepixel-basedretrievalandfinegrainedqueryofverylargedatacollections(e.g.%ofcloudcoverformyregionandsensorsofinterest,ratherthan%ofcloudcoverinascenefile).

8. Webbasedseamlessvisualisationandbrowsingofentirecollections

Usage

1. IntelligentToolCatalogs:Intelligenttoolcatalogsautomaticallysuggestdataanalytics/visualizationtoolstoworkwiththedata.

2. LiveDataCitation:Publicationsarelinkedtodataandtoolsthatallowinteractionswiththedata.

3. MobileDataandProcessing:Dataandprocessingmovetransparentlyasnecessarytoachieveoptimalperformance.

4. QuantitativeQuality:Alldatahavequantitativemeasuresfordataquality.5. Reproducibility:Scientistscanreproduceotherscientists’researchresultswithhigh

precision.6. High-QualityDocumentation:Concise,ComprehensiveandConsistent

documentationexistsforalldatavariables.7. CapacityBuilding:Arichsetofcapacity-buildingandtranslationmechanismsexists

tofacilitateleveragingdataforusebypeoplewithlimitedliteracyinscienceandadvancedtechnology,and/orEnglish.

8. DataAnalysisatScale:Usersareabletoanalyzetheentiredatarecordforanydatavariableoveranyarbitrarilydefinedarea.

9. DatasetUpgrading:High-valuedatasetsareupgradedasnecessarytofullysupportintherichcapabilitiesavailableinthedatasystems.

Integration

1. Data:EOdatacanbeeasilycompared,merged,fusedand/orassimilatedwith(dependingontheapplication)datafromotheragencies,nationsandotherentities.Thisnotonlyrequiresclarityandqualityofgeospatialandtemporalcharacteristicsbutcomparablecalibrationongeophysicalobservations(e.g.inusingtwodifferentopticalsatelliteswithsimilarsensordetectionwavelengths).

2. ToolsandServices:Toolsandserviceswithinthecommunityareeasytouseinconcertandco-hostedwithlargeEOdatacollections

3. Sharing:TheEOcommunityareabletoshareallscientificresources(data,tools,results,workflows,contextualknowledge)

4. Standardisationofprogramminginterfacesandvocabulariesacrosssensortypestobettersupportdataintegration,discovery,andanalysis.

Page 28: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

27

Infrastructure

1. HostedprocessinginfrastructurewithEOtoolboxes(visualisation,analytics)anddataavailableforIndustryandresearchuse

2. Virtualisationofhardwareandsoftwareservicestosupport“payforwhatyouuse”scalability

3. UseofOpenSourcesoftwarelicensingasamechanismtosupportinnovationbythirdpartiesbuildingoutfromagencyandresearchsuppliedtools

4. Consolidationofcommonarchitecturalcomponentsacrosscollections(samedeliverymechanism/experienceforallsensortypes)

5. Automationandaccelerationofthedatapreparationlifecycle(calibration,atmosphericandterraincorrection)tosupportnearrealtimeanalysis

Page 29: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

28

5.Conclusions

CSIRO,asCEOSChairfor2016,establishedanAd-hocTeamonFutureDataAccess&AnalysisArchitectures(FDA)tosurveythechallengesandopportunitiesaroundEOdataarchitecturesgiventheoperatingenvironmentinwhichgovernmentsponsoredEOprogrammesareworking.

ProgresstodatehasestablishedthatCEOSshouldintensifystrategiceffortsinrelationtoFDAifEOistorealiseitsfullpotentialinsupportofsociety,includingmakingbestuseofallavailableCEOSagencydatasothat:

• densetimeseriesanalysisandapplicationsaremadefeasibleforallusers,particularlyinthe‘landdomain’butnotrestrictedtoitgiventheinteractionsbetweendomains.;

• CEOSeffortsinrelationtothegrandchallengesofclimate,foodsecurity,theSustainableDevelopmentGoals(SDGs),anddisastermitigationcanbesupportedthroughapplicationofEOdataandproductsbyusersacrosssectorsandnations.

Thefirstconclusionoftheteam’s2016effortsistorecognisethatthisactivityisverymuchneededandindeedoverduewithinCEOS.AllCEOSagenciesrecognisetheneedtodomoretoremoveobstaclestodataaccess,analysis,uptake,applicationandrealisationofbenefitstosociety.ThereissignificantactivityacrossCEOSagenciesinthisareawithgreatdiversityinapproachesandcapacities.Thismeansthereisalottobuildon;italsomeansthatmovingforwardinacoordinatedwaywillbemoredifficult.

TheteamwouldliketohighlightkeytrendsthatarecompellingactionbyspaceagenciesintheareaofFDA(presentedinmoredetailinsection2ofthereport):

- Themovetofully‘on-line’datasystems,plustheincreasedsizeandcomplexityofthedatabeingproduced(referredtointhereportasvolume,velocityandvarietyofdata)isoverwhelmingtraditionalapproachestodataarchitectures;

- TheBigDataplayersandtheiradvancedplatforms,populatedwithCEOSagencydata(amongstothers),arechangingexpectationsastohoweasyitcouldandshouldbetoaccess,analyseandapplyEOsatellitedata;

- ThesenewcapabilitiesareprovidingawelcomebroadeningoftheuserbaseforEOsatellitedata,tomoresectorsandmoreusers,manyorwhomarenon-expert,and/ornotfromlargetechnicalinstitutions.Theseeffortsaredemonstratingthatup-frontefforttoremovetheobstaclestoEOdatahandlingandusearepayingdividendstothemainstreamingofEOdataandtoitssocietalimpact.CEOSagenciesmusttakenoteastotheimplicationsforeaseofdatahandlingforCEOSagencymissiondata;

Page 30: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

29

- CEOShasbeenplacingmoreemphasisonsupportinguptakeandapplicationofdata,includingforgrandthemesliketheSDGs,climate,andfoodsecurity.Theseinitiatives,suchasGFOI,havereporteddifficultiesinuseruptakeduetocomplexityofdataaccessandhandling,andthechangesinuserexpectationsfromexposuretoadvancedplatformsofcommercialbigdatacompanies.UsersareseeingsolutionstotraditionalobstaclestoEOdataaccessandapplicationandexpectspaceagenciestoadoptthem.

- Evolutionintheapplicationspaceiscreatingmoredemandforcapabilitytobringtogetherdifferentdatasetstoanswercomplexquestions,overlargeareas,overlongperiods,andincombinationwithother(e.g.socio-economic)non-spacedata;intheterrestrialdomaininparticularthisisfarfromeasytodo.

CEOShas,overtheyears,investedconsiderableeffortintocooperationininteroperabilityofdatadiscoveryandaccess.Effectivefuturecooperationshouldextendthistoencompasscommonworkonuser-datainteraction,facilitationofdataintegration/interoperability,andcompatibleserviceinterfacesforanalysis.CEOSeffortsshouldreflectthetrendsaroundincreasinguseof‘in-place’AnalysisReadyDatatoreplacedatadiscoveryanddownloadasaresponsetothe‘volume,velocity,variety’challenge.

Individualagencystrategiesarequitediverseandinclude:

- bringingtheusertothedata,incontrasttotryingtotransmitlargeamountsof‘rawdata’withtheassociatedcommunications,storageanddatamanagementproblemsbecomingahuge‘barriertoentry’forusers;

- APIs/VirtualLaboratories,enabledbystandards(suchasThematicExploitationPlatforms)thatmakeiteasierfortheworkofsometobeintegratedandlinkedwiththeworkofothers,toaccelerateprogress;

- pre-processingdatatoapointwhereitisameasurementcomparableinspaceandtimewithmeasurementsfrombothothersatelliteinstrumentsandothersectors,helpingisolateapplications(andusers)fromnon-relevant(totheirapplication)changesinthespacesegment(i.e.sensoragnostic);

- novelservicemodels(includingnewopportunitiestointegratecommercialdata‘onthefly’toaugmentproductsprimarilyusingpublicsectordata);

- movingtheburdenofdatapreparationprocessing(calibration,location,atmosphericcorrectionetc)fortheextractionofapplicationinformationfromtheuserstothespaceagencies(suchasAnalysisReadyData);and

- flexibleapproachestocomputing(includingHPCandCloud)thatshowcasetheabilitytoprocessmultipleobservationdatasets,atfullresolution,atcontinentandglobalscales.

FutureDataArchitecturescanbeassumedtoconsistofmultipleapproachestocoverallcircumstancesanduses.TheAd-hocTeamhasnotsoughttojudgeindividualagencyapproachesbutinsteadtoidentifycommongroundforeffectivecooperationthatwill:

Page 31: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

30

- deliverbenefitsbacktoAgencyactivities;

- laythefoundationforAgenciestoworktogether,throughCEOS,toofferamoreintegrated‘wayin’tosatelliteEarthobservationdataanalyticsforusersandglobalintiatives.

AnumberofactivitiesinvolvingcoretechnologiesandexamplesarealreadyunderwayaspilotsinitiatedbyGFOI,theCEOSSEOandLSI-VC,inrelationtoAnalysisReadyDataandtheCEOSDataCube.Inaddition,thereisongoingfundamentalworkwithinWGISSaroundmetadatastandards,dataprovenanceandpreservation,andongoingtechnologyexploration.

ThetechnologyandinfrastructurebehindFDAischangingrapidlyandthechallengeforCEOSwillbetoestablishaprogrammeofworktotakeadvantageoftheseopportunities–withanemphasisonapproachesthatde-riskandsimplifyEOdataforusers,allowinguserstomakeuseofALLavailableandrelevantCEOSagencydata,andthatwillsupportCEOSambitionsinrelationtoitschosengrandchallenges.FutureworkshouldhelpusersbenefitfromallrelevantCEOSdataatallstages–complementingpasteffortswhichhaveemphasisedunityof‘discovery’,withoutfullyaddressingthesignificantchallengesintheanalysisandexploitationofdisparateorincompatibledataafterdiscovery.

RecommendationsOneyeardidnotpermitsufficientcapacityortimetobebroughttobearonwhathasbeenfoundtobeoneofthebiggeststrategicissuesforCEOSAgenciesandforCEOSastheirforumforinternationalcoordination.Theissuesarecomplex,andmoretimeisrequiredtoestablishconsensusamongCEOSagenciesastothemostproductivewayforwardforCEOSasacoordinatingbody-reconcilingthevariousagencyapproachesandprioritieswhilstfindingcommongroundforco-operation.

Asaconsequence,theteamrecommendsthatCEOSadoptatwo-streamapproachtocontinueitsengagementwiththistopic:

- afurtheryearofworktocontinueexplorationofkeyareas,andforfacilitatedstrategicdiscussions;

- paralleleffortstoprogressestablishedCEOSpilotprojectstoensurestrategicdiscussionsaresupportedbyreal-worldevidence.

Thisreportthereforefocusesonlyonshort-termrecommendations.

1. USGSandCSIROhavebothagreedtocontinuetoprovideleadershipfortheTeam,whichwillseeka1-yearextensiontoitsoperationasaCEOSAd-hocTeamattheNovember2016BrisbaneCEOSPlenary.TheAHTrecommendsthatathirdco-chairbeidentifiedfromESA/ECtocontributetotheformulationofarecommendedwayforwardforCEOSin2017,sincethefreeandopenglobaldataprogrammesofboththeUSandEuropewillnecessarilybeattheheartofanystrategyultimatelyadoptedbyCEOS.ItisalsocriticalthatotherAgenciesactivelyengagealso,asthewayforwardmustprovideopportunitiesforallAgenciestocontributeandbenefit.The

Page 32: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

31

AHTrecommendsthatCEOSPlenarydirecttheTeamtoconcludeitsworkintimefordebateanddecisionattheUSGS-hostedPlenaryinOctober2017.

2. InparallelwiththeconclusionoftheAHTstudyandreportin2017,CEOSshouldprogress,accelerate,andintegratethepilotactivitiesalreadyunderwaywithinitssubsidiarygroups,includingandinparticular:

- LSI-VCworkinrelationtodefinitionsforCEOSAnalysisReadyData(ARD)forLand(CARD4L)andguidelinesforitsuse;and

- SEO/SDCGworkinrelationtodemonstrationsofaCEOSDataCubeanditsbenefitsforbothdataprovidersanddatausers,andasawayofengagingwithdonorinstitutionsonpracticalcapacitydevelopmentprojects.

TheAHTrecommendsthat2017effortsbedesignedaroundthefollowingobjectives:

- Asmall-scaledemonstrationoftheproductionandapplicationofCEOSARDanditsvaluetobothdataprovidersanddatausers,drawingonkeyfreeandopenglobalprogrammes;thisshouldhelpallCEOSagenciesdevelopanunderstandingastothenatureandscaleoftheactivityofdataproductionandintegration–andimportantlyshouldincorporateafeedbackloopfromengageduserorganisationsandfundingagenciesastolessonslearnedandthebenefitsforthem;andshouldemphasisewhatspaceagenciesgetbackbywayofdatauptakeasthemotivationforexpansionoftheconceptofCEOSARD;

- ContinueddevelopmentoftheCEOSDataCubeasawidely-supported,opensourceexampleofthebenefitsofHPCapproachestosupporteaseofuseofEOdata.Significantworkisalreadyunderway,documentedinaninformal3-yearWorkPlanpreparedbytheSEO,andincludesapilotdemonstrationwiththeGovernmentofColombiaforGFOI(andother)purposes.ThispilotcouldbeemployedasthepracticalapplicationfocusfortheARDproductiontrial–combiningmultipleopticalandradardatasetsusingtheCEOSARDdefinitions,inaframeworkthatwouldguaranteeuserfeedbackandpracticallessonslearned.TheSEOandSDCGhavebothindicatedawillingnesstosupportsuchanintegrationofexistingCEOSactivitiesrelevanttoFDA.TheworkwoulddemonstratethebenefitsofdensetimeseriesdataforGFOIpurposes,amongstothers.

ThisapproachwouldintegratedifferentaspectsofCEOSFDAworkinanapplicationfocusedactivitythatwouldguaranteeuserfeedbackanddevelopCEOSexperiencearoundtheproductionof,andbenefitsfrom,ARD.TheAHTcouldprovidethenecessaryintegrationeffortfortheseeffortsin2017toinformthedesignoftheCEOSFDAstrategywhilstcontributingactivitieswouldcontinueunderleadershipofLSI-VC,SEO,andSDCG,withexpertinputfromgroupssuchasWGISSandWGCV.Suchademonstrationwouldlikelyrequire12-24months.ItcouldeffectivelybuildontheexistingfoundationsunderwaywithinGFOI/SDCGandcouldadapttothefinalFDAReportrecommendationsinlate2017asneeded.

TheSEO,asthetechnicalleadforthisinitiative,shouldbeassuredsupportfromacoregroupofAgencieswillingtocontributetoevolutionofthetechnicaldesignanddevelopmentofnewcapability.

Page 33: CEOS Future Data Access & Analysis Architectures Studyceos.org › document_management › Meetings › Plenary › 30 › ... · 2016-10-17 · 1 CEOS Future Data Access & Analysis

32

3. CEOSshouldcontinuesupportforongoingWGISSworkinrelationto:discoverysearchengineoptimization(searchrelevancy,keywordsearch,persistentidentifiers);accesscommonstandardsforinteroperabilityofproductformats(metadata/data)andapplicationprograminterface(API)foranalyticsanddataaccessservices;explorationofemergingbigdataservicesincludingcloudcomputing.WGISSshouldcontinuetoprovideguidancenotesandbestpracticesthatagenciescantakeonboardwhenplanningfutureinvestments.

4. Inthecourseofthe2016workoftheAHTfurthersuggestionsforpilotactivitieshaveemerged,includingnotablyaproposalfromESAforaCEOSThematicExploitationPlatformforDisasters.TheleadershipoftheAHTshouldestablishtheavailablecapacityandleadershipforthisandanyotherrelevantproposalsemergingfromthework,andbringtoCEOS(atthekeymeetingsofSITorPlenary)suitableproposalsfordebateandfurtherdevelopmentasappropriate.Suchproposalsmightbeevolutionsofexistingworkandwithinexistinggroupsormightmeritestablishmentofnewinitiativesandthemeanstomanagethem.AnynewinitiativesshouldbedevelopedinaccordancewiththeCEOSProcessPaper.USGS,asincomingCEOSChairhasalreadyproposedtoundertake2017activitiesinrelationtointeroperabilityofmoderateresolutionopticaldataproductsandthesecancontributetotheaboveandemergingproposalsrelatedtotheFDAwork.

5. FutureCEOSdecisionsonastrategyaroundFDAissueswillbebothstrategicandsensitivesincethebroadercontextinvolvesactivitiesandcompetitivenessofnational/regionalindustriesandcompaniesinrelationtoEOdatauptakeandapplications.Yettheconsequencesaresocentraltothefuturesuccessandhealthofgovernment-sponsoredEOprogrammes,andtotheeffectivenessofCEOSasacoordinationbodywithsignificantuser-facingglobalinitiatives,thatCEOSmustidentifyaworkablecooperationpathofcommoninteresttoitsagencies.TheAHTrecommendsthatextensiveCEOSPrincipalinputbeassuredinthedevelopmentofthefinalreportonAHTmattersandthatCEOSChairandSITChaircooperatein2017toensurethatthewayforwardisformulatedinlightofinputsfrombothworking-levelCEOScontributors(liketheAHT)andthosefromCEOSPrincipalsandLeadership.