linking literature to astronomy data with dois · •make telescope bibliographies easier...
TRANSCRIPT
LinkingLiteraturetoAstronomyDatawithDOIs
SomeChallengesSarahWeissman – STScI/MAST
Code4LibDMV2017
STScI/MAST– whoarewe?
• Hubble!JWST!• …inBaltimore!
Imagecredits:HST,NASA2009viahubblesite.org;JWST,NorthrupGrummanviawebbtelescope.org;DivinefromPinkFlamingos,viabaltimoreorless.com;Mikulski,STScI viaYouTube
DataDOIs
• DOI=DigitalObjectIdentifier.It’sapermanentlinktoadigitalobject.• URL+ID+metadatacontainer
• DataDOIdemo
WhatareDOIsgoodfor?
• Globallyresolvable(kindoflikeURLs)• Machinereadable(likeURLs)• Persistentlinks(aslongasyouupdatethem)• Usuallycomepackagedwithsomekindofmetadata(ifyoucanfindit)• Noteasilybookmarked• Obscuretheirdestination(likebit.ly)• Havetobeupdated• Meantformachines,nothumans(10.7059/T9G0bled33)
ByWilliams,HenrySmith,1863-;Williams,EdwardHuntington,1868-1944,jointauthor[Norestrictions],viaWikimediaCommons
WhyDataDOIs
• Allowastronomerstolinktotheirdatainastandardizedway• Convenience• Reproducibility,openness
• MakeTelescopebibliographieseasier• Largelyamanualprocesscurrently• Archiveplanning• Justifyfunding
DataDOIsatMAST
• CollaborationbetweenMAST,AASPublishingandtheSTScI Library• DebutedourDOIserviceinApril2016,currentlyinBetamode• ~14STScI authorshavecreatedDOIsforpublication• Fall/Winter2017- Plantoopenserviceto12otherinstitutions.• Links:• http://archive.stsci.edu/doi/search/ (MainDOIentrypoint)• https://mast.stsci.edu/portal/DOI/help (DOIAPIdocumentation)
Challenges– Permanence&Uniqueness
• HowtomakesurethatyourDOIlinkskeepworking?• Landingpage+service(demo)• YALI– yetanotherlevelofindirection
• Howtolinkto”data”?• Dataisoftenthenot-well-definedglobofstuff(files,databaserecords)
• Whatleveltolinktoyourdata?• Observation,dataproduct
• Onceascientistdownloadsdata,theytypicallytransformit,soitnolongerresemblesitsforminthearchive.
• Hadadatamodel(CAOM),soweusedit• Evengiventhis,thingsaremessy.IDformatsnotwell
definedandnotactuallyunique!
Challenges– Buyinfrompublishers
• LuckilywehadagoodworkingrelationshipwithAASJournalsandEJpress• Luckilytheworldofastronomywrt dataandpublishingisrelativelyopen.(E.g.http://adswww.harvard.edu/)• Publishersaren’tbuildingtheirownsoftware.• Buildrelationshipswitheachpublisher• (Publisherhastobuildupprogramswitheachdatacenter.)
Challenges– Metadata
• Areanumberofstandardsformetadata– DataCite,ERC,DC,CrossRef• WewentwithDataCite **BUT**wedon’twantdataDOIstobefirstclasscitableobjects.• Adhoccollectionsofdatathatcouldintheorychange• UsuallywhenanastronomerpublishesadatasetthereisapaperandTHATshouldbecited
• DataCite hasdomain-specificelements(relatedIdentifierType),whichmakesithardtouseforgeneralpurposemetadata.
Challenges– LargeData
• Limitationsonoursoftware(Javascript)onlyallowuserstoworkwithsomuchdataatonce.• Largedatasetslikecatalogscancontainmillions,evenbillionsofrows,howtoefficientlyrepresentanysubsetofthisdata?
RobertWilliamsandtheHubbleDeepFieldTeam(STScI)and NASA
Challenges– API
TFW:• YouwantyourDOIlandingpageURLtocontaintheDOI,butitdoesn’texistyet.• YourAPIisreallyathinlyveiledproxyforanotherAPI.• YourAPIissupposedtobegenerallyapplicable,butit’sactuallyinextricablylinkedtothisgddmnJavaScriptGUI.• Youridentifiersaren’trecognizedbyDataCite soyouhavetouseacustomizedmetadataformat.
FutureWork
• GetoutofBeta.Expandtomoreusers.• Fullyintegratewithourdatasearchtool(mast.stsci.edu).Rightnowwearejustaclient.• ProvidemorelinksbetweenrelatedDOIs,relatedliterature.• **Usedatataggingtobuildtoolsfordataenrichment.**
Questions?
• sweissman [at]stsci.edu