preserving state and local government digital geospatial data
DESCRIPTION
Preserving State and Local Government Digital Geospatial Data. North Carolina Partnerships. Workshop on Archiving Digital Cartography and Geoinformation December 4, 2008. Steve Morris NCSU Libraries. NC Geospatial Data Archiving Project. - PowerPoint PPT PresentationTRANSCRIPT
Preserving State and Local Government Digital Geospatial Data
North Carolina Partnerships
Steve MorrisSteve MorrisNCSU LibrariesNCSU Libraries
Workshop on Archiving Digital Workshop on Archiving Digital Cartography and GeoinformationCartography and GeoinformationDecember 4, 2008December 4, 2008
One of eight initial collection building projects in the Library of Congress-funded NDIIPP (National Digital Information Infrastructure and Preservation Program)
Lead organizations: North Carolina State University Libraries and North Carolina Center for Geographic Information & Analysis (NCCGIA)
Focus: State and local government geospatial data in NC Repository development as catalyst for discussion Goal: Engage SDI in data archiving and preservation
Initial 3 year project extended to March 2009
NC Geospatial Data Archiving Project
Repository Goal Capture at-risk data Explore technical and organizational
challenges
Project End Goal Data Producers: Improved temporal data
management practices Archives: More efficient means of
acquiring and preserving data;
Progress towards best practices
Temporal data management vs. long-term preservation
NCGDAP Project Goals
Spatial Data Infrastructure Role in Archiving Metadata standards and outreach
metadata quality, best practices Inventories
Reduce “contact fatigue”, shareable info store Content exchange networks
Leverage more compelling business reasons to put data in motion
Automate process, add technical & administrative metadata
Framework data communities Snapshot frequency, schemas, format strategies
NCGDAP Data Types – Digital Orthophotography
• All 100 NC counties with orthos• 1-5 flight years per county• 200-300 gb per flight
NCGDAP Data Types – Vector Data
• Point, line, and polygon• Attached attribute data• Some layers frequently updated
NCGDAP Data Types – Vector Data
• Cadastral (tax parcels) • Street centerlines• Zoning• Topographic contours• Public utilities• School, sheriff, fire• Voting precincts• More …
Frequent UpdateMore detailed, current, and accurate than state/federal data sources
Downtown Raleigh Near State Capitol
2005 Wake County Ortho
Imagery = DurableStatic Simple structureMostly open formats
Vector data = VolatileFrequent updateComplex structureMostly proprietary formats
Downtown Raleigh Near State Capitol
2005 Wake County Ortho
Imagery = DurableStatic Simple structureMostly open formats
Vector data = VolatileFrequent updateComplex structureMostly proprietary formats
Carrboro, NC : Population 17,797 (2005 est.)
24 downloadable GIS data layers
4 OGC WMS services (web services)
6 web mapping applications
9 downloadable PDF map layers
Value in Older Data: Cultural Heritage
Future uses of data are difficult to anticipate (as with Sanborn Maps)
Value in Older Data: Solving Business Problems
Suburban Development 1993/2002Near Mecklenburg County-Cabarrus County NC border
Land use change analysis
Real estate trends analysis
Site location analysis
Disaster response
Resolution of legal challenges Impervious surface maps
Industry focus on “latest and greatest” data Industry temporally-impaired from the point of view of
data availability, software support, etc. “Kill and fill” as a common approach to data
management (past versions of vector data lost)
Loss of memory about the data Of superceded county orthophoto flights in NC:
Only 22% recorded in the state’s GIS inventory Only 30% accessible through county map servers
Some older inventories only available through Internet Archive
Problem: Lack of Temporal Data
Complex vector formats: multi-file, multi-format No non-proprietary, well-supported format for vector data
Shift to web services-based access Data becoming more ephemeral
Often: Inadequate or nonexistent metadata Impedes discovery and use
Increasing use of spatial databases for data management The whole is greater than the sum of the parts but the whole is
very hard to preserve Content packaging
No geospatial industry standard
Temporal Challenges with Geospatial Data
Problem: Putting the Data in Motion
Most costly part of archive development is identifying, negotiating acquisition, and then transferring data
Local agency “contact fatigue” resulting from repeated state, federal, and university requests for data
Archive development is low priority – leverage other business uses that can put the data in motion
•Continuity of operations•Highway planning•Floodplain mapping
Objective• Minimize direct contacts• Document data• Clarify rights• Routinize transfer
Problem: Metadata
Survey of current archiving practice amongNC counties and municipalities
Metadata is often asynchronous, inconsistently structured, incomplete, or missing.
Metadata archived with data?
25%
9%
6%
60%
FGDC
Locally Defined
NC OneMap Starter Block
None
Problem: Content Packaging
XML DatabaseExport
XML DatabaseExport
TIFF Images •Pixel Value and Header file•World file•Coordinate System file•Metadata file
Shapefiles•Geometry file•Index file•Attribute file•Metadata file•Coordinate System file•Spatial Index files
Potential Ingest Objects
• Complex multi-file, multi-format objects
• Shared ancillary components
• Need to add administrative & technical metadata beyond geospatial metadata
Metadata Exchange Format (MEF) in GeoNetwork is a form of content packaging
Technical solutions: How do we preserve acquired content over the long term?
Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be preserved—from point of production?
Current use and data sharing requirements – not archiving needs – are most likely to drive improved preservability of content and improvement of metadata
Different Ways to Approach Preservation
Preservation Approaches: Temporal Data Snapshots
Issue: How frequently should county and municipal vector data layers be captured in archives?
Parcels, centerlines, jurisdictions, zoning, …
Parcel Boundary Changes 2001-2004, North Raleigh, NC
How often should continually changing vector datasets be captured?
Tap into data custodian understanding of production patterns and uses
Tap into local innovation Learn about local business drivers for data archiving
2006 and 2008 surveys of NC cities and counties 2008 survey of archival practice in state agencies in
NC Planned survey of data users in NC
http://www.nconemap.com/AboutNCOneMap/tabid/289/Default.aspx#preservation
NC Frequency of Data Capture Surveys
Preservation Approaches: Original Data vs. “Desiccated” Data
Complex data representations can be made more preservable (and less useful) through simplification
Complex documents may be very hard to preserve over time GIS project files Layer definitions Web services or API interactions
Image outputs capture some sense of final product--but lose underlying data intelligence
GeoMAPP Multistate project: Engagement with ESRI on complex project archiving issues
Capturing Complex and Ephemeral Data Representations
Desiccated data: PDF and GeoPDF
Counterpart to analog map = datasets plus data models, symbolization, classification, annotation, etc.
More data intelligence survives in PDF documents than survives in most other “desiccated” formats
Explosion of geospatial PDF content recently Standards issues
GeoPDF: formerly proprietary TerraGo technology now going through OGC standards process
PDF an open ISO standard Open PDF variants created through ISO
standards process (PDF/E, PDF/X, PDF/A, …) NCGDAP approach: PDF content retained in
addition to, NOT instead of original data
Geospatial PDF Trends
Changes in the Domain: New Location-Based Content
Present-day value in location-based services and mobile applications
Street ViewsOblique Imagery
3D Images
Changes in the Domain: New Location-Based Content
Future value as cultural heritage resource
More descriptive of place and function than spatial data
Ortho image
GICC Archival and Long-Term Access Committee
Geo Multistate Archival and Preservation Partnership (GeoMAPP)
OGC Data Preservation Working Group
Moving Forward
Nov. 2007: NC Geographic Information Coordinating Council (GICC):
Ten Recommendations in Support of Geospatial Data Sharing released Recommendation: “Establish archive and long term
data access strategies” Suggested best practices include: “Establish a policy
and procedure for the provision of access to historic data, especially for framework data layers.”
Community Response to the Data Archiving Challenge
Initiated Feb. 2008 in response to agency requests for guidance on temporal data management
Federal, state, regional, and local agency representation Key focus
Best practices for data snapshots and retention State Archives processes: appraisal, selection, retention
schedules, etc. Who, What, Why, When, Where, How
Final Report delivered to GICC in November 2008
NC GICC Archival and Long-Term Access Committee
Lead organizations: North Carolina Center for Geographic Information & Analysis (NCCGIA), State Archives of NC, with Library of Congress
Partners: State geospatial organizations of Kentucky and Utah State Archives of Kentucky and Utah NCSU Libraries in catalytic/advisory role
State-to-state and geo-to-Archives collaboration 2 year project: Nov. 2007-Dec. 2009 Archives as part of Spatial Data Infrastructure
GeoMAPP: Geospatial Multistate Archival and Preservation Partnership
Formed Dec. 2006 Engage archival community Find points of intersection with other OGC activities:
GML for archiving Content packaging Large scale data transfers Time in decision support
OGC Data Preservation Working Group
“Supporting temporal analysis requirements” gets more attention than “archiving and preservation”
Leverage existing infrastructure Current data sharing needs drive infrastructure
improvements that help archiving Leverage business needs that are more compelling than
preservation (e.g., continuity of operations) Facilitate stakeholder ownership of the solutions Mine state and local archiving innovations
Conclusions
Thank You!Contact:Steve Morris
Head, Digital Library Initiatives
North Carolina State University Libraries
Steven_Morris @ncsu.edu
NCGDAP:
http://www.lib.ncsu.edu/ncgdap
GeoMAPP
http://www.geomapp.com