worldmap: a spatial infrastructure to support teaching and research (brown bag talk)
DESCRIPTION
The WorldMap platform http://worldmap.harvard.edu is the largest open source collaborative mapping system in the world, with over 13,000 map layers contributed by thousands of users from Harvard and around the world. Researchers may upload large spatial datasets to the system, create data-driven visualizations, edit data, and control access. Users may keep their data private, share it in groups, or publish to the world. The user base is interdisciplinary, including scholars from the humanities, social sciences, sciences, public health, design, planning, etc. All are able to access, view, and use one another’s data, either online, via map services, or by downloading. Current work is underway to create and maintain a global registry of map services and take us a step closer to one-stop-access for public geospatial data. Another project is working on tools to support the visualization of spatial datasets with over a billion features. Current collaborations are underway with groups inside Harvard, such as Dataverse, HarvardX, and various departments, and with groups outside Harvard, such as Cornell University and the University of Pennsylvania. Major additional contributors to the underlying source code include the WorldBank, the U.S. State Department, and the United Nations. The source code for the WorldMap platform is available on GitHub https://github.com/cga-harvard/cga-worldmap. Location: E25-202 Discussant: Ben Lewis is system architect and project manager for WorldMap, an open source infrastructure that supports collaborative research centered on geospatial information. Before joining Harvard, Ben was a project manager with Advanced Technology Solutions of Pennsylvania, where he led the company in adopting platform independent approaches to GIS system development. Ben studied Chinese at the University of Wisconsin and has a Masters in Planning from the University of Pennsylvania. After Penn, Ben helped start the GIS Lab at U.C. Berkeley, founded the GIS group for transportation engineering firm McCormick Taylor, and coordinated the Land Acquisition Mapping System for South Florida Water Management District. Ben is especially interested in technologies that lower the barrier to spatial technology access. Information Science Brown Bag talks, hosted by the Program on Information Science, consists of regular discussions and brainstorming sessions on all aspects of information science and uses of information science and technology to assess and solve institutional, social and research problems. These are informal talks. Discussions are often inspired by real-world problems being faced by the lead discussant.TRANSCRIPT
A Spatial Platform to Support Research and Collaboration
http://worldmap.harvard.edu
Ben Lewis [email protected]
MITCambridge, September 2014
Outline
• Why lat/long is valuable and what research support operations should do about it
• WorldMap as straw man– Overview of current system– New projects
• Global services registry (OGP integration?)• Dataverse integration• HarvardX integration• Neatline integration
• Discussion
Lat/long (location) as organizing facet
(http://www.architectmagazine.com/technology/meet-the-geodesigner.aspx
Spatial search
• Improve access to both spatial and non spatial materials with a “spatial visualization facet”– Books, documents– Paper maps– Local spatial data– Remote spatial data – Local map display and processing services– Remote map display and processing services– Wikipedia– Social media – tweets, etc– The web
Build platforms that lower barriers to:
• Finding spatial materials which reside inside or outside the academy
• Visualizing this data• Using this data• Sharing this data and views of it• Mashing up private and public data• Spatializing materials that are not yet spatial
Support spatial and spatio-temporal visualization
• Make it easy to visualize datasets on a map(A great way to see temporal patterns is on a map http://worldmap.harvard.edu/tweetmap )
• Make it easy to mashup private and public data• Make it easy to share spatial data and views of data• Make it easy to crowd curate data
Support the research life cycle
* Research Lifecycle:• Scoping• Data gathering / exploration• Synthesis / analysis• Writing / communication• Publishing / delivery
Archiving?
WorldMap In a Nutshell
• Designed to lower barriers for researchers who wish to use spatial technology
• Web-based, cloud hosted• Open source software• Service oriented architecture
Allows researchers to…
• Organize: their own (large) mapping datasets and share them
• Visualize: maps with data-driven symbology• Publish: data to the world or to just a few
collaborators• Mashup / Combine: one’s own data with data
provided by others• Analysis: basic but can be easily extended• Collaborate: by letting several people edit the same
map
12,000 Registered Users14,000 Data Layers3,500 Map Collections800,000 Visits
Statistics
Traffic By City
Built on Open Source Software
Loosely Coupled Approach for Adding Major Capabilities
MapWarper Neatline
DataverseEdX
New
WMS WMS
http
Service Registry
CSW
Contributing Organizations(WorldMap and GeoNode)
• Boston Area Research Initiative - BostonMap• UNICEF – Education Access in Cameroon• Cornell University – Global Health Map• UN University – Wildlife Enforcement Monitoring• Virtue Foundation – Women in the World• Amazon - Hardware• Others…
• World Bank – GFDRR, Dominode, Risiko• U.S. State Dept. – ROGUE, HIU, Syria Damage Assessment• NOAA - GeoCloud• UN World Food Program – WFP Geonode• Australian Govt. - AIFDR, TsuDAT• MapStory Foundation – MapStory App• Others…
GeoNode
WorldMap
Niche between desktop and web
Analysis
Ease
of U
se
Collaboration WorldMap
web apps
desktop apps
Openness is key
• Open registration• Open access to data• Open service protocols (WMS, WFS, ESRI Rest)• Open data formats (Shape, GeoTIFF, GeoRSS,
KML, Json, CSV)• Open source code (GPL on github)• Runs on open source operating systems
(Linux) and could run on Windows
Data Connectors to…
• Google maps, Open Street Maps, Bing, MapQuest, ESRI• Geo-tweets• Google Street View• Google Earth• Flickr, GeoRSS• Geonames, Google Places, Yahoo Places• Social Explorer, Yelp• WMS, WFS, ESRI REST• More to come…
Examples…
• Map applications people and organizations have created using WorldMap…
Reischauer InstituteJapan Earthquake Archive
Boston Area Research InitiativeBoston Data
University of BarcelonaHistoric Planning Data
Professor Colin GordonMapping Decline, St. Louis
Virtue FoundationWomen-run NGOs
What’s Coming in WorldMap
• Services Registry• HarvardX Integration• Dataverse Integration• Neatline Integration• Geo-tweet Archive
WorldMap and the Services Registry
• WorldMap is a web-based, open source, collaborative mapping platform developed at Harvard CGA since 2012
• The National Endowment for the Humanities Implementation Grant. – Key objective: create a comprehensive and
sustainable map service registry which researchers and the public can use to discover, create and share any work that can be represented spatially.
Lower the Barrier to Geodata Access, Across Disciplines
Internet
Commercial Systems
local data
Government Systems
Other Institutions
Esri REST, WMS, WFS, RSS etc.
dow
nloa
d
download
download
uplo
ad service view
servi
ce vi
ew
WorldMap Data and Service
Registry
Key Goals• Support discovery of the millions of web maps
that are exposed but not easy to find (the “dark” geoweb)
• To allow anyone to mashup content from any source from within any mapping application
• Enable non-IT professionals to create their own map services without IT support
• Crowd-source map data curation in a metadata-weak environment
Creating a Global Service Registry
-- A basic piece of geo-infrastructure that doesn’t exist
• Build registry of web map services (millions of map layers)• Make API available so any system can use it• Provide a fast, faceted search interface• Allow anyone to add new services to the registry• Maintain uptime statistics on each service• Use WorldMap usage statistics to improve search
(eventually bring in stats from systems outside WorldMap which use API)
How Many Services Are Out There?
• We estimate millions, each containing many map layers totaling petabytes of data which is currently VERY hard for the average researcher to find and use.
• Try this to estimate number of Esri REST servers (15million)– allinurl: http "arcgis rest services" mapserver -test -kml -kmz -
sitemap -query
• Try this to estimate number of WMS servers (47 thousand)– allinurl: http "?request getcapabilities" -test
Service Registry Challenges
• Metadata - tagging and usage statistics• Projections - cascading• Persistence - caching• Discovery - central index, usage statistics• Performance - caching
Open API to Registry
• Public, RESTful API• Access all (public) map layers within WorldMap• Access all service layers outside WorldMap• Access all Maps (collections of layers) within
WorldMap• Search on information:
– Metadata– Usage statistics– Attribute info (for local layers)
Distributed Map
Services
Services
API
ServiceRegistry
Distributed Users
WorldMap
OpenLayers,Leaflet
Esriclients
Any map client
Find and bindto layers
Crowd curation, user submitted services
WorldMap
Service caching,
reprojection
WorldMapLocal
Services
UptimeChecker
WorldMapCore
ServiceCrawler
*Common Crawl
*Start with hadoop search of Common Crawl dataset http://commoncrawl.org/
Faceted Services Search UI Mockup (draft)
More information…
WorldMap http://worldmap.harvard.edu
Center for Geographic Analysis http://gis.harvard.edu
HarvardX Integration
Dataverse Integration (Social Science Archive)
Neatline Integration map-based story telling platform
TweetMap Archive
Demo of Service Registry Test Server
• http://107.22.231.227/– Show services– Add service– Find service layers as layers– Metadata for services– Layer page for services– Statistics for services– Saving map with service layers
Service Registry Architecture
End
Information Exists Within Spectrum of Publicness and Curatedness
• PublicnessPrivate - Group – University – Consortium - Public
• CuratednessData created – Reviewed by expert – Reviewed by crowd
Our job
• Create platforms (of people and software and objects) to lower barriers to data access:– Data hosting– Data formatting– Metadata creation– Data and view sharing– Search tools – local and remote resources– Connecting systems where relevant– Analytical services– Publishing