GIS Data Acquisition and Data Formats
Presentation Agenda
• Discuss different ways of acquiring GIS data.
• Discuss examples of issues that might arise when acquiring GIS data.
• Demonstrate practically the use of main Ontario’s and Canada’s online
GIS data portals and tools for format conversion.
Tomislav Sapic
GIS Technologist
Lakehead University
Faculty of Natural Resources Management
GIS Data Acquisition and Data Formats
• GIS data created or captured by the user.
o Primary data capture
o Secondary data capture
•Already created GIS data obtained.
o Official data
o Data obtained from individuals in the government or
private organizations
Source Issues
Primary
Data
Capture
GPS collected
data
• Understand the accuracy of collected data:
- possible errors present in the system (multipath,
ionosphere geomagnetic disturbance, etc.).
- the real accuracy, not the accuracy suggested by the unit.
Data captured
from aerial and
satellite images.
• Is the image in a perspective projection or has it been
orthorectified?
• Learn about the basics of the imagery in question (what
satellite, the spatial, spectral, radiometric, temporal
resolutions) in order to understand the pixel values and
know how to best display them for data capturing.
GIS data created or captured by the user
Primary Data Capture
o GPS Collected Data
• Use available methods of increasing positional accuracy:
Differential GPS
Post-
processing
Real-time
1) WAAS
2) Public DGPS (not available in remote areas).
3) Real-time Kinematic, carrier phase based, GPS
(very accurate but expensive, requiring special
receivers).
Base Stations Canadian Spatial
Reference System
(CSRS)
https://www.nrcan.gc.
ca/maps-tools-
publications/tools/geo
detic-reference-
systems-tools/tools-
applications/10925
Source Issues
Secondary
Data
Capture
Hard copy
maps
• Hard copy maps can be distorted due to drying and shrinking.
• Georeference the map in the same projection in which the map
is.
• Be cognizant of the map scale. A note (metadata!) should be
made about it because GIS data don’t have inherent scale – data
collected at small scales can be used at large scales, carrying over
the small scale coarseness and digitizing errors.
• Maps are cartographic products, which means that they have
been made by using cartographic rules for representation – many
features are abstract representations of their real-world sources.
• Soft copy maps also have a degree of distortion in them if
created through scanning.
o Large-format feed scanners are the best choice when
combining the price, accuracy and quickness.
o Drum scanners are very accurate but also very expensive
and too slow.
o Flat-bed scanners – too small and inaccurate.
Soft copy
maps,
scanned
maps.
GIS data created or captured by the user
(From Devillers and Jeansoulin (2006))
Error vector field of a map scanned by a continuous feed scanner.
Secondary Data Capture
o Hard and Soft Copy Maps
• If georefencing in ArcGIS, set the data frame map projection to the map projection
of the map.
• If no map projection is stated on the map try to contact the person who created or
published the map. If that’s futile, as the last resort, try to figure it out. Here are some
hints that can help with it:
In Canada, two of the most used map projections are UTM and (Canada) Lambert
Conformal Conic:
If it’s a topographic or a topographic-based map, showing small areas (up
to a few hundred kilometers across), the projection is likely UTM.
If the area is large, such as an entire province or larger, the map projection
is likely Lambert Conformal Conic. If it is in UTM, distortions at the east,
west ends start to appear.
If the features are stretched east-west, e.g. squares look like horizontal
rectangles and circles like horizontal elipsoides, the ‘map projection’ is
likely the Geographic Coordinate System (GCS) – this coordinate system is
very unlikely, though, to be used in creating a map.
OBM tiles and a circle in UTM, Zone 15 OBM tiles and a circle in the Geograph. Coord. Sys. (GCS)
OBM tiles and a circle in Canada Lambert
Conformal Conic Eastern Ontario OBM tiles in UTM, Zone 15
• When georeferencing a map you will need to choose the transformation method and with and understand
the root mean square (RMS) error.
Common Transformation Methods Used for Georeferencing Images
Must have ≥ 3 control points. Must have ≥ 6 control points. Must have ≥ 10 control points.
Used for mostly flat areas. Used for hilly areas (relief
displacement!) and distorted
images.
Used for rugged terrain (relief
displacement!) and mountaines,
distorted images.
Notes: - more than required minimum of control points are advised to be used to lower the overall error.
- the higher the order of transformation is the greater can be image distortions in the areas far from the control points.
Important: Cartographic maps are planimetrically correct, and unless distorted due to their
manipulation, do not require greater than the affine transformation method.
• The transformation (the model, e.g. the Affine transformation) produces output locations
for control points. The transformed output locations most likely won’t match the true output
control point location and will result in an error. The overall average error, can be
calculated by taking a root mean square (RMS) of all errors.
Root Mean Square Error (RMSE)
•RMSE measures the displacement between the actual and estimated locations of the control
points (Chang 2008).
•RMSE should be achieved to the level that the project accuracy requires (e.g., as many meters of
error) or the user feels comfortable with. With Landsat images, RMSE should usually be brought
down to 1 pixel.
•RMSE is similar to standard deviation and errors have a distribution similar to normal
distribution.
~68% errors ≤ 1RMSE
~95% errors ≤ 2RMSE
Already created GIS data obtained
Source Issues
Official sources,government and non-government organizations
Topographic databases (OBM, NTDB), DEMs,thematic GIS datasets, imagery.
• Datasets might be delivered as .e00 files – transfer file format for coverages (an older ESRI file format).
• Most likely there are abundant metadata that accompany the database – make sure to find them and read parts relevant to your GIS work.
CanVec +(NTDB > CanVec > CanVec+): http://ftp.geogratis.gc.ca/pub/nrcan_rncan/vector/canvec/doc/info.html (Documentation files for the CanVec product)
Ontario datasets on Ontario GeoHub: https://geohub.lio.gov.on.ca/ , metadata example https://geohub.lio.gov.on.ca/datasets/forest-processing-facility
LiDAR Data If the map projection is not defined, use available
tools to define the projection. If it is defined, try to
double-check the map projection through the
metadata or the data provider.
Already created GIS data obtained
Source Issues
Data from
individuals
governments
or private
entities,
without or
with partially
proper
metadata
Mostly local
datasets.
• There are often mistakes in data themselves
and problems with improper formatting.
• Absence of the negative (-) sign for longitude
dec.degree coordinates in the western
hemisphere (e.g., Canada) provided in a
spreadsheet.
• Obtain as much as possible metadata about
the datasets from the sender.
• Extra caution should be exercised, the data
properly reviewed, their quality checked.
GIS Datasets Available Through Lakehead Library
• OBM datasets.
• Datasets available Ontario Geospatial Data Exchange Data (OGDE Data), for
example:
o Forest Resource Inventory (FRI) vector GIS files for each of forest management
units (FMU) in Ontario.
o 20 cm panchromatic, 40 cm multispectral, and stereo aerial imagery used to
photointerpret FRI.
o Triangulated digital surface models (DSM) for the flown FMUs.
• Census of Canada geography files and table databases.
The obtaining and use of the OGDE data will require signing a licence agreement stating
that the data will be used for research or education purposes only.
Ontario Base Map
(OBM) GIS
Database
•Entire province is divided into square tiles, 5x5 km in the south and
10x10km in the north.
• Datasets are created based on a scale 1:10000 in the south and
1:20000 in the north.
• Each tile contains a multitude of datasets, such as roads, rivers,
lakes, dams, towers, parks, contours, DTM, and many more.
• Each tile is named by adding scale + UTM zone + UTM easting of
the south-west corner + UTM northing of the south-west corner.
• Usually the name is shortened by leaving out the scale and
sometimes the UTM zone, and by truncating the UTM coordinates.
• Often OBM layers are referred to as ‘basic layers.’
163205320
Zone 16
SW corner x = 320000
SW corner y = 5320000
UTM Zone 15UTM Zone 16
• Many GIS datasets can be downloaded from the Web; some more prominent GIS data portals can be found at http://flash.lakeheadu.ca/~forspatial/ - Web GIS Resources .
Examples:Ontario Scholars GeoPortal: http://geo1.scholarsportal.info/Ontario datasets on Ontario GeoHub: https://geohub.lio.gov.on.ca/CanVec + (Canada basic vector layers) : http://maps.canada.ca/czs/index-en.html
Some Common Issues and Challenges with
Acquired GIS Data
OBM tile files are often received as
.e00 files.
e00 files are interchange files for older
ESRI file formats, ArcINFO files,
including the vector file type called
coverage. E00 files need to be imported
in ArcGIS by following a specific
procedure:
http://resources.arcgis.com/en/help/mai
n/10.2/index.html#//0012000000460000
00
ESRI Coverages
• Vector data models that are formatted as composite files.
• Can’t be edited in ArcGIS.
• Used to be ESRI’s flagship data format but are being replaced with
geodatabases.
Coverages can be exported to
shapefiles or geodatabase
feature classes through
ArcCatalog or ArcMap
(Export Data of a coverage
layer).
Garmin .gpx Files
• If dealing directly with Garmin .gpx files, the easiest way to bring them into GIS is to use
the free software DNR Garmin.
1) File > Load From
2) File Format
.gpx
3) Feature type (e.g. waypoint).
4) File > Save to > e.g. shapefile
References:
Chang, K. 2008. Introduction to Geographic Information Systems. McGraw-Hill, New York.
Devillers, R. and R. Jeansoulin. 2006. Fundamentals of Spatial Data Quality. ISTE Ltd.
ESRI. 2010. ArcGIS ArcMap Help File.
Longley, P. A., M. F. Goodchild, D. J. Maguire, and D. W. Rhind. 2011. Geographic Information Systems & Science. John Wiley & Sons, Inc.
McMaster, R. B. and K. S. Shea. 1992. Generalization in Digital Cartography. Washington, DC: Association of American Geographers.
Lekkerkerk, H. J. 2007. The GPS Handbook for Professional Users. CMedia Productions. Netherlands.