lecture 5: gis data management - wordpress.com1 lecture 5: gis data management ge 118: introduction...

Post on 06-Jul-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Lecture 5:

GIS Data Management

GE 118: INTRODUCTION TO GIS

Engr. Meriam M. Santillan

Caraga State University

2

File Structures

(File-based datasets)

Simple list

Ordered sequential files

Indexed files

3

Simple List

Simplest file structure

Unordered/unstructured

Arrangement is by whichever comes first

4

Ordered Sequential Files

Simple lists that are arranged according to

some order (ex. Alphabetical order)

5

Indexed Files

An index to the directory is needed for more

efficient searches involving finding entries

given certain criteria

Can be developed as direct files or inverted

files

6

Direct Indexed Files

Records are used to provide access to other

pertinent information

7

Indirect Indexed Files

Index is based on possible search criteria,

not on the entities themselves

Attributes are the primary search criteria and

the entities rely on them for selection

8

Database

An integrated set of data on a particular

subject

Collection of interrelated data stored

together with controlled redundancy to

serve one or more applications in an

optimal fashion

Requires more elaborate structure

called a database structure or

database management system

9

Significance of Database

Most GIS activities consist of storing entity and

attribute data so that we can retrieve any

combination of these objects.

Each graphical feature must be stored explicitly with

its attributes so that their combined search becomes

faster.

10

Advantages of Database over

File-based datasets

Collecting data at a single location reduces

redundancy and duplication

Lower maintenance cost due to better organization

and decreased data duplication

Multiple applications can use the same data and can

evolve separately over time

11

Advantages of Database over

File-based datasets

User knowledge can be transferred between applications more easily because database remains constant

Facilitated data sharing, with a corporate view provided to data managers and users

Security and standards for data and data access can be established and enforced

12

Database Management System

A software application designed to organize the efficient and effective storage and access to data

A suite of software programs designed to store, retrieve and manipulate data within a database

13

Types of Database Structure

1. Hierarchical Data Structures

2. Network Systems

3. Relational Database Structures

14

Hierarchical Data Structure

‘one-to-many’ or ‘parent-child’ relationship

Implies that each element has a direct relationship

to a number of symbolic children

Each child is capable of having the same direct

relationship with his/her own offspring, and so on.

15

Hierarchical Data Structure

16

Hierarchical Data Structure

Advantages:

Simple and straightforward data access since parent

and children are directly linked

Easy to search since structure is well defined

Relatively easy to expand by adding new branches

and formulating new decision rules

17

Hierarchical Data Structure

Disadvantages:

Confined to queries along one branch only

Difficult restructuring to allow other possible search

criteria

Creates large index files

Redundant entries for searching

18

Network Systems

‘many-to-many’ relationship

Each individual data is linked directly to

anywhere in the database using pointers,

without the parent-child relationship.

19

Network Systems

20

Network Systems

Advantages:

Less rigid compared to hierarchical structure

Can handle many-to-many relationships

Allows much greater flexibility

Reduced redundancy of data

21

Network Systems

Disadvantages:

In very complex GIS, the number of pointers can become large, thus requiring a lot of storage space

Linkages between data must still be explicitly defined using pointers

Numerous possible linkages can become extremely tangled, resulting to confusion and incorrect linkages

Not recommended for novice users

22

Relational Database

Management Systems

(RDBMS)

Data are stored as ordered records or rows of attribute values called tuples

Tuples are grouped with corresponding data rows in a form called relations

Each column represents data for a single attribute for the entire dataset

23

Relational Database

Management Systems

(RDBMS)

Primary key – a column which is used to define

the search strategy or criterion

Foreign key – column in the second table to

which the primary key is linked

24

Relational Database

Management Systems

(RDBMS)

Normal forms – set of rules to indicate the

forms that the tables should take

1. First Normal Form

2. Second Normal Form

3. Third Normal Form

25

First Normal Form

Table must contain columns and

rows

Because the columns are to be

used as search keys, there should

only be a single value in each row

location

26

Second Normal Form

Requires that every column that is

not a primary key be totally

dependent on the primary key

Simplifies the tables

Reduces redundancy by imposing the

restriction that each column be only

searchable using the primary key

27

Third Normal Form

States that columns that are not primary keys must “depend” on the primary key, whereas the primary key does not depend on the nonprimary key Primary key must be used to find other

columns

But the other columns are not needed to search for values in the primary key column

Idea is to reduce redundancy

28

Relational Database

Management Systems

(RDBMS) Advantages:

Allow us to collect data in reasonably simple tables, keeping organization also simple

Capable of doing relational joins, as long as there is at least one column common to the tables to be joined

Allows greatest flexibility, both in design and querying

29

Data Storage in a DBMS

Object classes/layers are stored in database tables

Each layer is stored as a single database table in a database management system

Rows contain objects, while columns contain attributes/properties of the objects

30

Data Storage in a DBMS

Geographic database tables have a geometry column (or shape column), which non-geographic tables don’t have

Each layer is stored as a single database table in a database management system

Rows contain objects while columns contain attributes/properties of the objects

31

Basic Database

Functions/Operations

Join

Tables are joined together using common row/column

values or keys

After joining two or more tables, a new table is created

which contains all the values of the joined tables

Database tables can be joined together to create new

relations, or views of the database.

32

Basic Database

Functions/Operations

Link

Tables are linked using common row/column values or

keys

Unlike in joining, linking tables does not result to a new

table. The original tables are retained but accessing one

enables the user to also access a table linked to it

33

Database Design

Involves three stages: conceptual, logical,

and physical

Involves six practical steps (see Figure)

34

Stages of Database Design

Conceptual Model

User View

Object

and

Relationships

Geographic

Representation

Logical Model

Geographic

Database

Types

Geographic

Database

Structure

Physical Model

Database

Schema

35

Conceptual Model

Steps involved are:

1. Model the user’s view

2. Define objects and their relationships

3. Select geographic representation

36

Model the User’s View

Identifying organizational functions, determining data requirements of these functions, organizing data into groups for data management

May be presented using a report with tables

37

Define Objects and Their

Relationships

Specification of object types/classes and

functions, and their relationships

May be presented using diagrams

38

Select Geographic

Representation

Choosing between the types of discrete objects (point, line, or polygon) or field to represent the data

Selection has a critical impact on the database use

Although it is possible to switch between representations later on, it would be computationally expensive and would lead to information loss

39

Logical Model

Steps involved are:

1. Match to geographic database types

2. Organize geographic database structure

40

Match to Geographic Database

Types

Matching of object types to be studied to

specific data types supported by the GIS

41

Organize Geographic Database

Structure

Defining topological associations, specifying

rules and relationships, and assigning

coordinate systems

42

Physical Model

Step involved is:

1. Define database schema

definition of the actual physical database

schema that will hold the database data values

usually created using the DBMS software’s data

definition language (ex. SQL)

43

Database

Organization/Structuring

Necessary for efficient query, analysis, and

mapping

44

Structuring Techniques

1. Topologic Creation

2. Indexing

45

Topologic Creation

Can be created for vector data using either batch or interactive techniques

Batch Topology – for CAD, survey, simple feature and other unstructured vector data

– an iterative process

Interactive Topology – performed dynamically at the time objects are added to the database

46

Indexing

Can help speed up certain types of queries

Three main indexing methods in GIS are grid indexes, quadtrees, and R-trees.

Database index – a special representation of information about objects that improves searching

47

Thank you!

top related