comp4 unit6a lecture slides

20
Introduction to Information and Computer Science Databases and SQL Lecture a This material (Comp4_Unit6a) was developed by Oregon Health and Science University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information Technology under Award Number IU24OC000015.

Upload: health-it-workforce-curriculum-2012

Post on 06-May-2017

231 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Comp4 Unit6a Lecture Slides

Introduction to Information and Computer ScienceDatabases and SQL

Lecture a

This material (Comp4_Unit6a) was developed by Oregon Health and Science University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information Technology under Award Number

IU24OC000015.

Page 2: Comp4 Unit6a Lecture Slides

Databases and SQLLearning Objectives

• Define and describe the purpose of databases (Lecture a)• Define a relational database (Lecture a)• Describe data modeling and normalization (Lecture b)• Describe the structured query language (SQL) (Lecture c)• Define the basic data operations for relational databases and how to

implement them in SQL (Lecture c)• Design a simple relational database and create corresponding SQL

commands (Lecture c)• Examine the structure of a healthcare database component (Lecture

d)

2Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 3: Comp4 Unit6a Lecture Slides

Data Representation

• It’s all 1s and 0s• 01000001 can mean

– 65 as a binary number– ‘A’ as alphanumeric character (ASCII)– Many other options, including CPU

instructions and multimedia data

3Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 4: Comp4 Unit6a Lecture Slides

Data Storage

• Large component of computer systems is management of data

• Storing and retrieving data are important functions– Efficiency– Speed

4Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 5: Comp4 Unit6a Lecture Slides

Data Storage Options

• Text/data files• Spreadsheets• Databases

5Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 6: Comp4 Unit6a Lecture Slides

Files

• A collection of information stored electronically in a single location

• Can store text or data• Files have different formats

6Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 7: Comp4 Unit6a Lecture Slides

Advantages/Disadvantages of Files

Advantages• Easy to create and

store• Easy to share• Used by many

applications– Input or output data

from scientific computations

Disadvantages• Limited security• Multiple user access

isn't supported• Redundant and

inconsistent data

7Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 8: Comp4 Unit6a Lecture Slides

Contact Information Example

File with contact information:

Bill Robeson, 1312 Main, Portland, OR, Community Hospital, Inc.

Walter Schmidt, 14 12th St., Oakland, CA, Oakland Providers LLC

Mary Stahl, 14 12th St., Oakland, CA, Oakland Providers LLC

Albert Brookings, 1312 Main, Portland, OR, Community Hospital Incorporated

Catherine David, 14 12th Street, Oakland, CA, Oakland Providers LLC

8Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 9: Comp4 Unit6a Lecture Slides

Quick!

• Do Bill and Albert work for the same company? • Is there an issue with Catherine and Walter?• Can a computer application tell?• Give me a contact list sorted by last name• Imagine with 10,000 contacts!

9Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 10: Comp4 Unit6a Lecture Slides

Quick! Answers

• Bill and Albert work for the same company – but it’s represented differently

• Catherine and Walter have the same addressed – again represented differently

• It’s hard for a computer application to tell• You CAN sort by hand – but it’s a challenge

10Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 11: Comp4 Unit6a Lecture Slides

Another Problem

• What do you do if “Community Hospital” becomes “Community General” ?– Find every instance of “Community Hospital”

or variation thereof– Change EVERY entry

11Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 12: Comp4 Unit6a Lecture Slides

Another Solution: Spreadsheets

• Spreadsheet applications store, manipulate and present data

• Provide more functionality than plain text files– Calculations– Sorting– Filtering– Data analysis

12Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 13: Comp4 Unit6a Lecture Slides

Spreadsheet Example

OpenOffice Calc spreadsheet example. (PD-US, 2011).

13Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 14: Comp4 Unit6a Lecture Slides

Advantages/Disadvantages of Spreadsheets

Advantages• Widely available• Powerful calculations• Basic sorting and

filtering

Disadvantages• Limited security• Multiple user access

isn't supported• Redundant and

inconsistent data

14Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 15: Comp4 Unit6a Lecture Slides

Databases

• Definition:– Structured data collection accessed

electronically• Files are simple databases• Relational databases maintain relationships

between data

15Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 16: Comp4 Unit6a Lecture Slides

Relational Database

• Introduced by Dr. Edgar Codd of IBM Research Laboratory in 1970– “Future users of large data banks must be protected

from having to know how the data is organized in the machine (the internal representation).”

• Definition:• An organized collection of data accessible by

electronic means where the information type and information relationships are maintained

16Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 17: Comp4 Unit6a Lecture Slides

Relational Database Contents

• A relational database contains tables• Tables contain multiple rows of data• Rows contain data of specified type(s) in a

column order• Data and type are independent• Row order does not matter, but column order

does.

17Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 18: Comp4 Unit6a Lecture Slides

Advantages/Disadvantages of Relational Databases

Advantages• Secure• Multiple user access• Relationships prevent

redundancy and inconsistency

• Optimized operations• Complex queries

Disadvantages• Expertise required• Limited data

calculations

18Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 19: Comp4 Unit6a Lecture Slides

Databases and SQLSummary – Lecture a

• Data can be stored in files, spreadsheets or databases• Files and spreadsheets

– Widely available– Good for computations

• Databases– Secure– Optimized for speed– Multiple user access– Store relationships

19Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a

Page 20: Comp4 Unit6a Lecture Slides

Databases and SQLReferences – Lecture a

References• American National Standards Institute. (2007). Information Systems - Coded Character Sets - 7-Bit American

National Standard Code for Information Interchange (7-Bit ASCII) (No. ANSI INCITS 4-1986 (R2007)).• Codd, E. F. (1970). A relational model of data for large shared data banks. Communications of the ACM, 13(6),

377-387.

Images• Slide 13: OpenOffice Calc spreadsheet example. (PD-US, 2011).

20Health IT Workforce Curriculum Version 3.0/Spring 2012

Introduction to Information and Computer Science Databases and SQL

Lecture a