1.8 database and data modelling

29
Unit 1 8 Database and data modelling 1 1.8 DATABASE AND DATA MODELLING 8.1 Database Concepts File-based approach for the storage and retrieval of data File-based systems were an early attempt to computerise the manual filing system. File-based system is a collection of application programs that perform services for the end-users. Each program defines and manages its data. File-based systems used flat files. Alternatively referred to as a flat database or text database, a flat file is a file of data that does not contain links to other files or is a non-relational database. A good example of a flat file is a single text-only file that contains all the data needed for a program that is often separated by some kind of delimiter. Storage of large amounts of data has always been a matter of huge concern. In early days, file-based systems were used. In this system, data was stored in discrete files and a collection of such files was stored on a computer. These could be accessed by a computer operator. Files of archived data were called tables because they looked like tables used in traditional file keeping. Rows in the table were called records and columns were called fields.

Upload: others

Post on 21-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

1

1.8 DATABASE AND DATA MODELLING

8.1 Database Concepts

File-based approach for the storage and retrieval of data

File-based systems were an early attempt to computerise the manual filing system. File-based system

is a collection of application programs that perform services for the end-users. Each program defines

and manages its data.

File-based systems used flat files. Alternatively referred to as a flat database or text database, a flat file

is a file of data that does not contain links to other files or is a non-relational database. A good example

of a flat file is a single text-only file that contains all the data needed for a program that is often separated

by some kind of delimiter.

Storage of large amounts of data has always been a matter of huge concern. In early days, file-based

systems were used. In this system, data was stored in discrete files and a collection of such files was

stored on a computer. These could be accessed by a computer operator. Files of archived data were

called tables because they looked like tables used in traditional file keeping. Rows in the table were

called records and columns were called fields.

Page 2: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

2

Limitations of the File-Based Approach

1. Data redundancy and inconsistency: Since data resides in different private data files, there are

chances of redundancy and resulting inconsistency.

a. Duplication wastes time and money since data is entered more than once

b. Duplication takes up additional storage space

c. Duplication introduces the risk of a loss of data integrity, since changes to the data may not

be accurately reflected in all files.

For example, a customer can have a savings account as well as a mortgage loan. Here, the customer

details may be duplicated since the programs for the two functions store their corresponding data

in two different data files. This gives rise to redundancy in the customer's data. Since the same data

is stored in two files, inconsistency arises if a change made in the data of one file is not reflected in

the other.

2. Unanticipated queries: In a file-based system, handling sudden/ad-hoc queries can be difficult, since

it requires changes in the existing programs.

For example, the bank officer needs to generate a list of all the customers who have an account

balance of $20,000 or more. The bank officer has two choices: either obtain the list of all customers

and have the needed information extracted manually, or hire a system programmer to design the

necessary application program. Both alternatives are obviously unsatisfactory. Suppose that such a

program is written, and several days later, the officer needs to trim that list to include only those

customers who have opened their account one year ago. As the program to generate such a list does

not exist, it leads to a difficulty in accessing the data.

3. Data isolation: Data are scattered in various files, and files may be in a different format. Though data

used by different programs in the application may be related, they reside as isolated data files.

4. Concurrent access anomalies: In large multi-user systems, the same file or record may need to be

accessed by multiple users simultaneously. Handling this in a file-based system is difficult.

5. Security problems: In data-intensive applications, security of data is a major concern. Users should

be given access only to required data and not to the whole database.

For example, in a banking system, payroll personnel need to view only that part of the database that

has information about the various bank employees. They do not need access to information about

customer accounts. Since application programs are added to the system in an ad-hoc manner, it is

difficult to enforce such security constraints. In a file-based system, this can be handled only by

additional programming in each application.

Page 3: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

3

6. Integrity problems: In any application, there will be certain data integrity rules, which need to be

maintained. These could be in the form of certain conditions/constraints on the elements of the data

records. In the savings bank application, one such integrity rule could be 'Customer ID, which is the

unique identifier for a customer record, should not be empty'. There can be several such integrity

rules. In a file-based system, all these rules need to be explicitly programmed in the application

program.

Though all these are common issues of concern to any data-intensive application, each application

had to handle all these problems on its own. The application programmer needs to bother not only

about implementing the application business rules but also, about handling these common issues.

Relational database A relational database (RDB) is a collective set of multiple data sets organised by tables, records and

columns. RDBs establish a well-defined relationship between database tables. Tables communicate and

share information, which facilitates data searchability, organisation and reporting.

Database Elements

Tables: A database table is composed of records and fields that hold data.

Records: Data is stored in records. A record is composed of fields and contains all the data about one particular person, company, or item in a database.

Fields: A field is part of a record and contains a single piece of data for the subject of the record.

Page 4: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

4

Page 5: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

5

Use of Terminology Associated with a Relational Database Model Each database is a collection of related tables; these are also called relations, hence the name

"relational database". Each table is a physical representation of an entity or object that is in a tabular

format consisting of columns and rows. Columns are the fields of a record or the attributes of an entity.

The rows contain the values or data instances; these are also called records or tuples.

Terminology associated with a relational database model

1. Entity Examples:

An entity can be a real-world object, either animate or inanimate, that can be easily identifiable. For

example, in a school database, students, teachers, classes, and courses offered can be considered as

entities. All these entities have some attributes or properties that give them their identity.

2. Table A table is a collection of related data held in a structured format within a database. It consists of fields

(columns), and rows.

3. Tuple A single row of a table, which contains a single record for that relation, is called a tuple.

4. Attribute Entities are represented by means of their properties, called attributes. All attributes have values. For

example, a student entity may have name, class, and age as attributes.

<Entity Name>

Page 6: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

6

5. Primary key The attribute or combination of attributes that uniquely identifies a row or record in a relation is known

as primary key.

6. Candidate key A relation can have only one primary key. It may contain many fields or combination of fields that can

be used as primary key. One field or combination of fields is used as primary key. The fields or

combination of fields that are not used as primary key are known as candidate key or alternate key.

7. Foreign key A foreign key is an attribute or combination of attribute in a relation whose value matches a primary key

in another relation. The table in which foreign key is created is called as dependent table. The table to

which foreign key is refers is known as parent table.

8. Relationship

The logical association among entities is called relationship. Relationships are mapped with entities in

various ways. Mapping cardinalities define the number of association between two entities.

9. Referential integrity

Referential integrity is a property of data which, when satisfied, requires every value of one attribute

(column) of a relation (table) to exist as a value of another attribute in a different (or the same) relation

(table).

For referential integrity to hold in a relational

database, any field in a table that is declared a

foreign key can contain either a null value, or

only values from a parent table's primary key or

a candidate key. In other words, when a foreign

key value is used it must reference a valid,

existing primary key in the parent table. For

instance, deleting a record that contains a value

referred to by a foreign key in another table

would break referential integrity

Page 7: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

7

10. Secondary key A field or combination of fields that is basis for retrieval is known as secondary key. Secondary key is a

non-unique field. One secondary key value may refer to many records.

11. Indexing A database index is a data structure that improves the speed of data retrieval operations on a database

table at the cost of additional writes and storage space to maintain the index data structure. Indexes are

used to quickly locate data without having to search every row in a database table every time a database

table is accessed. Indexes can be created using one or more columns of a database table, providing the

basis for both rapid random lookups and efficient access of ordered records.

Relational design: Drawing Entity Relationship Diagram (ER Diagram)

An entity-relationship diagram (ERD) is a graphical representation of an information system that shows

the relationship between entities (people, objects, places, concepts or events) within that system. An

ERD is a data modeling technique that can help define business processes and can be used as the

foundation for a relational database.

A very simple computer system may be able to be supported by a very simple database design that only

includes a single table. However, if the database design needs to be enhanced to support more complex

requirements, the single table design would almost always end up being normalised into multiple tabled

linked together through relationships. This is required to reduce data redundancy and to improve

efficiency.

There are 3 types of table relationships:

1. One-to-one relationships

2. One-to-many relationships

3. Many-to-many relationships

▪ One-to-One Relationships

In a one-to-one relationship, each row in one database table is linked to one and only one other row in

another table. In a one-to-one relationship between Table A and Table B, each row in Table A is linked

to another row in Table B. The number of rows in Table A must equal the number of rows in Table B.

Page 8: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

8

It would be apparent that one-to-one relationships are not very useful since the database designer might

as well simply merge both tables into a single table. This is true in general. However, there are some

situations in which the one-to-one relationship may improve performance.

For example, if a database table contains a few columns of data that is frequently used and the remaining

columns being infrequently used, the database designer may split the single table into 2 tables linked

through a one-to-one relationship. Such a design would reduce the overhead needed to retrieve the

infrequently used columns whenever query is performed on the contents of the database table.

▪ One-to-Many Relationships

In a one-to-many relationship, each row in the related to table can be related to many rows in the

relating table. This effectively save storage as the related record does not need to be stored multiple

times in the relating table.

For example, all the customers belonging to a business is stored in a customer table while all the

customer invoices are stored in an invoice table. Each customer can have many invoices but each invoice

can only be generated for a single customer.

Page 9: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

9

▪ Many-to-Many Relationships

In a many-to-many relationship, one or more rows in a table can be related to 0, 1 or many rows in

another table. A mapping table is required in order to implement such a relationship.

For example, all the customers belonging to a bank is stored in a customer table while all the bank's

products are stored in a product table. Each customer can have many products and each product can be

assigned to many customers.

Complete the following diagram with relationships considering the relationship rules given.

• An author writes many novels and an a novel can be written by one author

• novel can be read by one many readers and a reader can read many novels

Author Novel

Reader

Page 10: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

10

Normalisation Process: First (1NF), Second (2NF) and Third Normal Form (3NF)

Database Normalisation is a technique of organising the data in the database. Normalisation is a

systematic approach of decomposing tables to eliminate data redundancy and undesirable

characteristics like Insertion, Update and Deletion Anomalies. It is a multi-step process that puts data

into tabular form by removing duplicated data from the relation tables.

Consider the following delivery note from Easy Fasteners Ltd.

In this example, the delivery note has more than one part on it. This is called a repeating group. In the

relational database model, each record must be of a fixed length and each field must contain only one

item of data. Also, each record must be of a fixed length so a variable number of fields are not allowed.

Num CustName City Country ProdID Description

005 Bill Jones London England 1 Table

2 Desk

3 Chair

Page 11: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

11

Un-normalised Data (UNF)

ORDER(Num, CustName, City, Country, (ProdID, Description))

where ORDER is the name of the relation (or table) and Num, CustName, City, Country, ProdID and

Description are the attributes. ProdID and Description are put inside parentheses because they form a

repeating group.

ORDER

Num CustName City Country ProdID Description

005 Bill Jones London England 1 Table

005 Bill Jones London England 2 Desk

005 Bill Jones London England 3 Chair

This again shows the repeating group. We say that this is in un-normalised form (UNF). To put it into 1st

normal form (1NF) we complete the table and identify a key that will make each tuple unique.

To make each row unique we need to choose Num together with ProdID as the key. Remember, another

delivery note may have the same products on it, so we need to use the combination of Num and ProdID

to form the key. We can write this as

ORDER(Num, CustName, City, Country, ProdID, Description)

To indicate the key, we simply underline the attributes that make up the key.

Because we have identified a key that uniquely identifies each tuple, we have removed the repeating

group.

Page 12: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

12

First Normal Form (1NF)

A table with no repeating groups is said to be in First Normal Form

First normal form (1NF) sets the very basic rules for an organised database:

▪ Define the data items required, because they become the columns in a table. Place related data

items in a table.

▪ Ensure that there are no repeating groups of data.

▪ Ensure that there is a primary key.

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

ORDER

Num CustName City Country

005 Bill Jones London England

ORDER-PRODUCT

Num ProdID Description

005 1 Table

005 2 Desk

005 3 Chair

Underline the primary keys

of both the tables

Page 13: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

13

Second Normal Form (2NF)

A table is in Second Normal Form if any partial dependencies have been removed.

Second normal form states that it should meet all the rules for 1NF and there must be no partial

dependences of any of the columns on the primary key:

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

ORDER-PRODUCT PRODUCT

Num ProdID

005 1

005 2

005 3

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

ProdID Description

1 Table

2 Desk

3 Chair

Underline the

primary keys of

both the tables

Page 14: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

14

Third Normal Form (3NF)

A table is in Third Normal Form if any non-key dependencies have been removed.

A table is in third normal form when the following conditions are met:

▪ It is in second normal form.

▪ All non-primary fields are dependent on the primary key.

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

ORDER

Num CustName City

005 Bill Jones London

006 Linda Jones Melbourne-CN

007 Samantha Lee Melbourne-AUS

CITY-COUNTRIES

City Country

London England

Melbourne-CN Canada

Melbourne-AUS Australia

Underline the

primary keys of

both the tables

Page 15: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

15

Summary of design changes

Stage Tables

UNF

1NF

2NF

3NF

Final ER Diagram

Page 16: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

16

8.2 Database Management System (DBMS)

A DBMS can be defined as a collection of related records and a set of program that access and manipulate

these records. A DBMS enables the user to enter, store, and manage data. The main problem with the

earlier DBMS packages was that the data was stored in the flat file format. So, the information about

different objects was maintained separately in different physical files. Hence, the relations between

these objects, if any, had to be maintained in a separate physical file. Thus, a single package would

consist of too many files and vast functionalities to integrate them into a single system.

A solution to these problems came in the form of a centralised database system. In a centralised

database system, the database is stored in the central location. Everybody can have access to the data

stored in a central location from their machine.

For example, a large central database system would contain all the data pertaining to the employees.

The Accounts and the HR department would access the data required using suitable programs. These

programs or the entire application would reside on individual computer terminals.

A Database is a collection of interrelated data, and a DBMS is a set of programs used to add or modify

this data. Thus, a DBMS is a set of software programs that allow databases to be defined, constructed,

and manipulated.

A DBMS provides an environment that is both convenient and efficient to use when there is a large

volume of data and many transactions to be processed. Different categories of DBMS can be used,

ranging from small systems that run on personal computers to huge systems that run on mainframes.

Features provided by a DBMS

▪ Data management, including maintaining a data dictionary

Data dictionary provides a descriptive list of names, definitions, and attributes of data elements to be

captured in an information system or database. It describes the definitions or the expected meaning and

acceptable representation of data for use within a defined context of data elements within a data set. It

also provides metadata or information about data.

Page 17: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

17

Example: Data Dictionary of the Client table

The metadata may include other attributes or characteristics such as length of data element, data type

(e.g., alphanumeric, numeric, date, special symbols), data frequency (mandatory or not), allowable

values or constraints, originating source system, data owner, data entry date, and when the data

element is no longer collected.

▪ Data modelling

Data modeling the analysis of data objects and their relationships to other data objects. Data modelling

is often the first step in database design and object-oriented programming as the designers first create

a conceptual model of how data items relate to each other. Data modelling involves a progression from

conceptual model to logical model to physical schema.

A data model can be thought of as a diagram or flowchart that illustrates the relationships between data.

Page 18: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

18

Three levels of data modeling

1. Conceptual data model

A conceptual data model identifies the highest-level relationships between the different entities.

Features of conceptual data model include:

• Includes the important entities and the relationships among them.

• No attribute is specified.

• No primary key is specified.

Conceptual data model is created by gathering business

requirements from various sources like business documents,

discussions with functional teams, business analysts,

management experts and users who do the reporting on the

database.

For Example: School (Teaching-Learning)

Entities Relationships Diagram

Page 19: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

19

2. Logical data model

A logical data model describes the data in as much detail as possible, without regard to how they will

be physical implemented in the database. Features of a logical data model include:

• Includes all entities and relationships

among them.

• All attributes for each entity are specified.

• The primary key for each entity is specified.

• Foreign keys (keys identifying the

relationship between different entities) are

specified.

• Normalisation occurs at this level.

Page 20: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

20

3. Physical data model

Physical data model represents how the model will be built in the database. A physical database model

shows all table structures, including column name, column data type, column constraints, primary key,

foreign key, and relationships between tables. Features of a physical data model include:

• Specification all tables and columns.

• Foreign keys are used to identify

relationships between tables.

• De-normalisation may occur based on user

requirements.

• Physical considerations may cause the

physical data model to be quite different

from the logical data model.

• Physical data model will be different for

different RDBMS. For example, data type

for a column may be different between

MySQL and SQL Server.

▪ Logical schema

A logical schema is a data model of a specific problem domain expressed in terms of a particular data

management technology.

Without being specific to a particular database management product, it is in terms of relational tables

and columns, object-oriented classes, or XML tags. This is as opposed to a conceptual data model, which

describes the semantics of an organisation without reference to technology, or a physical data model,

which describes the particular physical mechanisms used to capture data in a storage medium.

▪ Data integrity

Data integrity refers to maintaining and assuring the accuracy and consistency of data over its entire life-

cycle, and is a critical aspect to the design, implementation and usage of any system which stores,

processes, or retrieves data.

Data integrity is the opposite of data corruption, which is a form of data loss.

[Refer 1.6.2] in terms of database

Page 21: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

21

▪ Data security, including backup procedures and the use of access rights to individuals/groups

of users.

Data security is the practice of keeping data protected from corruption and unauthorised access. The

focus behind data security is to ensure privacy while protecting personal or corporate data.

[Refer 1.6.2] in terms of database

Software tools found within a DBMS

▪ Developer interface

Since databases support a number of user groups, the DBMS must have languages and Developer

Interfaces that support each user group. The DBMS interfaces are the simplest kind of extensibility

services. DBMS interfaces are made available through extensions to SQL or to the Oracle Call Interface

(OCI).

▪ Query processor

With higher level database query languages such as SQL and QUEL, a special component of the DBMS

called the Query Processor takes care of arranging the underlying access routines to satisfy a given

query.

Thus queries can be specified in terms of the required results rather than in terms of how to achieve

those results.

Page 22: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

22

What is query processing?

• A given SQL query is translated by the query processor into a low level program called an

execution plan

• An execution plan is a program in a functional language: – The physical relational algebra,

specialised for internal storage representation in the DBMS.

• The physical relational algebra extends the relational algebra with: – Primitives to search

through the internal storage structures of the DBMS

High-level languages provide accessing facilities for data stored in a database: Many applications in the real world need databases to store the data they process. Thus, programming languages for such applications must also support some mechanism to organise the access to databases. This can be done in a way that is largely independent on the underlying programming language

In principle, High-level languages provides a natural framework for connecting databases since relations

stored in a relational database can be considered as facts defining a predicate of a logic program.

Page 23: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

23

8.3 Data Definition Language (DDL) and Data Manipulation Language (DML)

Data Definition Language (DDL)

DDL, which is usually part of a DBMS, is used to define and manage all attributes and properties of a

database, including row layouts, column definitions, key columns, file locations, and storage strategy.

DDL statements are used to build and modify the structure of tables and other objects such as views,

triggers, stored procedures, and so on. For each object, there are usually CREATE, ALTER, and DROP

statements (such as, CREATE TABLE, ALTER TABLE, and DROP TABLE). Most DDL statements take the

following form:

• CREATE object _ name

• ALTER object _ name

• DROP object _ name

In DDL statements, object_namecan be a table, view, trigger, stored procedure, and so on.

CREATE DATABASE

Many database servers allow for the presence of many databases. In order to create a database, a

relatively standard command ‘CREATE DATABASE’ is used.

The general format of the command is:

CREATE DATABASE <database-name> ;

The name can be pretty much anything; usually it shouldn’t have spaces (or those spaces) have to be

properly escaped). Some databases allow hyphens, and/or underscores in the name. The name is usually

limited in sise.

DROP DATABASE

Just like there is a ‘create database’ there is also a ‘drop database’, which simply removes the database.

Note that it doesn’t ask you for confirmation, and once you remove a database, it is gone forever.

DROP DATABASE <database-name> ;

Page 24: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

24

CREATE TABLE

Probably the most common DDL statement is ‘CREATE TABLE’. Intuitively enough, it is used to create

tables. The general format is something along the lines of:

CREATE TABLE <table-name> (

...

);

The ... is where column definitions go. The general format for a column definition is the column name

followed by column type. For example:

PERSONID INT

Which defines a column name PERSONID, of type INT. Column names have to be comma separated,

i.e.:

CREATE TABLE PERSON (

PERSONID INTEGER,

LNAME TEXT (20),

FNAME TEXT (20) NOT NULL,

DOB DATE,

PRIMARY KEY(PERSONID));

The above creates a table named person, with person id, last name, first name, and date of birth. There

is also the ‘primary key’ definition. A primary key is a column value that uniquely identifies a database

record. So for example, we can have two ‘person’ records with the same last name and first name, but

with different ids.

Besides for primary key, there are many other flags we can specify for table columns. For example, in

the above example, FNAME is marked as NOT NULL, which means it is not allowed to have NULL values.

Many databases implement various extensions to the basics, and you should read the documentation to

determine what features are present/absent, and how to use them.

Page 25: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

25

ALTER TABLE

The general syntax to add a field is:

ALTER TABLE <table-name>

ADD <field-name><data-type>;

The field declaration is pretty much exactly what it is in the ‘create table’ statement.

The general syntax to drop a field is:

ALTER TABLE <table-name>

DROP <field-name>;

Note that very few databases let you drop a field. The drop command is mostly present to allow for

dropping of constraints (such as indexes, etc.) on the table.

The general syntax to modify a field (change its type, etc.) is:

ALTER TABLE <table-name>

MODIFY <field-name><new-field-declaration>;

Note that you can only do this to a certain extent on most databases. Just as with ‘drop’, this is mostly

useful for working with table constraints (changing ‘not null’ to ‘null’, etc.)

Data Manipulation Language (DML)

DML is used to select, insert, update, or delete data in the objects defined with DDL. All database users

can use these commands during the routine operations on a database. The different DML statements

are as follows:

SELECT statement

INSERT statement

UPDATE statement

DELETE statement

Page 26: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

26

SELECT Statement

Probably the most used statement in all of SQL is the SELECT statement. The select statement as the

general format of:

SELECT <column-list>

FROM <table-list>

WHERE <search-condition>;

The column-list indicates what columns you’re interested in (the ones which you want to appear in the

result), the table-list is the list of tables to be used in the query, and search-condition specifies what

criteria you’re looking for.

An example of a short-hand version to retrieve all ‘person’ records we’ve been using:

SELECT * FROM PERSON;

The WHERE Clause

The WHERE clause is used in UPDATE, DELETE, and SELECT statements, and has the same format in all

these cases. It has to be evaluated to either true or false. Table 1 lists some of the common operators.

=equals to

>greater than

<less than

>=greater than or equal to

<=less than or equal to

<>not equal to

There is also IS, which can be used to check for NULL values, for example: column-name IS NULL

We can also use AND, OR and parenthesis to group expressions. Besides for these operators, we can

also call built-in functions (as well as stored procedures we define ourselves—that is, if the database

supports stored procedures).

An example of the operators in use would be:

something < 5 OR something is NULL AND somedate = TO DATE(’01/03/93’,’MM/DD/YY’).

Page 27: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

27

INSERT INTO Statement

To get data into a database, we need to use the ‘insert’ statement. The general syntax is:

INSERT INTO <table-name> (<column1>,<column2>,<column3>,...) VALUES (<column-

value1>,<column-value2>,<column-value3>);

The column names (i.e.: column1, etc.) must correspond to column values (i.e.: column-value1,etc.).

There is a short-hand for the statement:

INSERT INTO <table-name> VALUES (<column-value1>,<column-value2>,<column-value3>);

In which the column values must correspond exactly to the order columns appear in the ‘createtable’

declaration. It must be noted, that this sort of statement should (or rather, must)be avoided! If someone

changes the table, moves columns around in the table declaration, the code using the shorthand insert

statement will fail.

A typical example, of inserting the ‘person’ record we’ve created earlier would be:

INSERT INTO PERSON(PERSONID,LNAME,FNAME,DOB) VALUES(1,’DOE’,’JOHN’,’1956-11-23’);

UPDATE Statement

The update statement is used for changing records. The general syntax is:

UPDATE <table-name> SET <column1> = <value1>, <column2> = <value2>, ...

WHERE <criteria>

The criteria is what selects the records for update. The ‘set’ portion indicates which columns should be

updated and to what values. An example of the use would be:

UPDATE PERSON

SET FNAME=’Jane’, LNAME=’Keats’

WHERE FNAME=’Jean’;

Page 28: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

28

DELETE Statement

The ‘delete’ is used to remove elements from the database. The syntax is very similar to

update and select statements:

DELETE * FROM <table-name>

WHERE <criteria>

Basically we select which records we want to delete using the where clause. An example use would be:

DELETE * FROM PERSON

WHERE PERSONID=54598;

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

Page 29: 1.8 DATABASE AND DATA MODELLING

Unit 1 8 Database and data modelling

29

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………

……………………………………………………………………………………………………………