unit-1 introduction database system: database and its ...€¦ · unit-1 introduction database...
TRANSCRIPT
UNIT-1
INTRODUCTION
DATABASE SYSTEM: DATABASE AND ITS PURPOSE
DATABASE:
A database is a collection of information that is organized so that it can be easily accessed, managed and
updated.
Data is organized into rows, columns and tables, and it is indexed to make it easier to find relevant information.
Data gets updated, expanded and deleted as new information is added. Databases process workloads to create
and update themselves, querying the data they contain and running applications against it.
Computer databases typically contain aggregations of data records or files, such as sales transactions, product
catalogs and inventories, and customer profiles.
1. To see why database management systems are necessary, let's look at a typical ``file-processing system'' supported by a conventional operating system.
The application is a savings bank:
o Savings account and customer records are kept in permanent system files. o Application programs are written to manipulate files to perform the following tasks:
Debit or credit an account. Add a new account. Find an account balance. Generate monthly statements.
2. Development of the system proceeds as follows: o New application programs must be written as the need arises. o New permanent files are created as required. o but over a long period of time files may be in different formats, and o Application programs may be in different languages.
3. So we can see there are problems with the straight file-processing approach: o Data redundancy and inconsistency
Same information may be duplicated in several places. All copies may not be updated properly.
o Difficulty in accessing data May have to write a new application program to satisfy an unusual request. E.g. find all customers with the same postal code. Could generate this data manually, but a long job...
o Data isolation Data in different files. Data in different formats.
Difficult to write new application programs. o Multiple users
Want concurrency for faster response time. Need protection for concurrent updates. E.g. two customers withdrawing funds from the same account at the same time -
account has $500 in it, and they withdraw $100 and $50. The result could be $350, $400 or $450 if no protection.
o Security problems Every user of the system should be able to access only the data they are permitted to
see. E.g. payroll people only handle employee records, and cannot see customer accounts;
tellers only access account data and cannot see payroll data. Difficult to enforce this with application programs.
o Integrity problems Data may be required to satisfy constraints. E.g. no account balance below $25.00. Again, difficult to enforce or to change constraints with the file-processing approach.
These problems and others led to the development of database management systems.
CHARACTERSTICS OF DATABASE APPROACH:
Traditionally, data was organized in file formats. DBMS was a new concept then, and all the research was done
to make it overcome the deficiencies in traditional style of data management. A modern DBMS has the
following characteristics −
Real-world entity − A modern DBMS is more realistic and uses real-world entities to design its
architecture. It uses the behavior and attributes too. For example, a school database may use students as
an entity and their age as an attribute.
Relation-based tables − DBMS allows entities and relations among them to form tables. A user can understand the architecture of a database just by looking at the table names.
Isolation of data and application − A database system is entirely different than its data. A database is an active entity, whereas data is said to be passive, on which the database works and organizes. DBMS
also stores metadata, which is data about data, to ease its own process.
Less redundancy − DBMS follows the rules of normalization, which splits a relation when any of its attributes is having redundancy in values. Normalization is a mathematically rich and scientific process
that reduces data redundancy.
Consistency − Consistency is a state where every relation in a database remains consistent. There exist methods and techniques, which can detect attempt of leaving database in inconsistent state. A DBMS
can provide greater consistency as compared to earlier forms of data storing applications like file-
processing systems.
Query Language − DBMS is equipped with query language, which makes it more efficient to retrieve and manipulate data. A user can apply as many and as different filtering options as required to retrieve
a set of data. Traditionally it was not possible where file-processing system was used.
ACID Properties − DBMS follows the concepts of Atomicity, Consistency, Isolation, and Durability
(normally shortened as ACID). These concepts are applied on transactions, which manipulate data in a
database. ACID properties help the database stay healthy in multi-transactional environments and in
case of failure.
Multiuser and Concurrent Access − DBMS supports multi-user environment and allows them to
access and manipulate data in parallel. Though there are restrictions on transactions when users attempt
to handle the same data item, but users are always unaware of them.
Multiple views − DBMS offers multiple views for different users. A user who is in the Sales
department will have a different view of database than a person working in the Production department.
This feature enables the users to have a concentrate view of the database according to their
requirements.
Security − Features like multiple views offer security to some extent where users are unable to access data of other users and departments. DBMS offers methods to impose constraints while entering data
into the database and retrieving the same at a later stage. DBMS offers many different levels of security
features, which enables multiple users to have different views with different features. For example, a
user in the Sales department cannot see the data that belongs to the Purchase department. Additionally,
it can also be managed how much data of the Sales department should be displayed to the user. Since a
DBMS is not saved on the disk as traditional file systems, it is very hard for miscreants to break the
code.
ADVANTAGES AND DISADVANTAGES OF DATABASE SYSTEM
Advantages:
1. Controlling Data Redundancy
In the conventional file processing system, every user group maintains its own files for handling its data files. This may lead to
• Duplication of same data in different files.
• Wastage of storage space, since duplicated data is stored.
• Errors may be generated due to updation of the same data in different files.
• Time in entering data again and again is wasted.
• Computer Resources are needlessly used.
• It is very difficult to combine information.
2. Elimination of Inconsistency
In the file processing system information is duplicated throughout the system. So changes made in one file may be necessary be carried over to another file. This may lead to inconsistent data. So we need to remove this duplication of data in multiple file to eliminate inconsistency.
To avoid the above problem, there is a need to have a centralize database in order to have this conflicting information.
On centralizing the data base the duplication will be controlled and hence inconsistency will be removed.
3. Better service to the users
A DBMS is often used to provide better services to the users. In conventional system, availability of information is often poor, since it normally difficult to obtain information in a timely manner because our existing systems are not capable to produce the same.
Once several conventional systems are combined to form one centralized database, the availability of information and its updateness is likely to improve since the data can now be shared and DBMS makes it easy to respond to anticipated information requests.
Centralizing the data in the database also means that user can obtain new and combined information easily that would have been impossible to obtain otherwise.
Also use of DBMS should allow users that don’t know programming to interact with the data more easily, unlike file processing system where the programmer may need to write new programs to meet every new demand.
4. Flexibility of the System is Improved
Since changes are often necessary to the contents of the data stored in any system, these changes are made more easily in a centralized database than in a conventional system.
Applications programs need not to be changed on changing the data in the database.
5. Integrity can be improved
Since data of the organization using database approach is centralized and would be used by a number of users at a time, it is essential to enforce integrity-constraints.
In the conventional systems because the data is duplicated in multiple files so updating or changes may sometimes lead to entry of incorrect data in some files wherever it is applicable.
Even if we centralized the database it may still contain incorrect data. For example: -• Salary of full time clerk may be entered as Rs. 1500 rather than Rs. 4500.
A student may be shown to have borrowed library books but has no enrollment.
The above problems can be avoided by defining the validation procedures whenever any update operation is attempted.
6. Standards can be enforced
Standards are easier to enforce in database systems because all the data in database is access through centralized DBMS.
Here standards may relate to the naming of data, structure of data, format of the data etc.
Standardizing stored data formats is usually desirable for the purpose of data interchange or migration between systems.
7. Security can be improved
In conventional systems, applications are developed in an adhoc manner.
Often different system of an organization would access different components of the operational data, in such an environment enforcing security can be quiet difficult.
Setting up of a database makes it easier to enforce security restrictions since data is now centralized.
It is easier to control who has access to what parts of the database. Different checks can be established for each type of access (retrieve, modify, delete etc.) to each piece of information in the database.
8. Organization’s requirement can be easily identified
All organizations have sections and departments and each of these units often consider the work of their unit as the most important and therefore consider their need as the most important.
Once a database has been setup with centralized control, it will be necessary to identify organization’s requirement and to balance the needs of the competition units.
So it may become necessary to ignore some requests for information if they conflict with higher priority need of the organization.
9. Data Model must be developed
Perhaps the most important advantage of setting up of database system is the requirement that an overall data model for an organization be build. In conventional systems, it is more likely that files will be designed as per need of particular applications demand.
The overall view is often not considered. Building an overall view of an organization’s data is usual cost effective in the long terms.
10. Provides backup and Recovery
Centralizing a database provides the schemes such as recovery and backups from the failures including disk crash, power failures, software errors which may help the database
to recover from the inconsistent state to the state that existed prior to the occurrence of the failure, though methods are very complex.
Disadvantages of Database Systems
The following are the disadvantages of Database Systems
1. Database Complexity
The design of the database system is complex, difficult and is very time consuming task to perform.
2. Substantial hardware and software start-up costs Huge amount of investment is needed to setup the required hardware and the softwares needed to run those applications.
3. Damage to database affects virtually all applications programs If one part of the database is corrupted or damaged because of the hardware or software failure, since we don’t have many versions of the file, all the application programs which are dependent on this database are implicitly affected.
4. Extensive conversion costs in moving form a file-based system to a database system If you are currently working on file based system and need to upgrade it to database system, then large amount of cost is incurred in purchasing different tools, adopting different techniques as per the requirement.
5. Initial training required for all programmers and user. Large amount of human efforts, the time and cost is needed to train the end users and application programmers in order to get used to the database systems.
CLASSIFICATION OF DBMS USERS
A typical DBMS has users with different rights and permissions who use it for different purposes. Some users
retrieve data and some back it up. The users of a DBMS can be broadly categorized as follows −
Administrators − Administrators maintain the DBMS and are responsible for administrating the database. They are responsible to look after its usage and by whom it should be used. They create
access profiles for users and apply limitations to maintain isolation and force security. Administrators
also look after DBMS resources like system license, required tools, and other software and hardware
related maintenance.
Designers − Designers are the group of people who actually work on the designing part of the database.
They keep a close watch on what data should be kept and in what format. They identify and design the
whole set of entities, relations, constraints, and views.
End Users − End users are those who actually reap the benefits of having a DBMS. End users can
range from simple viewers who pay attention to the logs or market rates to sophisticated users such as
business analysts.
ACTORS ON THE SCENE
For a small personal database, such as the list of addresses discussed in Previous Section , one person typically
defines, constructs, and manipulates the database, and there is no sharing. However, in large organizations, many
people are involved in the design, use, and maintenance of a large database with hundreds of users. In this
section we identify the people whose jobs involve the day-to-day use of a large database; we call them the actors
on the scene. In Section 1.5 we consider people who may be called workers behind the scene—those who work
to maintain the database system environment but who are not actively interested in the database contents as part
of their daily job.
1. Database Administrators
In any organization where many people use the same resources, there is a need for a chief administrator to
oversee and manage these resources. In a database environment, the primary resource is the database itself, and
the secondary resource is the DBMS and related software. Administering these resources is the responsibility of
the database administrator (DBA). The DBA is responsible for authorizing access to the database,
coordinating and monitoring its use, and acquiring software and hardware resources as needed. The DBA is
accountable for problems such as security breaches and poor system response time. In large organizations, the
DBA is assisted by a staff that carries out these functions.
2. Database Designers
Database designers are responsible for identifying the data to be stored in the data-base and for choosing
appropriate structures to represent and store this data. These tasks are mostly undertaken before the database is
actually implemented and popu-lated with data. It is the responsibility of database designers to communicate
with all prospective database users in order to understand their requirements and to cre-ate a design that meets
these requirements. In many cases, the designers are on the staff of the DBA and may be assigned other staff
responsibilities after the database design is completed. Database designers typically interact with each potential
group of users and develop views of the database that meet the data and processing requirements of these
groups. Each view is then analyzed and integrated with the views of other user groups. The final database design
must be capable of supporting the requirements of all user groups.
3. End Users
End users are the people whose jobs require access to the database for querying, updating, and generating
reports; the database primarily exists for their use. There are several categories of end users:
Casual end users occasionally access the database, but they may need different information each time.
They use a sophisticated database query language to specify their requests and are typically middle- or high-
level managers or other occasional browsers.
Naive or parametric end users make up a sizable portion of database end users. Their main job
function revolves around constantly querying and updating the database, using standard types of queries and
updates—called canned transactions—that have been carefully programmed and tested. The tasks that such
users perform are varied:
Bank tellers check account balances and post withdrawals and deposits.
Reservation agents for airlines, hotels, and car rental companies check availability for a given request and make
reservations.
Employees at receiving stations for shipping companies enter package identifications via bar codes and
descriptive information through buttons to update a central database of received and in-transit packages.
Sophisticated end users include engineers, scientists, business analysts,
and others who thoroughly familiarize themselves with the facilities of the DBMS in order to implement their
own applications to meet their complex requirements.
Standalone users maintain personal databases by using ready-made pro-gram
packages that provide easy-to-use menu-based or graphics-based interfaces. An example is the user of a tax
package that stores a variety of personal financial data for tax purposes.
A typical DBMS provides multiple facilities to access a database. Naive end users need to learn very little about
the facilities provided by the DBMS; they simply have to understand the user interfaces of the standard
transactions designed and implemented for their use. Casual users learn only a few facilities that they may use
repeatedly. Sophisticated users try to learn most of the DBMS facilities in order to achieve their complex
requirements. Standalone users typically become very proficient in using a specific software package.
4. System Analysts and Application Programmers
(Software Engineers)
System analysts determine the requirements of end users, especially naive and parametric end users, and
develop specifications for standard canned transactions that meet these requirements. Application
programmers implement these specifications as programs; then they test, debug, document, and maintain these
canned transactions. Such analysts and programmers—commonly referred to as software developers
or software engineers—should be familiar with the full range of capabilities provided by the DBMS to
accomplish their tasks.
UNIT-2
DATABASE SYSTEM CONCEPT AND ARCHITECTURE
DATA MODELS:
Data models define how the logical structure of a database is modeled. Data Models are fundamental entities to
introduce abstraction in a DBMS. Data models define how data is connected to each other and how they are
processed and stored inside the system.
The very first data model could be flat data-models, where all the data used are to be kept in the same plane.
Earlier data models were not so scientific, hence they were prone to introduce lots of duplication and update
anomalies.
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships among them.
While formulating real-world scenario into the database model, the ER Model creates entity set, relationship
set, general attributes and constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
Entities and their attributes.
Relationships among entities.
These concepts are explained below.
Entity − An entity in an ER Model is a real-world entity having properties called attributes.
Every attribute is defined by its set of values called domain. For example, in a school database, a
student is considered as an entity. Student has various attributes like name, age, class, etc.
Relationship − The logical association among entities is called relationship. Relationships are mapped
with entities in various ways. Mapping cardinalities define the number of association between two
entities.
Mapping cardinalities −
o one to one
o one to many
o many to one
o many to many
Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model than others. This
model is based on first-order predicate logic and defines a table as an n-ary relation.
The main highlights of this model are −
Data is stored in tables called relations.
Relations can be normalized.
In normalized relations, values saved are atomic values.
Each row in a relation contains a unique value.
Each column in a relation contains values from a same domain.
Database Schema A database schema is the skeleton structure that represents the logical view of the entire database. It defines
how the data is organized and how the relations among them are associated. It formulates all the constraints that
are to be applied on the data.
A database schema defines its entities and the relationship among them. It contains a descriptive detail of the
database, which can be depicted by means of schema diagrams. It’s the database designers who design the
schema to help programmers understand the database and make it useful.
A database schema can be divided broadly into two categories −
Physical Database Schema − This schema pertains to the actual storage of data and its form of storage
like files, indices, etc. It defines how the data will be stored in a secondary storage.
Logical Database Schema − This schema defines all the logical constraints that need to be applied on the data stored. It defines tables, views, and integrity constraints.
Database Instance It is important that we distinguish these two terms individually. Database schema is the skeleton of database. It
is designed when the database doesn't exist at all. Once the database is operational, it is very difficult to make
any changes to it. A database schema does not contain any data or information.
A database instance is a state of operational database with data at any given time. It contains a snapshot of the
database. Database instances tend to change with time. A DBMS ensures that its every instance (state) is in a
valid state, by diligently following all the validations, constraints, and conditions that the database designers
have imposed.
Database State: The ‘database state’ is what we call the tendency to try to use computers to manage society by watching people.
There are many interlocking government plans that do this. Together they mean officials poking into YOUR private life more than ever before.
All the databases could be linked to, or indexed by, the National Identity Register (NIR) that is the main aim of
the ‘ID cards’ scheme. Your NIR number would be the key to your whole life. And by “information sharing”,
what you tell one public servant could be passed to anyone. The government name: “Transformational
Government” sounds nicer — until you understand what is being transformed is not government but its power
over you.
Currently the planned systems include:
ePassports that help collect data about your travel for…
International eBorders schemes that exchange
Passenger Name Record information with foreign countries as well as collecting them
Recording of all car journeys, using Automatic Number Plate Recognition (ANPR)
‘Entitlement cards’ as part of, or linked to the ID scheme, logging use of public services
Centralised medical records without privacy
Biometrics in schools — fingerprinting children as young as 4 or 5
‘ContactPoint’, a database collecting sensitive information on every child
Fingerprinting in pubs and bars — landlords forced to monitor their patrons
A greatly expanded National DNA Database (NDNAD)
New police powers to check identity
Increasing Criminal Records Bureau (CRB) checks for employees and volunteers
Businesses under pressure to verify ID of staff and customers with the government
Database Architecture:
The design of a DBMS depends on its architecture. It can be centralized or decentralized or hierarchical. The
architecture of a DBMS can be seen as either single tier or multi-tier. An n-tier architecture divides the whole
system into related but independent n modules, which can be independently modified, altered, changed, or
replaced.
In 1-tier architecture, the DBMS is the only entity where the user directly sits on the DBMS and uses it. Any
changes done here will directly be done on the DBMS itself. It does not provide handy tools for end-users.
Database designers and programmers normally prefer to use single-tier architecture.
If the architecture of DBMS is 2-tier, then it must have an application through which the DBMS can be
accessed. Programmers use 2-tier architecture where they access the DBMS by means of an application. Here
the application tier is entirely independent of the database in terms of operation, design, and programming.
3-tier Architecture
A 3-tier architecture separates its tiers from each other based on the complexity of the users and how they use
the data present in the database. It is the most widely used architecture to design a DBMS.
Database (Data) Tier − At this tier, the database resides along with its query processing languages. We
also have the relations that define the data and their constraints at this level.
Application (Middle) Tier − At this tier reside the application server and the programs that access the database. For a user, this application tier presents an abstracted view of the database. End-users are
unaware of any existence of the database beyond the application. At the other end, the database tier is
not aware of any other user beyond the application tier. Hence, the application layer sits in the middle
and acts as a mediator between the end-user and the database.
User (Presentation) Tier − End-users operate on this tier and they know nothing about any existence
of the database beyond this layer. At this layer, multiple views of the database can be provided by the
application. All views are generated by applications that reside in the application tier.
Multiple-tier database architecture is highly modifiable, as almost all its components are independent and can
be changed independently.
Data Independence:
A database system normally contains a lot of data in addition to users’ data. For example, it stores data about
data, known as metadata, to locate and retrieve data easily. It is rather difficult to modify or update a set of
metadata once it is stored in the database. But as a DBMS expands, it needs to change over time to satisfy the
requirements of the users. If the entire data is dependent, it would become a tedious and highly complex job.
Metadata itself follows a layered architecture, so that when we change data at one layer, it does not affect the
data at another level. This data is independent but mapped to each other.
Logical Data Independence
Logical data is data about database, that is, it stores information about how data is managed inside. For
example, a table (relation) stored in the database and all its constraints, applied on that relation.
Logical data independence is a kind of mechanism, which liberalizes itself from actual data stored on the disk.
If we do some changes on table format, it should not change the data residing on the disk.
Physical Data Independence
All the schemas are logical, and the actual data is stored in bit format on the disk. Physical data independence is
the power to change the physical data without impacting the schema or logical data.
For example, in case we want to change or upgrade the storage system itself − suppose we want to replace hard-disks with SSD − it should not have any impact on the logical data or schemas.
Database language and interface:
We need a method to create all the logical objects like tables, views, procedures and packages in the database and we need some interface between the user and the database, so that we can access the data stored in it. We also need a standardized method to organize these tables and views in the database.
DBMS is software that defines different operations to be carried out in the database. It varies from creating a database, tables, index, constraints to manipulating the data in the database like inserting, deleting, updating, retrieving, sorting etc. In order to perform all these operations, DBMS defines two forms of database languages.
Below are the database languages. Click them for more details.
DDL Data Definition Language
DML Data Manipulation Language
DCL Data Control Language
TCL Transaction Control Language
Interfaces are procedure, functions or package that are used for interacting with user or other applications
We have tables, their records and we know how to access them. Imagine we have to calculate the total of 5 subjects of student which user asks for and display all the marks as well as total in a Report format.
What would be the steps involved in it?
We will first retrieve marks of student name/id requested by the user. Then we would calculate the total of 5 subjects. Then we will display the results for student in the Report format. This can be done by firing individual queries. But user might not be aware of all these queries.
Imagine parents of the student want to see their son/daughters result. But they cannot fire database queries. They will just enter his/her name or ID and they want to see the result. What we would do in such case is group all these related transactions into a block and names it. When parents enter the ID of their son/daughter, we call this named block for that particular ID and get the result. This named block is called as procedure/function. This helps a developer to call each time, when there is a requirement to perform the same task repeatedly.
Procedure
A procedure is a named PL/SQL block which executes one or more related task. Standard syntax for a procedure is as below:
CREATE [OR REPLACE] PROCEDURE procedure_name
[ (parameter [,parameter]) ]
IS
[declaration_section]
BEGIN
executable_section
[EXCEPTION
exception_section]
END [procedure_name];
Where
Parameter is a one or more values input for the procedure, for which we need to perform the task or use them to perform the task. In our above example, STUDENT_ID is the input parameter. For the ID input to
procedure, the marks and totals are displayed. Parameters being entered are of 3 types – IN OUT and INOUT
IN Parameters values cannot be modified by the procedure/function. It is like a read only value input to procedure/function.
OUT parameter act as a result parameter. It will not have any input value, but the result of the procedure/function will be passed back to the calling program.
INOUT parameters will input some value to the procedure/function and the result of the tasks within will be sent back to the calling program using the same parameter.
Note If parameter type is not mentioned explicitly, it will take IN as default parameter type.
Declaration_Section will have set of variables declared which are local to the procedure and are used to perform the various tasks.
Executable_Section is the actual section of the procedure where the one or more steps of tasks are performed to meet the goal. In our example above, in this section, we would fetch the marks of student, calculate his total and display the result in a report.
Exception_Section is the error handling section of the procedure. Suppose there is an error while calculating the tasks, say there was no data found for the entered Student, then no need to perform rest of the actions. It will throw errors and procedure will fail. But user will not be happy to see that there is some error on the page. If proper error message is displayed, it will please them. Hence, when there is a pre-defined error situation, we capture them and proper message/alternative set of action will be defined.
CLASSIFICATION OF DISTRIBUTED SYSTEM
NOTE: SAME AS ABOVE
UNIT-3
Data Modeling using E.R. Model (Entity Relationship Model)
1.Data Models Classification
A model is a representation of reality, 'real world' objects and events, associations. It is an abstraction that
concentrates on the essential, inherent aspects an organization and ignores the accidental properties. A data
model represents the organization itself. It should provide the basic concepts and notations that will
allow database designers and end users unambiguously and accurately to communicate their understanding of
the organizational data.
Data Model can be defined as an integrated collection of concepts for describing and manipulating data,
relationships between data, and constraints on the data in an organization.
A data model comprises of three components:
• A structural part, consisting of a set of rules according to which databases can be constructed.
• A manipulative part, defining the types of operation that are allowed on the data (this includes the operations
that are used for updating or retrieving data from the database and for changing the structure of the database).
• Possibly a set of integrity rules, which ensures that the data is accurate.
The purpose of a data model is to represent data and to make the data understandable. There have been many
data models proposed in the literature. They fall into three broad categories:
• Object Based Data Models
• Physical Data Models
• Record Based Data Models
The object based and record based data models are used to describe data at the conceptual and external levels,
the physical data model is used to· describe data at the internal level.
Object Based Data Models
Object based data models use concepts such as entities, attributes, and relationships. An entity is a distinct object
(a person, place, concept, and event) in the organization that is to be represented in the database. An attribute is a
property that describes some aspect of the object that we wish to record, and a relationship is an association
between entities.
Some of the more common types of object based data model are:
• Entity-Relationship
• Object Oriented
• Semantic
• Functional
The Entity-Relationship model has emerged as one of the main techniques for modeling database design and
forms the basis for the database design methodology. The object oriented data model extends the definition of an
entity to include, not only the attributes that describe the state of the object but also the actions that are
associated with the object, that is, its behavior. The object is said to encapsulate both state and behavior. Entities
in semantic systems represent the equivalent of a record in a relational system or an object in an OO system but
they do not include behaviour (methods). They are abstractions 'used to represent real world (e.g. customer) or
conceptual (e.g. bank account) objects. The functional data model is now almost twenty years old. The original
idea was to' view the database as a collection of extensionally defined functions and to use a functional language
for querying the database.
Physical Data Models
Physical data models describe how data is stored in the computer, representing information such as record
structures, record ordering, and access paths. There are not as many physical data models as logical data models,
the most common one being the Unifying Model.
Record Based Logical Models
Record based logical models are used in describing data at the logical and view levels. In contrast to object
based data models, they are used to specify the overall logical structure of the database and to provide a higher-
level description of the implementation. Record based models are so named because the database is structured in
fixed format records of several types. Each record type defines a fixed number of fields, or attributes, and each
field is usually of a fixed length.
The three most widely accepted record based data models are:
• Hierarchical Model
• Network Model
• Relational Model
The relational model has gained favor over the other two in recent years. The network and hierarchical models
are still used in a large number of older databases.
2.ENTITY-RELATIONSHIP MODEL
The ER model defines the conceptual view of a database. It works around real-world entities and the
associations among them. At view level, the ER model is considered a good option for designing databases.
Entity
An entity can be a real-world object, either animate or inanimate, that can be easily identifiable. For example,
in a school database, students, teachers, classes, and courses offered can be considered as entities. All these
entities have some attributes or properties that give them their identity.
An entity set is a collection of similar types of entities. An entity set may contain entities with attribute sharing
similar values. For example, a Students set may contain all the students of a school; likewise a Teachers set
may contain all the teachers of a school from all faculties. Entity sets need not be disjoint.
Attributes
Entities are represented by means of their properties, called attributes. All attributes have values. For example,
a student entity may have name, class, and age as attributes.
There exists a domain or range of values that can be assigned to attributes. For example, a student's name
cannot be a numeric value. It has to be alphabetic. A student's age cannot be negative, etc.
Types of Attributes
Simple attribute − Simple attributes are atomic values, which cannot be divided further. For example,
a student's phone number is an atomic value of 10 digits.
Composite attribute − Composite attributes are made of more than one simple attribute. For example, a student's complete name may have first_name and last_name.
Derived attribute − Derived attributes are the attributes that do not exist in the physical database, but their values are derived from other attributes present in the database. For example, average_salary in a
department should not be saved directly in the database, instead it can be derived. For another example,
age can be derived from data_of_birth.
Single-value attribute − Single-value attributes contain single value. For example − Social_Security_Number.
Multi-value attribute − Multi-value attributes may contain more than one values. For example, a
person can have more than one phone number, email_address, etc.
These attribute types can come together in a way like −
simple single-valued attributes
simple multi-valued attributes
composite single-valued attributes
composite multi-valued attributes
Entity-Set and Keys
Key is an attribute or collection of attributes that uniquely identifies an entity among entity set.
For example, the roll_number of a student makes him/her identifiable among students.
Super Key − A set of attributes (one or more) that collectively identifies an entity in an entity set.
Candidate Key − A minimal super key is called a candidate key. An entity set may have more than one candidate key.
Primary Key − A primary key is one of the candidate keys chosen by the database designer to uniquely
identify the entity set.
Relationship
The association among entities is called a relationship. For example, an employee works_at a department, a
student enrolls in a course. Here, Works_at and Enrolls are called relationships.
Relationship Set
A set of relationships of similar type is called a relationship set. Like entities, a relationship too can have
attributes. These attributes are called descriptive attributes.
Degree of Relationship
The number of participating entities in a relationship defines the degree of the relationship.
Binary = degree 2
Ternary = degree 3
n-ary = degree
Mapping Cardinalities
Cardinality defines the number of entities in one entity set, which can be associated with the number of
entities of other set via relationship set.
One-to-one − One entity from entity set A can be associated with at most one entity of entity set B and vice versa.
One-to-many − One entity from entity set A can be associated with more than one entities of entity set
B however an entity from entity set B, can be associated with at most one entity.
Many-to-one − More than one entities from entity set A can be associated with at most one entity of entity set B, however an entity from entity set B can be associated with more than one entity from
entity set A.
Many-to-many − One entity from A can be associated with more than one entity from B and vice versa.
Let us now learn how the ER Model is represented by means of an ER diagram. Any object, for example,
entities, attributes of an entity, relationship sets, and attributes of relationship sets, can be represented with the
help of an ER diagram.
Entity
Entities are represented by means of rectangles. Rectangles are named with the entity set they represent.
Attributes
Attributes are the properties of entities. Attributes are represented by means of ellipses. Every ellipse represents
one attribute and is directly connected to its entity (rectangle).
If the attributes are composite, they are further divided in a tree like structure. Every node is then connected to
its attribute. That is, composite attributes are represented by ellipses that are connected with an ellipse.
Multivalued attributes are depicted by double ellipse.
Derived attributes are depicted by dashed ellipse.
Relationship
Relationships are represented by diamond-shaped box. Name of the relationship is written inside the diamond-
box. All the entities (rectangles) participating in a relationship, are connected to it by a line.
Binary Relationship and Cardinality
A relationship where two entities are participating is called a binary relationship. Cardinality is the number of
instance of an entity from a relation that can be associated with the relation.
One-to-one − When only one instance of an entity is associated with the relationship, it is marked as
'1:1'. The following image reflects that only one instance of each entity should be associated with the
relationship. It depicts one-to-one relationship.
One-to-many − When more than one instance of an entity is associated with a relationship, it is marked
as '1:N'. The following image reflects that only one instance of entity on the left and more than one
instance of an entity on the right can be associated with the relationship. It depicts one-to-many
relationship.
Many-to-one − When more than one instance of entity is associated with the relationship, it is marked
as 'N:1'. The following image reflects that more than one instance of an entity on the left and only one
instance of an entity on the right can be associated with the relationship. It depicts many-to-one
relationship.
Many-to-many − The following image reflects that more than one instance of an entity on the left and more than one instance of an entity on the right can be associated with the relationship. It depicts
many-to-many relationship.
Participation Constraints
Total Participation − Each entity is involved in the relationship. Total participation is represented by
double lines.
Partial participation − Not all entities are involved in the relationship. Partial participation is
represented by single lines.
The ER Model has the power of expressing database entities in a conceptual hierarchical manner. As the
hierarchy goes up, it generalizes the view of entities, and as we go deep in the hierarchy, it gives us the detail of
every entity included.
Going up in this structure is called generalization, where entities are clubbed together to represent a more
generalized view. For example, a particular student named Mira can be generalized along with all the students.
The entity shall be a student, and further, the student is a person. The reverse is called specialization where a
person is a student, and that student is Mira.
Generalization
As mentioned above, the process of generalizing entities, where the generalized entities contain the properties
of all the generalized entities, is called generalization. In generalization, a number of entities are brought
together into one generalized entity based on their similar characteristics. For example, pigeon, house sparrow,
crow and dove can all be generalized as Birds.
Specialization
Specialization is the opposite of generalization. In specialization, a group of entities is divided into sub-groups
based on their characteristics. Take a group ‘Person’ for example. A person has name, date of birth, gender, etc.
These properties are common in all persons, human beings. But in a company, persons can be identified as
employee, employer, customer, or vendor, based on what role they play in the company.
Similarly, in a school database, persons can be specialized as teacher, student, or a staff, based on what role
they play in school as entities.
Inheritance
We use all the above features of ER-Model in order to create classes of objects in object-oriented
programming. The details of entities are generally hidden from the user; this process known as abstraction.
Inheritance is an important feature of Generalization and Specialization. It allows lower-level entities to inherit
the attributes of higher-level entities.
For example, the attributes of a Person class such as name, age, and gender can be inherited by lower-level
entities such as Student or Teacher.
UNIT-4
Relational Model
Relational Model Concepts:
The relational model represents the database as a collection of relations. A relation is nothing but a table of values. Every row in the table represents a collection of related data values. These rows in the table denote a real-world entity or relationship.
The table name and column names are helpful to interpret the meaning of values in each row. The data are represented as a set of relations. In the relational model, data are stored as tables. However, the physical storage of the data is independent of the way the data are logically organized.
Some popular Relational Database management systems are:
DB2 and Informix Dynamic Server - IBM Oracle and RDB – Oracle SQL Server and Access - Microsoft
1. Attribute: Each column in a Table. Attributes are the properties which define a relation. e.g., Student_Rollno, NAME, etc.
2. Tables – In the Relational model the, relations are saved in the table format. It is stored along with its entities. A table has two properties rows and columns. Rows represent records and columns represent attributes.
3. Tuple – It is nothing but a single row of a table, which contains a single record. 4. Relation Schema: A relation schema represents the name of the relation with its attributes. 5. Degree: The total number of attributes which in the relation is called the degree of the relation. 6. Cardinality: Total number of rows present in the Table. 7. Column: The column represents the set of values for a specific attribute. 8. Relation instance – Relation instance is a finite set of tuples in the RDBMS system. Relation instances
never have duplicate tuples. 9. Relation key - Every row has one, two or multiple attributes, which is called relation key. 10. Attribute domain – Every attribute has some pre-defined value and scope which is known as attribute
domain
Relational Integrity constraints
Relational Integrity constraints is referred to conditions which must be present for a valid relation. These integrity constraints are derived from the rules in the mini-world that the database represents.
There are many types of integrity constraints. Constraints on the Relational database management system is mostly divided into three main categories are:
1. Domain constraints 2. Key constraints 3. Referential integrity constraints
Domain Constraints
Domain constraints can be violated if an attribute value is not appearing in the corresponding domain or it is not of the appropriate data type.
Domain constraints specify that within each tuple, and the value of each attribute must be unique. This is specified as data types which include standard data types integers, real numbers, characters, Booleans, variable length strings, etc.
Example:
Create DOMAIN CustomerName CHECK (value not NULL)
The example shown demonstrates creating a domain constraint such that CustomerName is not NULL
Key constraints
An attribute that can uniquely identify a tuple in a relation is called the key of the table. The value of the attribute for different tuples in the relation has to be unique.
Example:
In the given table, Customer ID is a key attribute of Customer Table. It is most likely to have a single key for one customer, Customer ID =1 is only for the Customer Name =" Google".
Customer ID Customer Name Status
1 Google Active
2 Amazon Active
3 Apple Inactive
Referential integrity constraints
Referential integrity constraints is base on the concept of Foreign Keys. A foreign key is an important attribute of a relation which should be referred to in other relationships. Referential integrity constraint state happens where relation refers to a key attribute of a different or same relation. However, that key element must exist in the table.
Example:
In the above example, we have 2 relations, Customer and Billing.
Tuple for CustomerID =1 is referenced twice in the relation Billing. So we know CustomerName=Google has billing amount $300
Operations in Relational Model
Four basic update operations performed on relational database model are
Insert, update, delete and select.
Insert is used to insert data into the relation Delete is used to delete tuples from the table. Modify allows you to change the values of some attributes in existing tuples. Select allows you to choose a specific range of data.
Whenever one of these operations are applied, integrity constraints specified on the relational database schema must never be violated.
Inset Operation
The insert operation gives values of the attribute for a new tuple which should be inserted into a relation.
Update Operation
You can see that in the below-given relation table CustomerName= 'Apple' is updated from Inactive to Active.
Delete Operation
To specify deletion, a condition on the attributes of the relation selects the tuple to be deleted.
In the above-given example, CustomerName= "Apple" is deleted from the table.
The Delete operation could violate referential integrity if the tuple which is deleted is referenced by foreign keys from other tuples in the same database.
Select Operation
In the above-given example, CustomerName="Amazon" is selected
Comparison b/w E/R model and Relational
Model
E-R Model and Relational Model both are the types of Data Model. Data Model describes a way to
design database at physical, logical and view level. The main difference between E-R Model and
Relational Model is that E-R Model is entity specific, and Relational Model is table specific. Let us
discuss some differences between E-R Model and Relation model with the help of comparison chart
shown below.
Content: E-R Model Vs Relational Model
1. Comparison Chart 2. Key Differences
Comparison Chart
BASIS FOR
COMPARISON
E-R MODEL RELATIONAL MODEL
Basic It represents the collection of
objects called entities and
relation between those entities.
It represents the collection of
Tables and the relation between
those tables.
Describe Entity Relationship Model
describe data as Entity set,
Relationship set and Attribute.
Relational Model describes data
in a table as Domain, Attributes,
Tuples.
Relationship E-R Model is easier to
understand the relationship
between entities.
Comparatively, it is less easy to
derive a relation between tables
in Relational Model.
Mapping E-R Model describes Mapping
Cardinalities.
Relational Model does not
describe mapping cardinalities.
Key Differences between E-R Model and Relational Model
1. The basic difference between E-R Model and Relational Model is that E-R model specifically deals
with entities and their relations. On the other hand, the Relational Model deals with Tables and
relation between the data of those tables.
2. An E-R Model describes the data with entity set, relationship set and attributes. However, the
Relational model describes the data with the tuples, attributes and domain of the attribute.
3. One can easily understand the relationship among the data in E-R Model as compared to
Relational Model.
4. E-R Model has Mapping Cardinality as a constraint whereas Relational Model does not have such
constraint.
UNIT-5
Normalization
What is Normalization?
Normalization is a database design technique which organizes tables in a manner that reduces redundancy and dependency of data.
It divides larger tables to smaller tables and links them using relationships.
The inventor of the relational model Edgar Codd proposed the theory of normalization with the introduction of First Normal Form, and he continued to extend theory with Second and Third Normal Form. Later he joined with Raymond F. Boyce to develop the theory of Boyce-Codd Normal Form.
Theory of Data Normalization in SQL is still being developed further. For example, there are discussions even on 6
th Normal Form. However, in most practical applications, normalization
achieves its best in 3rd
Normal Form. The evolution of Normalization theories is illustrated below-
Database Normalization Examples -
Assume a video library maintains a database of movies rented out. Without any normalization, all information is stored in one table as shown below.
Table 1
Here you see Movies Rented column has multiple values.
Database Normal Forms
Now let's move into 1st Normal Forms
1NF (First Normal Form) Rules
Each table cell should contain a single value. Each record needs to be unique.
The above table in 1NF-
1NF Example
Table 1: In 1NF Form
Before we proceed let's understand a few things --
What is a KEY?
A KEY is a value used to identify a record in a table uniquely. A KEY could be a single column or combination of multiple columns
Note: Columns in a table that are NOT used to identify a record uniquely are called non-key columns.
What is a Primary Key?
A primary is a single column value used to identify a database record uniquely.
It has following attributes
A primary key cannot be NULL A primary key value must be unique The primary key values cannot be changed The primary key must be given a value when a new record
is inserted.
What is Composite Key?
A composite key is a primary key composed of multiple columns used to identify a record uniquely
In our database, we have two people with the same name Robert Phil, but they live in different places.
Hence, we require both Full Name and Address to identify a record uniquely. That is a composite key.
Let's move into second normal form 2NF
2NF (Second Normal Form) Rules
Rule 1- Be in 1NF Rule 2- Single Column Primary Key
It is clear that we can't move forward to make our simple database in 2nd
Normalization form unless we partition the table above.
Table 1
Table 2
We have divided our 1NF table into two tables viz. Table 1 and Table2. Table 1 contains member information. Table 2 contains information on movies rented.
We have introduced a new column called Membership_id which is the primary key for table 1. Records can be uniquely identified in Table 1 using membership id
Database - Foreign Key
In Table 2, Membership_ID is the Foreign Key
Foreign Key references the primary key of another Table!
It helps connect your Tables
A foreign key can have a different name from its primary key
It ensures rows in one table have corresponding rows in another
Unlike the Primary key, they do not have to be unique. Most often they aren't
Foreign keys can be null even though primary keys can not
Why do you need a foreign key?
Suppose an idiot inserts a record in Table B such as
You will only be able to insert values into your foreign key that exist in the unique key in the parent table. This helps in referential integrity.
The above problem can be overcome by declaring membership id from Table2 as foreign key of membership id from Table1
Now, if somebody tries to insert a value in the membership id field that does not exist in the parent table, an error will be shown!
What are transitive functional dependencies?
A transitive functional dependency is when changing a non-key column, might cause any of the other non-key columns to change
Consider the table 1. Changing the non-key column Full Name may change Salutation.
Let's move into 3NF
3NF (Third Normal Form) Rules
Rule 1- Be in 2NF
Rule 2- Has no transitive functional dependencies
To move our 2NF table into 3NF, we again need to again divide our table.
3NF Example
TABLE 1
Table 2
Table 3
We have again divided our tables and created a new table which stores Salutations.
There are no transitive functional dependencies, and hence our table is in 3NF
In Table 3 Salutation ID is primary key, and in Table 1 Salutation ID is foreign to primary key in Table 3
Now our little example is at a level that cannot further be decomposed to attain higher forms of normalization. In fact, it is already in higher normalization forms. Separate efforts for moving into next levels of normalizing data are normally needed in complex databases. However, we will be discussing next levels of normalizations in brief in the following.
Boyce-Codd Normal Form (BCNF)
Even when a database is in 3rd
Normal Form, still there would be anomalies resulted if it has more than one Candidate Key.
Sometimes is BCNF is also referred as 3.5 Normal Form.
4NF (Fourth Normal Form) Rules
If no database table instance contains two or more, independent and multivalued data describing the relevant entity, then it is in 4
th Normal Form.
5NF (Fifth Normal Form) Rules
A table is in 5th Normal Form only if it is in 4NF and it cannot be decomposed into any number of
smaller tables without loss of data.
6NF (Sixth Normal Form) Proposed
6th Normal Form is not standardized, yet however, it is being discussed by database experts for
some time. Hopefully, we would have a clear & standardized definition for 6th Normal Form in the
near future...
That's all to Normalization!!!
Summary
Database designing is critical to the successful implementation of a database management system that meets the data requirements of an enterprise system.
Normalization helps produce database systems that are cost-effective and have better security models.
Functional dependencies are a very important component of the normalize data process Most database systems are normalized database up to the third normal forms. A primary key uniquely identifies are record in a Table and cannot be null A foreign key helps connect table and references a primary key
TRIVIAL AND NON TRIVIAL DEPANDENCIES
Trivial − If a functional dependency (FD) X → Y holds, where Y is a subset of X, then it is called a trivial FD. Trivial FDs always hold.
Non-trivial − If an FD X → Y holds, where Y is not a subset of X, then it is called a non-trivial
FD.
Functional Dependency
Functional dependency (FD) is a set of constraints between two attributes in a relation.
Functional dependency says that if two tuples have same values for attributes A1, A2,..., An,
then those two tuples must have to have same values for attributes B1, B2, ..., Bn.
Functional dependency is represented by an arrow sign (→) that is, X→Y, where X functionally determines Y. The left-hand side attributes determine the values of attributes on
the right-hand side.
DECOMPOSITION
Decomposition in DBMS removes redundancy, anomalies and inconsistencies from a database by
dividing the table into multiple tables.
The following are the types:
Lossless Decomposition
Decomposition is lossless if it is feasible to reconstruct relation R from decomposed tables using
Joins. This is the preferred choice. The information will not lose from the relation when decomposed.
The join would result in the same original relation.
Let us see an example:
<EmpInfo>
Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance
Decompose the above table into two tables:
<EmpDetails>
Emp_ID Emp_Name Emp_Age Emp_Location
E001 Jacob 29 Alabama
E002 Henry 32 Alabama
E003 Tom 22 Texas
<DeptDetails>
Dept_ID Emp_ID Dept_Name
Dpt1 E001 Operations
Dpt2 E002 HR
Dpt3 E003 Finance
Now, Natural Join is applied on the above two tables:
The result will be:
Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance
Therefore, the above relation had lossless decomposition i.e. no loss of information.
Lossy Decomposition
As the name suggests, when a relation is decomposed into two or more relational schemas, the loss
of information is unavoidable when the original relation is retrieved.
Let us see an example:
<EmpInfo>
Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance
Decompose the above table into two tables:
<EmpDetails>
Emp_ID Emp_Name Emp_Age Emp_Location
E001 Jacob 29 Alabama
E002 Henry 32 Alabama
E003 Tom 22 Texas
<DeptDetails>
Dept_ID Dept_Name
Dpt1 Operations
Dpt2 HR
Dpt3 Finance
Now, you won’t be able to join the above tables, since Emp_ID isn’t part of the DeptDetails relation.
Therefore, the above relation has lossy decomposition.
Database Access and Security
Unit-6
Syllabus: Creating and using indexes, creating and using views. Database security, process controls,
database protection, grant and revoke
Creating and using indexes
SQL CREATE INDEX Statement
The CREATE INDEX statement is used to create indexes in tables.
Indexes are used to retrieve data from the database very fast. The users cannot see the indexes, they are just used to speed up searches/queries.
Note: Updating a table with indexes takes more time than updating a table without (because the indexes also need an update). So, only create indexes on columns that will be frequently searched against.
CREATE INDEX Syntax
Creates an index on a table. Duplicate values are allowed:
CREATE INDEX index_name ON table_name (column1, column2, ...);
CREATE UNIQUE INDEX Syntax
Creates a unique index on a table. Duplicate values are not allowed:
CREATE UNIQUE INDEX index_name ON table_name (column1, column2, ...);
Note: The syntax for creating indexes varies among different databases. Therefore: Check the syntax for creating indexes in your database.
CREATE INDEX Example
The SQL statement below creates an index named "idx_lastname" on the "LastName" column in the "Persons" table:
CREATE INDEX idx_lastname ON Persons (LastName);
If you want to create an index on a combination of columns, you can list the column names within the parentheses, separated by commas:
CREATE INDEX idx_pname ON Persons (LastName, FirstName);
DROP INDEX Statement
The DROP INDEX statement is used to delete an index in a table.
creating and using views
Creating Views
Database views are created using the CREATE VIEW statement. Views can be created from a
single table, multiple tables or another view.
To create a view, a user must have the appropriate system privilege according to the specific
implementation.
The basic CREATE VIEW syntax is as follows −
CREATE VIEW view_name AS SELECT column1, column2..... FROM table_name WHERE [condition];
You can include multiple tables in your SELECT statement in a similar way as you use them in
a normal SQL SELECT query.
Example
Consider the CUSTOMERS table having the following records −
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Following is an example to create a view from the CUSTOMERS table. This view would be
used to have customer name and age from the CUSTOMERS table.
SQL > CREATE VIEW CUSTOMERS_VIEW AS
SELECT name, age
FROM CUSTOMERS;
Now, you can query CUSTOMERS_VIEW in a similar way as you query an actual table.
Following is an example for the same.
SQL > SELECT * FROM CUSTOMERS_VIEW;
This would produce the following result.
+----------+-----+ | name | age | +----------+-----+ | Ramesh | 32 | | Khilan | 25 | | kaushik | 23 | | Chaitali | 25 | | Hardik | 27 | | Komal | 22 | | Muffy | 24 | +----------+-----+
The WITH CHECK OPTION
The WITH CHECK OPTION is a CREATE VIEW statement option. The purpose of the WITH
CHECK OPTION is to ensure that all UPDATE and INSERTs satisfy the condition(s) in the
view definition.
If they do not satisfy the condition(s), the UPDATE or INSERT returns an error.
The following code block has an example of creating same view CUSTOMERS_VIEW with the
WITH CHECK OPTION.
CREATE VIEW CUSTOMERS_VIEW AS
SELECT name, age
FROM CUSTOMERS
WHERE age IS NOT NULL
WITH CHECK OPTION;
The WITH CHECK OPTION in this case should deny the entry of any NULL values in the
view's AGE column, because the view is defined by data that does not have a NULL value in
the AGE column.
Updating a View
A view can be updated under certain conditions which are given below −
The SELECT clause may not contain the keyword DISTINCT.
The SELECT clause may not contain summary functions.
The SELECT clause may not contain set functions.
The SELECT clause may not contain set operators.
The SELECT clause may not contain an ORDER BY clause.
The FROM clause may not contain multiple tables.
The WHERE clause may not contain subqueries.
The query may not contain GROUP BY or HAVING.
Calculated columns may not be updated.
All NOT NULL columns from the base table must be included in the view in order for the
INSERT query to function.
So, if a view satisfies all the above-mentioned rules then you can update that view. The
following code block has an example to update the age of Ramesh.
SQL > UPDATE CUSTOMERS_VIEW
SET AGE = 35
WHERE name = 'Ramesh';
This would ultimately update the base table CUSTOMERS and the same would reflect in the
view itself. Now, try to query the base table and the SELECT statement would produce the
following result.
+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 35 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+
Inserting Rows into a View
Rows of data can be inserted into a view. The same rules that apply to the UPDATE command
also apply to the INSERT command.
Here, we cannot insert rows in the CUSTOMERS_VIEW because we have not included all the
NOT NULL columns in this view, otherwise you can insert rows in a view in a similar way as
you insert them in a table.
Deleting Rows into a View
Rows of data can be deleted from a view. The same rules that apply to the UPDATE and
INSERT commands apply to the DELETE command.
Following is an example to delete a record having AGE = 22.
SQL > DELETE FROM CUSTOMERS_VIEW
WHERE age = 22;
This would ultimately delete a row from the base table CUSTOMERS and the same would
reflect in the view itself. Now, try to query the base table and the SELECT statement would
produce the following result.
+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 35 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+
Dropping Views
Obviously, where you have a view, you need a way to drop the view if it is no longer needed.
The syntax is very simple and is given below −
DROP VIEW view_name;
Following is an example to drop the CUSTOMERS_VIEW from the CUSTOMERS table.
DROP VIEW CUSTOMERS_VIEW;
Database security
Database security refers to the collective measures used to protect and secure a database or database management software from illegitimate use and malicious threats and attacks.
It is a broad term that includes a multitude of processes, tools and methodologies that ensure security within a database environment.
Database security covers and enforces security on all aspects and components of databases. This includes:
Data stored in database Database server Database management system (DBMS) Other database workflow applications
Database security is generally planned, implemented and maintained by a database administrator and or other information security professional.
Some of the ways database security is analyzed and implemented include:
Restricting unauthorized access and use by implementing strong and multifactor access and data management controls
Load/stress testing and capacity testing of a database to ensure it does not crash in a distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and implementing a road map/plan to mitigate them.
Process controls
Access control is a security technique that regulates who or what can view or use
resources in a computing environment. It is a fundamental concept in security that
minimizes risk to the business or organization.
There are two types of access control: physical and logical. Physical access control
limits access to campuses, buildings, rooms and physical IT assets. Logical access
control limits connections to computer networks, system files and data.
To secure a facility, organizations use electronic access control systems that rely on
user credentials, access card readers, auditing and reports to track employee access to
restricted business locations and proprietary areas, such as data centers. Some of
these systems incorporate access control panels to restrict entry to rooms and buildings
as well as alarms and lockdown capabilities to prevent unauthorized access or
operations.
Access control systems perform identification authentication and authorization of users
and entities by evaluating required login credentials that can include passwords,
personal identification numbers (PINs), biometric scans, security tokens or
other authentication factors. Multifactor authentication, which requires two or more
authentication factors, is often an important part of layered defense to protect access
control systems.
These security controls work by identifying an individual or entity, verifying that the
person or application is who or what it claims to be, and authorizing the access level
and set of actions associated with the username or IP address. Directory services and
protocols, including the Local Directory Access Protocol (LDAP) and the Security
Assertion Markup Language(SAML), provide access controls for authenticating and
authorizing users and entities and enabling them to connect to computer resources,
such as distributed applications and web servers.
Organizations use different access control models depending on their compliance
requirements and the security levels of information technology they are trying to
protect.
Types of access control
The main types of access control are:
Mandatory access control (MAC): A security model in which access rights are regulated by
a central authority based on multiple levels of security. Often used in government and
military environments, classifications are assigned to system resources and the operating
system or security kernel, grants or denies access to those resource objects based on the
information security clearance of the user or device. For example, Security Enhanced
Linux is an implementation of MAC on the Linux operating system.
Discretionary access control (DAC): An access control method in which owners or
administrators of the protected system, data or resource set the policies defining who or what
is authorized to access the resource. Many of these systems enable administrators to limit the
propagation of access rights. A common criticism of DAC systems is a lack of centralized
control.
Role-based access control (RBAC): A widely used access control mechanism that restricts
access to computer resources based on individuals or groups with defined business functions -
- executive level, engineer level 1 -- rather than the identities of individual users. The role-
based security model relies on a complex structure of role assignments, role authorizations
and role permissions developed using role engineering to regulate employee access to
systems. RBAC systems can be used to enforce MAC and DAC frameworks.
Rule-based access control: A security model in which the system administrator defines the
rules that to govern access to resource objects. Often these rules are based on conditions, such
as time of day or location. It is not uncommon to use some form of both rule-based access
control and role-based access control to enforce access policies and procedures.
Attribute-based access control (ABAC): A methodology that manages access rights by
evaluating a set of rules, policies and relationships using the attributes of users, systems and
environmental conditions.
Use of access control
The goal of access control is to minimize the risk of unauthorized access to physical and logical
systems. Access control is a fundamental component of security compliance programs that
ensures security technology and access control policies are in place to protect confidential
information, such as customer data. Most organizations have infrastructure and procedures that
limit access to networks, computer systems, applications, files and sensitive data, such as
personally identifiable information and intellectual property.
Access control systems are complex and can be challenging to manage in dynamic IT
environments that involve on-premises systems and cloud services. After some high-profile
breaches, technology vendors have shifted away from single sign-on systems to unified access
management, which offers access controls for on-premises and cloud environments.
Implementing access control
Access control is a process that is integrated into an organization's IT environment. It can
involve identity and access management systems. These systems provide access control
software, a user database, and management tools for access control policies, auditing and
enforcement.
MYSQL/SQL (Structured Query Language)
Unit-7
SQL* DDL (Data Definition Languages):
Structured Query Language(SQL) as we all know is the database language by the use of which we can perform certain operations on the existing database and also we can use this language to create a database. SQL uses certain commands like Create, Drop, Insert etc. to carry out the required tasks.
These SQL commands are mainly categorized into four categories as discussed below:
DDL(Data Definition Language) : DDL or Data Definition Language actually consists of the SQL commands that can be used to define the database schema. It simply deals with descriptions of the database schema and is used to create and modify the structure of database objects in database. Examples of DDL commands:
CREATE – is used to create the database or its objects (like table, index, function, views, store procedure and triggers).
DROP – is used to delete objects from the database. ALTER-is used to alter the structure of the database. TRUNCATE–is used to remove all records from a table, including all spaces allocated
for the records are removed. COMMENT –is used to add comments to the data dictionary. RENAME –is used to rename an object existing in the database.
The SQL CREATE TABLE Statement
The CREATE TABLE statement is used to create a new table in a database.
Syntax
CREATE TABLE table_name ( column1 datatype, column2 datatype, column3 datatype, .... );
The column parameters specify the names of the columns of the table.
The datatype parameter specifies the type of data the column can hold (e.g. varchar, integer, date, etc.).
Tip: For an overview of the available data types, go to our complete Data Types Reference.
SQL CREATE TABLE
Example
The following example creates a table called "Persons" that contains five columns: PersonID, LastName, FirstName, Address, and City:
Example
CREATE TABLE Persons ( PersonID int, LastName varchar(255), FirstName varchar(255), Address varchar(255), City varchar(255) );
The PersonID column is of type int and will hold an integer. The LastName, FirstName, Address, and City columns are of type varchar and will hold characters, and the maximum length for these fields is 255 characters. The empty "Persons" table will now look like this:
PersonID LastName FirstName Address City
Tip: The empty "Persons" table can now be filled with data with the SQL INSERT INTO statement.
Create Table Using Another Table
A copy of an existing table can also be created using CREATE TABLE.
The new table gets the same column definitions. All columns or specific columns can be selected.
If you create a new table using an existing table, the new table will be filled with the existing values from the old table.
Syntax
CREATE TABLE new_table_name AS SELECT column1, column2,... FROM existing_table_name WHERE ....;
The following SQL creates a new table called "TestTables" (which is a copy of the "Customers" table):
Example
CREATE TABLE TestTable AS SELECT customername, contactname FROM customers;
Inserting values into a table
The SQL INSERT INTO Statement is used to add new rows of data to a table in the database.
Syntax
There are two basic syntaxes of the INSERT INTO statement which are shown below.
INSERT INTO TABLE_NAME (column1, column2, column3,...columnN) VALUES (value1, value2, value3,...valueN);
Here, column1, column2, column3,...columnN are the names of the columns in the table into
which you want to insert the data.
You may not need to specify the column(s) name in the SQL query if you are adding values for
all the columns of the table. But make sure the order of the values is in the same order as the
columns in the table.
The SQL INSERT INTO syntax will be as follows −
INSERT INTO TABLE_NAME VALUES (value1,value2,value3,...valueN);
Example
The following statements would create six records in the CUSTOMERS table.
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (1, 'Ramesh', 32, 'Ahmedabad', 2000.00 );
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (2, 'Khilan', 25, 'Delhi', 1500.00 );
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (3, 'kaushik', 23, 'Kota', 2000.00 );
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (4, 'Chaitali', 25, 'Mumbai', 6500.00 );
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (5, 'Hardik', 27, 'Bhopal', 8500.00 );
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (6, 'Komal', 22, 'MP', 4500.00 );
You can create a record in the CUSTOMERS table by using the second syntax as shown
below.
INSERT INTO CUSTOMERS VALUES (7, 'Muffy', 24, 'Indore', 10000.00 );
All the above statements would produce the following records in the CUSTOMERS table as
shown below.
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
updating columns of a Table
The SQL UPDATE Query is used to modify the existing records in a table. You can use the
WHERE clause with the UPDATE query to update the selected rows, otherwise all the rows
would be affected.
Syntax
The basic syntax of the UPDATE query with a WHERE clause is as follows −
UPDATE table_name SET column1 = value1, column2 = value2...., columnN = valueN WHERE [condition];
You can combine N number of conditions using the AND or the OR operators.
Example
Consider the CUSTOMERS table having the following records −
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
The following query will update the ADDRESS for a customer whose ID number is 6 in the
table.
SQL> UPDATE CUSTOMERS
SET ADDRESS = 'Pune'
WHERE ID = 6;
Now, the CUSTOMERS table would have the following records −
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | Pune | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+
If you want to modify all the ADDRESS and the SALARY column values in the CUSTOMERS
table, you do not need to use the WHERE clause as the UPDATE query would be enough as
shown in the following code block.
SQL> UPDATE CUSTOMERS
SET ADDRESS = 'Pune', SALARY = 1000.00;
Now, CUSTOMERS table would have the following records −
+----+----------+-----+---------+---------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+---------+---------+ | 1 | Ramesh | 32 | Pune | 1000.00 | | 2 | Khilan | 25 | Pune | 1000.00 | | 3 | kaushik | 23 | Pune | 1000.00 | | 4 | Chaitali | 25 | Pune | 1000.00 | | 5 | Hardik | 27 | Pune | 1000.00 | | 6 | Komal | 22 | Pune | 1000.00 | | 7 | Muffy | 24 | Pune | 1000.00 | +----+----------+-----+---------+---------+
Deleting Rows
The SQL DELETE Query is used to delete the existing records from a table.
You can use the WHERE clause with a DELETE query to delete the selected rows, otherwise
all the records would be deleted.
Syntax
The basic syntax of the DELETE query with the WHERE clause is as follows −
DELETE FROM table_name WHERE [condition];
You can combine N number of conditions using AND or OR operators.
Example
Consider the CUSTOMERS table having the following records −
+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+
The following code has a query, which will DELETE a customer, whose ID is 6.
SQL> DELETE FROM CUSTOMERS
WHERE ID = 6;
Now, the CUSTOMERS table would have the following records.
+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+
If you want to DELETE all the records from the CUSTOMERS table, you do not need to use
the WHERE clause and the DELETE query would be as follows −
SQL> DELETE FROM CUSTOMERS;
Now, the CUSTOMERS table would not have any record.
Dropping a Table
The SQL DROP TABLE statement is used to remove a table definition and all the data,
indexes, triggers, constraints and permission specifications for that table.
NOTE − You should be very careful while using this command because once a table is deleted then all the information available in that table will also be lost forever.
Syntax
The basic syntax of this DROP TABLE statement is as follows −
DROP TABLE table_name;
Example
Let us first verify the CUSTOMERS table and then we will delete it from the database as shown
below −
SQL> DESC CUSTOMERS;
+---------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------------+------+-----+---------+-------+
| ID | int(11) | NO | PRI | | |
| NAME | varchar(20) | NO | | | |
| AGE | int(11) | NO | | | |
| ADDRESS | char(25) | YES | | NULL | |
| SALARY | decimal(18,2) | YES | | NULL | |
+---------+---------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
This means that the CUSTOMERS table is available in the database, so let us now drop it as
shown below.
SQL> DROP TABLE CUSTOMERS;
Query OK, 0 rows affected (0.01 sec)
Now, if you would try the DESC command, then you will get the following error −
SQL> DESC CUSTOMERS;
ERROR 1146 (42S02): Table 'TEST.CUSTOMERS' doesn't exist
Here, TEST is the database name which we are using for our examples.
(Data Manipulation Language)
The SQL commands that deals with the manipulation of data present in database belong to DML or Data Manipulation Language and this includes most of the SQL statements. Examples of DML: SELECT – is used to retrieve data from the a database. INSERT – is used to insert data into a table. UPDATE – is used to update existing data within a table. DELETE – is used to delete records from a database table.
Database Security and Privileges
DB2 database and functions can be managed by two different modes of security controls:
1. Authentication
2. Authorization
Authentication
Authentication is the process of confirming that a user logs in only in accordance with the rights
to perform the activities he is authorized to perform. User authentication can be performed at
operating system level or database level itself. By using authentication tools for biometrics
such as retina and figure prints are in use to keep the database from hackers or malicious
users.
The database security can be managed from outside the db2 database system. Here are some
type of security authentication process:
Based on Operating System authentications.
Lightweight Directory Access Protocol (LDAP)
For DB2, the security service is a part of operating system as a separate product. For
Authentication, it requires two different credentials, those are userid or username, and
password.
Authorization
You can access the DB2 Database and its functionality within the DB2 database system, which
is managed by the DB2 Database manager. Authorization is a process managed by the DB2
Database manager. The manager obtains information about the current authenticated user,
that indicates which database operation the user can perform or access.
Here are different ways of permissions available for authorization:
Primary permission: Grants the authorization ID directly.
Secondary permission: Grants to the groups and roles if the user is a member
Public permission: Grants to all users publicly.
Context-sensitive permission: Grants to the trusted context role.
Authorization can be given to users based on the categories below:
System-level authorization
System administrator [SYSADM]
System Control [SYSCTRL]
System maintenance [SYSMAINT]
System monitor [SYSMON]
Authorities provide of control over instance-level functionality. Authority provide to group
privileges, to control maintenance and authority operations. For instance, database and
database objects.
Database-level authorization
Security Administrator [SECADM]
Database Administrator [DBADM]
Access Control [ACCESSCTRL]
Data access [DATAACCESS]
SQL administrator. [SQLADM]
Workload management administrator [WLMADM]
Explain [EXPLAIN]
Authorities provide controls within the database. Other authorities for database include with
LDAD and CONNECT.
Object-Level Authorization: Object-Level authorization involves verifying privileges
when an operation is performed on an object.
Content-based Authorization: User can have read and write access to individual rows
and columns on a particular table using Label-based access Control [LBAC].
DB2 tables and configuration files are used to record the permissions associated with
authorization names. When a user tries to access the data, the recorded permissions verify the
following permissions:
Authorization name of the user
Which group belongs to the user
Which roles are granted directly to the user or indirectly to a group
Permissions acquired through a trusted context.
While working with the SQL statements, the DB2 authorization model considers the
combination of the following permissions:
Permissions granted to the primary authorization ID associated with the SQL
statements.
Secondary authorization IDs associated with the SQL statements.
Granted to PUBLIC
Granted to the trusted context role.
Instance level authorities
Let us discuss some instance related authorities.
System administration authority (SYSADM)
It is highest level administrative authority at the instance-level. Users with SYSADM authority
can execute some databases and database manager commands within the instance. Users
with SYSADM authority can perform the following operations:
Upgrade a Database
Restore a Database
Update Database manager configuration file.
System control authority (SYSCTRL)
It is the highest level in System control authority. It provides to perform maintenance and utility
operations against the database manager instance and its databases. These operations can
affect system resources, but they do not allow direct access to data in the database.
Users with SYSCTRL authority can perform the following actions:
Updating the database, Node, or Distributed Connect Service (DCS) directory
Forcing users off the system-level
Creating or Dropping a database-level
Creating, altering, or dropping a table space
Using any table space
Restoring Database
System maintenance authority (SYSMAINT)
It is a second level of system control authority. It provides to perform maintenance and utility
operations against the database manager instance and its databases. These operations affect
the system resources without allowing direct access to data in the database. This authority is
designed for users to maintain databases within a database manager instance that contains
sensitive data.
Only Users with SYSMAINT or higher level system authorities can perform the following tasks:
Taking backup
Restoring the backup
Roll forward recovery
Starting or stopping instance
Restoring tablespaces
Executing db2trc command
Taking system monitor snapshots in case of an Instance level user or a database level
user.
A user with SYSMAINT can perform the following tasks:
Query the state of a tablespace
Updating log history files
Reorganizing of tables
Using RUNSTATS (Collection catalog statistics)
System monitor authority (SYSMON)
With this authority, the user can monitor or take snapshots of database manager instance or its
database. SYSMON authority enables the user to run the following tasks:
GET DATABASE MANAGER MONITOR SWITCHES
GET MONITOR SWITCHES
GET SNAPSHOT
LIST
o LIST ACTIVE DATABASES
o LIST APPLICATIONS
o LIST DATABASE PARTITION GROUPS
o LIST DCS APPLICATIONS
o LIST PACKAGES
o LIST TABLES
o LIST TABLESPACE CONTAINERS
o LIST TABLESPACES
o LIST UTITLITIES
RESET MONITOR
UPDATE MONITOR SWITCHES
Database authorities
Each database authority holds the authorization ID to perform some action on the database.
These database authorities are different from privileges. Here is the list of some database
authorities:
ACCESSCTRL: allows to grant and revoke all object privileges and database authorities.
BINDADD: Allows to create a new package in the database.
CONNECT: Allows to connect to the database.
CREATETAB: Allows to create new tables in the database.
CREATE_EXTERNAL_ROUTINE: Allows to create a procedure to be used by applications
and the users of the databases.
DATAACCESS: Allows to access data stored in the database tables.
DBADM: Act as a database administrator. It gives all other database authorities except
ACCESSCTRL, DATAACCESS, and SECADM.
EXPLAIN: Allows to explain query plans without requiring them to hold the privileges to access
the data in the tables.
IMPLICIT_SCHEMA: Allows a user to create a schema implicitly by creating an object using a
CREATE statement.
LOAD: Allows to load data into table.
QUIESCE_CONNECT: Allows to access the database while it is quiesce (temporarily
disabled).
SECADM: Allows to act as a security administrator for the database.
SQLADM: Allows to monitor and tune SQL statements.
WLMADM: Allows to act as a workload administrator
Privileges
SETSESSIONUSER
Authorization ID privileges involve actions on authorization IDs. There is only one privilege,
called the SETSESSIONUSER privilege. It can be granted to user or a group and it allows to
session user to switch identities to any of the authorization IDs on which the privileges are
granted. This privilege is granted by user SECADM authority.
Schema privileges
This privileges involve actions on schema in the database. The owner of the schema has all
the permissions to manipulate the schema objects like tables, views, indexes, packages, data
types, functions, triggers, procedures and aliases. A user, a group, a role, or PUBLIC can be
granted any user of the following privileges:
CREATEIN: allows to create objects within the schema
ALTERIN: allows to modify objects within the schema.
DROPIN
This allows to delete the objects within the schema.
Tablespace privileges
These privileges involve actions on the tablespaces in the database. User can be granted the
USE privilege for the tablespaces. The privileges then allow them to create tables within
tablespaces. The privilege owner can grant the USE privilege with the command WITH GRANT
OPTION on the tablespace when tablespace is created. And SECADM or ACCESSCTRL
authorities have the permissions to USE privileges on the tablespace.
Table and view privileges
The user must have CONNECT authority on the database to be able to use table and view
privileges. The privileges for tables and views are as given below:
CONTROL
It provides all the privileges for a table or a view including drop and grant, revoke individual
table privileges to the user.
ALTER
It allows user to modify a table.
DELETE
It allows the user to delete rows from the table or view.
INDEX
It allows the user to insert a row into table or view. It can also run import utility.
REFERENCES
It allows the users to create and drop a foreign key.
SELECT
It allows the user to retrieve rows from a table or view.
UPDATE
It allows the user to change entries in a table, view.
Package privileges
User must have CONNECT authority to the database. Package is a database object that
contains the information of database manager to access data in the most efficient way for a
particular application.
CONTROL
It provides the user with privileges of rebinding, dropping or executing packages. A user with
this privileges is granted to BIND and EXECUTE privileges.
BIND
It allows the user to bind or rebind that package.
EXECUTE
Allows to execute a package.
Index privileges
This privilege automatically receives CONTROL privilege on the index.
Sequence privileges
Sequence automatically receives the USAGE and ALTER privileges on the sequence.
Routine privileges
It involves the action of routines such as functions, procedures, and methods within a
database.
Grant and Revoke Command
SQL GRANT REVOKE Commands
DCL commands are used to enforce database security in a multiple user database environment. Two types of DCL commands are GRANT and REVOKE. Only Database Administrator's or owner's of the database object can provide/remove privileges on a database object.
SQL GRANT Command
SQL GRANT is a command used to provide access or privileges on the database objects to the users.
The Syntax for the GRANT command is:
GRANTprivilege_name
ONobject_name
TO{user_name|PUBLIC|role_name}
[WITH GRANT OPTION];
privilege_name is the access right or privilege granted to the user. Some of the access rights are ALL, EXECUTE, and SELECT.
object_name is the name of an database object like TABLE, VIEW, STORED PROC and SEQUENCE.
user_name is the name of the user to whom an access right is being granted. user_name is the name of the user to whom an access right is being granted. PUBLIC is used to grant access rights to all users. ROLES are a set of privileges grouped together. WITH GRANT OPTION - allows a user to grant access rights to other users.
For Example: GRANT SELECT ON employee TO user1; This command grants a SELECT permission on employee table to user1.You should use the WITH GRANT option carefully because for example if you GRANT SELECT privilege on employee table to user1 using the WITH GRANT option, then user1 can GRANT SELECT privilege on employee table to another user, such as user2 etc. Later, if you REVOKE the SELECT privilege on employee from user1, still user2 will have SELECT privilege on employee table.
SQL REVOKE Command:
The REVOKE command removes user access rights or privileges to the database objects.
The Syntax for the REVOKE command is:
REVOKEprivilege_name
ONobject_name
FROM {user_name |PUBLIC |role_name}
For Example: REVOKE SELECT ON employee FROM user1;This command will REVOKE a SELECT privilege on employee table from user1.When you REVOKE SELECT privilege on a table from a user, the user will not be able to SELECT data from that table anymore. However, if the user has received SELECT privileges on that table from more than one users, he/she can SELECT from that table until everyone who granted the permission revokes it. You cannot REVOKE privileges if they were not initially granted by you.
Maintaining Database Objects
A database object in a relational database is a data structure used to either store or reference data. The most common object that people interact with is the table. Other objects are indexes, stored procedures, sequences, views and many more.
When a database object is created, a new object type cannot be created because all the various object types created are restricted by the very nature, or source code, of the relational database model being used, such as Oracle, SQL Server or Access. What is being created is instances of the objects, such as a new table, an index on that table or a view on the same table.
Two small but important distinctions in database objects are needed:
An object type is the base concept or idea of an object; for example, the concept of a table or index.
An object instance is an example of an object type. For example, a table called CUSTOMER_MASTER is an instance of the object type TABLE.
Most of the major database engines offer the same set of major database object types:
Tables Indexes Sequences Views Synonyms
Although there are subtle variations in the behavior and the syntax used for the creation of these major database object types, they are almost identical in their concept and what they mean. A table in Oracle behaves almost exactly as a table in SQL Server. This makes work much easier for the database administrator. It is analogous to moving from one car to another made by a different manufacturer; the switches for turning the headlights on may be in different locations, but the overall layout is broadly similar.
When creating an object instance, it is a good idea to follow an easy-to-understand naming convention. This is especially important for database designers whose products are intended to
be used by several people. It is also helpful to make work as simple as possible for in-house database administrators by reducing the number of queries made to the creator later. A simple guideline is to add suffixes. Here are two examples:
Suffix all the master tables using _MASTER: o CUSTOMER_MASTER o ACCOUNTS_MASTER o LOANS_MASTER
Suffix all transactional tables using the suffix _TRANS: o DAILY_TRANS o LOANS_TRANS o INTERBANK_TRANS
Commit and Rollback
Transactional control commands are only used with the DML Commands such as - INSERT,
UPDATE and DELETE only. They cannot be used while creating tables or dropping them
because these operations are automatically committed in the database.
The COMMIT Command
The COMMIT command is the transactional command used to save changes invoked by a
transaction to the database.
The COMMIT command is the transactional command used to save changes invoked by a
transaction to the database. The COMMIT command saves all the transactions to the database
since the last COMMIT or ROLLBACK command.
The syntax for the COMMIT command is as follows.
COMMIT;
Example
Consider the CUSTOMERS table having the following records −
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Following is an example which would delete those records from the table which have age = 25
and then COMMIT the changes in the database.
SQL> DELETE FROM CUSTOMERS
WHERE AGE = 25;
SQL> COMMIT;
Thus, two rows from the table would be deleted and the SELECT statement would produce the
following result.
+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+
The ROLLBACK Command
The ROLLBACK command is the transactional command used to undo transactions that have
not already been saved to the database. This command can only be used to undo transactions
since the last COMMIT or ROLLBACK command was issued.
The syntax for a ROLLBACK command is as follows −
ROLLBACK;
Example
Consider the CUSTOMERS table having the following records −
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Following is an example, which would delete those records from the table which have the age
= 25 and then ROLLBACK the changes in the database.
SQL> DELETE FROM CUSTOMERS
WHERE AGE = 25;
SQL> ROLLBACK;
Thus, the delete operation would not impact the table and the SELECT statement would
produce the following result.
+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+
various types of select commands
The SELECT statement is used to select data from a database.
The data returned is stored in a result table, called the result-set.
SELECT Syntax
SELECT column1, column2, ... FROM table_name;
Here, column1, column2, ... are the field names of the table you want to select data from. If you want to select all the fields available in the table, use the following syntax:
SELECT * FROM table_name;
Joins
A JOIN clause is used to combine rows from two or more tables, based on a related column between them.
Let's look at a selection from the "Orders" table:
OrderID CustomerID OrderDate
10308 2 1996-09-18
10309 37 1996-09-19
10310 77 1996-09-20
Then, look at a selection from the "Customers" table:
CustomerID CustomerName ContactName Country
1 Alfreds Futterkiste Maria Anders Germany
2 Ana Trujillo Emparedados y
helados
Ana Trujillo Mexico
3 Antonio Moreno Taquería Antonio
Moreno
Mexico
Notice that the "CustomerID" column in the "Orders" table refers to the "CustomerID" in the "Customers" table. The relationship between the two tables above is the "CustomerID" column.
Then, we can create the following SQL statement (that contains an INNER JOIN), that selects records that have matching values in both tables:
Example
SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate FROM Orders INNER JOIN Customers ON Orders.CustomerID=Customers.CustomerID; Try it Yourself »
and it will produce something like this:
OrderID CustomerName OrderDate
10308 Ana Trujillo Emparedados y helados 9/18/1996
10365 Antonio Moreno Taquería 11/27/1996
10383 Around the Horn 12/16/1996
10355 Around the Horn 11/15/1996
10278 Berglunds snabbköp 8/12/1996
Different Types of SQL JOINs
Here are the different types of the JOINs in SQL:
(INNER) JOIN: Returns records that have matching values in both tables LEFT (OUTER) JOIN: Return all records from the left table, and the matched records
from the right table RIGHT (OUTER) JOIN: Return all records from the right table, and the matched records
from the left table FULL (OUTER) JOIN: Return all records when there is a match in either left or right
table
Sub query
In SQL a Subquery can be simply defined as a query within another query. In other words we can say that a Subquery is a query that is embedded in WHERE clause of another SQL query.
Important rules for Subqueries:
You can place the Subquery in a number of SQL clauses: WHERE clause, HAVING clause, FROM clause. Subqueries can be used with SELECT, UPDATE, INSERT, DELETE statements along with expression operator. It could be equality operator or comparison operator such as =, >, =, <= and Like operator.
A subquery is a query within another query. The outer query is called as main query and inner query is called assubquery.
The subquery generally executes first, and its output is used to complete the query condition for the main or outer query.
Subquery must be enclosed in parentheses. Subqueries are on the right side of the comparison operator. ORDER BY command cannot be used in a Subquery. GROUPBY command can be used
to perform same function as ORDER BY command. Use single-row operators with singlerow Subqueries. Use multiple-row operators with
multiple-row Subqueries. Syntax: There is not any general syntax for Subqueries. However, Subqueries are seen to be used most frequently with SELECT statement as shown below: SELECT column_name
FROM table_name
WHERE column_name expression operator ( SELECT COLUMN_NAME from TABLE_NAME WHERE ... );
Aggregate functions
The SQL COUNT(), AVG() and SUM() Functions
The COUNT() function returns the number of rows that matches a specified criteria.
The AVG() function returns the average value of a numeric column.
The SUM() function returns the total sum of a numeric column.
COUNT() Syntax
SELECT COUNT(column_name) FROM table_name WHERE condition;
AVG() Syntax
SELECT AVG(column_name) FROM table_name WHERE condition;
SUM() Syntax
SELECT SUM(column_name) FROM table_name WHERE condition;
Challenges of My SQL
Long Development Time Scaling frameworks that cannot be optimized with master/slave setups requires extensive development time. Replication lag further complicates app logic because it disrupts the data consistency between the slave and the master. Finally, MySQL server modifications need constant coordination between database teams and apps.
Replication MySQL servers often run into replication conflicts during a manual failover when multi-master setups are involved.
Database Logging Costs Database logging is expensive and so it remains disabled most of the time. As a result, organizations lack real-time visibility into slow logs, which delays troubleshooting.
Query Caches MySQL server query cache is of little help when handling a high volume of workload, because cache invalidation cannot be controlled.
High Connection Churn If your apps rely on a LAMP stack, they tend to have a high volume of user sessions running
concurrently and, consequently, they experience a high connection churn. So most of your valuable server resources are exhausted on connection management.
Some companies consider sharding as a scaling option, but sharding adds significant complexity, cost and its own set of additional challenges. The easiest way to leverage the powerful features of MySQL without making any modifications at the app level or writing a single line of code is to utilize database load balancing software.
Introduction to big data
In order to understand 'Big Data', you first need to know
What is Data?
The quantities, characters, or symbols on which operations are performed by a computer, which may be stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media.
What is Big Data?
Big Data is also data but with a huge size. Big Data is a term used to describe a collection of data that is huge in size and yet growing exponentially with time. In short such data is so large and complex that none of the traditional data management tools are able to store it or process it efficiently.
In this intoduction, you will learn,
Examples Of Big Data Types Of Big Data Characteristics Of Big Data Advantages Of Big Data Processing
Examples Of Big Data
Following are some the examples of Big Data-
The New York Stock Exchange generates about one terabyte of new trade data per day.
Social Media
The statistic shows that 500+terabytes of new data get ingested into the databases of social media site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments etc.
A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many thousand flights per day, generation of data reaches up to many Petabytes.
Types Of Big Data
BigData' could be found in three forms:
1. Structured 2. Unstructured 3. Semi-structured
Structured
Any data that can be stored, accessed and processed in the form of fixed format is termed as a 'structured' data. Over the period of time, talent in computer science has achieved greater success in developing techniques for working with such kind of data (where the format is well known in advance) and also deriving value out of it. However, nowadays, we are foreseeing issues when a size of such data grows to a huge extent, typical sizes are being in the rage of multiple zettabytes.
Examples Of Structured Data
An 'Employee' table in a database is an example of Structured Data
Unstructured
Any data with unknown form or the structure is classified as unstructured data. In addition to the size being huge, un-structured data poses multiple challenges in terms of its processing for deriving value out of it. A typical example of unstructured data is a heterogeneous data source containing a combination of simple text files, images, videos etc. Now day organizations have wealth of data available with them but unfortunately, they don't know how to derive value out of it since this data is in its raw form or unstructured format.
Examples Of Un-structured Data
The output returned by 'Google Search'
Semi-structured
Semi-structured data can contain both the forms of data. We can see semi-structured data as a structured in form but it is actually not defined with e.g. a table definition in relational DBMS. Example of semi-structured data is a data represented in an XML file.
Examples Of Semi-structured Data
Personal data stored in an XML file
Characteristics Of Big Data
(i) Volume – The name Big Data itself is related to a size which is enormous. Size of data plays a very crucial role in determining value out of data. Also, whether a particular data can actually be considered as a Big Data or not, is dependent upon the volume of data. Hence, 'Volume' is one characteristic which needs to be considered while dealing with Big Data.
(ii) Variety – The next aspect of Big Data is its variety.
Variety refers to heterogeneous sources and the nature of data, both structured and unstructured. During earlier days, spreadsheets and databases were the only sources of data considered by most of the applications. Nowadays, data in the form of emails, photos, videos, monitoring devices, PDFs, audio, etc. are also being considered in the analysis applications. This variety of unstructured data poses certain issues for storage, mining and analyzing data.
(iii) Velocity – The term 'velocity' refers to the speed of generation of data. How fast the data is generated and processed to meet the demands, determines real potential in the data.
Big Data Velocity deals with the speed at which data flows in from sources like business processes, application logs, networks, and social media sites, sensors,Mobile devices, etc. The flow of data is massive and continuous.
(iv) Variability – This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively.
Benefits of Big Data Processing
Ability to process Big Data brings in multiple benefits, such as-
o Businesses can utilize outside intelligence while taking decisions
Access to social data from search engines and sites like facebook, twitter are enabling organizations to fine tune their business strategies.
o Improved customer service
Traditional customer feedback systems are getting replaced by new systems designed with Big Data technologies. In these new systems, Big Data and natural language processing technologies are being used to read and evaluate consumer responses.
o Early identification of risk to the product/services, if any o Better operational efficiency
Big Data technologies can be used for creating a staging area or landing zone for new data before identifying what data should be moved to the data warehouse. In addition, such integration of Big Data technologies and data warehouse helps an organization to offload infrequently accessed data.
Summary
Big Data is defined as data that is huge in size. Bigdata is a term used to describe a collection of data that is huge in size and yet growing exponentially with time.
Examples of Big Data generation includes stock exchanges, social media sites, jet engines, etc.
Big Data could be 1) Structured, 2) Unstructured, 3) Semi-structured Volume, Variety, Velocity, and Variability are few Characteristics of Bigdata Improved customer service, better operational efficiency, Better Decision Making are few
advantages of Bigdata