adbms project pearl
TRANSCRIPT
Team Pearl
Arun Kumar Dash
Divya Rajasri Tadi
Rajeev Reddy Rachamalla
Ramya Kirshna Reddy Vuyyuru
Sophia Benjelloun
Movies, directors, genres
The movie database allows users to search information regarding movies such as the directors, casts, fan ratings, fans, genres, awards, theaters, MPAA ratings, shows, price, and user reviews. In addition, the database provides the user with theater information including the number of screens each theater occupies per location and providing contact information of the theaters. Management at local theaters will be able to track a movie at their theater based on the price of the ticket by time of day the movie is screened and its type of viewers.
Introduction Fall 2014
Dr. McCart
+ My Favorites
Suggestions for You
2
TABLE OF CONTENTS Introduction: ................................................................................................ 1
Requirements: ............................................................................................. 3 Assumptions: ............................................................................................... 3 Restrictions: ................................................................................................. 4 Logical Design: ............................................................................................. 4
Logical Database Design ................................................................ 6
Physical Design: .......................................................................................... 6 Physical Database Design: ............................................................ 7
Data Generation and Loading: ................................................................. 21 Performance Tuning: ................................................................................ 24
SQL Tuning: ................................................................................... 24
Querying: ....................................................................................... 27
Parallelizing Index: ...................................................................... 30
Function Based Indexing: ........................................................... 32
DBA Scripts: ............................................................................................... 34 Database Security: .................................................................................... 41
Authentication: ............................................................................. 42 Users Privileges .................................................................... 43
Auditing: ........................................................................................ 45 Data Backup and Recovery: ..................................................................... 46 Weightage Table: ....................................................................................... 47
3
Requirements:
The movie database should include the name and location of multiple
movie theaters. The data will be focused on the type of movie and type of
theater it is screened at along with the time of day, cost per ticket, and type of
patron in particular to a specific movie, whether current or historical records.
Even additional add-‐ons will be stored, as in times; the cost of ticket can be
larger than an original movie ticket. Users will be allowed to search for a
theater screening the same movie and providing the user with the date and
time per movie and theater on multiple screens. Additionally, a user is able to
provide a theater rating more than once with the option to write a description.
Assumptions:
• The Theatre review is about the theatre and not about the screen. (So, if
a user/reviewer wants to review a particular screen of a specific
theatre, there is no mechanism to support this).
• Ratings are given in the range from (1to 5). Five means excellent and 1
means poor.
• Each theater will have many reviews. It is One to Many relation between
theaters and reviews.
• Fans or anonymous users will write the theater ratings. We stored all
anonymous users with a single ID.
• In each theater there can be multiple screens. One movie can be shown
in multiple screens. Many to Many relations between movie and
Theaters (Screens).
4
• We created a dummy user in the “fan” table to incorporate anonymous
users within the database.
Restrictions:
• Users cannot give review about individual screens in a Theater.
• There is no provision to distinguish between anonymous users.
Logical Design:
The database is divided into functional areas. These functional areas include
Movies, Theaters, Awards and Ratings. These areas are discussed further in
the database decomposition section.
Logical Database Design:
We built the logical model after the requirements were clearly
identified. The logical model includes the specified entities and the relation
between them. We detailed the attributes in each and every entity/table. The
logical data model fully explains the attributes of the data and fulfills the data
Movies
Theaters Ratings Awards
5
requirements from a business point of view. Data attributes define the data
types and lengths. Some data types include the null ability options.
We created our logical data model based on the Movies database
schema using oracle data modeler.
The steps involved in the Oracle Data Modeler for Logical Database design are:
File -‐> Data Modeler -‐> Import -‐> Data Dictionary.
A screen shot of this design is provided below.
6
Physical Database Design:
In a sense, logical design is what you draw with a pencil before building
your warehouse and physical design is when you create the database with
SQL statements.
7
During the physical design process, we converted the data gathered
during the logical design phase into a description of the physical database,
including tables and constraints. Physical design decisions, such as the type of
index or partitioning have a large impact on query performance.
In the physical database design, objects such as tables and columns are
created based on the entities and attributes that were defined during logical
design of the database system. Other important things like primary keys,
foreign keys, other unique keys and indexes are also defined. All these parts
are integrated to complete the Physical design of the database.
Tables:
8
AWARDS
CREATE TABLE "ADB_PEARL"."AWARDS"
( "AWARD_ID" NUMBER(10,0),
"AWARD_NAME" VARCHAR2(30 BYTE),
"AWARD_YEAR" NUMBER(5,0),
"AWARD_CATEGORY" VARCHAR2(30 BYTE),
"FILM_ID" NUMBER(10,0),
"AWARDEE_NAME" VARCHAR2(50 BYTE),
PRIMARY KEY ("AWARD_ID")
)
CASTS
CREATE TABLE "ADB_PEARL"."CASTS"
( "FILM_ID" NUMBER(10,0),
"CAST_MEMBER" VARCHAR2(50 BYTE),
"CAST_ROLE" VARCHAR2(100 BYTE),
PRIMARY KEY ("FILM_ID", "CAST_MEMBER", "CAST_ROLE"),
9
FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE
)
DIRECTORS
CREATE TABLE "ADB_PEARL"."DIRECTORS"
( "FILM_ID" NUMBER(10,0),
"DIRECTOR" VARCHAR2(50 BYTE),
PRIMARY KEY ("FILM_ID", "DIRECTOR"),
FOREIGN KEY ("FAN_ID")
REFERENCES "ADB_PEARL"."FANS" ("FAN_ID") ENABLE,
FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE
)
FAN_RATINGS
CREATE TABLE "ADB_PEARL"."FAN_RATINGS"
( "FAN_ID" NUMBER(10,0) NOT NULL ENABLE,
10
"FILM_ID" NUMBER(10,0) NOT NULL ENABLE,
"IMDB_RATING" NUMBER(10,1) NOT NULL ENABLE,
"LAST_RATED" DATE,
"FIRST_RATED" DATE,
PRIMARY KEY ("FILM_ID", "FAN_ID"),
FOREIGN KEY ("FAN_ID")
REFERENCES "ADB_PEARL"."FANS" ("FAN_ID") ENABLE,
FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE
)
FANS
CREATE TABLE "ADB_PEARL"."FANS"
( "FNAME" VARCHAR2(50 BYTE),
"LNAME" VARCHAR2(50 BYTE),
"GENDER" VARCHAR2(10 BYTE),
"STREET" VARCHAR2(100 BYTE),
"CITY" VARCHAR2(50 BYTE),
11
"STATE" VARCHAR2(20 BYTE),
"ZIP" VARCHAR2(20 BYTE),
"BIRTH_DAY" NUMBER,
"BIRTH_MONTH" NUMBER,
"BIRTH_YEAR" NUMBER,
"FAN_ID" NUMBER(10,0),
PRIMARY KEY ("FAN_ID")
)
GENRES
CREATE TABLE "ADB_PEARL"."GENRES"
( "FILM_ID" NUMBER(10,0),
"GENRE" VARCHAR2(20 BYTE),
PRIMARY KEY ("FILM_ID", "GENRE"),
FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE
)
KEYWORDS
CREATE TABLE "ADB_PEARL"."KEYWORDS"
( "FILM_ID" NUMBER(10,0) NOT NULL ENABLE,
"KEYWORD" VARCHAR2(50 BYTE) NOT NULL ENABLE,
12
PRIMARY KEY ("FILM_ID", "KEYWORD"),
FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE
)
LOCALES
CREATE TABLE "ADB_PEARL"."LOCALES"
( "FILM_ID" NUMBER(10,0) NOT NULL ENABLE,
"CITY" VARCHAR2(50 BYTE),
"STATE" VARCHAR2(50 BYTE),
"COUNTRY" VARCHAR2(50 BYTE) NOT NULL ENABLE,
"NOTE" VARCHAR2(200 BYTE),
PRIMARY KEY ("FILM_ID", "CITY", "COUNTRY"),
FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE
)
13
MOVIES
CREATE TABLE "ADB_PEARL"."MOVIES"
( "IMDB_RANK" NUMBER(10,0),
"IMDB_RATING" NUMBER(10,1),
"FILM_TITLE" VARCHAR2(100 BYTE) NOT NULL ENABLE,
"IMDB_VOTES" NUMBER(10,0),
"FILM_YEAR" NUMBER(4,0),
"RUNTIME" NUMBER(10,0),
"BUDGET" NUMBER(12,0),
"WORLDWIDE_GROSS" NUMBER(12,0),
"FILM_ID" NUMBER(10,0) NOT NULL ENABLE,
"USA_GROSS" NUMBER(12,0),
14
"AFI_RANK" NUMBER(10,0),
"MPAA_RATING" VARCHAR2(10 BYTE),
"RELEASE_DATE" DATE,
"GROSS_DATE" DATE,
PRIMARY KEY ("FILM_ID"),
FOREIGN KEY ("MPAA_RATING")
REFERENCES "ADB_PEARL"."MPAA_RATINGS" ("MPAA_RATING") ENABLE
)
MPAA_RATINGS
CREATE TABLE "ADB_PEARL"."MPAA_RATINGS"
( "MPAA_RATING" VARCHAR2(10 BYTE) NOT NULL ENABLE,
"RATING_DESCRIPTION" VARCHAR2(100 BYTE),
PRIMARY KEY ("MPAA_RATING")
)
PATRON_TYPE
CREATE TABLE "ADB_PEARL"."PATRON_TYPE"
( "TYPE_ID" NUMBER(5,0),
"PATRON" VARCHAR2(25 BYTE),
15
"MOVIE_TYPE" VARCHAR2(25 BYTE),
"MOVIE_TIME" VARCHAR2(25 BYTE),
PRIMARY KEY ("TYPE_ID")
)
PRICE
CREATE TABLE "ADB_PEARL"."PRICE"
( "PRICE_DATE" DATE NOT NULL ENABLE,
"TYPE_ID" NUMBER(10,0),
"FARE" NUMBER(5,0),
"PRICE_ID" NUMBER(10,0),
PRIMARY KEY ("PRICE_ID"),
CONSTRAINT "PRICE_FK" FOREIGN KEY ("TYPE_ID")
REFERENCES "ADB_PEARL"."PATRON_TYPE" ("TYPE_ID") ENABLE
)
PRODUCERS
CREATE TABLE "ADB_PEARL"."PRODUCERS"
( "FILM_ID" NUMBER(10,0),
"PRODUCER" VARCHAR2(50 BYTE),
PRIMARY KEY ("FILM_ID"),
16
CONSTRAINT "PR_CONST" FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE
)
SHOWS
CREATE TABLE "ADB_PEARL"."SHOWS"
( "FILM_ID" NUMBER(10,0) NOT NULL ENABLE,
"THEATRE_ID" NUMBER(10,0) NOT NULL ENABLE,
"SCREEN_ID" NUMBER(10,0) NOT NULL ENABLE,
"MOVIE_TIME" TIMESTAMP (6),
"SHOWS_DATE" DATE NOT NULL ENABLE,
"TICKETS_SOLD" NUMBER(10,0),
"TYPE_ID" NUMBER(10,0) NOT NULL ENABLE,
CONSTRAINT "SHOWS_CONST" PRIMARY KEY ("FILM_ID", "THEATRE_ID", "SCREEN_ID", "MOVIE_TIME", "SHOWS_DATE", "TYPE_ID"),
CONSTRAINT "SHOWS_FK" FOREIGN KEY ("TYPE_ID")
REFERENCES "ADB_PEARL"."PATRON_TYPE" ("TYPE_ID") ENABLE,
CONSTRAINT "SHOWS_FK1" FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE
17
)
TAGLINES
CREATE TABLE "ADB_PEARL"."TAGLINES"
( "FILM_ID" NUMBER(10,0) NOT NULL ENABLE,
"TAGLINE" VARCHAR2(300 BYTE) NOT NULL ENABLE,
PRIMARY KEY ("FILM_ID", "TAGLINE"),
FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE
)
TECHNICIANS
CREATE TABLE "ADB_PEARL"."TECHNICIANS"
( "FILM_ID" NUMBER(10,0),
"TECHNICIAN" VARCHAR2(50 BYTE),
PRIMARY KEY ("FILM_ID"),
FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE
)
18
THEATRE_INFO
CREATE TABLE "ADB_PEARL"."THEATRE_INFO"
( "THEATRE_ID" NUMBER(10,0),
"THEATRE_NAME" VARCHAR2(50 BYTE) NOT NULL ENABLE,
"THEATRE_TYPE" VARCHAR2(30 BYTE),
"PHONE" VARCHAR2(20 BYTE),
"EMAIL" VARCHAR2(50 BYTE) NOT NULL ENABLE,
"NUMBER_OF_SCREENS" NUMBER(3,0) NOT NULL ENABLE,
"STREET" VARCHAR2(100 BYTE),
"CITY" VARCHAR2(35 BYTE),
"STATE" VARCHAR2(35 BYTE),
"ZIP" VARCHAR2(15 BYTE),
19
"COUNTRY" VARCHAR2(30 BYTE),
PRIMARY KEY ("THEATRE_ID")
)
THEATRE_MOVIE_INFO
CREATE TABLE "ADB_PEARL"."THEATRE_MOVIE_INFO"
( "SCREEN_ID" NUMBER(10,0),
"THEATRE_ID" NUMBER(10,0),
"FILM_ID" NUMBER(10,0),
"PRICE_ID" NUMBER(10,0),
"MOVIE_TYPE" VARCHAR2(20 BYTE),
"MOVIE_CLASS" VARCHAR2(20 BYTE),
"CAPACITY" NUMBER(4,0),
CONSTRAINT "THEATRE_MOVIE_INFO_CONST1" PRIMARY KEY ("SCREEN_ID", "THEATRE_ID", "FILM_ID", "PRICE_ID"),
CONSTRAINT "THEATRE_MOVIE_INFO_CONST2" FOREIGN KEY ("THEATRE_ID")
REFERENCES "ADB_PEARL"."THEATRE_INFO" ("THEATRE_ID") ENABLE,
20
CONSTRAINT "THEATRE_MOVIE_INFO_CONST3" FOREIGN KEY ("FILM_ID")
REFERENCES "ADB_PEARL"."MOVIES" ("FILM_ID") ENABLE,
CONSTRAINT "THEATRE_MOVIE_INFO_CONST4" FOREIGN KEY ("PRICE_ID")
REFERENCES "ADB_PEARL"."PRICE" ("PRICE_ID") ENABLE
)
USER_REVIEWS
CREATE TABLE "ADB_PEARL"."USER_REVIEWS"
( "REVIEW_ID" NUMBER(10,0),
"THEATRE_ID" NUMBER(10,0),
"FAN_ID" NUMBER(10,0),
"RATING" NUMBER(2,0),
"DESCRIPTION" VARCHAR2(150 BYTE),
"REVIEW_DATE" DATE,
CONSTRAINT "USER_REVIEW_CONST1" PRIMARY KEY ("REVIEW_ID", "THEATRE_ID", "FAN_ID"),
CONSTRAINT "USER_REVIEW_CONST2" FOREIGN KEY ("THEATRE_ID")
REFERENCES "ADB_PEARL"."THEATRE_INFO" ("THEATRE_ID") ENABLE,
CONSTRAINT "USER_REVIEW_CONST3" FOREIGN KEY ("FAN_ID")
21
REFERENCES "ADB_PEARL"."FANS" ("FAN_ID") ENABLE
)
Data Generation and Loading:
For this section, we have generated appropriate data for the schema of the
database. Apart from the tables already existent in the Movies database, we
have created 6 additional tables:
AWARDS
TECHNICIANS
THEATRE_INFO
THEATRE_MOVIE_INFO
USER_REVIEWS
PATRON_TYPE
Of these tables, we have generated data for the AWARDS and the
TECHNICIANS tables. We used multiple techniques to generate the data. For
the AWARDS table, we gathered data from Wikipedia. We assembled the data
into an Excel table and then loaded the data using the import feature of
SQLDeveloper into our AWARDS table.
There are three steps in the import process:
22
1. Choose the import method
2. Select the columns
3. Select the definitions of the columns
For propagating data into the TECHNICIANS table, we first gathered
some data about the technicians from Wikipedia. Then, we assembled the data
into an Excel table and then imported the data into TECHNICIANS table.
23
The same procedure was repeated for the PRODUCERS table. We began
assembling the data about the respective producers from Wikipedia, then
assembled them into an Excel table and finally imported that into our
PRODUCERS table.
24
Performance Tuning:
SQL Tuning
Different SQL statements are used to retrieve data from the Movies
database. We can get the same results by writing different queries but using
the best query is always important to improve the performance. Therefore, we
need to SQL query tuning based on the requirements. The list of queries below
shows how queries can be optimized for better performance.
1. SQL query becomes faster if you use the actual columns in SELECT
statement instead of “ * “.
25
SELECT FILM_TITLE, FILM_ID FROM MOVIES
INSTEAD OF
SELECT * FROM MOVIES
2. Sometimes you may have more than one sub queries in your main query.
Try to minimize the number of sub query blocks in your query.
SELECT FILM_ID, FILM_TITLE FROM MOVIES WHERE (IMDB_VOTES,
USA_GROSS) = (SELECT MAX (IMDB_VOTES), MAX (USA_GROSS) FROM
MOVIES)
INSTEAD OF
SELECT FILM_ID, FILM_TITLE FROM MOVIES WHERE IMDB_VOTES= (SELECT
MAX (IMDB_VOTES) FROM MOVIES) AND USA_GROSS = (SELECT MAX
(USA_GROSS) FROM MOVIES)
3. Use operator EXISTS, IN and table joins appropriately in your query.
a) Usually, IN has the slowest performance.
b) IN is efficient when most of the filter criteria is in the sub-‐query.
c) EXISTS is efficient when most of the filter criteria is in the main query.
Example: Write the Query as follows:
SELECT FILM_ID FROM SHOWS S WHERE EXISTS (SELECT * FROM
THEATRE_INFO T WHERE S.THEATRE_ID=T.THEATRE_ID)
INSTEAD OF
SELECT FILM_ID FROM SHOWS WHERE THEATRE_ID IN (SELECT
THEATRE_ID FROM THEATRE_INFO)
26
4. Use UNION ALL in place of UNION
Example: Write the Query as follows:
SELECT FILM_ID, FILM_TITLE FROM MOVIES UNION ALL SELECT AWARD_ID,
AWARD_NAME FROM AWARDS
INSTEAD OF
SELECT FILM_ID, FILM_TITLE FROM MOVIES UNION SELECT AWARD_ID,
AWARD_NAME FROM AWARDS
5. Use where clause appropriately
Example: Write the Query as follows:
SELECT * FROM MOVIES WHERE FILM_YEAR BETWEEN 2000 AND 2014
INSTEAD OF
SELECT * FROM MOVIES WHERE FILM_YEAR>=2000 AND FILM_YEAR<=2014
6. HAVING clause is used to filter the rows after all the rows are selected. It is
like a filter. Do not use the HAVING clause for any other purposes.
SELECT MPAA_RATING, COUNT (M.FILM_TITLE) AS MOVIES FROM MOVIES M
GROUP BY MPAA_RATING HAVING COUNT (M.FILM_TITLE)>1
INSTEAD OF
SELECT MPAA_RATING, COUNT (M.FILM_TITLE) AS MOVIES FROM MOVIES
M GROUP BY MPAA_RATING HAVING MPAA_RATING='R'
27
7. Use EXISTS instead of DISTINCT when using joins, which involves tables
having one-‐to-‐many relationship.
Example: Write the Query as follows:
SELECT FILM_ID, FILM_TITLE FROM MOVIES M WHERE EXISTS (SELECT
CAST_MEMBER FROM CASTS C WHERE M.FILM_ID=C.FILM_ID)
INSTEAD OF
SELECT DISTINCT C.CAST_MEMBER, M.FILM_ID FROM MOVIES M, CASTS C
WHERE M.FILM_ID=C.FILM_ID
Querying
Interesting queries for MOVIES database:
1. Display the name of all movies that have an IMDB rating of at least 8.0,
with more than 100,000 IMDB votes, and were released from 2007 to
2013. Show the movies with the highest IMDB ratings first.
SELECT M.FILM_TITLE AS MOVIES FROM RELMDB.MOVIES M WHERE
M.IMDB_VOTES >100000 AND M.IMDB_RATING >=8.0 AND
M.FILM_YEAR>=2007 AND M.FILM_YEAR<=2013 ORDER BY
M.IMDB_RATING DESC;
28
2. Display each movie’s title and total gross, where total gross is the USA
gross and worldwide gross combined. Exclude any movies that do not
have values for either USA gross or worldwide gross. Show the highest
grossing movies first.
SELECT M.FILM_TITLE AS TITLE, (M.USA_GROSS +
M.WORLDWIDE_GROSS) AS TOTAL_GROSS FROM RELMDB.MOVIES M
WHERE M.USA_GROSS IS NOT NULL AND M.WORLDWIDE_GROSS IS
NOT NULL ORDER BY TOTAL_GROSS DESC;
29
3. Display how many movies have an MPAA rating of G, PG, PG-‐13, and R.
Show the results in alphabetical order by MPAA rating.
SELECT MPAA_RATING, COUNT (M.FILM_TITLE) AS MOVIES
FROMRELMDB.MOVIES M WHERE
MPAA_RATING IN ('G','PG','PG-‐13','R') GROUP BY MPAA_RATING
ORDER BY MPAA_RATING;
4. Display the titles of all movies where Tom Hanks or Tim Allen were cast
members. Each movie title should be shown only once.
SELECT DISTINCT M.FILM_TITLE AS TITLE FROM RELMDB.MOVIES M
INNER JOIN RELMDB.CASTS C ON
M.FILM_ID = C.FILM_ID WHERE C.CAST_MEMBER IN ('Tom Hanks','Tim
Allen')
5. 5. For each movie, display its movie title, year, and how many cast
members were part of the movie. Exclude movies with five or fewer cast
members. Display movies with the most cast members first, followed by
movie year and title.
SELECT COUNT (C.CAST_MEMBER) AS CASTCOUNT, M.FILM_TITLE,
M.FILM_YEAR FROM
30
RELMDB.CASTS C INNER JOIN RELMDB.MOVIES M ON M.FILM_ID=
C.FILM_ID GROUP BY
M.FILM_TITLE, M.FILM_YEAR HAVING COUNT (C.CAST_MEMBER) >5
ORDER BY COUNT
(C.CAST_MEMBER) DESC, FILM_YEAR DESC, FILM_TITLE DESC ;
6. Displays the movies which contain the keyword “people” in tagline.
SELECT movies.film_title
FROM movies
INNER JOIN taglines ON movies.film_id = taglines.film_id
WHERE taglines.tagline LIKE '%people%';
Parallelizing Index
We can use parallelizing indexes which enable the database to use multiple processes to create the index which is quicker than if a single server process
31
created the index sequentially. We can increase the degree of the table to help it to use multiple processes which improves the performance.
We first ran a query which displays the film name and its Fan ID where the film_year is greater than 1990.
SELECT MOVIES.FILM_TITLE,MOVIES.FILM_YEAR,FAN_RATINGS.FAN_ID from MOVIES LEFT JOIN FAN_RATINGS on FAN_RATINGS.FILM_ID = MOVIES.Film_id where MOVIES.FILM_YEAR > 1990
Now, we increased the degree of the tables to allow parallelism
ALTER TABLE MOVIES PARALLEL(DEGREE 4)
ALTER TABLE FAN_RATINGS PARALLEL(DEGREE 4)
We can see the reduction in the cost by enabling parallelism.
Conclusion:
We can notice that query cost has reduced from 7 to 5. Hence, the use of parallelized view helped in the improvement of performance. Parallelism also helps in faster access of data.
32
Function Based Indexing
The ability to index functions and use these indexes in a query is called function-‐based index. This capacity allows you to have case insensitive searches or sorts, search on complex equations, and extend the SQL Language efficiently by implementing your own functions and operators and then searching on them.
• It is easy and provides immediate value. • It can be used to speed up existing applications without changing any of their logic or queries.
• It can be used to supply additional functionalities to applications with very little cost.
Query:
SELECT FILM_TITLE,(WORLDWIDE_GROSS-‐BUDGET) AS Earnings FROM MOVIES
WHERE (WORLDWIDE_GROSS-‐BUDGET)>1000000
Without Index:
33
Index Creation:
With Index:
Here we observe that indexing the functional attribute not only
increases the cost but also the performance. In the query above, if we had made indexes on either WORLDWIDE_GROSS or BUDGET, it wouldn’t have been used since we are using WORLDWIDE_GROSS-‐BUDGET as the filtering criteria.
After creating indexes ideally STATS should be gathered since the spread of the data in the database changes the statistics of the database. But currently we do not have the privilege to execute DBMS_STATS package. It requires DBA privilege since it will impact the database performance.
34
DBA Scripts:
The below queries will display the critical and general database information.
SELECT * FROM V$DATABASE;
SELECT * FROM V$INSTANCE
35
SELECT * FROM V$VERSION
ACTIVE SESSIONS IN THE DATABASE:
The below query lists out the sessions which are presently active in the database:
SELECT NVL(V$SESSION.USERNAME, '(oracle)') AS USERNAME, V$SESSION.OSUSER, V$SESSION.SID,
V$SESSION.SERIAL#,V$PROCESS.SPID, V$SESSION.LOCKWAIT, V$SESSION.STATUS,V$SESSION.MODULE,
V$SESSION.MACHINE,V$SESSION.PROGRAM, TO_CHAR(V$SESSION.LOGON_TIME,'DD-‐MON-‐YYYY HH24:MI:SS') AS
LOGON_TIME FROM V$SESSION, V$PROCESS WHERE V$SESSION.PADDR = V$PROCESS.ADDR AND
36
V$SESSION.STATUS = 'ACTIVE'
ORDER BY V$SESSION.USERNAME, V$SESSION.OSUSER;.
BLOCKING:
The below query will provide the list of user id’s who have blocked a session. It takes into account the V$SESSION and V$LOCKS table.
SELECT S.SID, S.USERNAME, B.BLOCKER, B.WAITER FROM V$SESSION s,
(SELECT SID, DECODE(BLOCK, 0, 'NO', 'YES' ) BLOCKER, DECODE(REQUEST, 0, 'NO','YES' ) WAITER FROM V$LOCK
WHERE REQUEST > 0 OR BLOCK > 0 ) b
WHERE S.SID = B.SID
37
There were perhaps no blocked users, therefore resulting into the above output.
PROFILES:
The below query lists out the DBA Profiles:
SELECT * FROM DBA_PROFILES
38
USERS:
The below query shows the list of users in the database:
SELECT USERNAME, USER_ID, ACCOUNT_STATUS, LOCK_DATE, EXPIRY_DATE, DEFAULT_TABLESPACE, PROFILE
FROM
DBA_USERS
39
ROLES:
The below query list out the roles in the database:
SELECT * FROM DBA_ROLES
DISPLAYING SYSTEM PARAMETERS ALONG WITH THEIR DESCRIPTION
SET LINESIZE 500
COLUMN name FORMAT A30
COLUMN value FORMAT A60
40
SELECT SP.NAME, SP.TYPE, SP.VALUE, SP.ISSES_MODIFIABLE, SP.ISSYS_MODIFIABLE, SP.ISINSTANCE_MODIFIABLE
FROM V$SYSTEM_PARAMETER SP
ORDER BY SP.NAME;
CACHE MEMORY USAGE:
To improve performance, a DBA needs to constantly monitor the cache memory usage. Below query lists out the same:
SELECT OWNER, NAMESPACE, TYPW, COUNT(*) OBJECT_COUNT, SUM(SHARABLE_MEM) SHARABLE_MEM,
SUM(LOADS) LOADS, SUM(EXECUTIONS) EXECUTIONS, SUM(LOCKS) LOCKS, SUM(PINS) PINS
FROM V$DB_OBJECT_CACHE
GROUP BY OWNER, NAMESPACE, TYPE;
41
Database Security:
Database security concerns the use of a broad range of information
security controls to protect databases (potentially including the data, the
database applications or stored functions, the database systems, the database
servers and the associated network links) against compromises of their
confidentiality, integrity and availability. It involves various types or
categories of controls such as technical, procedural/administrative, and
physical.
Security risks to database systems:
• Unauthorized or unintended activity or misuse by authorized database
users
42
• Data corruption and/or loss caused by the entry of invalid data or
commands
• Malware infections causing incidents such as unauthorized access
• Overloads, performance constraints, and capacity issues
• Physical damage to database servers
• Design flaws and programming bugs in databases
Many layers and types of information security controls are appropriate to
databases, including:
• Access control
• Auditing
• Authentication
• Encryption
• Integrity controls
• Backups
• Application security
• Database Security applying Statistical Method
We want to implement authentication and auditing of the database in our
project.
Authentication
A basic security requirement is that your users must know. You must
identify the users before you can determine their privileges and access rights;
so that you can audit their actions upon the data.
43
Users can be authenticated in a number of different ways before they
are allowed to create a database session. In database authentication, you can
define users such that the database performs both identification and
authentication of users. In external authentication, you can define users such
that authentication is performed by the operating system or network service.
Alternatively, you can identify users through authentication by the Secure
Sockets Layer (SSL).
For enterprise users, an enterprise directory can be used to authorize
their access to the database through enterprise roles. Finally, you can specify
users who are allowed to connect through a middle-‐tier server. The middle-‐
tier server authenticates and assumes the identity of the user and is allowed
to enable specific roles for the user. This is called proxy authentication.
Different privileges are given to different users; this will result in only
select number of people who can change the data in the tables. In the movies
database, we have defined a set of actors and assigned privileges. The list of
some of the users and their privileges are listed in the table below.
Users Privileges
A user privilege is a right to execute a particular type of SQL statement
or a right to access another user's object. The types of privileges are defined
by DBMS.
Roles, on the other hand, are created by users (usually administrators) and
are used to group together privileges or other roles. They are a means of
facilitating the granting of multiple privileges or roles to users.
User Roles
44
• Database Administrator
• Database Programmers
• Moderators
• Users
We set authentication rules for each user. Users need to create username and
password to log into the system to ensure that they have privileges.
Some authentication features need to be enabled in the system:
• Password has to be changed every three months.
• Make the password between 10 and 30 characters and using numbers.
• Use mixed case letters and special characters in the password.
• Use the database character set for the password's characters, which can
include the underscore (_), dollar ($), and number sign (#) characters.
• Do not use an actual word for the entire password.
• Entering a password incorrectly for more than five tries will lock the
user account.
• Password encryption.
The following DBA script will display all the roles and the privileges granted:
SET SERVEROUTPUT ON
SET VERIFY OFF
SELECT a.granted_role "Role",
a.admin_option "Adm"
FROM user_role_privs a;
SELECT a.privilege "Privilege",
45
a.admin_option "Adm"
FROM user_sys_privs a;
Auditing
Auditing is the monitoring and recording of selected user database
actions. It can be based on individual actions, such as the type of SQL
statement executed or on combinations of factors that can include user name,
application, time, and so on. Security policies can trigger auditing when
specified elements in an Oracle database are accessed or altered, including the
contents within a specified object.
Auditing is typically used to:
• Enable future accountability for current actions taken in a particular
schema, table, or row, or affecting specific content.
• Deter users (or others) from inappropriate actions based on that
accountability.
• Investigate suspicious activity.
• Notify an auditor that an unauthorized user is manipulating or deleting
data and that the user has more privileges than expected which can lead
to reassessing user authorizations.
• Monitor and gather data about specific database activities.
• Detect problems with an authorization or access control
implementation.
Types of auditing:
• Statement auditing
46
• Privilege auditing
• Schema object auditing
• Fine-‐Grade auditing
Auditing can also be used to monitor data about certain database activities. A
DBA can gather data about all the logical I/O’s being performed. This can give
the DBA a better understanding of what is happening in the system.
Data backup and recovery
Data backup is an insurance plan. Important files are accidentally
deleted all the time. Mission-‐critical data can become corrupt. Natural
disasters can leave your office in ruin. With a solid backup and recovery plan,
you can recover from any of these. Without one, you're left with nothing to fall
back on.
A Recovery manager can be used to backup data and restore it in quick time.
Recovery Manager (RMAN) is an Oracle utility that can back up, restore,
and recover database files. The product is a feature of the Oracle database
server and does not require separate installation. Recovery Manager is a
client/server application that uses database server sessions to perform
backup and recovery. It stores metadata about its operations in the control file
of the target database and, optionally, in a recovery catalog schema in an
Oracle database.
Below is an example query to back up a database to an image copy:
47
connect target / run { shutdown immediate; startup force DBA; shutdown
immediate; startup mount; allocate channel channel1 type DISK; backup as
copy format '/backup/%U' (database) CURRENT CONTROLFILE SPFILE;
BACKUP AS COPY CURRENT CONTROLFILE FORMAT '/backup/control01.ctl';
release channel channel1;
shutdown immediate; startup;
SQL "ALTER DATABASE BACKUP CONTROLFILE TO TRACE"; }
Weightage Table:
Topic / Section Description Evaluation
Logical database design
The logical design section should include entity-‐relationship diagrams (ERDs) and data dictionaries for your database design, as well as any design assumptions. You might include a few high-‐level diagrams that highlight interesting sections of your project (include textual descriptions with these). There should also be a complete ERD for your entire project. There is no expectation that you implement all of your design, just indicate the areas built. You are expected to add additional design work as part of the project.
15
Physical database design
This section should cover implementation-‐level issues. For instance, you should discuss predicted usage and indexing strategies that support expected activities. In addition, you may wish to discuss architecture issues, including distributed database issues (even though you may not implement anything in these areas). Artifacts could include capacity planning, storage subsystems, and data placement (e.g., tablespace / file system arrangements), indexing strategies,
15
48
transaction usage maps, etc.
Data generation and loading
Though some data was provided, there may have been interesting queries, stored procedures, desktop tools (e.g., MS Excel) that were used to populate the database. You may have used queries with mod function, data arithmetic, number sequences, lookup tables, and even data from the Web. Any / all of these are interesting additions to the project.
15
Performance tuning
In this section, highlight any experiments run as part of the project related to performance tuning. Experiments with different indexing strategies, optimizer changes, transaction isolation levels, function-‐based indexes, and table partitioning can all be interesting. Remember to look at different types of queries (e.g., point, range, scan), execution plans, and I/O burden.
20
Querying
You may also choose to focus on writing SQL queries (analytic SQL extensions can also be explored). Include interesting queries that highlight the types of questions that can be answered by the database. These queries may also be used to illustrate performance tuning.
10
DBA scripts
Throughout the semester, we looked at example DBA scripts that query the system catalog (a good way to explore the database engine). Provide DBA scripts, and an explanation of how the scripts can be used, that are helpful for reporting on database objects, indexes, constraints, physical storage, data files, etc.
15
Database security
Database security is an important area of interest that can also be investigated. Though you are limited on the implementation side, you can develop a security policy and discuss how you might implement various aspects using authentication strategies, roles, profiles, and even auditing features.
10