1 chapter 4 logical & physical database design. 2 logical data modeling application data models...
TRANSCRIPT
1
Chapter 4Logical & Physical Database Design
2
Logical Data Modeling
Application data models have two primary phases– Establishing logical data model– Translating logical data model into physical data model
Normalization & Third Normal Form– All data in an entity is dependent on the primary key– There should be no repeating groups of attributes– No data in an entity is dependent on part of the key– No data in an entity is dependent on any nonkey attribute
“The key, the whole key, and nothing but the key, so help me Codd”
3
Logical Data Modeling
Application data models have two primary phases– Establishing logical data model– Translating logical data model into physical data model
Normalization & Third Normal Form– All data in an entity is dependent on the primary key– There should be no repeating groups of attributes– No data in an entity is dependent on part of the key– No data in an entity is dependent on any nonkey attribute
“The key, the whole key, and nothing but the key, so help me Codd”
4
Logical Data Modeling (cont.)
Determining Data types for attributes– Fixed & Variable length attributes– Integer & floating point numbers– Pictures, video, long character data (LONG, LOB)– Dates, Timestamps, and the like
When determining data types– Consider the impact on storage– Consider database fragmentation– Carefully consider options before using LONG, LOB
5
Logical Data Modeling (cont.)
Keys– Natural
constructed naturally from entity Column(s) have meaning Usually will have a longer key length than artificial keys Often are made up of multiple columns
– Artificial unintelligent, has no meaning Usually a sequential number Generally will perform better than natural keys Never needs updating (easier to manage referential integrity)
– Ongoing debate about the use of natural vs. artificial keys
6
Logical Data Modeling (cont.)
Data Warehouse Design– Have different logical requirements– Common data models include
Star Schema– Fact table (generally large)– Series of dimension tables (generally small)
Snowflake Schema– A more complex star schema
Hybrid– Have different physical requirements
Partitioning Bitmap Indexes Materialized Views
7
Logical to Physical
Logical design meets functional requirements Physical design meets performance
requirements Common mistake is making physical model
exact copy of logical model– Usually means lower performance – Pay the price upfront to properly determine physical
model
8
Logical to Physical (cont.)
Key steps include– Mapping entities to tables– Choosing a table type– Determining Data types– Precision & optional attributes
Denormalization– Done for performance– Can increase overhead
Summary tables Partitioning tables
9
The Star Schema
Common in Data Warehousing Typically show better performance for warehouses Fact table
– Contains the detailed information– Many foreign keys to “dimension” tables
Dimension tables– Many tables that surround the fact table– Are reference or “categorized” tables such as time, product,
and customer
See Figure 4-3 (p. 94)
10
The Snowflake Schema
An expanded star schema Dimensions are split into multiple tables Is a normalization technique To be used with caution, can degrade
performance Can complicate queries Can reduce storage requirements See Figure 4-4 (p. 95)
11
Materialized Views
Also common in the data warehouse Done to aggregate or join data Created to simplify access for the end-user Is a physical table (contains storage) Provides sophisticated refresh functionality Refreshes can be complete or just the changes Query rewrite functionality gives user
transparency of the views themselves
12
Physical Storage Options
Segment space management– Can be automatic (preferred) or manual– Affect how oracle manages block management
regarding: Freelists Block-related parameters (PCTUSED,PCTFREE)
Row migration– Indicated by table fetch continued row (V$SYSSTAT)
INITRANS – transaction slots within a block– ITL (Interested Transaction list)
13
Physical Storage Options (cont.)
Compression– Reduces storage & memory requirement– Makes DML slower– Select operations can be faster– In Oracle 10g
Done during table creation or reorganization Loads needed to be done “direct-load” Normal DML caused decompression
– In Oracle 11g New advanced compression component Can function within normal DML operations
– Best used with Table scans Character type data
14
Physical Storage Options (cont.)
LOBS (Large Object Data)– Character data > 4000 bytes (CLOB)– Binary data (BLOB)– Stored separately from remainder of row data– Storage mechanism differs than row data (chunks)– Have different storage options – New security features in Oracle 11g
15
Oracle Partitioning
Breaks table/index up into logical segments Each partition can have separate storage
characteristics Benefits include
– Reading only relevant partitions needed for queries– Can help improve parallel processing for DML, select
operations, and database maintenance operations– Deletes can be done much cheaper– Can reduce latch contention
16
Oracle Partitioning (cont.)
Partitioning types:– Range (typically time-based)– Hash (helps ensure equal-sized partitions)– List (based on specific set of values – e.g. – state code)– Composite partitioning (combination of above)
New Oracle 11g partitioning– Reference (child tables inherit partitioning from parents)– Interval (can enable auto-addition of partitions)– Virtual-column (enables partitioning on expressions)
17
Oracle Partitioning (cont.)
Choosing a partitioning strategy– Range (good if data will be purged)– Hash (helps if parallel operations are needed)– List (queries based on a small subset of table’s data)– Composite partitioning (if multiple of above factors are
indicated)– Enterprise manager partitioning advisor
Can help suggest partitioning schemes