database relationship modeling building database relationships
TRANSCRIPT
Database Relationship Modeling
Building Database Relationships
page 204/10/23 Presentation
Database Relationship Modeling
Before you create a database, you must answer the following questions: What is the purpose of this database and who will use it? What tables (data) will this database contain? What queries and reports do the users of this database
need? What forms do you need to create?
Answering the above questions will lead you to a good database design, and help you create a database that is useful and usable.
page 304/10/23 Presentation
Database Relationship Modeling
If you start out with poorly arranged data, you could be in big trouble down the line, trouble from which you might never recover.
Even experienced database administrators have been known to discover, halfway into a project, that their data design is flawed and they have no choice but to throw out all the code they've written so far and start from scratch.
As you tackle your first database, be prepared to make some mistakes.
page 404/10/23 Presentation
Greg’s Database
If Greg’s friend Nigel has asked Greg to help
him find a woman to date with similar interests, can he easily query the data with the table we have currently?
SELECT * FROM my_contacts
WHERE gender = 'F'
AND status = 'single'
AND state='TX'
AND seeking LIKE '%single M%'
AND birthday > '1970-08-28'
AND birthday < '1980-08-28'
AND interests LIKE'%animals%‘
AND interests LIKE '%horse%‘
AND interests LIKE '%movies%';
This query will take too long to write as the database grows Your table design should do the heavy lifting for you. Don’t
write convoluted queries to “get around” a badly designed table.
page 504/10/23 Presentation
Other options for searching
Ignore the interests column – No, Interests ARE important. We shouldn’t ignore them, there’s some valuable information in there.
Query just first interest – Only women who had ‘animals’ listed first in their interests will show up in the results, using SUBSTRING_INDEX:
SELECT * FROM my_contacts
WHERE gender = 'F'
AND status = 'single'
AND state='MA'
AND seeking LIKE '%single M%'
AND birthday > '1950-08-28'
AND birthday < '1960-08-28'
AND SUBSTRING_INDEX(interests,’,’,1) = 'animals';
No, you may miss something from another interest column that might be important to the match.
Create multiple columns to hold one interest in each – No, it makes querying even more complicated. Think about all of the AND/OR statements.
page 604/10/23 Presentation
We need to move the non-atomic columns in our table into new tables.
Schema - A representation of all the structures, such as tables and columns, in your database, along with how they connect.
Creating a visual depiction of your database can help you see how things connect when you’re writing your queries, but your schema can also be written in a text format.
Think outside of the single table
page 704/10/23 Presentation
Every time we looked at a table, we either depicted it with the column names across the top and the data below, or we used a DESCRIBE statement.
Those are both fine for single tables, but they’re not very practical to use when we want to create a diagram of multiple tables.
Here’s a shorthand technique for diagramming the current my_contacts table:
The primary key will be referenced
differently in Visio.
page 804/10/23 Presentation
How to go from one table to two
Here’s our current my_contacts table. Our interest column isn’t atomic, and there’s really only one good way to make it atomic: we need a new table that will hold all the interests. Remove the interests column and put it in its own table. Add columns that will let us identify which interests belong to
which person in the my_contacts table. We need a unique column to connect these. Luckily, since we already
started to normalize it, we have a truly unique column in my_contacts: the primary key.
We can’t just use first and last names, what if someone else has the same name?
page 904/10/23 Presentation
The Foreign Key
We can use the value from the primary key in the my_contacts table as a column in the interests table. Better yet, we’ll know which interests belong to which person in the my_contacts table through this column. It’s called a foreign key
The FOREIGN KEY is a column in a table that references the PRIMARY KEY of another table.
page 1004/10/23 Presentation
Foreign key facts
A foreign key can have a different name than the primary key it comes from.
The primary key used by a foreign key is also known as a parent key. The table where the primary key is from is known as a parent table.
The foreign key can be used to make sure that the rows in one table have corresponding rows in another table.
Foreign key values can be null, even though primary key values can’t.
Foreign keys don’t have to be unique—in fact, they often aren’t.
page 1104/10/23 Presentation
Foreign key facts
A NULL foreign key means that there’s no matching primary key in the parent table.
Although you could simply create a table and put in a column to act as a foreign key, it’s not really a foreign key unless you designate it as one when you CREATE or ALTER a table.
The key is created inside of a structure called a constraint. You will only be able to insert values into your foreign key
that exist in the table the key came from, the parent table. This is called
referential integrity.
page 1204/10/23 Presentation
CREATE a table with a FOREIGN KEY
page 1304/10/23 Presentation
Relationships between tables
In the my_contacts table, our problem is that we need to associate lots of people with lots of interests.
Relationships: one-to-one 1:1, one-to-many 1:M, and many-to-many M:M
page 1404/10/23 Presentation
Types of Relationships
One to one - exactly one row of a parent table is related to one row of a child table.
One to many - A record in Table A can have MANY matching records in Table B, but a record in Table B can only match ONE record in Table A.
Many to many
A student takes many classes and a class has many students
page 1504/10/23 Presentation
Patterns of data: one-to-one
In this pattern a record in Table A can have at most ONE matching record in Table B. Say Table A contains your name, and Table B contains your salary details and Social Security
Numbers, in order to isolate them from the rest of the table to keep them more secure. Both tables will contain your ID number so you get the right paycheck. The employee_id in the parent table is a primary key, the employee_id in the child table is a foreign
key. In the schema, the connecting line is plain to show that we’re linking one thing to one thing.
page 1604/10/23 Presentation
When to use one-to-one tables
It generally makes more sense to leave one-to-one data in your main table, but there are a few advantages you can get from pulling those columns out at times: Pulling the data out may allow you to write faster queries. For
example, if most of the time you need to query the SSN and not much else, you could query just the smaller table.
If you have a column containing values you don’t yet know, you can isolate it and avoid NULL values in your main table.
You may wish to make some of your data less accessible. Isolating it can allow you to restrict access to it. For example, if
you have a table of employees, you might want to keep their salary information out of the main table.
If you have a large piece of data, a BLOB type for example, you may want that large data in a separate table.
page 1704/10/23 Presentation
Patterns of data: one-to-many
One-to-many means that a record in Table A can have many matching records in Table B, but a record in Table B can only match one record in Table A.
The connecting line has a black arrow at the end to show that we’re linking one thing to many things.
Each row in the professions table can have many matching rows in my_contacts, but each row in my_contacts has only one matching row in the professions table.
For example, the prof_id for Programmer may show up more than once in my_contacts, but each person in my_contacts will only have one prof_id.
page 1804/10/23 Presentation
Patterns of data: many-to-many
Many women own many pairs of shoes.
If we created a table containing women and another table containing shoes to keep track of them all, we’d need to link many records to many records since more than one woman can own a particular make of shoe.
Suppose Carrie and Miranda buy both the Old Navy Flops and Prada boots, and Samantha and Miranda both have the Manolo Strappies, and Charlotte has one of each. Here’s how the links between the women and shoes tables would look.
page 1904/10/23 Presentation
Sharpen your Pencil
page 2004/10/23 Presentation
Sharpen your Pencil
page 2104/10/23 Presentation
Patterns of data: a junction table
As you just found, adding either primary key to the other table as a foreign key gives us duplicate data in our table.
We need a table to step in between these two many-to-many tables and simplify the relationships to one-to-many.
This table will hold all the woman_id values along with the shoe_id values. We need what is called a junction table, which will contain the primary key columns of the two tables we want to relate.
page 2204/10/23 Presentation
Patterns of data: many-to-many
The secret of the many-to-many relationship—it’s usually made up of two one-to-many relationships, with a junction table in between.
We need to associate ONE person in the my_contacts table with MANY interests in our new interests table.
Each of the interests values could also map to more than one person, so this relationship fits into the many-to-many pattern.
The interests column can be converted into a many-to-many relationship using this schema. Every person can have more than one interest, and for every interest, there can be more than one person who shares it:
page 2304/10/23 Presentation
Questions
Do I have to create the middle table when I have many-to-many relationship?
Yes, you should. If you have a many-to-many relationship between two tables, you’ll end up with repeating groups, violating first normal form.
What if I still don’t mind repeats? Joining tables helps preserve your data integrity. If you have to delete
someone from my_contacts, you never touch the interests table, just the contact_interest table. Without the separate table, you could accidentally remove the wrong records.
And when it comes to updating info, it’s also nice
page 2404/10/23 Presentation
Name that Relationship
page 2504/10/23 Presentation
Questions?
We will have part 2 of data modeling, then examples and exercises will follow.