Database Normalization Part II
In our last installment, we discussed what database normalization is and how
to get our tables into first normal form.
In this installment, we will talk about normalization in the second and third
normal forms. Since normalization is progressive, that is each normal form
builds on the previous forms, we assume that our tables are already in first
normal form. For a review of first normal form, see the previous tutorial
labeled "Database Normalization."
Second Normal Form
The second normal form applies only to tables with multiple field primary
keys. Take the following table for example.
| Rep ID* | Rep First Name | Rep Last Name | Client ID* | Client | Time With Client |
| TS-89 | Gilroy | Gladstone | 978 | US Corp | 14 hrs |
| TS-89 | Gilroy | Gladstone | 665 | Taggarts | 26 hrs |
| TS-89 | Gilroy | Gladstone | 782 | Kilroy Inc. | 9 hrs |
| RK-56 | Mary | Mayhem | 221 | Italiana | 67 hrs |
| RK-56 | Mary | Mayhem | 982 | Linkers | 2 hrs |
| RK-56 | Mary | Mayhem | 665 | Taggarts | 4 hrs |
This table is already in first normal form. It has a primary key
consisting of Rep ID and Client ID since neither alone can be considered a
unique value.
The second normal form states that each field in a multiple field primary key
table must be directly related to the entire primary key. Or in other words,
each non-key field should be a fact about all the fields in the primary key.
Only fields that are absolutely necessary should show up in our table, all other
fields should reside in different tables. In order to find out which
fields are necessary we should ask a few questions of our database. In our
preceding example, I should ask the question "What information is this
table meant to store?" Currently, the answer is not obvious. It may be
meant to store information about individual clients, or it could be
holding data for employees time cards. As a further example, if my
database is going to contain records of employees I may want a table of
demographics and a table for payroll. The demographics will have all the
employees personal information and will assign them an ID number. I should
not have to enter the data twice, the payroll table on the other hand should
refer to each employee only by their ID number. I can then link the two tables
by a relationship and will then have access to all the necessary
data.
In the table of the preceding example we are devoting three field to the
identification of the employee and two to the identification of the
client. I could identify them with only one field each -- the primary
key. I can then take out the extraneous fields and put them in their own
table. For example, my database would then look like the following.
| Rep ID* | Client ID* | Time With Client |
| TS-89 | 978 | 14 hrs |
| TS-89 | 665 | 26 hrs |
| TS-89 | 782 | 9 hrs |
| RK-56 | 221 | 67 hrs |
| RK-56 | 982 | 2 hrs |
| RK-56 | 665 | 4 hrs |
The above table contains time card information.
| Rep ID* | First Name | Last Name |
| TS-89 | Gilroy | Gladstone |
| RK-56 | Mary | Mayhem |
The above table contains Employee Information.
| Client ID* | Client Name |
| 978 | US Corp |
| 665 | Taggarts |
| 782 | Kilroy Inc. |
| 221 | Italiana |
| 982 | Linkers |
The above table contains Client Information
These tables are now in normal form. By splitting off the unnecessary
information and putting it in its own tables, we have eliminated redundancy and
put our first table in second normal form. These tables are now ready to
be linked through relationship to each other.
Third Normal Form
Third normal form is the same as second normal form except that it only
refers to tables that have a single field as their primary key. In other
words, each non-key field in the table should be a fact about the primary key.
Either of the preceding two tables act as an example of third normal form since
all the fields in each table are necessary to describe the primary key.
Once all the tables in a database have been taken through the third normal
form, we can begin to set up relationships. In out next installment, we
will discuss relationships and how using them allows us greater flexibility in
our database.