Try OpenEdge Now
skip to main content
Database Essentials
Database Design : Normalization : First normal form
 

First normal form

The first rule of normalization is that you must remove duplicate columns or columns that contain more than one value to a new table. The columns of a table in the first normal form have these characteristics:
*They contain only one value
*They occur once and do not repeat
First, examine an un-normalized Customer table, as shown in the following figure.
Table 2. Un-normalized Customer table with several values in a column
Cust Num
Name
Street
Order Number
101
Jones, Sue
2 Mill Ave.
M31, M98, M129
102
Hand, Jim
12 Dudley St.
M56
103
Lee, Sandy
45 School St.
M37, M40
104
Tan, Steve
67 Main St.
M41
Here, the Order Number column has more than one entry. This makes it very difficult to perform even the simplest tasks, such as deleting an order, finding the total number of orders for a customer, or printing orders in sorted order. To perform any of those tasks, you need a complex algorithm to examine each value in the Order Number column for each row. You can eliminate the complexity by updating the table so that each column in a table consists of exactly one value.
The following figure shows the same Customer table in a different un-normalized format which contains only one value per column.
Table 3. Un-normalized Customer table with multiple duplicate columns
Cust Num
Name
Street
Order Number1
Order Number2
Order Number3
101
Jones, Sue
2 Mill Ave.
M31
M98
M129
102
Hand, Jim
12 Dudley St.
M56
Null
Null
103
Lee, Sandy
45 School St.
M37
M140
Null
104
Tan, Steve
67 Main St.
M41
Null
Null
Here, instead of a single Order Number column, there are three separate but duplicate columns for multiple orders. This format is also not efficient. What happens if a customer has more than three orders? You must either add a new column or clear an existing column value to make a new entry. It is difficult to estimate a reasonable maximum number of orders for a customer. If your business is brisk, you might have to create 200 Order Number columns for a row. But if a customer has only 10 orders, the database will contain 190 null values for this customer.
Furthermore, it is difficult and time consuming to retrieve data with repeating columns. For example, to determine which customer has Order Number M98, you must look at each Order Number column individually (all 200 of them) in every row to find a match.
To reduce the Customer table to the first normal form, split it into two smaller tables, one table to store only Customer information and another to store only Order information. Table 4 shows the normalized Customer table, and Table 5 shows the new Order table.
Table 4. Customer table reduced to first normal form
Cust Num (Primary key)
Name
Street
101
Jones, Sue
2 Mill Ave.
102
Hand, Jim
12 Dudley St.
103
Lee, Sandy
45 School St.
104
Tan, Steve
67 Main St.
Table 5. Order table created when normalizing the Customer table
Order Number (Primary key)
Cust Num (Foreign key)
M31
101
M98
101
M129
101
M56
102
M37
103
M140
103
M41
104
There is now only one instance of a column in the Customer and Order tables, and each column contains exactly one value. The Cust Num column in the Order table relates to the Cust Num column in the Customer table.
A table that is normalized to the first normal form has these advantages:
*It allows you to create any number of orders for each customer without having to add new columns.
*It allows you to query and sort data for orders very quickly because you search only one column—Order Number.
*It uses disk space more efficiently because no empty columns are stored.