Normalize data to third normal form (3NF)

Resources | Subject Notes | Information Technology IT

Database Normalization to 3NF - IT 9626

Database and File Concepts - Normalization to 3NF

This document provides detailed notes on normalizing data to Third Normal Form (3NF), a crucial concept in database design. Normalization aims to reduce data redundancy and improve data integrity.

What is Data Normalization?

Data normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves dividing databases into two or more tables and defining relationships between the tables.

Why Normalize?

Normalization helps address several problems associated with unnormalized data:

  • Redundancy: Minimizes duplicate data, saving storage space and reducing update anomalies.
  • Update Anomalies: Prevents inconsistencies that can occur when updating redundant data.
  • Insertion Anomalies: Allows for the insertion of new data without requiring all related information to be present.
  • Deletion Anomalies: Prevents unintended loss of data when deleting records.

Normal Forms

There are several normal forms, each with increasing levels of strictness. We will focus on 3NF.

  1. First Normal Form (1NF): Each cell in the table contains a single value. No repeating groups.
  2. Second Normal Form (2NF): Must be in 1NF and all non-key attributes must be fully functionally dependent on the primary key.
  3. Third Normal Form (3NF): Must be in 2NF and no non-key attribute is transitively dependent on the primary key.

Third Normal Form (3NF)

A table is in 3NF if it meets the following conditions:

  • It is in 2NF.
  • No non-key attribute is transitively dependent on the primary key.

Transitive Dependency: A non-key attribute A is transitively dependent on a key B if there exists another non-key attribute C such that B determines C and C determines A.

How to Normalize to 3NF

The process of normalizing to 3NF involves identifying and removing transitive dependencies by splitting the table into multiple tables.

Steps for Normalization to 3NF

  1. Identify the primary key(s) of the table.
  2. Identify all non-key attributes.
  3. Check for transitive dependencies.
  4. If transitive dependencies exist, create new tables to eliminate them.
  5. Define relationships between the new tables using foreign keys.

Example: Order Details

Consider an `Orders` table with the following attributes:

Order ID Customer ID Customer Name Customer Address Order Date Product ID Product Name Product Price Quantity
1 101 John Doe 123 Main St 2023-10-26 A1 Widget 10.00 2

This table has several issues: customer information is repeated for each order, and product information is also repeated. This violates 3NF.

Primary Key: Order ID

Non-key Attributes: Customer ID, Customer Name, Customer Address, Product ID, Product Name, Product Price

Transitive Dependency: Order ID -> Customer ID -> Customer Name, Customer Address. Order ID -> Product ID -> Product Name, Product Price.

To normalize this table to 3NF, we can create three separate tables:

1. Orders Table

Order ID Customer ID Order Date
1 101 2023-10-26

2. Customers Table

Customer ID Customer Name Customer Address
101 John Doe 123 Main St

3. Order Items Table

Order ID Product ID Quantity
1 A1 2

4. Products Table

Product ID Product Name Product Price
A1 Widget 10.00

Relationships:

  • Orders.Customer ID references Customers.Customer ID
  • Order Items.Order ID references Orders.Order ID
  • Order Items.Product ID references Products.Product ID

This 3NF design eliminates data redundancy and ensures data integrity. Changes to customer information only need to be made in one place (the Customers table).

Conclusion

Normalizing data to 3NF is a fundamental aspect of database design. It leads to more efficient, reliable, and maintainable databases. Understanding the principles of normalization is essential for any IT professional working with databases.