Detailed Explanation of Database Normal Forms
Problem Description: Database normal forms are a series of specifications followed when designing relational databases, aiming to reduce data redundancy and improve data consistency. The main normal forms include 1NF, 2NF, 3NF, BCNF, etc. Each normal form builds upon the previous one with stricter rules.
Problem-Solving Process:
Step 1: Understanding First Normal Form (1NF)
- Definition: Every attribute (column) in a relation must be atomic, i.e., an indivisible, minimal data unit.
- Criteria:
- Each column contains a single value; arrays, sets, or nested tables are not allowed.
- All values in a column must be of the same data type.
- Each column has a unique name.
- Example Analysis:
-
Table violating 1NF (Student Course Selection):
Student ID Student Name Selected Courses 1001 Zhang San Math, English 1002 Li Si Physics, Chemistry -
Converted to 1NF:
Student ID Student Name Selected Course 1001 Zhang San Math 1001 Zhang San English 1002 Li Si Physics 1002 Li Si Chemistry
-
Step 2: Mastering Second Normal Form (2NF)
- Prerequisite: The table must already satisfy 1NF.
- Definition: In addition to satisfying 1NF, all non-key attributes must be fully functionally dependent on the entire primary key (no partial dependency is allowed).
- Steps:
- Identify the table's primary key (which may be composite).
- Check if each non-key attribute is fully dependent on the entire primary key.
- If an attribute depends only on part of the primary key, split the table.
- Example Analysis:
-
Original table (Student Course Grades):
Student ID Course ID Student Name Course Name Grade - Primary key: (Student ID, Course ID)
- Issue: Student Name depends only on Student ID (partial dependency), Course Name depends only on Course ID (partial dependency).
-
Solution: Split into three tables:
- Students(Student ID, Student Name)
- Courses(Course ID, Course Name)
- Enrollments(Student ID, Course ID, Grade)
-
Step 3: Understanding Third Normal Form (3NF)
- Prerequisite: The table must already satisfy 2NF.
- Definition: In addition to satisfying 2NF, eliminate transitive dependencies (non-key attributes must not depend on other non-key attributes).
- Criterion: There should be no situation where non-key attribute A depends on non-key attribute B, and B in turn depends on the primary key.
- Example Analysis:
-
Original table (Student Information):
Student ID Student Name Department Department Address Department Phone - Primary key: Student ID
- Issue: Department Address and Department Phone depend on Department, while Department depends on Student ID (transitive dependency).
-
Solution: Split into two tables:
- Students(Student ID, Student Name, Department)
- Departments(Department Name, Department Address, Department Phone)
-
Step 4: Introduction to BCNF (Boyce-Codd Normal Form)
- Definition: A stricter normal form than 3NF, requiring that every determinant must be a candidate key.
- Core Idea: The left-hand side of every functional dependency in the table must contain a candidate key.
- Difference from 3NF: 3NF allows transitive dependencies where a prime attribute depends on a candidate key, while BCNF does not.
- Example Analysis:
- Consider table: Enrollment(Student, Course, Teacher)
- Assumption: A teacher teaches only one course, but a course can have multiple teachers.
- Functional dependencies: Course → Teacher, (Student, Course) → Teacher
- Issue: Course → Teacher, but Course is not a candidate key.
- Solution: Split into (Student, Course) and (Course, Teacher).
- Consider table: Enrollment(Student, Course, Teacher)
Step 5: Practical Application of Normal Forms
- Trade-offs: Higher normal forms reduce data redundancy but may require more joins for queries.
- Practical Applications:
- Designing up to 3NF or BCNF usually meets most business requirements.
- Data warehouse designs may intentionally use denormalization to improve query performance.
- Balance between normalization and performance based on specific business scenarios is necessary.
Summary: Database normal forms are progressive, with each building upon the previous ones. Understanding normal forms centers on identifying and managing data dependencies, ensuring data integrity and consistency through rational table structure design.