Lesson 7.2: Database Design and Normalisation
Introduction
Welcome to Lesson 7.2 of Foundation Computing! In this lesson, we will dive into database design and normalisation—two critical components in the field of database management. Understanding how to effectively structure data not only helps in efficient data storage but also aids in ensuring that the data remains accurate and consistent.
Learning Outcomes
By the end of this lesson, students will be able to:
- Identify entities, attributes, and relationships from a given scenario.
- Create entity-relationship diagrams (ER diagrams) to model one-to-many and many-to-many relationships.
- Understand data redundancy and anomalies, along with normalisation, particularly to the third normal form (3NF).
- Recognize the importance of referential integrity and validation rules in databases.
- Model a real-world scenario with entities, attributes, and relationships in an ER diagram.
Understanding Entities and Attributes
Let's begin by defining our core concepts: entities and attributes. An entity represents a person, object, concept, or event within the domain being modeled. For instance, in a school database, entities could include Students, Courses, and Teachers.
An attribute describes properties of an entity. For the Students entity, possible attributes could include:
StudentIDNameDateOfBirth
Real-World Example
Imagine a library system where we have entities such as Books and Members. Here’s how we would define them:
- Entity:
Books - Attributes:
ISBNTitleAuthorPublicationYear- Entity:
Members - Attributes:
MemberIDFullNameJoinDate
These definitions help structure our eventual database design, ensuring every piece of data has a clear place.
Relationships Between Entities
Entities often interconnect in various ways. Understanding these relationships is key for database design.
One-to-Many Relationships
A one-to-many relationship occurs when a single entity is related to multiple entities. For example, one Teacher can teach multiple Courses:
- Entities:
Teacher,Course - Relationship: One
Teacherteaches manyCourses
In an ER diagram, this is represented by a line connecting Teacher to Course with a "1" next to Teacher and a "∞" next to Course, indicating the relationship's nature.
graph TD
A[Teacher] -- 1 -->|teaches| B[Course]
B[Course] -- ∞ -->|has| C[Student]
Many-to-Many Relationships
A many-to-many relationship occurs when multiple entities can be associated with multiple other entities. For instance, Students can enroll in many Courses, and each Course can have many Students:
- Entities:
Student,Course - Relationship: Many
Studentsenroll in manyCourses
In an ER diagram, we typically use a junction table to resolve many-to-many relationships. This involves creating a new entity that holds references to both related entities, e.g., Enrollments:
graph TD
A[Student] -- ∞ -->|enrolls| B[Enrollments]
B[Enrollments] -- ∞ -->|includes| C[Course]
Data Redundancy and Anomalies
When designing databases, one challenge we often encounter is data redundancy—the unnecessary duplication of data. For instance, having multiple records for the same Student can lead to inconsistencies, such as differing Addresses for the same StudentID.
Anomalies
Anomalies often arise from poorly designed databases, leading to:
- Insertion Anomaly: Difficulty in adding data due to the absence of other related data.
- Update Anomaly: Inconsistent data following updates across different records.
- Deletion Anomaly: Unintentional loss of data when records are deleted.
To combat these issues, we apply normalisation.
Introduction to Normalisation
Normalisation is the process of organizing the fields (attributes) and tables of a database to minimize redundancy and improve data integrity. We'll focus on achieving Third Normal Form (3NF), which requires:
- The database is in Second Normal Form (2NF).
- All attributes must depend only on the primary key.
Steps to Normalize to 3NF
- Identify Functional Dependencies: Determine how attributes relate to the primary keys.
- Eliminate Redundant Data: Organize data so no non-key attributes depend on other non-key attributes.
- Define New Tables for New Entities: Create new entities as needed to separate the data effectively.
For example, if we find that CourseName depends only on CourseID, we should make sure they are appropriately separated in our tables.
Referential Integrity and Validation Rules
Referential integrity ensures that relationships between tables remain consistent. If a Course exists, it should not relate to a nonexistent Student. This is facilitated by foreign keys.
Validation rules enforce data correctness. For instance, DateOfBirth should always precede JoinDate, and future dates for any attribute should be invalid.
Accommodating the Real World
When designing your database to represent any real-world scenario, be sure to consider how entities interact, their relationships, and the potential for data anomalies. This will help you create a robust and reliable database.
Conclusion
We've covered a lot in this lesson about database design, focusing on entities, relationships, data redundancy, and normalisation. Understanding these concepts will allow students to build effective databases and manage data efficiently.
Study Notes
- Entities are the objects in a database (e.g., Students, Courses).
- Attributes describe properties of those entities (e.g., Name, ISBN).
- Use ER diagrams to visualize relationships: one-to-many and many-to-many.
- Normalisation helps eliminate redundancy and anomalies.
- Referential integrity maintains consistent relationships between tables.
- Apply validation rules to ensure data accuracy.
