6. Topic 6(COLON) Databases and SQL

Lesson 6.2: Database Design And Normalisation

Official syllabus section covering Lesson 6.2: Database Design and Normalisation within Topic 6: Databases and SQL: Identifying entities, attributes and relationships from a scenario.; Entity-relationship diagrams and modelling one-to-many and many-to-many relationships..

Lesson 6.2: Database Design and Normalisation

Introduction

In this lesson, students will explore the core concepts of database design and normalization step-by-step. Understanding how to structure a database is crucial to ensuring that data can be managed efficiently. We will learn how to identify entities, attributes, and relationships within data scenarios, and how to use entity-relationship diagrams (ER diagrams) to map these elements. We'll also cover the importance of normalization to reduce data redundancy and how to maintain referential integrity through validation rules. By the end of this lesson, students should be able to model a scenario using an ER diagram while adhering to best practices in database design.

Learning Objectives:

  • Identify entities, attributes, and relationships from a scenario.
  • Create entity-relationship diagrams and model one-to-many and many-to-many relationships.
  • Understand data redundancy and anomalies, and introduce normalization to the third normal form qualitatively.
  • Explain referential integrity and validation rules.
  • Model a scenario as entities, attributes, and relationships in an ER diagram.

Understanding Entities, Attributes, and Relationships

Entities

An entity is a thing or object in the real world that is distinguishable from other objects. Entities can be singular or plural and include tangible items like a car, a student, or an order, as well as intangible concepts like a course or a project. For example, if we are designing a database for a school, the following might be our entities:

  • Student
  • Course
  • Instructor
  • Enrollment

Example: Identifying Entities

Consider a simple scenario where we have a bookstore. In our case, the entities might include:

  • Book
  • Author
  • Customer
  • Order

Attributes

Attributes are the properties or details of an entity. They give us information about each entity, thus providing the necessary context. Using the bookstore scenario, let’s define attributes for the entity Book:

  • Title
  • ISBN
  • Price
  • Publication Year
  • Author (a reference to the Author entity)

Example: Defining Attributes

In our bookstore scenario, the Customer entity's attributes could be:

  • Customer ID
  • Name
  • Email
  • Phone Number
  • Address

Relationships

A relationship represents how entities interact with one another. Relationships can be classified as one-to-one, one-to-many, or many-to-many. Understanding these relationships helps in effective database design.

  • One-to-One (1:1): A single entity from one set is related to a single entity from another set. Example: A person can have only one passport.
  • One-to-Many (1:M): A single entity from one set is related to multiple entities in another set. Example: One author can write multiple books.
  • Many-to-Many (M:N): Multiple entities on both sides can relate to each other. Example: Students can enroll in multiple courses, and each course can have multiple students.

Example: Identifying Relationships

In the bookstore scenario, we can model the relationships as follows:

  • Customer places Order (1:M)
  • Order contains Book (M:N)
  • Author writes Book (1:M)

Entity-Relationship Diagrams (ER Diagrams)

Entity-Relationship diagrams visually map out the entities, attributes, and relationships in a database. They serve as fundamental tools in the design phase of database creation.

Creating an ER Diagram

To create an ER diagram, follow these steps:

  1. Identify the Entities: Determine the main subjects of your database.
  2. Define the Attributes: Specify important details for each entity.
  3. Establish Relationships: Determine how entities are related.
  4. Draw the Diagram: Use standardized symbols to represent entities, attributes, and relationships.

Example: ER Diagram for Bookstore

For our bookstore example, we would represent our entities and relationships with the following components:

  • Rectangles for entities (Book, Customer, Order, Author).
  • Diamonds for relationships (places, contains, writes).
  • Ovals for attributes (Title, Customer ID, ISBN).

Here is a simple representation:

[Customer] - (places) - [Order] - (contains) - [Book]

[Author] - (writes) - [Book]

Normalization

Normalization is the process of organizing a database to reduce redundancy and improve data integrity. The primary goal is to ensure that data is stored logically and efficiently. Normalization usually involves several stages, called normal forms (NF).

First Normal Form (1NF)

1NF requires that all columns in a table contain atomic values, meaning each value is indivisible. It also requires that each record in a table be unique. In practical terms, no repeating groups or arrays are allowed in the table.

Second Normal Form (2NF)

To achieve 2NF, a database must be in 1NF, and all non-key attributes must depend only on the primary key. Any partial dependency on a composite key should be eliminated.

Third Normal Form (3NF)

A database is in 3NF if it is in 2NF, and all its attributes are not only dependent on the primary key but are also independent of one another (no transitive dependency).

Example: Applying Normalization

Consider a table storing information about students and their courses:

Student IDStudent NameCourse NameInstructor Name
1AliceDatabase SystemsProf. Smith
2BobNetwork SecurityDr. Jones

This design has redundancy, as the instructor's name repeats. Normalize it by separating instructor details into another entity:

  1. Students:
  • Student ID
  • Student Name
  1. Courses:
  • Course Name
  • Instructor Name
  • Course ID
  • Student ID (as a foreign key)

Referential Integrity

Referential integrity ensures that relationships between entities remain consistent. It means if an entity references another entity, the referenced entity must exist.

Example: Enforcing Referential Integrity

In our bookstore, if we link an order to a customer through the Customer ID, we must ensure every order references an existing customer in the database. If Customer ID 3 is deleted, all orders referencing that customer must either be deleted or updated to avoid orphaned records.

Conclusion

In conclusion, understanding how to design and normalize databases is fundamental for efficient data management. By identifying entities, attributes, and relationships, and translating them into ER diagrams, students can create well-structured databases that minimize redundancy and ensure integrity through normalization and referential integrity rules. Successful database design is essential for the smooth operation of information technology applications.

Study Notes

  • An entity is a distinct object, and an attribute provides details about it.
  • Relationships are categorized into one-to-one, one-to-many, and many-to-many.
  • A normalized database reduces redundancy and maintains integrity.
  • Referential integrity is critical for consistent data relationships.
  • An ER diagram visually represents the entities, attributes, and relationships in a database.

Practice Quiz

5 questions to test your understanding