5. Topic 5(COLON) Data Structures and Program Design

Lesson 5.4: Searching, Sorting And Common Data Structures

Official syllabus section covering Lesson 5.4: Searching, Sorting and Common Data Structures within Topic 5: Data Structures and Program Design: Linear search and binary search; the precondition (sorted data) and trade-off of binary search.; Bubble sort and insertion sort as introductory sorting algorithms, traced step by step..

Lesson 5.4: Searching, Sorting and Common Data Structures

Introduction

In this lesson, we will explore the essential concepts of searching algorithms, sorting algorithms, and common data structures as part of your foundation in information technology. By the end of this lesson, students, you will be able to:

  • Understand how linear search and binary search work, including the preconditions and trade-offs associated with each.
  • Describe and trace the bubble sort and insertion sort algorithms step by step.
  • Explain the differences between arrays, stacks, and queues, with realistic use cases for each.
  • Gain a qualitative understanding of hash tables, including how they facilitate quick lookups and what collisions are.
  • Trace and explain the operations of linear search, binary search, and simple sorting algorithms.

Let's begin by diving into searching algorithms!

Searching Algorithms

Searching algorithms are crucial for finding data within collections. Understanding these algorithms is vital for efficiency in programming.

1. Linear Search

Linear search is the simplest searching algorithm. It checks each element in a list until the desired element is found or the list ends. This method works with both sorted and unsorted data.

Example of Linear Search

Imagine you have a list of numbers: [3, 5, 2, 4, 1]. You want to find the number 4.

  1. Start with the first element (3); it is not 4.
  2. Move to the second element (5); it is not 4.
  3. Move to the third element (2); it is not 4.
  4. Move to the fourth element (4); we found it!

Thus, the linear search locates the number 4 at index 3 of the list.

2. Binary Search

Binary search is a more efficient algorithm but requires the data to be sorted beforehand. This method repeatedly divides the search interval in half, significantly reducing the search time compared to linear search.

Precondition: The list must be sorted.

Trade-off: The setup time to sort the list can outweigh the benefits if you are searching very few times, or you may incur a higher cost to keep it sorted during insertion or deletion operations.

Example of Binary Search

Let's say we have a sorted list: [1, 2, 3, 4, 5]. We want to find the number 4.

  1. Start with the entire list. The middle index is (0 + 4) / 2 = 2. The middle element is 3.
  2. Since 4 > 3, we check the right half of the list: [4, 5].
  3. The new middle index is (3 + 4) / 2 = 3. The middle element is 4.

We found the number 4 at index 3! The binary search operates with a time complexity of $O(\log n)$ compared to the linear search's $O(n)$.

Sorting Algorithms

Sorting algorithms are pivotal for data organization. They allow for better search performance and data management. We will discuss two simple sorting algorithms: bubble sort and insertion sort.

1. Bubble Sort

Bubble sort is one of the most straightforward sorting algorithms. It repeatedly steps through the list, compares adjacent pairs, and swaps them if they are in the wrong order. This process is repeated until no swaps are needed.

Example of Bubble Sort

Consider the list [5, 3, 8, 4].

  1. Compare the first two elements (5 and 3): swap them since 5 > 3 → [3, 5, 8, 4].
  2. Compare the next pair (5 and 8): no swap → [3, 5, 8, 4].
  3. Compare (8 and 4): swap them since 8 > 4 → [3, 5, 4, 8].

Repeat the process:

  1. First Pass: [3, 4, 5, 8] (all elements now in order).

Bubble sort has a time complexity of $O(n^2)$. While easy to understand, it is inefficient for larger datasets.

2. Insertion Sort

Insertion sort builds a sorted array one element at a time. It works similarly to sorting playing cards. The algorithm divides the list into sorted and unsorted sections, gradually expanding the sorted section by inserting the next unsorted element in its correct position.

Example of Insertion Sort

Take the list [5, 3, 8, 4].

  1. Start with the first element (5); it is already sorted → [5].
  2. Take the next element (3); compare and insert 3 before 5 → [3, 5].
  3. Next, take 8; it goes after 5 → [3, 5, 8].
  4. Finally, insert 4 appropriately: [3, 4, 5, 8].

Insertion sort also has a time complexity of $O(n^2)$ but performs better than bubble sort in practice due to fewer swaps in many cases.

Common Data Structures

Understanding data structures is essential for effective programming and managing data. Let’s discuss three fundamental data structures: arrays, stacks, and queues.

1. Arrays

Arrays are data structures that store elements in a fixed-size sequential collection. Each element can be accessed using an index, making arrays efficient for accessing elements directly. Arrays require contiguous memory allocation.

Realistic Use: Storing temperature data for a week: [30, 32, 31, 29, 28, 31, 30]. Each index represents a day of the week.

2. Stacks

Stacks are collections of elements that follow the Last In, First Out (LIFO) principle. The last element added to the stack will be the first to be removed. Operations include push (adding an element) and pop (removing the top element).

Realistic Use: Undo functionality in applications; the last actions taken are stored on a stack and can be undone in reverse order.

3. Queues

Queues store elements in a First In, First Out (FIFO) manner, where the first element added will be the first to be removed. The primary operations are enqueue (adding an element) and dequeue (removing the front element).

Realistic Use: A printer queue that manages print jobs, where the first job sent to the printer is the first one printed.

Hash Tables

A hash table is a data structure that offers fast data retrieval and storage. It uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.

However, when two keys hash to the same index, a collision occurs. To handle collisions, various techniques can be employed, such as chaining (linking elements at the same index) or open addressing.

Qualitative Understanding of Hash Tables

Hash tables provide average-case time complexity of $O(1)$ for lookups, making them incredibly efficient for searching compared to arrays or linked lists.

Conclusion

In this lesson, we have covered significant concepts around searching and sorting algorithms, as well as fundamental data structures. MASTERING these topics will enhance your programming skills and set a strong foundation for your further studies in information technology. Try practicing these algorithms and data structures to better understand their applications and performance.

Study Notes

  • Linear Search: Sequentially finds an element; $O(n)$ complexity.
  • Binary Search: Divides and conquers searching; requires sorted data; $O(\log n)$ complexity.
  • Bubble Sort: Simple sorting; performs numerous swaps; $O(n^2)$ complexity.
  • Insertion Sort: Builds sorted array gradually; better than bubble sort typically; $O(n^2)$ complexity.
  • Arrays: Fixed-size, indexable collections of data.
  • Stacks: LIFO data structure; used for undo operations.
  • Queues: FIFO data structure; used in task scheduling.
  • Hash Tables: Fast data retrieval; uses hash functions; handles collisions.

Practice Quiz

5 questions to test your understanding