Binary Search Trees 🌳

students, today you will learn one of the most important abstract data structures in Computer Science: the Binary Search Tree (BST). Binary search trees are a powerful way to store data so that searching, inserting, and deleting can often be done efficiently. They are a key part of the IB Computer Science HL extension on Abstract Data Structures because they show how abstraction, organization, and efficiency work together in real systems.

What you will learn

By the end of this lesson, students, you should be able to:

explain the main ideas and vocabulary of binary search trees,
apply BST rules to place and find values,
describe why BSTs can be efficient,
connect BSTs to linked and abstract structures,
use examples to show how BSTs work in real computing tasks.

Imagine a school library where books are arranged in a way that helps you find a title faster than checking every shelf one by one 📚. A binary search tree works in a similar way: it stores values in a pattern that helps the computer decide where to look next. This lesson will show how that pattern works and why it matters.

What is a Binary Search Tree?

A binary search tree is a binary tree with an extra ordering rule.

A binary tree is a tree data structure where each node has at most two children: a left child and a right child.

A node usually stores:

a value,
a link to the left subtree,
a link to the right subtree.

The key BST rule is:

all values in the left subtree are less than the node value,
all values in the right subtree are greater than the node value.

This rule must hold for every node in the tree, not just the root.

For example, if the root contains $50$, then values like $20$ and $40$ may go in the left subtree, while values like $60$ and $80$ may go in the right subtree. If $20$ itself has children, those children must also follow the same rule relative to $20$.

This structure is called abstract because we focus on the behavior and rules of the data structure, not the exact physical storage details. A BST is a good example of abstraction in computer science: it gives us a model for organizing data efficiently.

How Binary Search Trees store data

To build a BST, values are inserted one by one using comparisons.

Suppose we insert the values $50$, $30$, $70$, $20$, $40$, $60$, and $80$.

Start with $50$ as the root.
Insert $30$: since $30 < 50$, it goes left.
Insert $70$: since $70 > 50$, it goes right.
Insert $20$: compare with $50$, then $30$; since $20 < 30$, it goes left of $30$.
Insert $40$: compare with $50$, then $30$; since $40 > 30$, it goes right of $30$.
Insert $60$: compare with $50$, then $70$; since $60 < 70$, it goes left of $70$.
Insert $80$: compare with $50$, then $70$; since $80 > 70$, it goes right of $70$.

A BST can be drawn like this:

$\begin{array}{c}$

50 \\

/ \ \\

$30 \quad 70 \\$

$/ \ \quad / \ \\$

20 \quad 40 \quad 60 \quad 80

$\end{array}$

Real-world idea: think of organizing names in a contact list by repeatedly splitting the list into smaller groups based on alphabetical order 📱. Each comparison tells you which branch to follow.

Searching in a Binary Search Tree

Searching in a BST is similar to playing a guessing game where each answer removes half of the remaining possibilities.

To search for a value:

compare it with the current node,
if it is smaller, go left,
if it is larger, go right,
if it matches, the value has been found.

Example: search for $60$ in the tree above.

Compare $60$ with $50$: $60 > 50$, so go right.
Compare $60$ with $70$: $60 < 70$, so go left.
Compare $60$ with $60$: match found.

This process is efficient because the tree rules avoid checking every node.

If a BST is balanced, its height is about $\log_2(n)$, where $n$ is the number of nodes. That means searching can often be done in $O(\log n)$ time. In a very unbalanced BST, the height can become $n$, making searching as slow as $O(n)$.

The idea is important for IB Computer Science HL reasoning: the performance of an abstract data structure depends on its shape, not just its rules.

Inserting and deleting values

Insertion in a BST follows the same comparison logic as searching.

To insert a new value:

Start at the root.
Compare the new value with the current node.
Move left or right according to the BST rule.
Continue until an empty position is found.
Insert the new node there.

Example: insert $65$ into the earlier tree.

$65 > 50$, so move right.
$65 < 70$, so move left.
$65 > 60$, so move right.
The right child of $60$ is empty, so place $65$ there.

Deletion is more complicated because the BST rule must still hold after removal. There are three main cases:

The node is a leaf: it can be removed directly.
The node has one child: replace the node with its child.
The node has two children: replace it with either its in-order successor or in-order predecessor, then remove that replacement node from its original position.

The in-order successor is the smallest value in the right subtree. The in-order predecessor is the largest value in the left subtree.

Example: if you delete $50$ from the earlier tree, one valid replacement is $60$, the smallest value in the right subtree. After moving $60$ to the root, the tree can still satisfy the BST rules.

This shows why BSTs are useful but also why implementation details matter. Some operations are straightforward, while others require careful restructuring.

Traversals and ordered output

A BST supports different tree traversals, which are ways to visit nodes in a specific order.

The most important traversal for BSTs is in-order traversal:

visit the left subtree,
visit the node,
visit the right subtree.

For a BST, in-order traversal produces the values in ascending order. This is a major advantage.

Using the earlier tree, an in-order traversal gives:

20,\ 30,\ 40,\ 50,\ 60,\ 70,\ 80

That means a BST can store data and also make it easy to retrieve the values in sorted order. This is useful in applications like grade lists, inventory systems, and database indexing.

Other traversals include:

pre-order: node, left, right,
post-order: left, right, node,
level-order: visit nodes by each level from top to bottom.

Each traversal has a purpose, but for ordered output, in-order traversal is the key BST example.

Efficiency, balance, and real systems

Efficiency is a central idea in HL Extension — Abstract Data Structures. A data structure should not only work; it should help the computer solve problems quickly and with manageable memory use.

For a BST:

search can be $O(\log n)$ in a balanced tree,
insertion can be $O(\log n)$ in a balanced tree,
deletion can also be $O(\log n)$ in a balanced tree.

However, if values are inserted in already sorted order, the tree may become like a linked list. For example, inserting $10$, $20$, $30$, $40$ in that order creates a tree where every node has only a right child. Then operations may take $O(n)$ time.

This is why some systems use self-balancing trees, such as AVL trees or red-black trees, to keep the height small. Those are beyond the basic BST, but they show how the BST idea can be improved to maintain efficiency.

Real-world example: search engines, file systems, and data indexing structures often rely on tree-based ideas to access information quickly. BSTs help explain the abstract logic behind that organization 💻.

Conclusion

Binary search trees are a core example of an abstract data structure because they combine a clear rule, a useful organization method, and measurable performance benefits. students, you should now understand that a BST is a binary tree where left values are smaller and right values are larger, and that this rule makes searching, inserting, and traversal efficient when the tree is well balanced.

BSTs connect directly to the HL Computer Science focus on abstraction and efficiency. They show how choosing the right data structure can make programs faster and more organized. They also prepare you for more advanced structures built on the same ideas.

Study Notes

A binary search tree is a binary tree with the ordering rule: left subtree values are less than the node, and right subtree values are greater than the node.
Each node usually contains a value and links to left and right children.
Searching follows comparisons: go left if the value is smaller, go right if it is larger.
Insertion uses the same comparison process until an empty position is found.
Deletion has three main cases: leaf, one child, or two children.
The in-order successor is the smallest value in the right subtree.
The in-order predecessor is the largest value in the left subtree.
In-order traversal of a BST produces values in ascending order.
Balanced BST operations are often $O(\log n)$, while unbalanced trees can degrade to $O(n)$.
BSTs are important in HL Abstract Data Structures because they show how abstraction can improve efficiency and data organization.