5. HL Extension — Abstract Data Structures

Trees

Trees 🌳

Welcome, students. In this lesson, you will learn one of the most important abstract data structures in computer science: trees. Trees help computers organize information in a way that is fast, flexible, and easy to search. They are used in file systems, databases, compilers, search engines, and many other real-world systems. By the end of this lesson, you should be able to explain the main ideas and terminology of trees, connect trees to abstraction and efficiency, and apply IB Computer Science HL reasoning to tree-based problems.

Why Trees Matter

A tree is a structure for organizing data in a hierarchy, meaning there is a top element and then levels below it. This is similar to how a family tree shows relationships, or how a company chart shows a manager, departments, and employees. In computing, trees are useful because they can store data in a way that supports efficient searching, inserting, and deleting when designed well.

A key idea in the HL extension is abstraction. A tree is an abstract data structure because we focus on what it does rather than how it is physically stored in memory. The important question is not only “Where is the data?” but also “How is the data related?” Trees answer that by using parent-child relationships. Each piece of data can be connected to others in a structured way.

Tree Terminology and Main Ideas

A tree is made of nodes connected by edges. A node is a storage unit containing data, and an edge is the link between two nodes. The top node is called the root. The root has no parent. Any node connected below another node is called a child. A node with no children is called a leaf. A node may have one child, two children, or many children depending on the type of tree.

The path between nodes is the sequence of edges you follow to move from one node to another. The level of a node shows how far it is from the root. The root is usually at level $0$ or depth $0$, and each step down increases the depth by $1$. The height of a tree is the length of the longest path from the root to a leaf. These terms are important because they help describe the shape and efficiency of the tree.

For example, imagine a school folder system 📁. The root folder might be “School.” Inside it could be folders for “Math,” “Science,” and “English.” Inside “Math,” there could be “Homework” and “Tests.” The folder “Math” is the parent of those two folders, and they are its children. This is exactly the kind of hierarchical organization that trees model well.

How Trees Support Computation

Trees are valuable because they can make operations more efficient than simple linear structures like arrays or linked lists. In a list, finding an item may require checking many elements one by one. In a tree, especially a balanced tree, you can often skip large parts of the structure because each comparison tells you where to look next.

One common example is the binary search tree, or BST. In a BST, each node has at most two children: a left child and a right child. The left subtree contains values less than the node’s value, and the right subtree contains values greater than the node’s value. This ordering makes searching faster in many cases.

Suppose a BST stores the numbers $10$, $5$, $15$, $3$, $7$, $12$, and $18$. If you search for $12$, you compare it with the root $10$. Because $12 > 10$, you move right. Then compare with $15$. Because $12 < 15$, you move left. Then you find $12$. Instead of checking every number, you used the tree structure to narrow the search. ✅

This is a good example of abstraction and efficiency working together. The abstraction is the tree structure itself. The efficiency comes from the way the structure reduces unnecessary comparisons.

Binary Trees and Balanced Trees

A binary tree is a tree in which each node has at most two children. Binary trees are a foundation for many important structures. A special kind of binary tree is a balanced tree, where the left and right sides stay roughly even in height. Balance matters because a very unbalanced tree can behave almost like a linked list, which reduces performance.

For example, if numbers are inserted into a BST in sorted order like $1$, $2$, $3$, $4$, the tree can become a chain. In that case, searching may take time similar to checking a list from start to end. But if the tree is balanced, the number of steps to find a value is much smaller.

The height of a balanced binary tree with $n$ nodes is about $\log_2 n$. That means the tree grows slowly as the data set gets larger. This is why trees are so useful in large-scale computing. When $n$ becomes very big, small differences in efficiency matter a lot.

Traversing a Tree

Traversal means visiting the nodes in a systematic order. This is important when you want to process every item in the tree. There are several common traversal methods.

In preorder traversal, you visit the root first, then the left subtree, then the right subtree. In inorder traversal, you visit the left subtree, then the root, then the right subtree. In postorder traversal, you visit the left subtree, then the right subtree, then the root. In level-order traversal, you visit nodes level by level from top to bottom.

Consider this binary tree: root $A$, with left child $B$ and right child $C$. If $B$ has children $D$ and $E$, and $C$ has children $F$ and $G$, then the inorder traversal is $D, B, E, A, F, C, G$. The preorder traversal is $A, B, D, E, C, F, G$. The exact order matters in computer science because different traversal methods are useful for different tasks.

For example, inorder traversal of a BST gives values in sorted order. This is a powerful result because it connects tree structure to ordered output. If a teacher wants to print student scores from lowest to highest, a BST can help produce that ordering efficiently.

Trees in the Real World

Trees are everywhere in real systems. File systems use trees to store folders and files. HTML documents can be represented as a tree of elements. Compilers use syntax trees to understand the structure of code. Decision trees are used in some forms of artificial intelligence and data classification. Even website navigation menus often follow tree-like structures.

A common real-world example is a company directory on a phone 📱. At the top, there may be categories like “Contacts,” then subcategories such as “Family,” “Friends,” and “Work.” This makes it easy to organize and find information. The data is not just stored; it is stored in a way that reflects relationships.

In IB Computer Science HL, it is important to connect structures to their purpose. Trees are not just abstract diagrams. They are tools for organizing data so programs can work faster and more logically.

Abstraction, Linked Structures, and Implementation

Trees are often implemented as linked structures. This means each node stores its data and also stores references, or pointers, to other nodes. A node in a binary tree may have a reference to a left child and a right child. This is different from an array, where items are stored in consecutive memory locations.

Linked structures are flexible because nodes can be added without shifting every other element. However, linked structures can use extra memory for the references. That is an important trade-off in computer science: improved flexibility and easier reorganization may come at the cost of additional storage.

Abstraction helps us ignore unnecessary details when discussing a tree at a higher level. For example, you may describe a tree by saying it has a root, children, and leaves, without worrying about the exact memory addresses. At the same time, understanding linked implementation helps you reason about how the tree behaves in code.

Example Reasoning for IB HL

Imagine a bookstore database that stores book titles in a binary search tree. If the store adds titles one by one, the tree may become unbalanced depending on insertion order. If titles are inserted in a random order, the tree may stay more balanced. A balanced tree supports faster searching for book titles, which matters when the store has thousands of entries.

Now consider this question: why might a tree be better than a simple list for a dictionary app? A list can be searched linearly, meaning the program checks one word after another. A tree can use comparisons to move left or right, reducing the number of checks. This is an excellent HL-style explanation because it compares structures using efficiency and abstraction.

You may also be asked to trace operations. For example, if a BST already contains $20$, $10$, $30$, $5$, and $15$, and you insert $12$, you compare $12$ with $20$, then with $10$, then with $15$. Because $12 < 20$, move left; because $12 > 10$, move right; because $12 < 15$, move left. The new node becomes the left child of $15$. Tracing carefully is essential for exam success.

Conclusion

Trees are a core abstract data structure in HL Computer Science because they organize information hierarchically and support efficient computation. They use nodes, edges, roots, parents, children, and leaves to represent relationships clearly. They are especially powerful when balanced and when used for searching, sorting, and organizing data in real-world systems. students, if you understand tree terminology, traversal, and why balanced trees improve efficiency, you have a strong foundation for the HL extension on abstract data structures.

Study Notes

  • A tree is a hierarchical abstract data structure made of nodes and edges.
  • The top node is the root; nodes with no children are leaves.
  • Common terms include parent, child, sibling, path, level, depth, and height.
  • Trees model relationships well and are useful for abstraction.
  • Binary trees have at most two children per node.
  • Binary search trees store smaller values on the left and larger values on the right.
  • Balanced trees usually provide faster search than unbalanced trees.
  • Traversal methods include preorder, inorder, postorder, and level-order.
  • Inorder traversal of a BST returns values in sorted order.
  • Trees are often implemented as linked structures using references or pointers.
  • Real-world uses include file systems, compilers, databases, and navigation menus.
  • Tree structures help reduce unnecessary comparisons and improve efficiency.

Practice Quiz

5 questions to test your understanding