Tree structures are quite common in computer science. They can represent a sorted collection of numbers or names, or a series of decisions that need to be made. Like graphs (described in another section), trees are made up of nodes and directed arcs. Nodes represent some value and arcs point from one node to another. Trees are in many ways like graphs with restrictions on how the arcs are arranged.
In computer science trees grow upside down. The root is at the top and the leaves are at the bottom. Below is an example of a tree that is being used to hold some integers in a particular order:
The nodes in this tree represent integers. In this case, the integers 1, 3, 4, 5, 7, 8 and 10 have been used. The root node is labeled as “5″ and appears at the top of the tree. Several interior nodes (nodes that are neither root nor leaves) labeled 3 and 8 appear in the middle of the tree. Finally, leaf nodes with labels 1, 4, 7 and 10 appear at the bottom of the tree.
A Parent node is one that has at least one node below it. In the above tree, node 3 is a parent and its Child nodes are 1 and 4. (Can you find the other 2 parent nodes in the graph?) Child nodes with a common parent are called sibling nodes. In the example, nodes 1 and 4 are siblings. As a rule, child nodes in a tree may only have one parent. (Can you find the other siblings in the graph?)
Another feature of a tree is the number of levels of depth. In the example, the tree is 3 levels deep. If the number of children is the same for all parent nodes, we say that the tree is balanced. In the above example, all nodes (except the leaf nodes) have 2 children so the tree is balanced.
A tree with 2 children per parent is also called a binary tree. Looking at the above example, one can see that for each parent node, one of its children has a value less than the parent, and the other child has a value greater than that of the parent. For example, the parent node 5 has a left child with value 3 (less than its parent) and a right child with value 8 (greater than its parent).
Trees can be used to assist us in sorting and searching for data of interest. For example, suppose we have the following set of data:
Name | Age |
---|---|
Bailey | 41 |
Brown | 27 |
Green | 21 |
Jones | 38 |
Smith | 35 |
Taner | 63 |
Vespa | 31 |
We would like to be able to store this data in such a way that improves how we can search for a person later on. We can construct a binary tree to store the names and ages in the following fashion. First, find the median value (the value that appears closest to the middle) of the list and make that the root node. In this example, “Jones” appears to be in the middle of the list so we can make Jones the root node.
Next, look at the all of the values that come before Jones in the list: Bailey, Brown and Green. Choose the median of these three and make it the left child of Jones. Repeat this process with the bottom half of the list (after Jones) to establish the right child of Jones.
Finally, repeat the process again for the next level by taking Brown and Taner each as parents. The resulting tree looks like the following:
To search such a tree, start at the root node and:
- Compare the target value (what is being searched for) with the root node.
- If the target value matches the current node, the search is over
- If the target value is less than the current node’s value, move to the current node’s left child and go back to step 2.
- If the target value is greater than the current node’s value, move to the current node’s right child and go back to step 2.
- If we reach a leaf node and the target value is not found, then it is not in the tree.
As an example, assume we are searching for “Vespa”. Starting with the root node, we compare Vespa with Jones. Since Vespa comes after Jones in the alphabet (Vespa is greater than Jones) we move to Jones’ right child. At this node, we compare Vespa with Taner and see that since Vespa is greater than Taner, we move to Taner’s right child node. Finally, we find the target node.
Notice that if we have to search through the list (starting from the first entry), it would have taken us 7 steps to finally locate Vespa. On average, we would expect to find what we are looking for in the list after about 3 or 4 steps. With the binary tree, we will never have to use more than 3 steps.
We can generalize this reasoning with the following points:
- Searching in a list (linear search) of n items will require on average n / 2 steps.
- Searching the same list represented as a binary tree will require no more than log2n steps.
Here is a comparison:
Number of Items n | Linear Search | Binary Search |
---|---|---|
4 | 4/2 = 2 | log2 4 = 2 |
8 | 8/2 = 4 | log2 8 = 3 |
16 | 16/2 = 8 | log2 16 = 4 |
32 | 32/2 = 16 | log2 32 = 5 |
64 | 64/2 = 32 | log2 64 = 6 |
128 | 128/2 = 64 | log2 128 = 7 |
It is clear from the comparison that the binary tree representation can cut down significantly on the number of steps.