# Trees

A tree is a connected acyclic graph. As a result, a tree with n vertices has n-1 edges:

In this example, there are 9 vertices:
{A, B, C, D, E, F, G, H, I}
and 8 edges:
{{A,B},{B,D},{D,C},{D,E},{E,F},{E,H},{F,G},{H,I}}
For any graph, it is possible to find a "spanning tree": a tree which includes all of the vertices of the original graph. We will use the following algorithm to find a spanning tree for an arbitrary graph with n vertices:

1. Label the edges of the graph randomly using small positive integers for the weights.
2. Starting with the edge or edges with the highest weights, delete each edge which does not disconnect the graph.
3. Stop when there are n-1 edges left.

For example, let us begin with the following graph:

We assign a set of random weights to the edges:
The edges with the highest weights are {A,D} and {D,E}. Removal of the edge {D,E} would disconnect the graph, but we can remove the edge {A,D}:
Edges {A,B} and {E,G} have the next highest weights; of these, we can only remove {A,B} without disconnecting the graph:
The edges with the next highest weights are {A,C} and {C,D}. If we were to remove the edge {A,C}, vertex A would be disconnected from the graph, but we are able to remove the edge {C,D}:
The graph now has 6 edges:
{{A,C},{B,C},{B,D},{D,E},{E,F},{E,G}}
Since both this and the original graph have 7 vertices, we have found a spanning tree:
By moving the vertex A, we can represent this spanning tree in planar form:
Click here to view another spanning tree constructed from the same graph. It is different because the weights were assigned differently.
Spanning trees are constructed by network switches (which control data transfers between network segments) in order to prevent data from moving along cycles in the network. The protocol for their cooperation in this regard is called "Spanning Tree Protocol" (or STP).

### Rooted Trees

It is often useful to designate a particular vertex of a tree as the "root"; the tree is then called a "rooted tree". Rooted trees are constructed by programming language compilers (programs which translate other programs from a language such as C into a form which the computer hardware can interpret) in order to represent arithmetic and logical expressions in a more convenient form. For example, consider the arithmetic expression
- ( x + 2 ) / ( y - 3 )
The rooted tree which represents this expression is:
The root vertex of this tree is labeled with the division operator ("/"). Each edge is called a "branch" of the tree, and the vertices
{x,2,y,3}
are called the "leaves" of the tree. Each vertex can be assigned a depth:

1. the two vertices labeled with minus signs are at a depth of one;
2. the vertices labeled {+,y,3} are at a depth of two,
3. and the vertices {x,2} are at a depth of three.

1. x+2 is performed first because it is at the deepest level;
2. the unary minus -(x+2) is performed next, followed by
3. y-3 (operations at the same level are usually performed from left to right),
4. and finally the division is performed.

It is interesting to examine the tree corresponding to a very similar arithmetic expression such as

( - x + 2 ) / y - 3
In this expression, the order of operations is

1. - x
2. - x + 2
3. ( - x + 2 ) / y
4. ( - x + 2 ) / y - 3 (or more clearly, (( - x + 2 ) / y ) - 3 )

The rooted tree corresponding to this expression is:

Rooted trees are also commonly used to represent the structure of file systems in a computer (often called the "directory tree"). Suppose that you wished to create a web page, upload it to a web server and examine it using a web browser. If you were using Windows, you might use the NotePad program to create the web page, the FTP (File Transfer Protocol) program to upload your web page to the server, and Internet Explorer to view the web page you have just created. These programs are located in the files

• C:\Windows\ftp.exe
• C:\Program Files\Internet Explorer\iexplore.exe
The ellipsis ("...") indicate large portions of the tree which have been omitted for space reasons; a typical Windows file system contains several hundred folders (also called directories) and several thousand files. The root vertex of this tree is labeled C:, and C:\ is called the "root directory" of the file system. In the complete directory tree, the vertices which are not leaves represent the directories and the leaves represent the files in the directories (such as msdos.sys). In this truncated tree, the vertices labeled "My Documents" and "Common Files" actually represent directories whose contents were omitted for lack of space.

It is instructive to examine a UNIX directory tree in the same fashion. If you were using Linux (an open source version of UNIX for PCs) instead of Windows to create, upload and view your web page, you would use

In the Linux system, these files have the path names

• /usr/bin/emacs
• /usr/bin/ftp
• /usr/bin/netscape

A portion of the Linux directory tree illustrates their location:

Here, the root vertex is labeled "/"; / is the root directory of a UNIX file system. The vertices labeled emacs, ftp and netscape are actually the only leaves in this tree corresponding to files; the other vertices represent directories whose contents have again been omitted for lack of space. It should now be clear why path names are called path names: they describe the path through the graph between the root vertex and the leaf of interest. For example, the path name /usr/bin/netscape actually designates the path
{/,usr,bin,netscape}
from the root vertex to the leaf vertex corresponding to the program file for netscape.