### Question Description

Just shaky - in traversal BFS and DFS, is the only difference in the algorithm that one uses a stack and the other a queue?

## Final Answer

To traverse means to visit the vertices in some systematic order. You should be familiar with various traversal methods for trees:

· preorder: visit each node before its children.

· postorder: visit each node after its children.

· inorder (for binary trees only): visit left subtree, node, right subtree.

We also saw another kind of traversal, topological ordering, when I talked about shortest paths.

Today, we'll see two other traversals: breadth first search (BFS) and depth first search (DFS). Both of these construct spanning trees with certain properties useful in other graph algorithms. We'll start by describing them in undirected graphs, but they are both also very useful for directed graphs.

**Breadth
First Search**

This can be throught of as being like Dijkstra's algorithm for shortest paths, but with every edge having the same length. However it is a lot simpler and doesn't need any data structures. We just keep a tree (the breadth first search tree), a list of nodes to be added to the tree, and markings (Boolean variables) on the vertices to tell whether they are in the tree or list.

**breadth first search:**

unmark all vertices

choose some starting vertex x

mark x

list L = x

tree T = x

while L nonempty

choose some vertex v from front of list

visit v

for each unmarked neighbor w

mark w

add it to end of list

add edge vw to T

It's very important that you remove vertices from the other end of the list than the one you add them to, so that the list acts as a queue (fifo storage) rather than a stack (lifo). The "visit v" step would be filled out later depending on what you are using BFS for, just like the tree traversals usually involve doing something at each vertex that is not specified as part of the basic algorithm. If a vertex has several unmarked neighbors, it would be equally correct to visit them in any order. Probably the easiest method to implement would be simply to visit them in the order the adjacency list for v is stored in.

Let's prove some basic facts about this algorithm. First, each vertex is clearly marked at most once, added to the list at most once (since that happens only when it's marked), and therefore removed from the list at most once. Since the time to process a vertex is proportional to the length of its adjacency list, the total time for the whole algorithm is O(m).

We also want to know that T is a *spanning
tree*, i.e. that if the graph is connected (every vertex has some path to
the root x) then every vertex will occur somewhere in T. We can prove this by
induction on the length of the shortest path to x. If v has a path of length k,
starting v-w-...-x, then w has a path of length k-1, and by induction would be included
in T. But then when we visited w we would have seen edge vw, and if v were not
already in the tree it would have been added.

Breadth first traversal of G
corresponds to some kind of tree traversal on T. But it isn't preorder,
postorder, or even inorder traversal. Instead, the traversal goes a *level*
at a time, left to right within a level (where a level is defined simply in
terms of distance from the root of the tree). For instance, the following tree
is drawn with vertices numbered in an order that might be followed by breadth
first search:

1

/ | \

2 3 4

/ \ |

5 6 7

| / | \

8 9 10 11

The proof that vertices are in this order by breadth first search goes by induction on the level number. By the induction hypothesis, BFS lists all vertices at level k-1 before those at level k. Therefore it will place into L all vertices at level k before all those of level k+1, and therefore so list those of level k before those of level k+1. (This really is a proof even though it sounds like circular reasoning.)

Breadth first search trees have a nice property: Every edge of G can be classified into one of three groups. Some edges are in T themselves. Some connect two vertices at the same level of T. And the remaining ones connect two vertices on two adjacent levels. It is not possible for an edge to skip a level.

Therefore, the breadth first search tree really is a shortest path tree starting from its root. Every vertex has a path to the root, with path length equal to its level (just follow the tree itself), and no path can skip a level so this really is a shortest path.

**Depth
first search**

Depth first search is another way of traversing graphs, which is closely related to preorder traversal of a tree. Recall that preorder traversal simply visits each node before its children. It is most easy to program as a recursive routine:

preorder(node v)

{

visit(v);

for each child w of v

preorder(w);

}

To turn this into a graph traversal algorithm, we basically replace "child" by "neighbor". But to prevent infinite loops, we only want to visit each vertex once. Just like in BFS we can use marks to keep track of the vertices that have already been visited, and not visit them again. Also, just like in BFS, we can use this search to build a spanning tree with certain useful properties.

dfs(vertex v)

{

visit(v);

for each neighbor w of v

if w is unvisited

{

dfs(w);

add edge vw to tree T

}

}

The overall depth first search algorithm then simply initializes a set of markers so we can tell which vertices are visited, chooses a starting vertex x, initializes tree T to x, and calls dfs(x). Just like in breadth first search, if a vertex has several neighbors it would be equally correct to go through them in any order. I didn't simply say "for each unvisited neighbor of v" because it is very important to delay the test for whether a vertex is visited until the recursive calls for previous neighbors are finished.

Just like we did for BFS, we can use DFS to classify the edges of G into types. Either an edge vw is in the DFS tree itself, v is an ancestor of w, or w is an ancestor of v. (These last two cases should be thought of as a single type, since they only differ by what order we look at the vertices in.) What this means is that if v and w are in different subtrees of v, we can't have an edge from v to w. This is because if such an edge existed and (say) v were visited first, then the only way we would avoid adding vw to the DFS tree would be if w were visited during one of the recursive calls from v, but then v would be an ancestor of w.

As an example of why this property might be useful, let's prove the following fact: in any graph G, either G has some path of length at least k. or G has O(kn) edges.

Proof: look at the longest path in the DFS tree. If it has length at least k, we're done. Otherwise, since each edge connects an ancestor and a descendant, we can bound the number of edges by counting the total number of ancestors of each descendant, but if the longest path is shorter than k, each descendant has at most k-1 ancestors. So there can be at most (k-1)n edges.

This fact can be used as part of an algorithm for finding long paths in G, another subgraph isomorphism problem closely related to the traveling salesman problem. If k is a small constant (like say 5) you can find paths of length k in linear time (measured as a function of n). But measured as a function of k, the time is exponential, which isn't surprising because this problem is closely related to the traveling salesman problem. For more on this particular problem, see Michael R. Fellows and Michael A. Langston, "On search, decision and the efficiency of polynomial-time algorithms", 21st ACM Symp. Theory of Computing, 1989, pp. 501-512.

**Relation
between BFS and DFS**

It may not be clear from the pseudo-code above, but BFS and DFS are very closely related to each other. (In fact in class I tried to describe a search in which I modified the "add to end of list" line in the BFS pseudocode to "add to start of list" but the resulting traversal algorithm was not the same as DFS.)

With a little care it is possible to make BFS and DFS look almost the same as each other (as similar as, say, Prim's and Dijkstra's algorithms are to each other):

bfs(G)

{

list L = empty

tree T = empty

choose a starting vertex x

search(x)

while(L nonempty)

remove edge (v,w) from start of L

if w not yet visited

{

add (v,w) to T

search(w)

}

}

dfs(G)

{

list L = empty

tree T = empty

choose a starting vertex x

search(x)

while(L nonempty)

remove edge (v,w) from end of L

if w not yet visited

{

add (v,w) to T

search(w)

}

}

search(vertex v)

{

visit(v);

for each edge (v,w)

add edge (v,w) to end of L

}

Both of these search algorithms now keep a list of edges to explore; the only difference between the two is that, while both algorithms adds items to the end of L, BFS removes them from the beginning, which results in maintaining the list as a queue, while DFS removes them from the end, maintaining the list as a stack

http://www.ics.uci.edu/~eppstein/161/960215.htmlBrown University

1271 Tutors

California Institute of Technology

2131 Tutors

Carnegie Mellon University

982 Tutors

Columbia University

1256 Tutors

Dartmouth University

2113 Tutors

Emory University

2279 Tutors

Harvard University

599 Tutors

Massachusetts Institute of Technology

2319 Tutors

New York University

1645 Tutors

Notre Dam University

1911 Tutors

Oklahoma University

2122 Tutors

Pennsylvania State University

932 Tutors

Princeton University

1211 Tutors

Stanford University

983 Tutors

University of California

1282 Tutors

Oxford University

123 Tutors

Yale University

2325 Tutors