Algorithmic Thinking: A Problem-Based Introduction Homepage for the book Algorithmic Thinking: A Problem-Based Introduction

Solving the MooTube USACO Problem Using Depth-First Search

Hi everyone,

In a previous post, we solved the MooTube problem using breadth-first search.

I noted there that we didn’t need to keep track of the minimum number of moves (even though I did so to keep close to the breadth-first search code from the book).

If we’re using breadth-first search but we don’t care about shortest paths, then we may also want to consider using depth-first search. Like breadth-first search (BFS), depth-first search (DFS) explores a graph. The difference is in the search strategy: BFS goes level by level, whereas DFS explores as deeply as possible before backtracking. The giveaway that someone is using DFS is recursion.

A few readers have asked me what a DFS solution would look like. Here’s what I came up with… we’ll discuss the dfs function below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_VIDEOS 5000

typedef struct edge {
  int to_video, relevance;
  struct edge *next;
} edge;

// Return number of videos reachable from from_video
int dfs(edge *adj_list[], int visited[], int relevance, int from_video,  int num_videos) {
  int total = 1;
  edge *e;

  if (visited[from_video])
    return 0;
  visited[from_video] = 1;

  e = adj_list[from_video];

  while (e) {
    if (e->relevance >= relevance)
      total = total + dfs(adj_list, visited, relevance, e->to_video,  num_videos);
    e = e->next;
  }
  return total;
}

int main(void) {
  static edge *adj_list[MAX_VIDEOS + 1] = {NULL};
  static int visited[MAX_VIDEOS + 1];
  int i, j, num_videos, num_queries, from_index, to_index, relevance, video, result;
  edge *e;

  freopen("mootube.in", "r", stdin);
  freopen("mootube.out", "w", stdout);

  scanf("%d%d", &num_videos, &num_queries);

  for (i = 0; i < num_videos - 1	; i++) {
    scanf("%d%d%d", &from_index, &to_index, &relevance);
    e = malloc(sizeof(edge));
    if (e == NULL) {
      fprintf(stderr, "malloc error\n");
      exit(1);
    }
    e->to_video = to_index;
    e->relevance = relevance;
    e->next = adj_list[from_index];
    adj_list[from_index] = e;
    e = malloc(sizeof(edge));
    if (e == NULL) {
      fprintf(stderr, "malloc error\n");
      exit(1);
    }
    e->to_video = from_index;
    e->relevance = relevance;
    e->next = adj_list[to_index];
    adj_list[to_index] = e;
  }

  for (i = 0; i < num_queries; i++) {
    scanf("%d%d", &relevance, &video);
    for (j = 0; j < MAX_VIDEOS + 1; j++)
      visited[j] = 0;
    result = dfs(adj_list, visited, relevance, video, num_videos);
    printf("%d\n", result - 1);
  }

  return 0;
}

The dfs function (13) explores starting from from_video. It returns the number of reachable videos, including from_video. The total variable is used to keep track of this count.

In DFS, we need to keep track of which nodes we’ve already visited. Otherwise, we could cycle around forever, revisiting already visited nodes. I keep track of the visited nodes using the visited array. If dfs is ever invoked with a node that’s already been visited (17), we bail out immediately. If we’re looking at a node that wasn’t visited before… well, it’s visited now (19) :)

Now for the DFS itself. We consider each edge (23); for each edge that meets the relevance threshold, we call DFS on its other endpoint (25). Notice how the total variable is keeping track of all of the new nodes that the recursive call visits.

Way at the bottom (69), there’s a potential off-by-one error. We want the number of videos that are reachable not including video. The dfs function includes the starting video in its count, which is why we subtract 1 here.

BFS? DFS? What Do I Use?

If we want the shortest path distances (which we do in the examples of Chapter 4), then we need BFS. If not, and all we have to do is explore the graph, then BFS is still good, but DFS is also an option. There are situations in which DFS works and BFS doesn’t, especially when what is being asked has to do with determining something about the particular structure of the graph. I haven’t covered such an example here yet, though.