Engineering

Shortest Path Algorithms

From BFS to A*: choosing the right tool for the graph in front of you

Learning Objectives

By the end of this module you will be able to:

  • Explain why BFS finds shortest paths in unweighted graphs but requires modification for weighted ones.
  • Describe Dijkstra's algorithm step by step and state its time complexity with a priority queue.
  • Explain why Dijkstra's algorithm fails on graphs with negative edge weights.
  • Describe Bellman-Ford and explain how it handles negative weights and detects negative cycles.
  • State the time complexity of Bellman-Ford and compare it to Dijkstra.
  • Explain what Floyd-Warshall computes, state its recurrence relation, and give its time complexity.
  • Describe what makes a heuristic admissible for A* and why admissibility guarantees optimality.
  • Choose the appropriate shortest-path algorithm given a description of graph properties and constraints.

Core Concepts

The shortest-path problem

"Shortest path" means different things depending on what your graph is modeling. In an unweighted graph, shortest means fewest edges. In a weighted graph, it means minimum total edge weight. The right algorithm depends on three questions about your graph:

  1. Are edges weighted?
  2. Can edge weights be negative?
  3. Do you need shortest paths from one source, or between all pairs of vertices?

A fourth question applies when you have a known goal and domain-specific distance estimates available: can you use a heuristic to direct the search?


BFS: the zero-cost baseline

Breadth-first search solves the single-source shortest path problem for unweighted graphs in O(V+E) time. Because all edges have equal weight, the first time BFS reaches a node it has found the shortest path to it — the FIFO queue ensures nodes are visited in order of increasing hop count.

This is the most efficient approach for unweighted graphs. Reaching for Dijkstra or Bellman-Ford on an unweighted graph adds complexity with no benefit.

Why BFS breaks on weighted graphs

When edges carry different weights, "fewest hops" and "minimum cost" diverge. A two-hop path with weights 1+1 beats a one-hop path with weight 100. BFS cannot account for this because it treats every edge as identical.


Dijkstra's algorithm

Dijkstra is the standard algorithm for weighted, non-negative graphs. It maintains a min-heap priority queue of tentative distances, always expanding the cheapest unvisited node next. This greedy strategy works precisely because weights are non-negative: once a node is settled, no future path through an unvisited node can make it cheaper.

Dijkstra's algorithm requires all edge weights to be non-negative. This is not merely a practical limitation — it is a correctness requirement. On a graph with negative edges, Dijkstra will produce wrong results without warning.

Complexity. Using a min-heap based priority queue, Dijkstra runs in O((V+E) log V). A naive adjacency-matrix implementation costs O(V²), which is only competitive on very dense graphs where E ≈ V².

Where it appears in practice. Dijkstra powers navigation systems (GPS routing), network routing protocols (finding optimal packet paths), and game AI pathfinding. It has also been applied to circuit layout and speech recognition.


Bellman-Ford: handling negative weights

Bellman-Ford solves the single-source shortest path problem for graphs with negative edge weights, as long as there are no negative-weight cycles. Where Dijkstra relies on a greedy invariant that only holds for non-negative weights, Bellman-Ford takes a more conservative approach: it relaxes every edge V-1 times.

The logic is that a shortest path in a graph with V vertices can have at most V-1 edges (any longer path would revisit a vertex, forming a cycle). Relaxing all edges V-1 times guarantees that shortest distances propagate fully across the graph.

Complexity. Bellman-Ford runs in O(EV). This is slower than Dijkstra for graphs without negative weights. The tradeoff is the ability to handle negative edges.

Negative cycle detection. After the V-1 relaxation passes, Bellman-Ford performs one additional pass. If any edge can still be relaxed, a negative-weight cycle exists. Negative cycles make shortest paths undefined — you can always reduce the path cost by going around the cycle one more time — so detecting them is essential before trusting any distance output.

Negative cycles invalidate shortest paths

A negative cycle is a cycle whose edge weights sum to a negative number. Any path that passes through such a cycle can be made arbitrarily short by looping. If Bellman-Ford detects a negative cycle, the shortest path problem has no finite solution for nodes reachable through that cycle.


Floyd-Warshall: all-pairs shortest paths

BFS, Dijkstra, and Bellman-Ford are all single-source algorithms: run one of them from vertex s, and you learn the shortest path from s to every other vertex. To get shortest paths between every pair of vertices, you could run a single-source algorithm from each vertex — but Floyd-Warshall does it more directly.

Floyd-Warshall is a dynamic programming algorithm that computes all-pairs shortest paths in O(n³) time and O(n²) space. Like Bellman-Ford, it works with negative edge weights.

The recurrence. Floyd-Warshall builds its solution bottom-up using the recurrence:

M[i, j, k] = min{ M[i, j, k−1],  M[i, k, k−1] + M[k, j, k−1] }

M[i, j, k] is the length of the shortest path from vertex i to vertex j using only the first k vertices as intermediates. Starting with k=0 (direct edges only), the algorithm progressively widens the set of allowed intermediate vertices. After k=n, every vertex has been considered as an intermediate, and M holds all-pairs shortest distances.


A*: guided search with a heuristic

Dijkstra is optimal but indiscriminate — it expands in every direction from the source. When you have a specific goal and some way to estimate how far you are from it, you can do better. A* adds a heuristic to Dijkstra's priority queue to bias expansion toward the goal.

The evaluation function. A* uses f(n) = g(n) + h(n), where g(n) is the known cost from the start to node n, and h(n) is an admissible heuristic estimating the remaining cost from n to the goal. The priority queue is ordered by f, not just g, so nodes closer to the goal get expanded sooner.

Efficiency gain. A* reduces the search space compared to Dijkstra by selectively expanding only vertices that have better possibilities of appearing in the shortest path. Dijkstra explores all reachable nodes; A* stops exploring a direction once h(n) signals it is heading away from the goal.

Admissibility and optimality. A* guarantees finding the shortest path when the heuristic is admissible — meaning it never overestimates the actual cost to reach the goal. If h underestimates, A* is still correct, just potentially slower than an ideal heuristic. If h ever overestimates, optimality is no longer guaranteed: A* may settle on a suboptimal path without knowing it.

A classic admissible heuristic for geographic routing is straight-line (Euclidean) distance — you cannot travel between two points in less distance than the straight line between them.


Compare & Contrast

PropertyBFSDijkstraBellman-FordFloyd-WarshallA*
Weighted?NoYesYesYesYes
Negative weights?NoYesYesNo (standard)
Negative cycle detection?NoYesYes (diagonal check)No
ScopeSingle-sourceSingle-sourceSingle-sourceAll-pairsSingle-source, single-target
Time complexityO(V+E)O((V+E) log V)O(EV)O(V³)Depends on heuristic
Space complexityO(V)O(V)O(V)O(V²)O(V)
Requires a goal?NoNoNoNoYes
Floyd-Warshall vs. repeated Dijkstra

For sparse graphs (few edges), running Dijkstra from every vertex costs O(V · (V+E) log V), which is better than O(V³). Floyd-Warshall becomes the better choice on dense graphs where E ≈ V², or when the simplicity of a single algorithm matters more than raw performance.


Worked Example

Scenario. You have five cities connected by roads with the following distances (all weights are positive). You want the shortest route from city A to every other city.

A --4--> B
A --2--> C
C --1--> B
B --3--> D
C --5--> D
D --1--> E

Choosing the algorithm. Weighted graph, non-negative weights, single source: Dijkstra.

Step-by-step.

  1. Initialize distances: {A:0, B:∞, C:∞, D:∞, E:∞}. Priority queue: [(0, A)].
  2. Pop A (cost 0). Relax neighbors:
    • B via A: 0+4=4. Queue: [(2,C),(4,B)]
    • C via A: 0+2=2. Queue: [(2,C),(4,B)]
  3. Pop C (cost 2). Relax neighbors:
    • B via C: 2+1=3 < 4. Update B to 3. Queue: [(3,B),(5,D)]
    • D via C: 2+5=7. Queue: [(3,B),(5,D),(7,D)]
  4. Pop B (cost 3). Relax neighbors:
    • D via B: 3+3=6 < 7. Update D to 6. Queue: [(5,D),(6,D)]
  5. Pop D (cost 6). Relax neighbors:
    • E via D: 6+1=7. Queue: [(7,E)]
  6. Pop E (cost 7). No unvisited neighbors.

Result. {A:0, B:3, C:2, D:6, E:7}. The shortest path to B goes A→C→B, not A→B directly.

What changed at step 3?

The direct edge A→B has weight 4, but the two-hop path A→C→B has total weight 3. Dijkstra caught this because it re-evaluates B's distance when C is settled. This update step is called "relaxation."


Boundary Conditions

When BFS fails on weighted graphs. BFS produces shortest hop-count paths, not shortest weighted paths. As soon as edges carry different weights, BFS is incorrect for the weighted problem.

When Dijkstra fails on negative weights. Dijkstra's greedy guarantee — that a settled node will never receive a shorter path later — breaks the moment negative edges exist. A settled node could be reached more cheaply via a future node connected by a negative edge. The algorithm won't backtrack to fix it, so distances are wrong.

When Bellman-Ford is not enough. Bellman-Ford handles negative weights, but not negative cycles. If the graph contains a negative cycle reachable from the source, there is no finite shortest path, and Bellman-Ford's detection step will flag it. At that point, the problem is ill-defined.

Floyd-Warshall on large graphs. O(V³) becomes prohibitive once V reaches the thousands. For V=10,000, you're looking at 10¹² operations. Floyd-Warshall is suited to dense graphs of moderate size, not large sparse graphs.

A with an inadmissible heuristic.* If h(n) ever exceeds the true remaining cost, A* may prematurely close a node that could have been reached more cheaply, producing a suboptimal path. The algorithm will still terminate, but correctness is not guaranteed. This is a subtle failure mode because the resulting path looks plausible.

A without a goal.* A* requires a known goal to compute h(n). Without one, it degenerates to Dijkstra. If you need shortest paths to all nodes, use Dijkstra or Floyd-Warshall instead.


Common Misconceptions

"BFS finds the shortest path." — Only for unweighted graphs. On weighted graphs, BFS finds the path with the fewest edges, which may not be the cheapest.

"Dijkstra handles negative weights if I just add a constant to all edges." — This does not work. Adding a constant to all edges changes the relative cost of paths with different hop counts: a two-edge path gets +2c, a one-edge path gets +1c. Path costs are no longer shifted uniformly, and the optimal path can change.

"Bellman-Ford is always worse than Dijkstra." — It's slower in the no-negative-weights case, but it is the only correct choice when negative weights exist. It also detects negative cycles, which Dijkstra cannot do.

"Floyd-Warshall cannot handle negative weights." — It can, as long as there are no negative cycles. This is a common conflation with Dijkstra's constraint.

"A is always faster than Dijkstra."* — A* is faster when a good heuristic is available. With a constant heuristic h(n)=0, A* is identical to Dijkstra. With a perfect heuristic, it expands only nodes on the optimal path. The quality of h determines the speedup.


Active Exercise

Work through the following problems. Commit to an answer before checking the rationale at the end.

Problem 1. You are writing a route planner for a city bike-share system. The graph has thousands of intersections (vertices) and streets (edges). All travel times are positive. You want the fastest path between two specific intersections.

Which algorithm do you choose, and why?

Problem 2. A financial system models currency exchange as a directed graph. Each edge (A→B) has weight equal to the log of the exchange rate. Negative weights appear naturally (some conversions lose value). You need to detect whether any sequence of exchanges results in a net gain (an arbitrage opportunity).

Which algorithm do you use for detection, and what condition do you check?

Problem 3. You are building a feature that precomputes a "travel time matrix" between every pair of airports in a regional network. The network has 80 airports. Edges represent direct flights; all weights are positive.

Which algorithm is most appropriate, and why might you prefer it over running Dijkstra from each airport?

Problem 4. An admissible heuristic for a grid-based pathfinder uses the straight-line distance to the goal. A teammate proposes changing it to straight-line distance multiplied by 1.5 to "make A* faster."

Explain what this change does to correctness and why.


Rationale.

  1. Dijkstra with a priority queue. Weighted graph, positive weights, single source-target pair. A* would also be valid if you have spatial coordinates to define an admissible heuristic (Euclidean distance). For a large sparse graph, the O((V+E) log V) priority queue implementation is appropriate.

  2. Bellman-Ford. After V-1 relaxation passes, perform one more pass. If any edge (A→B) can still reduce dist[B], a negative-weight cycle exists — that cycle corresponds to a profitable arbitrage loop. The problem requires negative cycle detection, which only Bellman-Ford (among these algorithms) provides.

  3. Floyd-Warshall. With 80 airports, O(V³) = 512,000 operations — trivial. Running Dijkstra from each of the 80 airports also works, but Floyd-Warshall produces the full matrix in a single conceptually simple algorithm. For dense graphs at this scale, the tradeoff favors simplicity.

  4. Correctness breaks. Multiplying by 1.5 means h(n) will often exceed the true remaining cost, making it inadmissible. A* no longer guarantees the optimal path. It may find a path faster, but that path may not be the shortest one. This is the defining tradeoff: accepting inadmissibility can speed up search but loses the optimality guarantee.

Key Takeaways

  1. Match the algorithm to the graph. BFS for unweighted graphs. Dijkstra for weighted, non-negative graphs. Bellman-Ford when negative weights are present or negative cycle detection is needed. Floyd-Warshall for all-pairs on moderate-size graphs.
  2. Dijkstra's non-negative requirement is a correctness constraint, not a performance suggestion. Applying it to negative-weight graphs produces silently wrong answers.
  3. Bellman-Ford costs O(EV) but buys two things Dijkstra cannot provide: correct results on negative-weight graphs and reliable negative cycle detection.
  4. Floyd-Warshall's DP recurrence answers one question at each step: is the path from i to j better going directly, or through intermediate vertex k? Repeat for all k.
  5. A* is only as good as its heuristic. An admissible heuristic (never overestimates) guarantees the optimal path. An inadmissible heuristic may find a path faster, but optimality is not guaranteed.

Further Exploration

Core Lectures

Reference Materials