## Graph distance using matrix multiplication in Octave

If A is a matrix of positive edge weights between nodes in a graph (from 0 to infinity), then power-iteration through multiplication of that matrix in the {R,min,plus} semi-ring will converge on the distances of the shortest-paths in that graph. For reference (my own mostly), this blog entry gives the GNU Octave code for it and an example.

``````
# convert a standard edge-adjacency matrix (1 for edge to other node, 0
# otherwise), to an initial distance matrx (i.e. Inf for >1-hop nodes).
A(A == 0) = Inf;
A(logical(eye(size(A)))) = 0;
endfunction

# less memory intensive, but slower, version for v large graphs.
for i = 1:rows(A)
B = A(i,:);
B(B == 0) = Inf;
A(i,:) = B;
A(i,i) = 0;
endfor
endfunction

# matrix multiplication in the (R,min,+) semi-ring,
#
# for A being an initial distance matrix for a graph with no negative
# cycles, this will converge on the shortest distances in the graph.
function [ D changed ] = min_plus_mul (A)
changed = false;
n = columns(A);
# as noted by mcgrant on #octave on Freenode IRC:
# for non-sparse matrices this could be optimised with bsxfun.
# in some way similar to:
# reshape(sum(bsxfun(@times,A,
#                    reshape(X,[1,size(X)])),
#             2),size(A,1),size(X,2))
#
# for sparse matrices, it's possible bsxfun could still
# be used to optimised the inner loop.
for i = 1:n
for j = 1:n
val = A(i,j);
D(i,j) = min(A(i,:) .+ A(:,j)');
if (D(i,j) != val)
changed = true;
endif
endfor
endfor
endfunction

# power iterate till convergence, or for the specified number of iterations
function [ A iterations ] = min_plus_mul_power (A, n = columns(A))
for i = 1:n
[ A c ] = min_plus_mul (A);
if (c == false)
break;
endif
endfor
iterations = i;
endfunction``````

With this, you can calculate shortest-path distance matrices for standard graph adjacency matrices in Octave. E.g. for the graph of:

```A--B----C--F
|  |\   |
D--E G  H
```

The adjacency matrix A and resulting distances would be:

```octave:4> A
A =

0   1   0   1   0   0   0   0
1   0   1   0   1   0   1   0
0   1   0   0   0   1   0   1
1   0   0   0   1   0   0   0
0   1   0   1   0   0   0   0
0   0   1   0   0   0   0   0
0   1   0   0   0   0   0   0
0   0   1   0   0   0   0   0

ans =

0   1   2   1   2   3   2   3
1   0   1   2   1   2   1   2
2   1   0   3   2   1   2   1
1   2   3   0   1   4   3   4
2   1   2   1   0   3   2   3
3   2   1   4   3   0   3   2
2   1   2   3   2   3   0   3
3   2   1   4   3   2   3   0
```

Just in case it’s useful to anyone…

Update: Regular matrix multiplication (i.e. standard arithmetic {R,+,*} semi-ring) can also be used to calculate the distances, though it requires some additional processing. If an adjacency matrix is multiplied by itself k-times, then each entry in the matrix gives the number of paths of length k between the 2 nodes. By noting down the first k that gives a positive value for a cell, the shortest-path distances can be determined. Further, this method can also say how many such shortest-paths there are (note: my example doesn’t record this). The Octave code is:

``````function [ D k ] = k_paths_mul (A)
fin = true;
D = A;
C = A;
k = 1;
do
k++;
C *=  A;
fin = true;
for i = 1:rows(C)
T = C(i,:);
T(i) = 0;
S = (T > 0) & (D(i,:) == 0);
if (any(S))
D(i,S) = k;
fin = false;
endif

endfor
until (fin == true)
endfunction
``````

I don’t know if this any faster than the min-plus method. On the one hand, it uses regular, built-in matrix multiplication, which may use fast libraries. On the other hand, it requires a bit more manipulation of the results.

Update2: The k-paths approach seems to be significantly faster, at least on smaller graphs:

```> G = scale_free (400, 2);
> T1 = time (); [D k ] =  min_plus_mul_power (G) ; T2 = time ();
> disp (T2 - T1);
22.405
> T1 = time (); [D k ] = k_paths_mul (G) ; T2 = time ();
> disp (T2 - T1);
1.4274```

Update3: It seems that as the graphs get bigger and bigger, the speed advantage of the k-paths approach gets smaller and smaller:

800 nodes : 128.51 v 13.343 seconds → k-paths 9.6 × faster
1600 nodes : 703.29 v 121.62 → 5.8 ×
3200 nodes : 4198.3 v 1214.6 → 3.5 ×
6400 nodes : 28004 v 11830 → 2.4 ×

So it’s very possible the min_plus_mul approach scales better to larger graphs.

Update 4: 12800 nodes: 188750 v 78226 → 2.4 × – maybe not?

Note: Generally, graph-exploration all-pair shortest-path algorithms, such as Floyd-WarshallJohnson’s algorithms or all-sources Dijkstra’s shortest-path should perform better than a matrix based approach, all else being equal, is my understanding. E.g., my Java implementation of Floyd-Warshall – part of a Java MultiGraph library –  significantly outperforms the above Octave code on large graphs. However, the matrix approach can still be useful sometimes, particularly with smaller graphs.

Update5: On a different machine, with larger graphs again, I’ve found that the k-paths-mul approach with GNU Octave 3.8.1 significantly out-performed the Java Floyd-Warshall approach. So it seems mileage can vary significantly. So I can’t say which approach is generally the fastest! It seems to depend a lot on data-sets and platforms.