# Modeling graphs using Python

## Introduction

In Graph Theory 101, we reviewed basic concepts from graph theory including **direction**, **weight**, **cycles**, **connectivity** and **density**. Be sure to read it first in order to be familiar with the vocabulary.

In this post, we discuss how to model nodes and edges using Python. Two data structures are commonly used in Computer Science to model graphs: **adjacency matrices** and **adjacency lists**. Both structures can be used to represent any type of graph, but they come with specific trade-offs.

Below is the graph we will use throughout the article. It is unweighted, undirected, connected and neither really sparse nor dense.

## The adjacency matrix

### Definition

An adjacency matrix is a square matrix where `matrix[i][j]`

represents the cost of moving from node i to node j.

If the graph is unweighted, we assume the default cost to be 1. So if nodes i and j are connected, `matrix[i][j] = 1`

, otherwise `matrix[i][j] = 0`

. If the graph is weighted, `matrix[i][j]`

contains the edge's weight.

If the graph is undirected, i.e. all edges are bi-directional, the matrix is symmetric: each entry is equal to its transpose.

graph type | matrix characteristics |
---|---|

unweighted | filled with 0s and 1s |

weighted | filled with 0s and weights |

undirected | symmetric |

directed | not symmetric |

### Complexity

Assuming the graph has \(n\) nodes: the space complexity of storing the matrix is \(O(n^2)\) as we need memory for all possible edges, but the time complexity of querying for an edge is only \(O(1)\) since we can directly look it up.

**more memory**but

**less computation**.

### Python implementation

In Python, we can model the adjacency matrix using a **list of lists** or a **list of dictionaries**.

In the most simple case, we label the nodes as a list of integers starting at zero. We can then use a Python list of lists to model the adjacency matrix, where each index of the list is also a node.

We can build the adjacency matrix for Figure 1 from a list of edges as follows:

```
# Inputs
nodes = [0, 1, 2, 3, 4]
edges = [[0, 1], [0, 2], [1, 2], [3, 0], [4, 2]]
# Adjacency matrix
graph = [[0 for i in range(5)] for j in range(5)]
for edge in edges:
graph[edge[0]][edge[1]] = 1
graph[edge[1]][edge[0]] = 1
```

Result:

```
>>> graph
[[0, 1, 1, 1, 0],
[1, 0, 1, 0, 0],
[1, 1, 0, 0, 1],
[1, 0, 0, 0, 0],
[0, 0, 1, 0, 0]]
```

To check if there is an edge between nodes 1 and 2 we simply lookup the value of `graph[1][2]`

.

## The adjacency list

### Definition

An adjacency list only represents **existing **edges. It is a list of lists where each first list represents a node and each second list is the list of neighboring nodes.

### Complexity

Assuming the graph has \(n\) nodes and \(m\) edges, the space complexity of storing the adjacency list is \(O(m + n)\) since we only store existing edges, but the time complexity of finding an edge is \(O(n)\) since, for a given node, we need to loop through all the neighbors to see whether the edge exists.

**less memory**but

**more computation**.

**sparse**graphs. If the graph is

**dense**, the adjacency list becomes less efficient since it will consume close to \(O(n^2)\) space (just like the adjacency matrix) but still require \(O(n)\) time to find an edge.

### Python implementation

Using the same example as before, we can implement the adjacency list in Python using a *dictionary* of *sets:*

```
# Inputs
nodes = [0, 1, 2, 3, 4]
edges = [[0, 1], [0, 2], [1, 2], [3, 0], [4, 2]]
# Adjacency list
graph = defaultdict(set)
for edge in edges:
graph[edge[0]].add(edge[1])
graph[edge[1]].add(edge[0])
```

Result:

```
>>> graph
defaultdict(<class 'set'>,
{
0: {1, 2, 3},
1: {0, 2},
2: {0, 1, 4},
3: {0},
4: {2}
}
)
```

In order to see whether there is an edge between nodes 1 and 2, we either loop through `graph[1]`

in search of 2 or loop through `graph[2]`

in search of 1.

## Conclusion

In this post we reviewed both adjacency matrices and adjacency lists. Adjacency matrices are best suited for dense graphs whereas adjacency lists are more efficient for sparse graphs.

In the next post, we will look at how to traverse a graph using breadth-first search algorithm.

Although useful for general understanding (and for programming interviews!), it is recommended not to reinvent the wheel in production code. Instead, it is better to use existing libraries such as networkx to model and analyze graphs.

References: