Today we’ll be talking about graphs. Just the basics.
Table of Contents
What are Graphs?
So, what is a graph? Have a look at the following picture:
It might represent a new road system for a new means of transportation for example. It’s very intuitive. Just by looking at the picture we can say that you can go from Charleston to Detroit through Cincinnati or through Washington.
You can also see that if you want to go from Denver to Dallas, you have to go through Kansas City and Cincinnati at least, but if you enjoy traveling, you can choose to go through Kansas City, Chicago, Detroit, Boston, New York, Washington, Charleston and Cincinnati for example.
Graph Terminology
What you can see in the picture is an example of a graph. The red circles are called nodes or vertices and the blue lines connecting the nodes are called edges.
The nodes in our example are strings (the names of the cities), but they could be anything.
Some nodes are connected by edges, like for example New York and Boston, others are not connected, like for example Charleston and Dallas.
In our case we can move in both directions along the edges, so for example from Chicago to Kansas City and then back to Chicago. We call such edges undirected edges and the whole graph an undirected graph. If we could move only in one direction, we would have a directed graph.
A Simplified Example
Well, this article is just about the basics. Let’s keep it simple and let’s simplify our graph by reducing the number of nodes and changing the names of the cities to their lowercase initials. Here’s the new image:
So, after a year the new means of transportation turned out not to be very successful. It was reduced.
In this graph we have 6 nodes, one of which, k, is not connected with any other one with an edge. The numbers of edges from the other nodes is also differentiated. There’s one edge from c, 2 edges from d, b and n and 3 edges from w.
Neighboring Nodes and Isolated Nodes
In order to represent the graph in code we can simply use a dictionary. The keys will be the nodes and the values will be lists of neighboring nodes. By neighboring nodes I mean nodes directly connected by an edge with the node being the key.
Here’s the dictionary:
graph = {'b': ['d', 'n'],
'c': ['w'],
'd': ['b', 'w'],
'n': ['b', 'w'],
'k': [],
'w': ['c', 'd', 'n']}
As you can see, the k node isn’t connected with any other node, so the value corresponding to it in the graph dictionary is an empty list. We call such nodes isolated nodes.
All the other nodes are connected by edges. An edge can be represented by a 2-tuple containing the two nodes on either side. So, the edge between b and w could represented in code like so: (‘b’, ‘w’).
With our graph dictionary in place, we can generate the edges. The dictionary contains all the information that is necessary to do that. Here’s the code:
def edges(graph):
return [(node, neighbor)
for node in graph
for neighbor in graph[node]]
Let’s use the function and print all the edges:
print("The following edges have been generated:")
for edge in edges(graph):
print(edge)
I’m using a list comprehension here to create the list of edges, but you could use a for loop instead. If you want to learn more about list comprehensions, I have an article about them, so feel free to read it.
And here’s the output:
The following edges have been generated:
('b', 'd')
('b', 'n')
('c', 'w')
('d', 'b')
('d', 'w')
('n', 'b')
('n', 'w')
('w', 'c')
('w', 'd')
('w', 'n')
There’s just one isolated node in our example, but there could be more. So, let’s write a function that returns a list of all isolated nodes:
def isolated_nodes(graph):
return [node for node in graph if not graph[node]]
I’m using another list comprehension here. Note that instead of writing:
if graph[node] == []
I wrote:
if not graph[node]
This is possible because an empty list is understood as False in Python. Let’s try it out:
print("The graph contains the following isolated nodes:")
for node in isolated_nodes(graph):
print(node)
The output is:
The graph contains the following isolated nodes:
k
The Graph Class
We already have two functions and we’re going to have even more, so it seems reasonable to wrap the whole functionality in a class.
Inside the class I’ll be using dictionary views. If you want to learn more about dictionary views, I have an article about them, so feel free to read it.
And now let’s create the Graph class:
class Graph:
# We can initialize an object with an existing graph. If we don't,
# an empty graph is created, which is represented by an empty
# dictionary.
def __init__(self, graph = {}):
self.__graph = graph
# Here's our edges function turned into a method. It returns
# the list of all the edges.
def edges(self):
return [(node, neighbor)
for node in self.__graph
for neighbor in self.__graph[node]]
# Now that we have a method that returns all the edges, let's
# create a method that returns all the nodes.
# We must convert the keys view that is returned by the keys
# method to a list.
def nodes(self):
return list(self.__graph.keys())
# And here's the isolated_nodes method, which returns all the isolated nodes.
def isolated_nodes(self):
return [node for node in self.__graph if not self.__graph[node]]
# We'll define two more methods. One for adding new nodes and the
# other for adding new edges.
# Let's start with the nodes:
def add_node(self, node):
# If we try to add a node that already exists in the graph,
# nothing will happen. If the node doesn't exist, it will
# be created as an isolated node.
if node not in self.__graph:
self.__graph[node] = []
# And now the edges.
def add_edge(self, node1, node2):
# We assume that by calling this method, 2 edges are
# created: one from node1 to node2 and one from node2 to node1.
# There are 4 possible scenarios here:
# 1. The two nodes may both exist in the graph or
# 2. node1 exists in the graph but not node2 or
# 3. node2 exists in the graph but not node1 or
# 4. neither node1 nor node2 exists in the graph
# The method has to take care of each of these cases. To
# make it easier, let's first check if the two nodes exist
# in the graph and if they don't, let's add them:
if node1 not in self.__graph:
self.add_node(node1)
if node2 not in self.__graph:
self.add_node(node2)
# Now we know for sure that the two nodes exist in the graph.
# Let's add the edges between them.
self.__graph[node1].append(node2)
self.__graph[node2].append(node1)
# With the class in place, let's test it.
# We'll be making use of the graph defined before.
# First let's create a graph object and initialize it with
# our graph dictionary.
g = Graph(graph)
# Now let's print all the nodes, all the isolated nodes and all the edges.
print("The nodes:")
for node in g.nodes():
print(node)
print("\nThe isolated nodes:")
for isolated_node in g.isolated_nodes():
print(isolated_node)
print("\nThe edges:")
for edge in g.edges():
print(edge)
# Now the transportation company is doing better. You can now travel
# to Miami. Let's create a new node, m.
g.add_node('m')
# Now m is an isolated node. Let's print all the nodes and all the
# isolated nodes again.
print("\nThe nodes after adding 'm':")
for node in g.nodes():
print(node)
print("\nThe isolated nodes after adding 'm':")
for isolated_node in g.isolated_nodes():
print(isolated_node)
# We can go from Kansas City and from Washington to Miami, so we'll need
# two connections, or edges.
g.add_edge('k', 'm')
g.add_edge('w', 'm')
# Now two things have changed: First, we should be able to see the new
# edges. Second, k and m are no longer isolated nodes because they are
# connected, so there should be no isolated nodes at all.
# Let's check it out:
print("\nThe edges after connecting 'k' with 'm' and 'w' with 'm':")
for edge in g.edges():
print(edge)
print("\nThe isolated nodes after connecting 'k' with 'm' and 'w' with 'm':")
if not g.isolated_nodes():
print("There are no isolated nodes.")
else:
for isolated_node in g.isolated_nodes():
print(isolated_node)
# Finally, let's add a totally new edge between two totally new
# nodes. Let's say there's a second line between Sacramento ('s')
# and Vegas ('v'):
g.add_edge('s', 'v')
# Let's see if the new edge was successfully created. Let's print all
# the nodes and edges one more time.
print("The nodes after adding an edge between 's' and 'v':")
for node in g.nodes():
print(node)
print("\nThe edges after adding an edge between 's' and 'v':")
for edge in g.edges():
print(edge)
Here’s the output:
The nodes:
b
c
d
n
k
w
The isolated nodes:
k
The edges:
('b', 'd')
('b', 'n')
('c', 'w')
('d', 'b')
('d', 'w')
('n', 'b')
('n', 'w')
('w', 'c')
('w', 'd')
('w', 'n')
The nodes after adding 'm':
b
c
d
n
k
w
m
The isolated nodes after adding 'm':
k
m
The edges after connecting 'k' with 'm' and 'w' with 'm':
('b', 'd')
('b', 'n')
('c', 'w')
('d', 'b')
('d', 'w')
('n', 'b')
('n', 'w')
('k', 'm')
('w', 'c')
('w', 'd')
('w', 'n')
('w', 'm')
('m', 'k')
('m', 'w')
The isolated nodes after connecting 'k' with 'm' and 'w' with 'm':
There are no isolated nodes.
The nodes after adding an edge between 's' and 'v':
b
c
d
n
k
w
m
s
v
The edges after adding an edge between 's' and 'v':
('b', 'd')
('b', 'n')
('c', 'w')
('d', 'b')
('d', 'w')
('n', 'b')
('n', 'w')
('k', 'm')
('w', 'c')
('w', 'd')
('w', 'n')
('w', 'm')
('m', 'k')
('m', 'w')
('s', 'v')
('v', 's')
Creating Graphs from Scratch
And now let’s create an empty graph, add a couple of nodes and edges to it and see what we get:
g = Graph()
# Now let's print all the nodes, all the isolated nodes and all the edges.
# There should be nothing:
print("The nodes:")
for node in g.nodes():
print(node)
print("\nThe isolated nodes:")
for isolated_node in g.isolated_nodes():
print(isolated_node)
print("\nThe edges:")
for edge in g.edges():
print(edge)
# Let's add a node. This time our nodes will be numbers.
g.add_node(1)
# Now 1 is an isolated node. Let's print all the nodes and all the
# isolated nodes again.
print("\nThe nodes after adding 1:")
for node in g.nodes():
print(node)
print("\nThe isolated nodes after adding 1:")
for isolated_node in g.isolated_nodes():
print(isolated_node)
# Let's add some edges:
g.add_edge(1, 3)
g.add_edge(2, 3)
g.add_edge(4, 5)
# Let's print the nodes and edges now:
print("\nThe nodes after adding some edges:")
for node in g.nodes():
print(node)
print("\nThe edges after adding some edges:")
for edge in g.edges():
print(edge)
Here’s the output:
The nodes:
The isolated nodes:
The edges:
The nodes after adding 1:
1
The isolated nodes after adding 1:
1
The nodes after adding some edges:
1
3
2
4
5
The edges after adding some edges:
(1, 3)
(3, 1)
(3, 2)
(2, 3)
(4, 5)
(5, 4)
Here’s the video version of the article: