Essential Algorithms For A Level Computer Science Algo
Essential Algorithms For A Level Computer Science Algo
Computer Science
D Hillyard
C Sargent
Published by
CRAIGNDAVE LTD
12 Tweenbrook Avenue
Gloucester
Gloucestershire
GL1 5JY
United Kingdom
admin@craigndave.co.uk
Craig 'n' Dave
www.craigndave.org
2018
i
Every effort has been made to trace copyright holders and to obtain their permission for the use of
copyrighted material. We apologise if any have been overlooked. The authors and publisher will gladly
receive information enabling them to rectify any reference or credit in future editions.
ISBN: 978-1-7943594-2-0
ii
About the authors
David Hillyard
David is a post-graduate qualified teacher with a Bachelor of Science (Honours) in Computing with Business
Information Technology. David has twenty years’ teaching experience in ICT and computing in three large
comprehensive schools in Cheltenham, Gloucestershire. He is an assistant headteacher, former head of
department, chair of governors and founding member of a multi-academy trust of primary schools.
Formerly subject leader for the Gloucestershire Initial Teacher Education Partnership (GITEP) at the University
of Gloucestershire, David has successfully led a team of PGCE/GTP mentors across the county.
His industry experience includes programming for the Ministry of Defence. A self-taught programmer, he
wrote his first computer game at ten years old.
Craig Sargent
Craig is a post-graduate qualified teacher with a Bachelor of Science (Honours) in Computer Science with
Geography. Craig has fourteen years’ teaching experience in ICT and computing in a large comprehensive
school in Stroud, Gloucestershire and as a private tutor. A head of department, examiner and moderator for
awarding bodies in England, Craig has authored many teaching resources for major publishers.
He also previously held roles as a Computing at School (CAS) Master Teacher and regional coordinator for the
Computing Network of Excellence.
His industry experience includes freelance contracting for a large high street bank and programming for the
Ministry of Defence. He also wrote his first computer game in primary school.
iii
Preface
The aim of this book is to provide students and teachers of A level Computer Science with a comprehensive
guide to the algorithms students need to understand for examinations. Each chapter examines a data
structure or algorithm and includes an explanation of how it works, real-world applications, a step-by-step
example, pseudocode, actual code in two languages and a description of the space and time complexity.
Coded solutions are provided in Python 3 because it is the most popular language taught at GCSE, though we
also provide them in Visual Basic Console (2015 onwards) because it most closely resembles pseudocode
and coded examples that students will need to work with in examinations. These coded solutions can be
downloaded from craigndave.org/product/algorithms.
For those students studying other languages such as C++/C# or Java, it would be a great exercise to
translate the code presented in this book into those languages.
Each chapter has been carefully considered to ensure it matches the needs of students and the
requirements of examining bodies without being unnecessarily complex. Wherever possible, a consistent
approach has been adopted to make it easier to see how algorithms expressed in English and pseudocode
relate to real coded solutions. Therefore, some conventions have been adopted:
- ++ += -= for incrementing and decrementing variables has been avoided, instead favouring x = x +1.
The result is not necessarily the most efficient code but an implementation that is most suitable for the level
of study.
There are many ways to code these algorithms. Taking the depth-first search as an example, it can be coded
using iteration or recursion, with a dictionary, objects or arrays. That is six different implementations, but
even these are not exhaustive. When combined with a programmer’s own approach and the available
commands in the language, the number of possibilities for coding these algorithms is huge. It is important
that students recognise the underlying data structures, understand the way an algorithm works and can
determine the output from a piece of code. Therefore, the approaches and solutions presented in this book
are one solution, not the only solution.
iv
Sorting
Lists
Arrays
Stacks
Trees /
Graphs
Searching
Queues
algorithms
algorithms
algorithms
Dijkstra’s
Quick sort
Merge sort
Linked lists
Bubble sort
Dictionaries
Hash tables
Data structure
Binary trees
A* algorithm
Insertion sort
shortest path
Linear search
Binary search
OCR GCSE
✓
✓
✓
✓
✓
✓
✓
(J276)
✓
✓
✓
✓
✓
✓
✓
✓
(H046)
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
(H446)
AQA GCSE
✓
✓
✓
✓
✓ (8520)
✓
✓
(7516)
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
(7517)
WJEC GCSE
✓
✓
✓
✓
✓
(C00/1157/9)
Examination board mapping
✓
✓
✓
✓
✓
(601/5391/X)
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
(601/5345/3)
Cambridge IGCSE
✓
(0984)
✓
✓
✓
✓
✓
✓
(1CP1) (4CP0)
✓
✓
✓
✓
✓
✓
✓
✓
✓
(601/7343/9) (603/0445/5)
(601/7342/7)
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
(603/0472/8) (603/0471/6)
v
vi
Contents
FUNDAMENTAL DATA STRUCTURES 1
Array 2
Dictionary 6
List 10
In summary 13
SEARCHING ALGORITHMS 96
Binary search 97
Hash table search 103
Linear search 112
In summary 117
vii
FUNDAMENTAL DATA
STRUCTURES
A feature of imperative programming languages, these abstractions of memory allow for the
creation of higher order data structures.
Essential algorithms for A level Computer Science
OCR OCR AQA AQA
Array
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
An array is a collection of items (called elements) of the same data type. An array is used to hold a collection
of data that would otherwise be stored in multiple variables. For example, to store the names of four people,
you could have four variables: name1, name2, name3 and name4. The problem with this approach is that it
would not be possible to refer to a specific name or loop through all the names with a for or while command.
That is because the number or index is part of the identifier (name of the variable). Instead, we need to
declare name1 as name[1] and so on. Note how the index is now enclosed in brackets. Some programming
languages used curved brackets while others use square brackets.
Assigning data to the array can be done using syntax similar to:
name[0] = "Craig"
name[1] = "Dave"
name[2] = "Sam"
name[3] = "Carol"
Notice with four names, the maximum index is 3. That is because arrays are usually zero-indexed — i.e., the
first item is stored at index zero. It is not necessary to do this; you could start at index 1, but why waste
memory unnecessarily?
FUNDAMENTAL DATA STRUCTURES
Now it is possible to use an iteration to output all the data in the array:
For index = 0 to 3
Print(name[index])
Next
This is an extremely useful algorithm that you need to understand at all levels of study. The alternative code
would be:
Print(name1)
Print(name2)
Print(name3)
Print(name4)
It is a misconception that using an array instead of individual variables is more memory-efficient. The same
amount of data still needs to be stored, but an array makes the algorithm scalable. That means we only
need to change the number 3 in the iteration to change the number of names output. You can see how
implementing code with a thousand names without an iteration would not only be time-consuming but also
impractical.
1. All the data elements of an array must contain the same data type. For example, an array of strings
cannot contain an integer.
2. The size of an array is determined when it is declared — i.e., when the algorithm is written — and
cannot be changed when the program is running. This is known as a static data structure.
Arrays can be declared larger than they need to be so that data can be inserted into an empty index later.
This is the case when implementing stacks, queues and trees using arrays. It is not memory-efficient, and
you would need to use either a search or a variable called a pointer to keep track of the last item in the
structure.
The static restriction of arrays is due to memory needing to be allocated for the data. An index register is
used by the CPU to store the index of an array to be accessed, and by incrementing this register, you can
2
Fundamental data structures
access the next item in the array. Therefore, all items in an array must occupy a contiguous memory space in
RAM in what is called the heap, a place in memory where data structures are stored.
Modern programming languages allow for the re-declaration of the size of an array at run-time. They do this
by finding a new place in the heap for the entire data structure and moving it to that location. Therefore, the
lines are somewhat blurred between arrays being static or dynamic data structures. During examinations,
you should always assume arrays to be static.
Arrays may have multiple indexes in what is known as a multi-dimension array. An array of two dimensions is
referred to as a two-dimensional array. By using two dimensions, we can store a grid of data.
Applications of an array
Arrays are used in situations where you have many related data items that require the same data type, to be
accessed either by an index or a searching algorithm. Not all programming languages support arrays, and
lists have become a popular alternative due to their dynamic use of memory.
Operations on an array
Typical operations that can be performed on an array include:
• Re-declare: change the size of the array at run-time (not universally supported)
Craig’n’Dave videos
https://www.craigndave.org/algorithm-array
3
Essential algorithms for A level Computer Science
An array illustrated
Name Board
3 Carol 0 O X
2 Sam 1 O
1 Dave 2 X
0 Craig
FUNDAMENTAL DATA STRUCTURES
Notice that in the example of the two-dimensional array, the first index was chosen to represent the x position
in the grid and the second index represents the y position. This is a useful approach when representing
game-boards in memory, but it is not necessary. It is acceptable for the first index to represent the y position
and the second index to represent the x position. In memory, the grid doesn’t exist at all. Indexes are simply
memory addresses, so the structure itself is abstracted. It is up to the programmer how they visualise the
data structure in their own mind.
4
Fundamental data structures
Efficiency of an array
Time complexity
Space
complexity
Best Average Worst
case case case
Since an array is considered a static data structure, its memory footprint is always known at run-time and,
therefore, it has a constant space complexity of O(1).
Calculating the time complexity with an array is dependent on the algorithms being implemented with it.
Simply assigning or returning an element from an index can be done immediately without searching through
the array — so, at best, it has a constant time complexity of O(1).
However, when implementing a linear search on an array, the average time complexity becomes linear: O(n).
5
Essential algorithms for A level Computer Science
OCR OCR AQA AQA
Dictionary
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
A dictionary is a data structure used for storing related data. It is often referred to as an associative
array, as it stores two sets of data that are associated with each other by mapping keys to values. It is
also known as a key-value pair or hashmap. A dictionary is an implementation of a hash table search
but is usually a fundamental data type in high-level languages. Instead of a programmer having to
code a hash table search, high-level languages provide the dictionary data type within the language.
The single value in a key-value pair can also be a list of items, making this a very versatile data
structure for a programmer.
Applications of a dictionary
Dictionaries are used as an extremely efficient alternative to searching algorithms. They are also
suitable for databases. For example, NoSQL — an acronym for “not only SQL” — makes extensive use
of dictionaries. Symbol tables in the syntax analysis stage of compilation make use of dictionaries, as
do dictionary encoding compression algorithms. Objects in JavaScript and JSON, a data-interchange
standard also make use of the data structure. Dictionaries are also ideal for implementing depth and
FUNDAMENTAL DATA STRUCTURES
breadth-first searching algorithms with graph data structures, and are useful in any situation where an
efficient value look-up or search is required like telephone numbers in contact lists.
Operations on a dictionary
Typical operations that can be performed on a dictionary include:
Craig’n’Dave video
https://www.craigndave.org/algorithm-dictionary
3. If the key exists, return the matching value; if not, return an error
6
Fundamental data structures
Dictionary illustrated
If illustrations help you remember, this is how you can picture a dictionary:
Key Value
England London
Find
Key: Germany France Paris
Germany Berlin
USA Washington DC
Return
Value: Berlin
Canada Ottawa
7
Essential algorithms for A level Computer Science
It is worth noting that unlike an array or list, there is no known order to the items in a dictionary due to the
hashing function. Even though England was entered first into the dictionary in the code, it is not necessarily
the first item in the structure. Therefore, it doesn’t make sense to try to return an item from an index. Some
languages support such commands, but the returned value often seems like a random item.
8
Fundamental data structures
Time complexity
Space
complexity
Best Average Worst
case case case
The dictionary data type aims to be extremely efficient, returning a value without having to search for the key
in the data set. The implementation of the dictionary data type is abstracted from the programmer. That
means we cannot know exactly how the programming language is facilitating the actual storage and retrieval
of items from the dictionary. Therefore, it is not possible to accurately determine the efficiency. However, the
aim with all key-value pair approaches is to achieve an average constant time complexity — O(1) — for
searching, adding and deleting data from a dictionary.
9
Essential algorithms for A level Computer Science
OCR OCR AQA AQA
List
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
A list is a collection of elements of different data types. In practice, a list is either an array in disguise or an
implementation of a linked list (see higher order data structures). An array is typically used if the limits of the
structure are known when the program is written — for example, we know chess is played on an 8x8 board.
Conversely, a list is used when the number of items is more variable such as a player’s inventory in an RPG
game. In reality, either an array or a list can usually be used for the same purpose. Some programming
languages support both arrays and lists while some only support one or the other.
Although a list should support elements of different data types, some programming languages require items
in a list to be of the same data type, largely because the list is implemented as an array with methods to add
and delete data from the structure. Two-dimensional arrays can also be implemented as a list of lists:
Unlike arrays, a list is considered to be a dynamic data structure because the data it holds increases and
decreases the size of the structure at run-time. As a result, lists are usually considered more memory-
efficient than arrays, although this does depend entirely on the purpose and use of the structure.
Unlike arrays, which use an index with contiguous memory space, lists can occupy any free memory space
and do not need to be stored contiguously. Instead, each item in the list points to the memory location of the
next item in the structure. A truly dynamic list takes memory from the heap (a place in memory where data
structures are stored) as items are added to the list and returns memory to the heap when items are deleted.
However, in some programming languages, lists are actually arrays in disguise, so it is possible to use an
index to refer to a particular element too. Equally, a searching algorithm such as a linear search can also be
used to retrieve a specific element from an index. On a true list, this operation will be less efficient than
using an index to specify a relative address for an element of an array.
Applications of a list
Lists are used in situations where a set of related data items need to be stored, especially when the number
of items is variable and not known in advance. Some programming languages may not support lists —
instead, a programmer would need to create their own linked list structure.
10
Fundamental data structures
Operations on a list
Typical operations that can be performed on a list include:
Due to the fact that lists are sometimes implemented with arrays, some programming languages facilitate
the use of indexes with lists. Therefore, it might be possible to retrieve or assign a specific item in a list by
using the index.
Craig’n’Dave videos
A list illustrated
Mutable means the size and data in a structure can change at run-time.
11
Essential algorithms for A level Computer Science
Efficiency of a list
Time complexity
Space
complexity
Best Average Worst
case case case
Since a list is considered a dynamic data structure, its memory footprint is not known in advance, so it has a
linear space complexity: O(n).
Calculating the time complexity with a list depends on the algorithms being implemented with it. Adding an
item to the start of a list can be done immediately, so at best, it has a constant time complexity: O(1). Adding
an item into the list or removing an item requires the item to be found. Therefore, a linear search would be
required, resulting in linear complexity: O(n).
FUNDAMENTAL DATA STRUCTURES
With lists of lists, if a nested loop is required to iterate over all the elements, it becomes polynomial: O(n2).
However, this is an uncommon use of a list, so its worst-case time complexity is usually stated as linear: O(n).
12
Fundamental data structures
In summary
Array Dictionary List
Elements are all of the same data Stores key-value pairs Elements can be different data
type types in some languages
Elements are stored in Values are stored in a location Elements are stored in any free
contiguous memory space, determined by a hashing function memory space, making use of
making use of an index register pointers to link elements together
Uses an index to retrieve an Uses a hash table search to Uses a linear search to retrieve
element at a specific location retrieve a value from a key an element at a specific location
Indexes are not deleted, so when Key-value pairs can be deleted Elements can be deleted
13
HIGHER ORDER DATA
STRUCTURES
Supported by some programming languages within the command set, programmers often
implement these structures using the fundamental data structures as building blocks.
Higher order data structures
OCR OCR AQA AQA
Binary Tree
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
A binary tree is a data structure consisting of nodes and pointers. It is a special case of a graph where each
node can only have zero, one or two pointers. Each pointer connects to a different node.
The first node is known as the root node. The connected nodes are known as child nodes, while the nodes at
the very bottom of the structure are referred to as leaf nodes.
Root node
B G Child node
Since a binary tree is essentially a graph, the nodes and pointers can also be described as vertices and
edges, respectively. Binary trees can be represented in memory with dictionaries:
tree = {"E":["B","G"],"B":["A","C"],"G":["F","H"],"A":[],"C":[],"F":[],"H":[]}
See the “Graphs” chapter for an alternative implementation with associated breadth and depth-first
algorithms that can also be applied to binary trees.
It is more common to see a binary tree represented as objects consisting of a node with a left and right
pointer. Typically, when implementing the binary tree, a node has its left and/or right pointer set to be
nothing or null if there is no child node.
15
Essential algorithms for A level Computer Science
• Breadth-first search: traverses the tree, starting at the root node, visiting each node at the same level
before going deeper into the structure
A traversal refers to the process of visiting each node in a binary tree only once. It is also a term commonly
used with graphs. A traversal is required to find, update and output the data in a node.
HIGHER ORDER DATA STRUCTURES
Craig’n’Dave videos
https://www.craigndave.org/algorithm-binary-tree
a. The new node becomes the first item. Create a start pointer to it.
b. If the new node should be placed before the current node, follow the left pointer.
c. If the new node should be placed after the current node, follow the right pointer.
e. If the new node should be placed before the current node, set the left pointer to be the new
node.
f. If the new node should be placed before the current node, set the right pointer to be the new
node.
16
Higher order data structures
B G
A C F H
Start at the root node. D is less than E; follow the left pointer. D is more than B; follow the right pointer. D is
A self-balancing operation decreases the efficiency of adding and deleteing items but
increases the efficiency of searching the structure. In this case, it is a very efficient way
to implement a dictionary.
17
Essential algorithms for A level Computer Science
Else
previous_node.rigth_pointer = new_node
End If
End If
End If
18
Higher order data structures
2. While the current node exists and it is not the one to be deleted:
b. If the item to be deleted is less than the current node, follow the left pointer.
c. If the item to be deleted is greater than the current node, follow the right pointer.
Assuming the node exists and therefore has been found, there are three possibilities that need to be
considered when deleting a node from a binary tree:
3. If the previous node is greater than the current node, the previous node’s left pointer is set to null.
4. If the previous node is less than the current node, the previous node’s right pointer is set to null.
a. Set the previous node’s left pointer to the current node’s left child.
a. Set the previous node’s right pointer to the current node’s right child.
19
Essential algorithms for A level Computer Science
In this situation, we can make use of the fact that the data in a binary tree can be represented in a different
way. For example:
G E G
F H H
HIGHER ORDER DATA STRUCTURES
One approach to deleting node G is to find the smallest value in the right sub-tree (known as the successor)
and move it to the position occupied by G. The smallest value in the right sub-tree will always be the left-most
node in the right-sub tree. The approach is known as the Hibbard deletion algorithm. In the example above,
there is a special case because there is no left sub-tree from node H. Therefore, we can move H into the
position occupied by G.
Becomes…
E E
G H
F H F
Notice the tree has become unbalanced, as F could now be the root node with E to the left and H to the right.
Therefore, there is an impact on the efficiency of algorithms on binary trees when nodes are added and
deleted over time.
As with all algorithms, there are alternative approaches, depending on how the structure is implemented by
the programmer. One alternative could be to use the predecessor node (the right-most node in the left sub-
tree) instead of the successor node. A simple alternative would be to introduce another attribute to each
node to flag whether the node is deleted or not and skip deleted nodes when traversing the tree. However,
this approach does increase the space complexity as nodes are added and deleted.
20
Higher order data structures
Hibbard deletion
7. If a right node exists, find the smallest leaf node in the right sub-tree.
A C F H
The node can simply be removed. The previous node’s left or right pointer is set to null.
B G
C F H
Here, B is being deleted. The previous node’s left or right pointer is set to the left or right pointer of the node
being deleted. In this case, E’s left pointer becomes C.
21
Essential algorithms for A level Computer Science
E E
B G B H
A C F H A C F H
In this special case, there is no left sub-tree from H, so G is replaced with H and its right pointer is set to
null. An example with a left sub-tree is shown below.
Becomes…
E E
B F B H
A C G J A C G J
H H
Here F is illustrated as being deleted. The lowest node in the right sub-tree is H. F is replaced with H, and J’s
left pointer is set to null. However, it is a mistake to assume that it’s always the first left node from the right
sub-tree that is found. That is not the case. It is the left-most leaf node that is swapped. Therefore, if H had
a left pointer, that would be followed until the leaf node is found.
22
Higher order data structures
23
Essential algorithms for A level Computer Science
3. Follow the left pointer and repeat from step 2 recursively until there is no pointer to follow.
4. Follow the right pointer and repeat from step 2 recursively until there is no pointer to follow.
B G
A C F H
Note the markers on the left side of each node. As you traverse the tree, starting from the root, the nodes
are only output when you pass the marker: E, B, A, C, G, F, H. You can illustrate like this in exams to
demonstrate your understanding of the algorithm.
24
Higher order data structures
2. Follow the left pointer and repeat from step 2 recursively until there is no pointer to follow.
4. Follow the right pointer and repeat from step 2 recursively until there is no pointer to follow.
25
Essential algorithms for A level Computer Science
B G
A C F H
HIGHER ORDER DATA STRUCTURES
Note the markers on the bottom of each node. As you traverse the tree, starting from the root, the nodes are
only output when you pass the marker: A, B, C, E, F, G, H. You can illustrate like this in exams to demonstrate
your understanding of the algorithm.
To output the nodes in reverse order, you simply reverse the algorithm by following the right pointers before
outputting the node, and then following the left pointers.
26
Higher order data structures
2. Follow the left pointer and repeat from step 2 recursively until there is no pointer to follow.
3. Follow the right pointer and repeat from step 2 recursively until there is no pointer to follow
B G
A C F H
Note the markers on the right side of each node. As you traverse the tree, starting from the root, the nodes
are only output when you pass the marker: A, C, B, F, H, G, E. You can illustrate like this in exams to
demonstrate your understanding of the algorithm.
27
Essential algorithms for A level Computer Science
b. If the current node has a left child, enqueue the left node.
c. If the current node has a right child, enqueue the right node.
B G
A C F H
28
Higher order data structures
class node:
data = None
left_pointer = None
right_pointer = None
start = None
def delete(self,item):
#Using Hibbard's algorithm (leftmost node of right sub-tree is the successor)
29
Essential algorithms for A level Computer Science
previous.left_pointer = current_node.left_pointer
else:
previous.right_pointer = current_node.left_pointer
elif current_node.left_pointer == None:
#Node has one right child
if previous.data < current_node.data:
previous.left_pointer = current_node.right_pointer
else:
previous.right_pointer = current_node.right_pointer
else:
#Node has two children
right_node = current_node.right_pointer
if right_node.left_pointer != None:
#Find the smallest value in the right sub-tree (successor node)
smallest = right_node
while smallest.left_pointer != None:
previous = smallest
smallest = smallest.left_pointer
#Change the deleted node value to the smallest value
current_node.data = smallest.data
#Remove the successor node
previous.left_pointer = None
else:
#Handle special case of no left sub-tree from right node
current_node.data = right_node.data
current_node.right_pointer = None
def preorder(self,current_node):
if current_node != None:
30
Higher order data structures
def inorder(self,current_node):
if current_node != None:
#Visit each node: LNR
if current_node.left_pointer != None:
self.inorder(current_node.left_pointer)
print(current_node.data)
if current_node.right_pointer != None:
self.inorder(current_node.right_pointer)
def postorder(self,current_node):
if current_node != None:
#Visit each node: LRN
if current_node.left_pointer != None:
bt.inorder(bt.start)
bt.postorder(bt.start)
31
Essential algorithms for A level Computer Science
new_node.right_pointer = Nothing
'Tree is empty
If IsNothing(current_node) Then
start = new_node
Else
'Find correct position in the tree
Dim previous As node
While Not IsNothing(current_node)
previous = current_node
If item < current_node.data Then
current_node = current_node.left_pointer
Else
current_node = current_node.right_pointer
End If
End While
If item < previous.data Then
previous.left_pointer = new_node
Else
previous.right_pointer = new_node
End If
Return True
End If
Catch
Return False
End Try
End Function
32
Higher order data structures
33
Essential algorithms for A level Computer Science
Else
'Handle special case of no left sub-tree from right node
current_node.data = right_node.data
current_node.right_pointer = Nothing
End If
End If
End If
End Sub
End Sub
End Class
34
Higher order data structures
bt.delete("Delaware")
bt.inorder(bt.start)
Console.ReadLine()
End Sub
End Module
35
Essential algorithms for A level Computer Science
Object Array
implementation implementation
O(n) O(1)
Linear Constant
A binary tree is a dynamic data structure, meaning its memory footprint grows and shrinks as data is added
and deleted from the structure when implemented using object-oriented techniques. Using only as much
memory as needed is the most efficient way of implementing a binary tree. Alternatively, you could
implement a binary tree using a dictionary or an array. With an array, you would be using a static data
structure, so the memory footprint would be static, but the size of the structure would also be limited.
Assuming the binary tree is not pre-populated, to establish the tree from a list of data items will require each
item to be added sequentially. However, the input order of the items does not matter — therefore, the number
of operations depends on the number of data items: O(n).
36
Higher order data structures
Adding, deleting and searching nodes on a binary tree uses a technique called divide and conquer. With a
balanced tree, the number of items to consider is halved each time a node is visited, providing logarithmic
complexity: O(log n). If the tree is unbalanced, it becomes the same as a linear search to determine the
location of an item to add or delete, demoting it to linear complexity: O(n).
Balanced tree where each level is complete: Unbalanced tree holding the same data:
E A
B G B
A C F H C
In the best-case scenario, the node to be found is the root node, so it can always be found first: O(1).
The traversal algorithms all require each node to be visited. As the number of nodes in the tree increases, so
too does the execution time. Therefore, traversals are of linear complexity: O(n).
37
Essential algorithms for A level Computer Science
OCR OCR AQA AQA
Graph
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
A graph is a data structure consisting of nodes (vertices) and pointers (edges). It differs from a linked list and
binary tree because each vertex can have more than one or two edges and point to any vertex in the data
structure. The edges can either point in one direction, known as a directed graph, or without specifying a
direction (bidirectional), referred to as an undirected graph. Graphs can also be weighted, with each edge
given a value representing a relationship such as distance between the vertices. Although it is common to
refer to vertices and edges when discussing graphs, this is largely due to their application in mathematics. A
vertex can also be referred to as a node and an edge as a pointer.
A Node (Vertex) A
Pointer (Edge)
B C D B C D
HIGHER ORDER DATA STRUCTURES
E F E F
G G
Typically, the undirected graph illustrated can be represented with the following syntax in Python, although
this will differ from language to language:
graph = {"A":["B","C","D"],"B":["A","E"],"C":["A","D"],"D":["A","C","F"],"E":["B","G"],"F":["D"],"G":["E"]}
Edges { (A,B), (A,C), (A,D), (B,A,), (B,E), (C,A), (C,D), (D,A), (D,C), (D,F), (E,B), (E,G), (F,D), (G,E) }
graph = {"A":["B","C","D"],"B":["E"],"C":["D"],"D":["F"],"E":["G"],"F":[],"G":[]}
It is also possible for each edge to have an edge value associated with it, typically also referred to as a cost.
This is necessary for some algorithms using graph data structures such as Dijkstra’s algorithm.
graph = {"A":["B":2,"C":6,"D":3]
38
Higher order data structures
This would mean the edge value between A and B is two, the edge value between A and C is six, and the edge
value between A and D is three. Edge values can represent many things between two vertices, including
distance, time or bandwidth, depending on what the data structure is being used for.
While it is typical to store graphs as objects or using a dictionary, known as adjacency lists, it is also possible
to store a graph using an array — or a list of lists. This implementation is known as an adjacency matrix, with
rows and columns representing vertices and edges respectively.
An example of an adjacency matrix for the undirected graph is shown below. The rows and columns are not
usually labelled with the vertices when implementing this method, but they are shown here to aid
understanding. A “1” represents an edge existing between the two vertices.
A B C D E F G
A 0 1 1 1 0 0 0
B 1 0 0 0 1 0 0
C 1 0 0 1 0 0 0
E 0 1 0 0 0 0 1
F 0 0 0 1 0 0 0
G 0 0 0 0 1 0 0
Applications of a graph
Graphs have many uses in computer science — e.g., mapping road networks for navigation systems, storing
social network data, resource allocation in operating systems, representing molecular structures and
geometry.
39
Essential algorithms for A level Computer Science
Abstraction
It is worth noting that graphs and trees are essentially the same data structure, with a binary tree being a
special type of graph. Therefore, operations on a binary tree such as a traversal can also be performed on
graph structures too.
In the illustrations below, we see how graph data can remain the same, even though the structure looks
different. What is important is not what the graph looks like but which vertices are connected. Using only the
necessary detail and discarding unnecessary detail is known as abstraction.
B C D
Is the same as:
E F
HIGHER ORDER DATA STRUCTURES
G A C
E F
D
G
This becomes more apparent when you consider how navigation systems store map data. They simply need
to know the type of each road, how long they are and which ones are connected. An obvious example of
abstraction is the map of the London Underground. It bears no resemblance to where the stations are
located within the city. All that is important is which stations are connected on which line.
Similarly, it is important that students don’t concern themselves with how a graph actually looks or what
operation is being asked for in an examination. Simply follow the algorithm on the structure given — e.g., a
pre-order traversal on a graph, when you might expect that only to be relevant to a binary tree.
Abstraction can often help to simplify a problem. This makes it easier for humans to understand and
communicate data. Abstractions are very useful in explaining complex situations. Sometimes links between
seemingly unrelated problems can be seen with abstractions. Computers do not benefit from abstract
illustrations.
40
Higher order data structures
Operations on a graph
Typical operations that can be performed on a graph include:
• Get edge value: returns the value of an edge between two vertices
• Set edge value: sets the value of an edge between two vertices
• Breadth-first search: traverses the graph, starting at the root node, visiting each neighbouring node
before moving to the vertices at the next depth
• Pre-order traversal: a type of depth-first search (see binary tree for an example)
• In-order traversal: a type of depth-first search (see binary tree for an example)
• Post-order traversal: a type of depth-first search (see binary tree for an example)
Craig’n’Dave videos
https://www.craigndave.org/algorithm-graph
41
Essential algorithms for A level Computer Science
b. Repeat from step 2 until all the edges have been visited.
4. Dequeue the queue and set the item removed as the current vertex.
HIGHER ORDER DATA STRUCTURES
42
Higher order data structures
A
A
B C D
E F
f
b
A
A B
B C D
E F
Step 3 Enqueue C.
f
b
A
A B C
B C D
E F
43
Essential algorithms for A level Computer Science
Step 4 Enqueue D.
f
b
A
A B C D
B C D
E F
b
A
B C D
B C D
E F
44
Higher order data structures
Step 6 Enqueue E.
f
b
AB
A
B C D E
B C D
E F
f
ABC
B C D
E F
f
b
ABCD
A
B C D E
B C D
E F
45
Essential algorithms for A level Computer Science
Step 9 Enqueue F.
f
b
ABCD
A
B C D E F
B C D
E F
b
A
B C D E F
B C D
E F
Step 11 Enqueue G.
f
b
ABCDE
A
B C D E F G
B C D
E F
46
Higher order data structures
f
b
ABCDEF
A B C D E F G
B C D
E F
B C D
E F
f
b
ABCDEFG
A
B C D E F G
B C D
E F
47
Essential algorithms for A level Computer Science
It is worth noting that there is more than one valid output from a breadth-first search. This implementation
examined the edges from A in the order B, C, D, from left to right. However, it would be perfectly valid to
examine them in reverse order too, from right to left. To achieve this, the edges would need to be enqueued
in reverse order.
ABCDEFG ADCBFEG
A
This is the most common output Reverse traversal.
shown in examples of this algorithm,
B C D so it is the one you should illustrate in
exams unless the question specifically
states otherwise.
E F
G
HIGHER ORDER DATA STRUCTURES
48
Higher order data structures
class current_vertex:
data = None
pointer = None
front_pointer = None
back_pointer = None
def enqueue(self,item):
#Check queue overflow
try:
#Push the item
new_current_vertex = queue.current_vertex()
new_current_vertex.data = item
#Empty queue
if self.back_pointer == None:
self.front_pointer = new_current_vertex
def dequeue(self):
#Check queue underflow
if self.front_pointer != None:
#Dequeue the item
popped = self.front_pointer.data
self.front_pointer = self.front_pointer.pointer
#When the last item is dequeued reset the pointers
if self.front_pointer == None:
self.back_pointer = None
return popped
else:
return None
Function dequeue()
'Check queue underflow
If Not IsNothing(front_pointer) Then
'Dequeue the item
Dim popped As String = front_pointer.data
front_pointer = front_pointer.pointer
'When the last item is dequeued reset the pointers
If IsNothing(front_pointer) Then
back_pointer = Nothing
End If
Return popped
50
Higher order data structures
Else
Return Nothing
End If
End Function
End Class
Sub main()
Dim graph = New Dictionary(Of String, List(Of String)) From {
{"A", New List(Of String) From {"B", "C", "D"}},
{"B", New List(Of String) From {"A", "E"}},
{"C", New List(Of String) From {"A", "D"}},
{"D", New List(Of String) From {"A", "C", "F"}},
{"E", New List(Of String) From {"B", "G"}},
{"F", New List(Of String) From {"D"}},
{"G", New List(Of String) From {"E"}}
}
Dim visited As New List(Of String)
Dim q As New queue
Dim current_vertex As String = "A"
Do While current_vertex <> Nothing
51
Essential algorithms for A level Computer Science
4. Pop the stack and set the item removed as the current vertex.
52
Higher order data structures
It is worth noting that there is more than one valid output from a depth-first search. You will typically see the
left-most path being traversed first in mark schemes and other sources. However, it is perfectly valid to
follow the right-most path first or to choose any random edge from a vertex to follow. In doing so, the results
of the algorithm will be different, but nonetheless, it is still a depth-first search.
Providing the algorithm follows edges to the bottom of the structure for any vertex, the output is valid, as
shown below.
53
Essential algorithms for A level Computer Science
A
A
B C D
HIGHER ORDER DATA STRUCTURES
E F
Step 2 Push D.
A
A
B C D
→ D
E F
54
Higher order data structures
Step 3 Push C.
A
A
B C D
→ C
E F D
Step 4 Push B.
A
A
AB
A
B C D B
→ C
D
E F
55
Essential algorithms for A level Computer Science
Step 6 Push E.
AB
A
B C D
→ E
C
E F D
ABE
HIGHER ORDER DATA STRUCTURES
B C D E
→ C
D
E F
Step 8 Push G.
ABE
A
B C D → G
C
D
E F
56
Higher order data structures
ABEG
A
B C D G
→ C
E F D
B C D G
C
→ D
E F
Step 11 Push D. Although D is already on the stack, it has not been visited.
This is resolved in the final step.
ABEGC
A
B C D G
→ D
D
E F
57
Essential algorithms for A level Computer Science
ABEGCD
A
B C D G
D
→ D
E F
Step 13 Push F.
ABEGCD
HIGHER ORDER DATA STRUCTURES
B C D
G
→ F
E F D
ABEGCDF
A
B C D G
F
→ D
E F
58
Higher order data structures
ABEGCDF
A
B C D
G
F
E F D
59
Essential algorithms for A level Computer Science
Stepping through the depth-first search using recursion and the call
stack
Although the use of a user-defined stack delivers the correct output, using only the procedure call stack with
recursion is somewhat simpler. This approach is most likely to feature in mark schemes of depth-first search-
related questions.
Step 1 Start at a root vertex. Push A to the stack and list of visited vertices.
A
A
B C D
E F → A
G
HIGHER ORDER DATA STRUCTURES
AB
A
B C D
→ B
A
E F
60
Higher order data structures
ABE
A
B C D → E
B
A
E F
ABEG
A
ABEGC
A
B C D E
→ C
A
E F
61
Essential algorithms for A level Computer Science
ABEGCD
A
B C D → D
C
A
E F
ABEGCDF
HIGHER ORDER DATA STRUCTURES
A
→ F
B C D D
C
A
E F
ABEGCDF
A
F
B C D D
C
A
E F
62
Higher order data structures
class current_vertex:
data = None
pointer = None
stack_pointer = None
def push(self,item):
#Check stack overflow
try:
#Push the item
new_current_vertex = stack.current_vertex()
new_current_vertex.data = item
new_current_vertex.pointer = self.stack_pointer
self.stack_pointer = new_current_vertex
return True
except:
return False
def pop(self):
#Check stack underflow
if self.stack_pointer != None:
#Pop the item
popped = self.stack_pointer.data
self.stack_pointer = self.stack_pointer.pointer
return popped
else:
return None
63
Essential algorithms for A level Computer Science
64
Higher order data structures
Function pop()
'Check stack underflow
If Not IsNothing(stack_pointer) Then
'Pop the item
Dim popped As String = stack_pointer.data
stack_pointer = stack_pointer.pointer
Return popped
Else
Return Nothing
End If
End Function
End Class
Sub main()
Dim graph = New Dictionary(Of String, List(Of String)) From {
{"A", New List(Of String) From {"B", "C", "D"}},
{"B", New List(Of String) From {"E", "A"}},
{"C", New List(Of String) From {"A", "D"}},
{"D", New List(Of String) From {"A", "C", "F"}},
{"E", New List(Of String) From {"B", "G"}},
{"F", New List(Of String) From {"D"}},
{"G", New List(Of String) From {"E"}}
}
65
Essential algorithms for A level Computer Science
Console.ReadLine()
End Sub
End Module
Sub main()
Dim graph = New Dictionary(Of String, List(Of String)) From {
{"A", New List(Of String) From {"B", "C", "D"}},
{"B", New List(Of String) From {"E", "A"}},
{"C", New List(Of String) From {"A", "D"}},
{"D", New List(Of String) From {"A", "C", "F"}},
{"E", New List(Of String) From {"B", "G"}},
{"F", New List(Of String) From {"D"}},
{"G", New List(Of String) From {"E"}}
}
66
Higher order data structures
dfs(graph, "A")
For Each current_vertex In visited
Console.WriteLine(current_vertex)
Next
Console.ReadLine()
End Sub
End Module
Efficiency of a graph
Space complexity of a graph
Object Array
implementation implementation
O(n) O(1)
Linear Constant
When implementing a graph as an array, the memory footprint remains constant, but operations such as
adding new vertices would require the matrix to be recreated. Graphs would not be stored using arrays if this
were a requirement, as the operation would be too slow. Therefore, we can assume a constant space
complexity at the expense of data structure flexibility.
Array
Object
implementation
Operation implementation
Adjacency
Adjacency list
matrix
O(V+E) O(V2)
Storing the graph
Linear Polynomial
O(1) O(V2)
Add a vertex
Constant Polynomial
O(1) O(1)
Add an edge
Constant Constant
67
Essential algorithms for A level Computer Science
O(E) O(V2)
Remove a vertex
Linear Polynomial
O(V) O(1)
Remove an edge
Linear Constant
When expressing the time complexity of operations on a graph, V represents the number of vertices and E
represents the number of edges. Storing a graph using object-oriented techniques is far superior to using an
array in most cases — the only exception being removing edges and checking the adjacency of two vertices.
Both of these operations are constant — O(1) — with an adjacency matrix, as you can immediately return an
element of an index in an array.
However, they are linear — O(V) — with an object-oriented approach since it is necessary to follow the edges
to find a particular vertex.
The breadth-first search can be used to find a single vertex in the structure or output all the data stored in it.
At best, the graph contains just one vertex, so it can be found immediately — O(1) — but if that were the case,
there wouldn’t be any point in having the data structure at all.
Usually, many vertices need to be visited, so it is of linear complexity — O(V+E), where V represents the
number of vertices and E the number of edges. A full traversal requires visiting every vertex and considering
every edge from the vertex. Therefore, it has polynomial complexity: O(V2). Notice the nested loop in the
iteration and recursion examples, which are both an indication of polynomial complexity.
68
Higher order data structures
OCR OCR AQA AQA
Linked list
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
A linked list is a data structure that provides a foundation upon which other data structures can be built such
as stacks, queues, graphs and trees. A linked list is constructed from nodes and pointers. A start pointer
identifies the first node. Each node contains data and a pointer to the next node in the linked list. Many
programming languages support lists in addition to arrays. Data in lists can be stored anywhere in memory,
with pointers indicating the address of the next item in the structure.
Node
Data
While a linked list can be implemented using a static array, its true benefit becomes evident when
implemented using object-oriented techniques.
Arrays, being static data structures, are stored contiguously in memory, requiring the use of an index register
By adding an extra pointer, nodes can point to the previous and next items, known as a doubly linked list:
By making the last node point to the first node, a circular linked list can be created:
A circular linked list can also have an additional pointer for each node pointing to the previous item, turning it
into a doubly circular linked list.
69
Essential algorithms for A level Computer Science
Craig’n’Dave videos
https://www.craigndave.org/algorithm-linked-list
a. The new node becomes the first item. Create a start pointer to it.
a. The new node becomes the first node. Change the start pointer to it.
b. If the data in the current node is less than the value of the new node:
ii. Repeat from step 5b until the correct position is found or the end of the linked list is
reached.
c. The new node is set to point where the previous node pointed.
A C F G H
71
Essential algorithms for A level Computer Science
2. If the first item is the item to delete, set the start pointer to nothing.
d. Repeat from step 3b until the item is found or the end of the linked list is reached.
A C F G H
72
Higher order data structures
5. Repeat from step 3 until the end of the linked list is reached.
Searching a linked list with commands such as if x in list abstracts the actual
process happening internally, which may be a linear search.
73
Essential algorithms for A level Computer Science
class node:
data = None
pointer = None
start = None
def add(self,item):
#Check memory overflow
try:
new_node = linkedlist.node()
new_node.data = item
current_node = self.start
#List is empty
if current_node == None:
new_node.pointer = None
self.start = new_node
HIGHER ORDER DATA STRUCTURES
else:
#Item becomes the new start item
if item < current_node.data:
self.start = new_node
new_node.pointer = current_node
else:
#Find correct position in the list
while current_node != None and current_node.data < item:
previous_node = current_node
current_node = current_node.pointer
new_node.pointer = previous_node.pointer
previous_node.pointer = new_node
return True
except:
return False
def delete(self,item):
current_node = self.start
#Check the list is not empty
if current_node != None:
#Item is the start node
if item == current_node.data:
self.start = current_node.pointer
else:
#Find item in the list
while current_node != None and item != current_node.data:
previous = current_node
current_node = current_node.pointer
previous.pointer = current_node.pointer
74
Higher order data structures
def output(self):
items = []
current_node = self.start
if current_node != None:
#Visit each node
while current_node != None:
items.append(current_node.data)
current_node = current_node.pointer
return items
For example, if asked to show the steps of a sorting algorithm, do not write
“list.sort()” even though this works in languages such as Python.
75
Essential algorithms for A level Computer Science
new_node.pointer = Nothing
start = new_node
Else
'Item becomes the new start item
If item < current_node.data Then
start = new_node
new_node.pointer = current_node
Else
'Find correct position in the list
Dim previous_node As node
Do While Not IsNothing(current_node) AndAlso current_node.data < item
previous_node = current_node
current_node = current_node.pointer
Loop
new_node.pointer = previous_node.pointer
previous_node.pointer = new_node
End If
End If
Return True
Catch
Return False
End Try
End Function
76
Higher order data structures
Function output()
Dim items As New List(Of String)
Dim current_node As node = start
If Not IsNothing(current_node) Then
End Module
77
Essential algorithms for A level Computer Science
Object Array
implementation implementation
O(n) O(1)
Linear Constant
A linked list is a dynamic data structure. When implemented using object-oriented techniques, its memory
footprint grows and shrinks as data is added and deleted from the structure. Using only as much memory as
needed is the most efficient way of implementing a linked list.
However, a linked list could also be implemented using an array. In this case, the dynamic structure is
created upon a static data structure and the memory footprint remains constant. This method is inefficient
because the linked list will be reserving more memory than it needs unless it is full.
If the order of items is important, you can find the correct position to add or delete a node using a linear
search. At best, a new node becomes the first node O(1), or it is the first to be deleted: O(1). Typically, this
will not be the case, resulting in a linear complexity when finding the correct position to add or delete a node:
O(n).
If the order of the items in a linked list is not important, adding a new item will always have a time complexity:
O(1). In this situation, a pointer to the last item would be used to prevent having to follow the pointers from
node to node to find the position of the last item in the structure.
78
Higher order data structures
OCR OCR AQA AQA
Queue
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
A queue is a linear data structure. Items are “enqueued” at the back of the queue and “dequeued” from the
front of the queue. It is also possible to “peek” at the front item without deleting the item.
Imagine a queue at a checkout. The person at the front of the queue is served first, and people join the back
of the queue. This strict process can also allow for people to jump the queue — when implemented in
computer science, this is known as a priority queue. In special circumstances, new items can join the front or
the back of the queue.
A queue has a “back pointer” that always points to the last item in the queue, sometimes referred to as a tail
pointer. A queue also has a “front pointer” that always points to the first item in the queue, sometimes
referred to as a head pointer.
An attempt to enqueue an item to an already-full queue is called a queue overflow, while trying to dequeue
an item from an empty queue is called a queue underflow. Both should be considered before proceeding to
enqueue or dequeue the item in question.
F G D A C
0 1 2 3 4 5 6
F G D A C
79
Essential algorithms for A level Computer Science
Applications of a queue
HIGHER ORDER DATA STRUCTURES
Queues are used for process scheduling, transferring data between processors and printer spooling. They are
also used to perform breadth-first searches on graph data structures.
Operations on a queue
Typical operations that can be performed on a queue include:
• Peek: returning the value from the front of the queue without removing it
Craig’n’Dave videos
https://www.craigndave.org/algorithm-queue
80
Higher order data structures
4. If this is the first node in the list, the front pointer is set to point to the new node.
81
Essential algorithms for A level Computer Science
class node:
data = None
pointer = None
front_pointer = None
back_pointer = None
def enqueue(self,item):
#Check queue overflow
try:
#Push the item
new_node = queue.node()
new_node.data = item
#Empty queue
if self.back_pointer == None:
self.front_pointer = new_node
def dequeue(self):
#Check queue underflow
if self.front_pointer != None:
#Pop the item
popped = self.front_pointer.data
self.front_pointer = self.front_pointer.pointer
#When the last item is popped reset the pointers
if self.front_pointer == None:
self.back_pointer = None
return popped
else:
return None
def peek(self):
#Check stack underflow
if self.front_pointer != None:
#Peek the item
return self.front_pointer.data
else:
return None
83
Essential algorithms for A level Computer Science
Function dequeue()
'Check queue underflow
If Not IsNothing(front_pointer) Then
'Dequeue the item
84
Higher order data structures
Function peek()
'Check queue underflow
If Not IsNothing(front_pointer) Then
'Peek the item
Return front_pointer.data
Else
Return Nothing
End If
85
Essential algorithms for A level Computer Science
Efficiency of a queue
Space complexity of a queue
Object Array
implementation implementation
O(n) O(1)
Linear Constant
A queue is a dynamic data structure. When implemented using object-oriented techniques, its memory
footprint grows and shrinks as data is enqueued and dequeued from the structure. Using only as much
memory as needed is the most efficient way of implementing a queue.
However, a queue could also be implemented using an array. In this case, the dynamic structure has been
created upon a static data structure, so the memory footprint remains constant. This method is inefficient
because the queue will be reserving more memory than it needs unless it is full.
Items are enqueued at the back of the queue and dequeued from the front. Therefore, the time complexity of
all operations on a queue is constant: O(1).
86
Higher order data structures
OCR OCR AQA AQA
Stack
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
A stack is an essential data structure for the operation of a computer. Items are “pushed” onto the top of the
stack when added and “popped” off the top of the stack when deleted. It is also possible to “peek” at the top
item without deleting it.
Imagine a stack of coins. A coin can be added or removed from the top but not the middle. The only way of
accessing items in the stack is from the top.
A stack is known as a last-in, first-out or LIFO structure. The last item pushed must be the first item popped.
A stack has a “stack pointer” that always points to the node at the top of the stack.
An attempt to push an item to an already-full stack is called a stack overflow, while attempting to pop an item
from an empty stack is called a stack underflow. Both should be considered before proceeding to push or
pop the item in question.
A stack is often implemented using an array but can also be created using object-oriented techniques.
C 6
5
A
4 C
D 3 A
2 D
G
1 G
F
0 F
87
Essential algorithms for A level Computer Science
Applications of a stack
Stacks are used by processors to keep track of program flow. When a procedure or function (subroutine) is
called, it changes the value of the program counter to the first instruction of the subroutine. When the
subroutine ends, the processor must return the program counter to its value from before the subroutine was
called. We can achieve this using a stack, allowing subroutine calls to be nested. Local variables are also
held on the stack, which is why their value is lost when a subroutine ends — because they are popped off the
stack. Interrupts can also be handled with a stack in the same way.
Stacks are used to perform depth-first searches on graph data structures, undo operations when keeping
track of user inputs and backtracking algorithms — for example, pathfinding maze solutions. Stacks are also
used to evaluate mathematical expressions without brackets, using a shunting yard algorithm and reverse
Polish notation.
Operations on a stack
Typical operations that can be performed on a stack include:
• Peek: returning the value from the top of the stack without removing it
Craig’n’Dave videos
https://www.craigndave.org/algorithm-stack
88
Higher order data structures
In addition to pushing, popping and peeking items from a stack, a third operation called
“rotate” or “roll” can be used to move items around a stack. “A-B-C” on a stack
becomes “B-C-A,” where the first item has “rolled” to become the last item.
89
Essential algorithms for A level Computer Science
90
Higher order data structures
class node:
data = None
stack_pointer = None
def push(self,item):
#Check stack overflow
try:
#Push the item
new_node = stack.node()
new_node.data = item
new_node.pointer = self.stack_pointer
self.stack_pointer = new_node
return True
except:
return False
def pop(self):
#Check stack underflow
if self.stack_pointer != None:
#Pop the item
popped = self.stack_pointer.data
self.stack_pointer = self.stack_pointer.pointer
return popped
else:
return None
def peek(self):
#Check stack underflow
if self.stack_pointer != None:
#Peek the item
91
Essential algorithms for A level Computer Science
return self.stack_pointer.data
else:
return None
Function pop()
'Check stack underflow
If Not IsNothing(stack_pointer) Then
'Pop the item
Dim popped As String = stack_pointer.data
stack_pointer = stack_pointer.pointer
Return popped
92
Higher order data structures
Else
Return Nothing
End If
End Function
Function peek()
'Check stack underflow
If Not IsNothing(stack_pointer) Then
'Peek the item
Return stack_pointer.data
Else
Return Nothing
End If
End Function
End Class
End Module
Pushing items to a stack object in Visual Basic Console
Dim items() As String = {"Florida", "Georgia", "Delaware", "Alabama", "California"}
Dim s As New stack
While these additional operations do not feature in the specification, showing examples
of them can make great exam questions to test your understanding of how a stack
works.
93
Essential algorithms for A level Computer Science
Efficiency of a stack
Space complexity of a stack
Object Array
implementation implementation
O(n) O(1)
Linear Constant
A stack is a dynamic data structure. When implemented using object-oriented techniques, its memory
footprint grows and shrinks as data is pushed onto and popped from the structure. Using only as much
memory as needed is the most efficient way of implementing a stack.
However, a stack could also be implemented using an array. In this case, the dynamic structure has been
created upon a static data structure, so the memory footprint remains constant. This method is inefficient
because the stack will be reserving more memory than it needs unless it is full.
Items are always pushed onto and popped from the top of the stack. Therefore, the time complexity of all
operations on a stack is constant: O(1).
94
Higher order data structures
In summary
Stack Queue
Items are inserted (pushed) onto and deleted Items are inserted (enqueued) to the back of the
(popped) from the top of the structure structure.
One pointer at the top of the structure Two pointers — one at the back of the structure and
one at the front
Nodes can have more than two child nodes A special case of a graph where nodes can only
have up to two child nodes
Any node can be a root node — a starting point for One root node — the starting point for all operations
breadth and depth-first searches
Nodes often referred to as vertices; Nodes at the end of the structure referred to as leaf
pointers referred to as edges nodes
Linked list
95
SEARCHING ALGORITHMS
Routines that find data within a data structure.
Searching algorithms
OCR OCR AQA AQA
Binary search
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
The binary search is an efficient algorithm for finding an item in a sorted list. To perform a binary search, start
at the middle item in the list and repeatedly divide the list in half.
Craig’n’Dave videos
https://www.craigndave.org/algorithm-binary-search
SEARCHING ALGORITHMS
Binary search in simple-structured English
1. Start at the middle item in the list.
3. If the item to be found is lower than the middle item, discard all items to the right.
4. If the item to be found is higher than the middle item, discard all items to the left.
5. Repeat from step 2 until you have found the item or there are no more items in the list.
6. If the item has been found, output item data. If it has not, output “not found.”
97
Essential algorithms for A level Computer Science
Index: 0 1 2 3 4
Step 1 Calculate the middle as the first index (0) + the last index (4) integer division by 2 = 2: Delaware
Step 2 Delaware is not the item to be found. California is lower. Discard all items to the right.
Step 3 Calculate the middle as the first index (0) + the last index (1) integer division by 2: = 0: Alabama
Step 4 Alabama is not the item to be found. California is higher. Discard all items to the left.
SEARCHING ALGORITHMS
Step 5 Calculate the middle as the first index (1) + the last index (1 integer division by 2 = 1: California
Note how the number of items to be checked is halved after each comparison. This is what makes the binary
search so efficient, but it also explains why the items must be in order for the algorithm to work.
98
Searching algorithms
SEARCHING ALGORITHMS
Else
Print "Item not found"
End If
99
Essential algorithms for A level Computer Science
items = ["Alabama","California","Delaware","Florida","Georgia"]
item_to_find = input("Enter the state to find: ")
found = False
first = 0
last = len(items) -1
while first <= last and found == False:
midpoint = (first + last) // 2
if items[midpoint] == item_to_find:
found = True
else:
if items[midpoint] < item_to_find:
first = midpoint + 1
else:
last = midpoint - 1
if found == True:
print("Item found at position",midpoint)
else:
SEARCHING ALGORITHMS
100
Searching algorithms
SEARCHING ALGORITHMS
End If
End If
End While
If found = True Then
Console.WriteLine("Item found at position " & midpoint)
Else
Console.WriteLine("Item not found")
End If
Console.ReadLine()
End Sub
End Module
101
Essential algorithms for A level Computer Science
Time complexity
A binary search will usually be more efficient than a linear search, although it does require the data set to be
sorted if a binary tree is not used as the underlying data structure.
In the best case, the item to be found is either in the middle position of an array or at the root node of a
binary tree. In this special case, the algorithm has a time complexity of O(1) since the item to be found will
always be the first item checked. However, this is not usually the case.
In most cases, the time it takes to find an item increases with the size of the data set. However, because half
of the items can be discarded at a time, the algorithm is usually logarithmic: O(log n).
If a binary tree is used to store the data, it is possible that an unbalanced tree could be created:
12
36
78
In this worst case, to find the number 78 would require checking all the items in the data set, reducing the
time complexity of the algorithm to O(n). This does not usually happen, and the goal is to use a binary tree
only when the data is likely to create a more balanced structure. Alternatively, you may use it to avoid sorting
the data before applying the binary search.
102
Searching algorithms
OCR OCR AQA AQA
The goal with a hash table is to immediately find an item in a sorted or unsorted list without the need to
compare other items in the data set. It is how programming languages implement a dictionary data
structure. A hashing function is used to calculate the position of an item in a hash table.
Craig’n’Dave videos
https://www.craigndave.org/algorithm-hash-table-search
SEARCHING ALGORITHMS
How hash tables work
Creating a hash table
A hashing function is applied to an item to determine a hash value — the position of the item in a hash table.
There are many different hashing functions in use today. A simple example might be to add up the ASCII
values of all the characters in a string and calculate the modulus of that value by the size of the hash table.
A hash table needs to be at least large enough to store all the data items but is usually significantly larger to
minimise the algorithm returning the same value for more than one item, known as a collision, e.g.:
Since two items of data cannot occupy the same position in the hash table, a collision has occurred.
103
Essential algorithms for A level Computer Science
1. Be calculated quickly.
Resolving collisions
There are many strategies to resolve collisions generated from hashing functions. A simple solution is to
repeatedly check the next available space in the hash table until an empty position is found and store the
item in that location. This is known as open addressing. To find the item later, the hashing function delivers
the start position from which a linear search can then be applied until the item is found.
1 2 3 4 5 6 7 8 9 10
In this example, we can see that Delaware has a hash value of 5, but that is occupied by Florida. Delaware is
therefore placed at 6, the next available position. California has a hash value of 6 but cannot occupy its
intended position, so it must be stored at the next available position, 7.
A disadvantage of linear probing in this way is that it will prevent other items being stored at their correct
location in the hash table. It also results in what is known as clustering — several positions being filled
around common collision values.
Notice that with a table size of 10, two collisions occur. With a table size of 5, three collisions occur, resulting
in a less efficient algorithm but a reduced memory footprint. If the table size is increased to 11, no collisions
occur. With hashing algorithms, there is often a trade-off between the efficiency of the algorithm and the size
of the hash table.
A potential solution to clustering is to skip several positions before storing the item, resulting in more even
distribution throughout the table. A simple approach would be to skip to every third item. A variation of this,
known as quadratic probing, increases the number of items skipped with each jump — e.g., 1, 4, 9, 16.
It may also be necessary to increase the size of the hash table in the future and recalculate the new position
of all the items in the table.
The process of finding an alternative position for items in the hash table is known as rehashing.
104
Searching algorithms
An alternative method of handling collisions is to use a two-dimensional hash table. It is then possible for
more than one item to be placed at the same position. This is known as chaining:
1 2 3 4 5 6 7 8 9 10
1 Delaware
In this example, we can see that both Florida and Delaware can occupy the same position but a different
element of the 2D array.
Another possibility would be to use a second table for all collisions, known as an overflow table:
1 2 3 4 5 6 7 8 9 10
SEARCHING ALGORITHMS
0 1 2 3 4 5 6 7 8 9
Delaware
4. Repeat from step 2 until the item is found or there are no more items in the hash table.
5. If the item has been found, output item data. If it has not, output “not found.”
105
Essential algorithms for A level Computer Science
Index: 1 2 3 4 5 6 7 8 9 10
SEARCHING ALGORITHMS
Georgia
Florida
Georgia
Florida
106
Searching algorithms
SEARCHING ALGORITHMS
Else
Print "Item not found"
End If
107
Essential algorithms for A level Computer Science
h = h + 1
hash_table[h] = items[index]
print("Inserted",items[index],"at position",h)
return hash_table
108
Searching algorithms
SEARCHING ALGORITHMS
'Place data in hashing table
For index = 0 To items.Length - 1
h = calculate_hash(items(index), table_size)
If hash_table(h) <> "" Then
Console.WriteLine("Collision inserting " & items(index) & " at " & h)
'On collision insert in next available space
Do While hash_table(h) <> ""
h = h + 1
Loop
End If
hash_table(h) = items(index)
Console.WriteLine("Inserted " & items(index) & " at position " & h)
Next
Return hash_table
End Function
Else
found = True
End If
Loop
End If
End If
If found Then
Console.WriteLine("Item found at position " & h)
Else
Console.WriteLine("Item not found")
End If
End Sub
Hash searching can be used on a hash table stored in memory but is often used with
large data sets and, therefore, more commonly used on files. Since it is necessary to
move directly to the item to be found and not search through the other items first, a
hash search can only be performed on a hard disk, solid-state drive or RAM.
110
Searching algorithms
Time complexity
The hash table search is usually the most efficient of all the searching algorithms, out-performing a linear
search and binary search on average. It achieves this by finding the position of an item immediately using a
hashing function, resulting in a constant time complexity of O(1) in both best and average cases.
However, if the item cannot be found immediately because of a collision, a variation of a linear search known
as linear probing must be used instead. This results in a time complexity of O(n) since each item must be
checked in turn until either the item is found or the end of the table is reached.
SEARCHING ALGORITHMS
With hash searching, it is important to have an efficient hashing function that will calculate quickly but also
produce as few collisions as possible. This is often achieved by using a larger hash table than is needed to
store the data, increasing the number of unique positions and reducing the number of possible collisions.
111
Essential algorithms for A level Computer Science
OCR OCR AQA AQA
Linear search
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
The linear search finds an item in a sorted or unsorted list. To perform a linear search, start at the first item
in the list and check each item one by one. Think about searching for a card in a shuffled deck, starting with
the top card and checking each one until you find the card you want.
Craig’n’Dave videos
https://www.craigndave.org/algorithms-linear-search
SEARCHING ALGORITHMS
2. If the item in the list is the one to be found, the search is complete.
4. Repeat from step 2 until the item is found or there are no more items in the list.
5. If the item has been found, output item data. If it has not, output “not found.”
112
Searching algorithms
Index: 0 1 2 3 4
Step 1 Start at the first item in the list. Compare California to Florida.
Florida is not the item to be found. Move to the next item in the list.
SEARCHING ALGORITHMS
Step 4 Compare California to Alabama.
Alabama is not the item to be found. Move to the next item in the list.
113
Essential algorithms for A level Computer Science
items = ["Florida","Georgia","Delaware","Alabama","California"]
item_to_find = input("Enter the state to find: ")
index = 0
found = False
while found == False and index < len(items):
if items[index] == item_to_find:
found = True
else:
index = index + 1
if found == True:
print ("Item found at position",index)
else:
print ("Item not found")
114
Searching algorithms
SEARCHING ALGORITHMS
End If
Console.ReadLine()
End Sub
End Module
Beyond this, it would be better to use an alternative algorithm. If certain data items are
likely to be searched for more frequently, it would be better to place them towards the
beginning of the list to maximise the efficiency of the algorithm.
When programming a linear search, use a “while” statement if only one occurrence of
an item needs to be found and a “for” statement if all occurrences need to be found.
115
Essential algorithms for A level Computer Science
Time complexity
In the best case, the item to be found is the first on the list. In this situation, the linear search performs as
well as a binary search when the first item is in the middle of the list: O(1).
That means the linear search can perform better than a binary search on very small lists.
In the worst case, the item to be found is last on the list or not in the list at all, so all the items need to be
SEARCHING ALGORITHMS
checked: O(n). Typically, the item to be found will be somewhere in the data set and, as the data set grows,
more searching must be performed. Therefore, the algorithm has a complexity of O(n).
116
Searching algorithms
In summary
Binary search Hash table search Linear search
Items must be in order for the Items are not stored in any useful Items do not need to be stored in
algorithm to work order order
Start at the middle item Use a hashing function on the key Start at the first item
Halve the set of items to search Hashing function delivers the Search each item in sequence
after each comparison until the location of the item to be found until the item is found or there
item is found or there are no unless there is a collision or it are no more items to check
more items to check doesn’t exist
SEARCHING ALGORITHMS
New items must be added in the New items can be added with the New items are added at the end
correct place to maintain the hashing function determining the — quick
order of the items — can be slow location — usually quick
Suitable for a large number of Suitable for a large number of Suitable for a small number of
items items items
117
SORTING ALGORITHMS
Routines that organise data in fundamental data structures.
Sorting algorithms
OCR OCR AQA AQA
Bubble Sort
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
The bubble sort orders an unordered list of items by comparing each item with the next one and swapping
them if they are out of order. The algorithm is finished when no swaps are made. It effectively “bubbles up”
the largest (or smallest) item to the end of the list.
Craig’n’Dave video
https://www.craigndave.org/algorithm-bubble-sort
SORTING ALGORITHMS
Bubble sort in simple-structured English
1. Start at the first item in the list.
5. Repeat from step 2 until all the unsorted items have been compared.
6. If any items were swapped, repeat from step 1. Otherwise, the algorithm is complete.
119
Essential algorithms for A level Computer Science
Index: 0 1 2 3 4
120
Sorting algorithms
SORTING ALGORITHMS
If items[index] > items[index+1] then
Swap(items[index], items[index+1])
swapped = True
End If
End For
End while
121
Essential algorithms for A level Computer Science
'Output
Console.WriteLine(String.Join(", ", items))
Console.ReadLine()
End Sub
End Module
122
Sorting algorithms
Time complexity
Space
complexity
Best Average Worst
case case case
In the best case, the data set is already ordered — in which case, only one pass is required to check all the
items in the list and no swaps will be made. Therefore, it is of linear complexity: O(n). This is not usually the
case, as the purpose of the algorithm is to sort data.
As the algorithm contains a nested loop, it usually has polynomial complexity — O(n2) — because the time it
takes to execute both iterations increases with the size of the data set. However, it does not require any
additional memory. It can be performed on the data structure containing the data set, so the space
complexity is O(1).
SORTING ALGORITHMS
The most efficient method of implementing a bubble sort stops the algorithm when no swaps are made.
Although this is not necessary, it is preferable and reduces execution time.
123
Essential algorithms for A level Computer Science
OCR OCR AQA AQA
Insertion sort
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
The insertion sort inserts each item into its correct position in a data set one at a time. It is a useful
algorithm for small data sets.
Craig’n’Dave video
https://www.craigndave.org/algorithm-insertion-sort
SORTING ALGORITHMS
2. Compare current item with the first item in the sorted list.
3. If the current item is greater than the item in the list, move to the next item in the sorted list.
4. Repeat from step 3 until the position of the current item is less than or equal to the item in the sorted
list.
5. Move all the items in the list from the current position up one place to create a space for the current
item.
7. Repeat from step 2 with the next item in the list until all the items have been inserted.
124
Sorting algorithms
SORTING ALGORITHMS
Did you know?
The insertion sort can out-perform a quicksort on small data sets, so it is often used as an
optimisation of a quicksort. Once the number of items in a list becomes small, an
efficient quicksort will execute an insertion sort on the smaller list instead.
125
Essential algorithms for A level Computer Science
Index: 0 1 2 3 4
Step 2 Georgia is greater than Florida. It is in the correct place. Move to the next item: Delaware.
Compare Delaware with the first item in the list: Florida
Step 3 Delaware is less than Florida. Insert before Florida. Move to the next item: Alabama.
Compare Alabama with the first item in the list: Delaware.
Step 4 Alabama is less than Delaware. Insert before Delaware. Move to the next item: California.
SORTING ALGORITHMS
Step 5 California is greater than Alabama. Check the next item in the list: Delaware
126
Sorting algorithms
SORTING ALGORITHMS
items[index2] = items[index2 -1]
index2 = index2 -1
items[index2] = current
#Output the result
print(items)
127
Essential algorithms for A level Computer Science
Time complexity
Space
complexity
Best Average Worst
case case case
In the best case, the data set is already ordered, so no data needs to be moved. The algorithm has a linear
time complexity — O(n) — since each item must still be checked. This is not usually the case, as the purpose
of the algorithm is to sort data.
The algorithm contains a nested iteration — one loop to check each item and another to move items so that
an item can be slotted into place. Due to these iterations, the algorithm has polynomial complexity: O(n2).
The algorithm does not require any additional memory, as it can be performed on the data structure
SORTING ALGORITHMS
An alternative version of the algorithm puts sorted data into a new list instead of working on the original list.
This does not change the time complexity of the algorithm but would increase the space complexity to O(n).
Another optimisation uses a linked list instead of an array, negating the need to move
items within the data structure. (See higher order data structures).
128
Sorting algorithms
OCR OCR AQA AQA
Merge sort
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
A merge sort can sort a data set extremely quickly using divide and conquer. The principle of divide and
conquer is to create two or more identical sub-problems from the larger problem, solving them individually
and combining their solutions to solve the bigger problem. With the merge sort, the data set is repeatedly
split in half until each item is in its own list. Adjacent lists are then merged back together, with each item in
the sub-list being entered into the correct place in the new, combined list.
Craig’n’Dave videos
https://www.craigndave.org/algorithm-merge-sort
SORTING ALGORITHMS
Merge sort in simple-structured English
1. Repeatedly divide the list in half into two smaller lists until each item is in its own list.
2. Take two adjacent lists and start with the first item in each one.
4. Insert the lowest item into a new list. Move to the next item in the list it was taken from.
5. Repeat step 4 until all the items from one of the lists have been put into the new list.
6. Append all the items from the list still containing items to the new list.
When programming this algorithm using iteration, step 1 can be achieved by putting each item into a new list
one at a time. This is a simple optimisation that is worth implementing. Examiners will expect you to
understand the data set is repeatedly split until each item is in it’s own list.
129
Essential algorithms for A level Computer Science
130
Sorting algorithms
Index: 0 1 2 3 4
Split
Florida Georgia Delaware Alabama California
Step 2 Divide the two lists in half to create four smaller lists.
Split
Florida Georgia Delaware Alabama California
Split
Florida Georgia Delaware Alabama California
Merge
Florida Georgia Delaware Alabama California
SORTING ALGORITHMS
Step 5 Compare Delaware to Alabama and merge to one list.
Merge
Florida Georgia Alabama Delaware California
131
Essential algorithms for A level Computer Science
Function merge(list1,list2)
newlist = []
index1 = 0
index2 = 0
While index1 < list.Length and index2 < list2.Length
If list1[index1] > list2[index2] Then
newlist.append list2[index2]
index2 = index2 + 1
ElseIf list1[index1] < list2[index2] Then
SORTING ALGORITHMS
newlist.append list1[index1]
index1 = index1 + 1
ElseIf list1[index1] == list2[index2] Then
newlist.append list1[index1]
newlist.append list2[index2]
index1 = index1 + 1
index2 = index2 + 1
End If
If index1 < list1.Length Then
For item = index1 to list1.Length
newlist.append list1[item]
Next item
ElseIf index2 < list2.Length Then
For item = index2 to list2.Length
newlist.append list2[item]
Next item
Return newlist
132
Sorting algorithms
SORTING ALGORITHMS
if index1 < len(list1):
for item in range(index1, len(list1)):
newlist.append(list1[item])
elif index2 < len(list2):
for item in range(index2, len(list2)):
newlist.append(list2[item])
return newlist
#Output
print(listofitems[0])
133
Essential algorithms for A level Computer Science
End If
Loop
Return newlist
End Function
134
Sorting algorithms
listofitems.Add(item)
Next
SORTING ALGORITHMS
Efficiency of the merge sort
Time complexity
Space
complexity
Best Average Worst
case case case
In all cases, the data in a merge sort needs to be manipulated so that each item is in it’s own list. The time
this takes will increase with more data, and so too will the memory requirements: O(n). However, by using a
divide and conquer algorithm, the data set can be repeatedly divided: O(log n).
When the lists are merged back together, it is possible to merge more than one list simultaneously, although
each item in the list needs to be considered in turn to determine its position in the new list. Therefore, the
algorithm has a linearithmic time complexity — O(n log n) — and a space complexity of O(n).
135
Essential algorithms for A level Computer Science
mergeSort(lefthalf)
mergeSort(righthalf)
i=0
j=0
k=0
SORTING ALGORITHMS
print("Merging ",items)
while i < len(lefthalf) and j <len(righthalf):
if lefthalf[i]<righthalf[j]:
items[k]=lefthalf[i]
i=i+1
else:
items[k]=righthalf[j]
j=j+1
k=k+1
while i <len(lefthalf):
items[k]=lefthalf[i]
i=i+1
k=k+1
while j<len(righthalf):
items[k]=righthalf[j]
j=j+1
k=k+1
items = ["Florida","Georgia","Delaware","Alabama","California"]
mergeSort(items)
print(items)
136
Sorting algorithms
SORTING ALGORITHMS
End If
Loop
Return newlist
End Function
137
Essential algorithms for A level Computer Science
listofitems.Add(item)
Next
138
Sorting algorithms
OCR OCR AQA AQA
Quicksort
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
Perhaps not unsurprisingly given the name of the algorithm, the quicksort orders a data set extremely quickly
using divide and conquer. The principle of divide and conquer is to create two or more identical, smaller sub-
problems from the larger problem, before solving them individually and combining their solutions to solve the
bigger problem. The algorithm makes use a of a pivot value from the data set, against which other items are
compared to determine their position. It is often considered more efficient than a merge sort due to requiring
less memory than a typical recursive merge sort implementation, but this is not really the case. It is usually
dependent on the factors affecting the algorithm such as the data set being sorted and the pivot chosen.
Applications of a quicksort
The quicksort is suitable for any data set but shines with larger data sets. It is ideal for parallel processing
environments where the concept of divide and conquer can be used. Typically found in real-time situations
due to its efficiency, the quicksort has applications in medical monitoring, life support systems, aircraft
controls and defence systems.
Craig’n’Dave video
SORTING ALGORITHMS
https://www.craigndave.org/algorithms-quick-sort
3. While the second pointer is greater than or equal to the first pointer:
a. Whilst the first pointer is less than or equal to the second pointer and the item at the first
pointer is less than or equal to the pivot value, increase the first pointer by one.
b. Whilst the second pointer is greater than or equal to the first pointer and the item at the
second pointer is greater than or equal to the pivot, decrease the second pointer by one.
c. If the second pointer is greater than the first pointer, swap the items.
4. Swap the pivot value with the item at the second pointer.
5. Repeat from step 1 on the list of items to the left of the second pointer.
139
Essential algorithms for A level Computer Science
6. Repeat from step 1 on the list of items to the right of the second pointer.
Quicksort illustrated
If illustrations help you remember, this is how you can picture a quicksort:
SORTING ALGORITHMS
When back in England, Tony Hoare was asked to write code for a Shellsort but instead
placed a bet that his sorting algorithm was the fastest. He won a sixpence! The
quicksort is now a widely adopted sorting algorithm in most programming language
libraries today.
140
Sorting algorithms
Index: 0 1 2 3 4
Step 1 Set the pivot to the first item. Set pointers on indexes 1 and 4.
Step 2 Consider first pointer: Georgia is not less than Florida. First pointer finished.
Step 4 Consider first pointer: California is less than Florida. Increment the first pointer.
SORTING ALGORITHMS
Florida California →Delaware Alabama Georgia
Step 5 Consider first pointer: Delaware is less than Florida. Increment the first pointer.
Step 6 Consider first pointer: Alabama is less than Florida. Increment the first pointer.
Step 9 Consider first pointer: California is not less than Alabama. First pointer finished.
141
Essential algorithms for A level Computer Science
Step 10 Consider second pointer: Delaware is greater than Alabama. Decrement second pointer.
Step 11 Consider second pointer: California is greater than Alabama. Decrement second pointer.
Step 14 Consider first pointer: Delaware is not less than California. First pointer finished.
SORTING ALGORITHMS
Step 15 Consider second pointer: Delaware is greater than Alabama. Decrement second pointer.
142
Sorting algorithms
SORTING ALGORITHMS
End Function
These include Tony Hoare’s method — using a pointer that starts at each end of the data
set and moves inwards — and the Nico Lomuto method, which starts with a pivot at the
end of the data set. Both are valid quicksort algorithms.
You may be more likely to learn the Hoare method in A level computer science and the
Lomuto method in A level mathematics.
143
Essential algorithms for A level Computer Science
temp = items[0]
SORTING ALGORITHMS
items[0] = items[pointer2]
items[pointer2] = temp
left = quicksort(items[0:pointer2])
right = quicksort(items[pointer2+1:len(items)])
return left+[items[pointer2]]+right
144
Sorting algorithms
SORTING ALGORITHMS
items(pointer2) = temp
End If
End While
temp = items(0)
items(0) = items(pointer2)
items(pointer2) = temp
Return newlist
End Function
145
Essential algorithms for A level Computer Science
Time complexity
Space
complexity
Best Average Worst
case case case
Once the position of the pivot is found, the quicksort can divide the list in half and recursively solve the
quicksort on each of the sub-lists. This makes the algorithm ideal for parallel processing. Dividing the data
set in this way provides a linearithmic time complexity: O(n log n). It is unlikely that the data set will be
divided equally in half, but in the worst case, the pivot will always be the first or last item, and the time
complexity becomes polynomial: O(n2). At best, the data set will always be divided equally, resulting in
linearithmic complexity: O(n log n). On average the position of the pivot item will be somewhere between the
first and last item in the data set, and the time complexity averages out.
SORTING ALGORITHMS
The code examples presented are not memory-efficient because they create new lists for each recursive call.
Implementing the algorithm in this way increases the space complexity but makes the algorithm easier to
code and understand. An alternative approach would use the same data set, passing the pointers as
parameters and not the data structure to the quicksort function.
However, any recursive algorithm — even if it does not create new data sets — will require a call stack, so the
space complexity is still considered O(log n) and not O(1).
One method of further optimising the quicksort is to recognise when the number of items in the data set is
small and switch to a more optimal algorithm for small data sets such as an insertion sort.
146
Sorting algorithms
To illustrate the point, this approach uses the last item as the pivot:
a. If the item at the first pointer is greater than the item at the second pointer, swap the items
and the pointers.
b. Move the first pointer one item towards the second pointer. (Note: this could be a move
towards the left or right in the data set. Only one pointer ever moves in this method.)
SORTING ALGORITHMS
Stepping through an alternative quicksort
Index: 0 1 2 3 4
Step 2 Compare Florida and California. Florida is greater than California. Swap items and pointers.
Step 4 Compare Alabama with California. Alabama is less than California. Swap items and pointers.
147
Essential algorithms for A level Computer Science
Step 6 Compare Georgia with California. Georgia is greater than California. Swap items and pointers.
Step 8 Compare Delaware with California. Delaware is greater than California. No swap needed.
Step 12 Compare Delaware with Florida. Delaware is less than Florida, no swap needed.
Step 14 Compare Georgia with Florida. Georgia is greater than Florida. Swap items and pointers.
148
Sorting algorithms
SORTING ALGORITHMS
items = ["Florida","Georgia","Delaware","Alabama","California"]
Function quicksort(list)
If items.Length <= 1 Then Return items
pointer1 = 0
While pointer1 != pointer2
If (items[pointer1] > items[pointer2] and pointer1 < pointer2) or (items[pointer1]
< items[pointer2] and pointer1 > pointer2) Then
Swap(items[pointer1],items[pointer2])
Swap(pointer1, pointer2)
End If
If pointer1 < pointer2 Then
pointer1 = pointer1 + 1
Else
pointer1 = pointer1 – 1
End If
End While
left = quicksort(items[0 to pointer1])
right = quicksort(items[pointer1 + 1 to items.Length])
Return left + items[pointer1] + right
End Function
149
Essential algorithms for A level Computer Science
pointer1 = pointer1 - 1
left = quicksort(items[0:pointer1])
right = quicksort(items[pointer1+1:len(items)])
return left+[items[pointer1]]+right
150
Sorting algorithms
SORTING ALGORITHMS
Else
pointer2 = pointer2 + 1
End If
End While
Return newlist
End Function
151
Essential algorithms for A level Computer Science
1. The processing architecture. Parallel processors usually execute more efficiently than single
processors. If a data set can be divided and the same algorithm applied to each sub-set of data, it is
well suited to parallelism. Good examples of this are the merge sort and quicksort, which take
advantage of divide and conquer and work best with large, unsorted data sets.
2. The memory footprint. The amount of memory an algorithm has to work with can be a limiting factor.
In computer science, there is often a balance between the efficiency of processing and the efficiency
of memory. In the past, memory was a significant limiting factor, and programmers would strive to
save every byte they could. Today, we take for granted the fact we have gigabytes of memory. In the
past, computers had only kilobytes of memory to store the operating system, program and data.
Algorithms that perform well in a defined memory space include the bubble sort and insertion sort.
3. The volume of code. More efficient algorithms usually require more complex code. This is difficult to
write for novice programmers and requires more memory to store. Sometimes, the speed of
execution is not the most important factor. A good example of this is the bubble sort, which is easy to
write and requires very few lines of code.
SORTING ALGORITHMS
4. The state of the data set. Some algorithms that are often less efficient can outperform others if the
data set is already partially sorted. A good example of this is the bubble sort. It can outperform a
quicksort on a partially sorted list, but it is quickly beaten if the data set is large and random.
It would be a great exercise to take the code for each algorithm in this book and apply it to different types of
data sets to see how the speed of execution is affected. Different data sets might include ordered, reverse
ordered, random, small and large volumes of data.
152
Sorting algorithms
In summary
Bubble sort Insertion sort Merge sort Quicksort
Adjacent items are Each item is compared Items in one list are Items are compared to
compared and swapped to every other item from compared to items in a pivot
if they are out of order the start until its place another list to create a
is found new list
Uses a nested iteration Uses a nested iteration Uses either a nested Uses either a nested
iteration or recursion iteration or recursion
Suitable for a small Suitable for a small Suitable for a large Suitable for a large
number of items number of items number of items number of items
SORTING ALGORITHMS
Known memory Known memory Memory footprint can Memory footprint can
footprint footprint increase as the increase as the
algorithm executes algorithm executes
153
OPTIMISATION ALGORITHMS
Routines that find satisfactory or optimum paths between vertices on a graph.
Optimisation algorithms
OCR OCR AQA AQA
Dijkstra’s shortest path algorithm finds the shortest path between one node and all other nodes on a
weighted graph. It is sometimes considered a special case of the A* algorithm with no heuristics, although
Edsger Dijkstra developed his algorithm first. It is also a type of breadth-first search. A limitation of Dijkstra’s
shortest path is that it doesn’t work for edges with a negative weight value. The Bellman–Ford algorithm later
provided a solution to that problem.
For AQA, candidates do not need to be able to code or recall the steps of Dijkstra’s shortest path. They need
to understand and trace the algorithm and be aware of its applications.
OPTIMISATION ALGORITHMS
Groningen. It can be used for many purposes where the shortest path between two points need to be
established. Maps, IP routing and the telephone network make use of Dijkstra’s algorithm.
Craig’n’Dave videos
https://www.craigndave.org/algorithm-dijkstras-shortest-path
155
Essential algorithms for A level Computer Science
a. Find the node with the shortest distance from the start that has not been visited.
ii. If the distance from the start is lower than the currently recorded distance from the
start:
1. Set the shortest distance to the start of the connected node to the newly
calculated distance.
156
Optimisation algorithms
Step 1 Set the distance from the start for all nodes to infinity.
Set the distance from the start for node A to zero.
Select the node with the lowest distance from the start that has not been visited: A.
Consider each edge from A that has not been visited: B, C, D.
Distance Previous
Node Visited
from start Node
A
4 2 A 0 No
3
B ∞ No
B C D
1
4 2 C ∞ No
E F D ∞ No
2 E ∞ No
5
G
F ∞ No
OPTIMISATION ALGORITHMS
G ∞ No
Step 2 Distance from the start = A’s distance from the start + edge weight.
B: 0 + 4 = 4 (lower than ∞) – update. C: 0 + 3 = 3 (lower than ∞) – update.
D: 0 + 2 = 2 (lower than ∞) - update.
Set A to visited.
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 No A
B C D
1
4 2 C 3 No A
E F D 2 No A
2 E ∞ No
5
G
F ∞ No
G ∞ No
157
Essential algorithms for A level Computer Science
Step 3 Select the node with the lowest distance from the start that has not been visited: D.
Consider each edge from D that has not been visited: C, F.
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 No A
B C D
1
4 2 C 3 No A
E F D 2 No A
2
E ∞ No
5
G
F ∞ No
G ∞ No
Step 4 Distance from the start = D’s distance from the start + edge weight.
C: 2 + 1 = 3 could be updated since the distance to C from A or D is the same, but as the distance from
OPTIMISATION ALGORITHMS
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 No A
B C D
1
4 2 C 3 No A
E F D 2 Yes A
2 E ∞ No
5
G
F 4 No D
G ∞ No
158
Optimisation algorithms
Step 5 Select the node with the lowest distance from the start that has not been visited: C.
Consider each edge from C that has not been visited: none.
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 No A
B C D
1
4 2 C 3 No A
E F D 2 Yes A
2 E ∞ No
5
G
F 4 No D
G ∞ No
OPTIMISATION ALGORITHMS
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 No A
B C D
1
4 2 C 3 Yes A
E F D 2 Yes A
2
E ∞ No
5
G
F 4 No D
G ∞ No
159
Essential algorithms for A level Computer Science
Step 7 Select the node with the lowest distance from the start that has not been visited: B.
This could also have been F — it doesn’t matter which is chosen first.
Consider each edge from B that has not been visited: E.
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B C D B 4 No A
1
4 2
C 3 Yes A
E F
D 2 Yes A
2
5 E ∞ No
G
F 4 No D
G ∞ No
Step 8 Distance from the start = B’s distance from the start + edge weight.
OPTIMISATION ALGORITHMS
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 Yes A
B C D
1
4 2 C 3 Yes A
E F D 2 Yes A
2 E 8 No B
5
G
F 4 No D
G ∞ No
160
Optimisation algorithms
Step 9 Select the node with the lowest distance from the start that has not been visited: F.
Consider each edge from F that has not been visited: G.
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 Yes A
B C D
1
4 2 C 3 Yes A
E F D 2 Yes A
2 E 8 No B
5
G
F 4 No D
G ∞ No
Step 10 Distance from the start = F’s distance from the start + edge weight.
G: 4 + 5 = 9 (lower than ∞) – update.
OPTIMISATION ALGORITHMS
Set F to visited.
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 Yes A
B C D
1
4 2 C 3 Yes A
E F D 2 Yes A
2 E 8 No B
5
G
F 4 Yes D
G 9 No F
161
Essential algorithms for A level Computer Science
Step 11 Select the node with the lowest distance from the start that has not been visited: E.
Consider each edge from F that has not been visited: G.
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 Yes A
B C D
1
4 2 C 3 Yes A
E F D 2 Yes A
2
E 8 No B
5
G
F 4 Yes D
G 9 No F
Step 12 Distance from the start = E’s distance from the start + edge weight.
G: 8 + 2 = 10 (not lower than 9) – don’t update.).
OPTIMISATION ALGORITHMS
Set E to visited.
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 Yes A
B C D
1
4 2 C 3 Yes A
E F D 2 Yes A
2
E 8 Yes B
5
G
F 4 Yes D
G 9 No F
162
Optimisation algorithms
Step 13 Select the node with the lowest distance from the start that has not been visited: G.
Consider each edge from G that has not been visited: none.
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 Yes A
B C D
1
4 2 C 3 Yes A
E F D 2 Yes A
2
E 8 Yes B
5
G
F 4 Yes D
G 9 No F
OPTIMISATION ALGORITHMS
Distance Previous
Node Visited
from start Node
A
4 2 A 0 Yes
3
B 4 Yes A
B C D
1
4 2 C 3 Yes A
E F D 2 Yes A
2 E 8 Yes B
5
G
F 4 Yes D
G 9 No F
To find the shortest path from one node to any other, start with the goal node and follow the previous
nodes back to the start, inserting the new node at the front of the list, e.g.:
Notice how Dijkstra’s algorithm finds the shortest path between all nodes. This is a difference between
Dijkstra and the A* algorithm.
163
Essential algorithms for A level Computer Science
Next
graph.pop(shortest)
End While
vertex = goal
While vertex != start
shortest_path.insert(vertex)
vertex = previous_vertex[vertex]
End While
Return shortest_path
End Function
164
Optimisation algorithms
#Set the shortest distance from the start for all vertices to infinity
for vertex in graph:
distance[vertex] = infinity
#Set the shortest distance from the start for the start vertex to 0
distance[start] = 0
OPTIMISATION ALGORITHMS
if shortest == None:
shortest = vertex
elif distance[vertex] < distance[shortest]:
shortest = vertex
#The vertex has now been visited, remove it from the vertices to consider
graph.pop(shortest)
165
Essential algorithms for A level Computer Science
'Set the shortest distance from the start for all vertices to infinity
For Each vertex In graph
OPTIMISATION ALGORITHMS
distance.Add(vertex.Key, infinity)
Next
'Set the shortest distance from the start for the start vertex to 0
distance(start) = 0
'The vertex has now been visited, remove it from the vertices to consider
166
Optimisation algorithms
graph.Remove(shortest)
End While
Sub main()
Dim graph = New Dictionary(Of String, Dictionary(Of String, Integer)) From {
{"A", New Dictionary(Of String, Integer) From {{"B", 4}, {"C", 3}, {"D", 2}}},
{"B", New Dictionary(Of String, Integer) From {{"A", 4}, {"E", 4}}},
OPTIMISATION ALGORITHMS
{"C", New Dictionary(Of String, Integer) From {{"A", 3}, {"D", 1}}},
{"D", New Dictionary(Of String, Integer) From {{"A", 2}, {"C", 1}, {"F", 2}}},
{"E", New Dictionary(Of String, Integer) From {{"B", 4}, {"G", 2}}},
{"F", New Dictionary(Of String, Integer) From {{"D", 2}, {"G", 5}}},
{"G", New Dictionary(Of String, Integer) From {{"E", 2}, {"F", 5}}}
}
167
Essential algorithms for A level Computer Science
Time complexity
Best Worst
Average case
case case
A “for” loop is used to set the shortest distance of all the nodes. This part of the algorithm is linear: O(n). The
main algorithm makes use of a graph stored as a dictionary, and we assume a time complexity of O(1) for
looking up data about each node. A “for” loop is nested in a “while” loop when the shortest distance for each
neighbouring vertex is calculated from every connected edge. This means the algorithm has a polynomial
complexity at worst — O(V2) — but different implementations are able to reduce this to O(E log V), where E is
the number of edges and V the number of vertices.
OPTIMISATION ALGORITHMS
168
Optimisation algorithms
OCR OCR AQA AQA
A* pathfinding
AS Level A Level AS Level A Level
WJEC WJEC BTEC BTEC
AS Level A Level Nationals Highers
The A* pathfinding algorithm is a development of Dijkstra’s shortest path. Unlike Dijkstra’s algorithm, A*
finds the shortest path between two vertices on a weighted graph using heuristics. It performs better than
Dijkstra’s algorithm because not every vertex is considered. Instead, only the most optimal path is followed to
the goal. Known as a best-first search algorithm, with A* pathfinding a heuristic estimates the cost of the
path between the next vertex and the goal. It then follows this path. It is important that the heuristic does
not over-estimate the cost, thereby choosing an incorrect vertex to move to next. The vertices being
considered are referred to as “the fringe”. The usefulness of the A* algorithm is determined by the suitability
of the heuristic.
About heuristics
A
In figure 1, in addition to the cost of each edge, you might be able to calculate
4 2 the distance between a vertex and the goal shown by dotted lines. This
3 additional data, called the heuristic, allows you to determine that B is closer to
the goal than C or D. Therefore, B is the vertex that should be followed, even
B C D though the cost from A to B is higher than the cost from A to C or D. When
1
4 2 determining the shortest path, the cost is added to the heuristic to determine
the best path. It is worth noting that the calculation of the heuristic does not
OPTIMISATION ALGORITHMS
E F need to be mathematically accurate — it just needs to deliver a useful result.
2 It is easier to appreciate this if we consider a
x = 100
5 situation in a video game where we want to
G y = 20 S
maximise the frames per second by reducing
the number of operations to be performed in a
Figure 1
given amount of time. In figure 2, a character
at position S (100,20 on the screen) is required B D
to travel to E (80,160) via waypoint B or D. A heuristic measuring the
distance between S and E could be calculated using Pythagoras’ theorem on h
the right-angle triangle: √202 + 1402 = 141 However, simply adding the two o
sides (a and o) requires fewer processing cycles but still delivers a useful
result (20 + 140 = 160). We have used the Manhattan distance (the sum of x = 80
the opposite and adjacent) to estimate C instead of the accurate Euclidian y = 160
distance (hypotenuse). Estimating results in this way — where only a best E
guess matters, rather than a precise result — is called a heuristic in a
computer science.
Figure 2
Although the heuristic could potentially be calculated in advance for each
vertex, in reality it is only usually calculated for a vertex when needed to maximise the efficiency of the
algorithm. In both examples, we used the distance between a start and end-point to determine the heuristic,
but this could be any meaningful data relevant to the context. We could make our game character appear
more intelligent by changing the heuristic depending on other factors in the game.
The A* algorithm has some notation that programmers of the algorithm will be familiar with:
• The distance from the start of a vertex plus the edge value is often referred to as the ‘g’ value.
• The distance from the start of a vertex plus the edge value plus the heuristic is often referred to as
the ‘f’ value.
169
Essential algorithms for A level Computer Science
Applications of A* pathfinding
Although useful in travel routing systems, A* is generally outperformed by other algorithms that pre-process
the graph to attain a better performance. A* was originally developed as part of the Shakey project to build a
robot with artificial intelligence. Frequently found in video games, the algorithm is used to move non-playable
characters in a way that they appear move intelligently. It can also be found in network packet routing, in
financial modelling for trading assets and goods (arbitrage opportunity), solving puzzles like word ladders,
and social networking analysis — calculating degrees of separation between individuals to suggest friends, for
example.
Craig’n’Dave video
https://www.craigndave.org/algorithm-astar
OPTIMISATION ALGORITHMS
1. Until the goal node has the lowest f value of all the nodes and nodes do not have an f value of infinity:
a. Find the node with the lowest f value that has not been visited.
i. Calculate the relative distance from the start by adding the edge and the heuristic.
ii. If the distance from the start plus the heuristic is lower than the currently recorded f
value:
1. Set the f value of the connected node to the newly calculated distance.
A* pathfinding illustrated
If illustrations help you remember, this is how you can picture A* pathfinding:
OPTIMISATION ALGORITHMS
Stepping through A* pathfinding
Note that many authors do not show the starting values for the distance from the start or f in their
illustrations, but they are infinity for all vertices — except for the start, which has a value of zero. The
heuristic for the goal should also be zero.
Step 1 Set the distance from the start to infinity for all nodes.
Set the distance from the start to zero for vertex A.
Select the node with the lowest f value that has not been visited: A.
Consider each edge from A that has not been visited: B, C, D.
Distance Previous
Node Heuristic f Visited
from start Node
A
4 2 A 0 12 0 No
3
B ∞ 6 ∞ No
B C D
1
4 2 C ∞ 9 ∞ No
E F D ∞ 12 ∞ No
2 E ∞ 3 ∞ No
5
G
F ∞ 9 ∞ No
G ∞ 0 ∞ No
171
Essential algorithms for A level Computer Science
Distance Previous
Vertex Heuristic f Visited
from start Vertex
A
4 2 A 0 12 0 Yes
3
B 4 6 10 No A
B C D
1
4 2 C 3 9 12 No A
E F D 2 12 14 No A
2
E ∞ 3 ∞ No
5
G
F ∞ 9 ∞ No
G ∞ 0 ∞ No
OPTIMISATION ALGORITHMS
Step 3 Select the node with the lowest f value that has not been visited: B.
Consider each edge from B that has not been visited: E.
Distance Previous
Vertex Heuristic f Visited
from start Vertex
A
4 2 A 0 12 0 Yes
3
B 4 6 10 No A
B C D
1
4 2 C 3 9 12 No A
E F D 2 12 14 No A
2
E ∞ 3 ∞ No
5
G
F ∞ 9 ∞ No
G ∞ 0 ∞ No
172
Optimisation algorithms
Distance Previous
Node Heuristic f Visited
from start Node
A
4 2 A 0 12 0 Yes
3
B 4 6 10 Yes A
B C D
1
4 2 C 3 9 12 No A
E F D 2 12 14 No A
2 E 8 3 11 No B
5
G
F ∞ 9 ∞ No
G ∞ 0 ∞ No
Step 5 Select the node with the lowest f value that has not been visited: E.
Consider each edge from E that has not been visited: G.
Distance Previous
Node Heuristic f Visited
from start Node
OPTIMISATION ALGORITHMS
A
4 2 A 0 12 0 Yes
3
B 4 6 10 Yes A
B C D
1
4 2 C 3 9 12 No A
E F D 2 12 14 No A
2 E 8 3 11 No B
5
G
F ∞ 9 ∞ No
G ∞ 0 ∞ No
173
Essential algorithms for A level Computer Science
Distance Previous
Node Heuristic f Visited
from start Node
A
4 2 A 0 12 0 Yes
3
B 4 6 10 Yes A
B C D
1
4 2 C 3 9 12 No A
E F D 2 12 14 No A
2
E 8 3 11 Yes B
5
G
F ∞ 9 ∞ No
G 10 0 10 No E
Step 7 Select the node with the lowest f value that has not been visited: G.
OPTIMISATION ALGORITHMS
Distance Previous
Node Heuristic f Visited
from start Node
A
4 2 A 0 12 0 Yes
3
B C D B 4 6 10 Yes A
1
4 2
C 3 9 12 No A
E F
D 2 12 14 No A
2
5 E 8 3 11 Yes B
G
F ∞ 9 ∞ No
G 10 0 10 No E
174
Optimisation algorithms
Distance Previous
Node Heuristic f Visited
from start Node
A
4 2 A 0 12 0 Yes
3
B 4 6 10 Yes A
B C D
1
4 2 C 3 9 12 No A
E F D 2 12 14 No A
2 E 8 3 11 Yes B
5
G
F 15 9 24 No G
G 10 0 10 Yes E
Step 9 At this point, although vertices C, D and F have not been visited, the goal node G has the lowest value,
OPTIMISATION ALGORITHMS
and no node has an f value of infinity, so the optimal path has been found. The algorithm is complete.
Distance Previous
Node Heuristic f Visited
from start Node
A
4 2 A 0 12 0 Yes
3
B 4 6 10 Yes A
B C D
1
4 2 C 3 9 12 No A
E F D 2 12 14 No A
2 E 8 3 11 Yes B
5
G
F 15 9 24 No G
G 10 0 10 Yes E
To find the optimal path, start with the goal node and follow the previous nodes back to the start,
inserting the new node at the front of the list, e.g.:
Notice how the best path was followed immediately and that it was not necessary to visit C, D or F. This
makes A* more efficient that Dijkstra’s shortest path because fewer nodes need to be considered. However,
A* is only able to find the optimal path between two nodes and not one node to all other nodes. It is also
hugely reliant on having a suitable heuristic.
175
Essential algorithms for A level Computer Science
End If
Next
graph.pop(shortest)
End While
vertex = goal
While vertex != start
optimal_path.insert(vertex)
vertex = previous_vertex[vertex]
End While
Return optimal_path
End Function
176
Optimisation algorithms
OPTIMISATION ALGORITHMS
#Find the vertex with the shortest f from the start
shortest = None
for vertex in graph:
if shortest == None:
shortest = vertex
elif f[vertex] < f[shortest]:
shortest = vertex
#The vertex has now been visited, remove it from the vertices to consider
graph.pop(shortest)
177
Essential algorithms for A level Computer Science
return optimal_path
'The vertex has now been visited, remove it from the vertices to consider
graph.Remove(shortest)
End While
OPTIMISATION ALGORITHMS
Return shortest_path
End Function
Sub main()
Dim graph = New Dictionary(Of String, Dictionary(Of String, Integer)) From {
{"A", New Dictionary(Of String, Integer) From {{"B", 4}, {"C", 3}, {"D", 2}}},
{"B", New Dictionary(Of String, Integer) From {{"A", 4}, {"E", 4}}},
{"C", New Dictionary(Of String, Integer) From {{"A", 3}, {"D", 1}}},
{"D", New Dictionary(Of String, Integer) From {{"A", 2}, {"C", 1}, {"F", 2}}},
{"E", New Dictionary(Of String, Integer) From {{"B", 4}, {"G", 2}}},
{"F", New Dictionary(Of String, Integer) From {{"D", 2}, {"G", 5}}},
{"G", New Dictionary(Of String, Integer) From {{"E", 2}, {"F", 5}}}
}
179
Essential algorithms for A level Computer Science
Efficiency of A* pathfinding
Time complexity
Best Worst
Average case
case case
Determining the efficiency of the A* algorithm is not simple since there are several optimisations that can be
implemented and different perspectives on how to calculate the time complexity.
If you study the code carefully, you will see that the algorithm doesn’t stop when the goal node has the lowest
f value but rather, when there are no nodes left to check. The implementation presented is the same as
Dijkstra’s shortest path, with an added heuristic to determine an optimal path. This is a great example of
how a coded solution can exemplify the characteristics of a well-known algorithm while having its own
nuance, due to the way it is implemented. This is very common in programming. There are always multiple
ways to solve the same problem, some being more efficient than others, in terms of complexity of approach,
OPTIMISATION ALGORITHMS
Depending on the purpose of A* pathfinding within a larger program, it may only be necessary to compute
one path, not necessarily the most optimal path. Therefore, once a solution has been found between the
start and the goal, the algorithm can stop. This reduces execution time considerably but would not backtrack
to consider other — potentially more optimal — routes.
A common optimisation with A* pathfinding is to pre-calculate the shortest distance or f values of some
vertices, significantly increasing its efficiency. A simple example of this would be to store the path from A to
G in another data structure once it has been calculated. If it is needed again, assuming the heuristic has not
changed, instead of being calculated all over again to return the same result, it can simply be looked up.
Having a suitable heuristic that can be calculated quickly is essential. The complexity of this is usually
considered to be O(1), although that may not be the case. Where it makes sense to use a Manhattan
distance, the heuristic is as optimal as it can be. However, A* is often used in situations where the heuristic
is just a computation to deliver a best guess.
Computer science theorists usually consider the execution time of A* as the result of the number of nodes
and edges in the graph. However, those working with artificial intelligence consider what is known as the
branching factor (b). In AI processing, the number of edges to consider can be extremely large, so an
optimisation that avoids having to consider all the nodes would be used. In this case, the number of nodes
and edges has less relevance. Therefore, time complexity is more a measurement of the depth to the goal
node (d). The time complexity of A* pathfinding is often calculated as polynomial: O(bd). The values of the
branching factor, the goal node and the heuristic affect the efficiency of the algorithm so significantly that it
can either be almost linear at best or polynomial at worst.
180
Optimisation algorithms
In summary
Dijkstra’s shortest path A* pathfinding
Finds the shortest path from one node to all other Finds the shortest path between two nodes
nodes
All nodes are expanded during the search Only promising nodes are expanded during the
search
Generally slower than A* pathfinding Heuristic helps find a solution more quickly
Fails with negative edge values Fails with negative edge values and if the heuristic is
over-estimated
OPTIMISATION ALGORITHMS
Typical heuristic might find the estimated distance
from the final node
181
ISBN: 978-1-7943594-2-0