Chapter 6

Algorithm Complexity

6.1 Introduction

The big-O notation gives an indication of the growth rate of a function. It identifies the dominant term of the function. In this chapter we shall use the big-O notation to classify the complexity of algorithms. The motivation for this is to help programmers analyse and evaluate the efficiency of algorithms.

Here is a reminder of the kinds of dominant terms that we meet in computing. They are given in order of growth rate, from slowest to fastest.

O(1)   O(log2n)  O(n log2n)    O(n2)     O(2n)     O(n!) 6.2 The complexity of an algorithm.

Algorithm complexity means how much time and storage the program requires to execute. (It does not mean how difficult the algorithm is to understand.) In this chapter we confine the discussion to the time of execution. When we say an algorithm has complexity O(n2), say, we are giving a relationship between the total number of operations that are executed and the size of the data set. For example, if algorithm A is O(n2) and algorithm B is O(nlog2n) then for a large data set we would prefer algorithm B, since the number of operations required to process the data grows more slowly as the data set increases. If we increased the size of the data set by a factor of ten (i.e. we have 10*n data items), then algorithm A will take about one hundred times as long to execute than before, whereas algorithm B may only take thirteen times as long.

Of course there are external factors that affect the execution time: e.g. speed of hardware, and the compiler used, but we shall assume a level playing field. A good estimate of the time-complexity of an algorithm is the number of high-level operations that are required.
 
 

6.3 Examples

Below we look at some classic algorithms and determine their complexity. We shall see that there are some quick ways to estimate the big-O term. The examples have been chosen to illustrate some of the orders of complexity listed above.

Linear search

Binary search

Selection Sort

Quicksort

Travelling Salesman Problem

We assume that there are n items in the data set. The searching and sorting algorithms assume the following declarations:

TYPE KeyType = CARDINAL;

DataType = .......;

Item = RECORD

key : KeyType;

data : DataType

END;
 
 

6.4 Linear Search.

This procedure searches the first n components of an array of Items for a record with a given key. If it is found then the position of the item is returned. If no record with that key is found then the procedure returns n. The array index starts at 0, so the first n components are in positions 0n-1.

--------------------------------------------------------------------------------------------------------------------------------------

(*pseudo code*)

Start at the first component

WHILE still checking, and x does not match with the current key DO

move to next item

END

Return position of match, or n if not found

--------------------------------------------------------------------------------------------------------------------------------------

PROCEDURE LinearSearch (VAR a : ARRAY OF Item;

n : CARDINAL;

x : KeyType) : CARDINAL;

(* pre- : n - 1 ã HIGH(a) : there are at least n components in the array;

post- : RETURNs i, where (0 ã i< n and a[i].key=x) or (x not in a[0..n-1] and i=n) *)

VAR i :CARDINAL;

BEGIN

  1. i:=0;
  2. WHILE (i<n) AND (x<>a[i].key) DO
  3. i:=i+1
  4. END;
  5. RETURN i
END LinearSearch;

--------------------------------------------------------------------------------------------------------------------------------------

6.5 Analysis

Line No.
Number of times executed
 
x found
 
x not found
1
1
 
1
2
½ n
(on average)
n
3
½ n
(on average)
n
5
1
 
1
Total
n + 2
 
2n+2

Whether the item is found or not, the number of operations is a linear function of n. Both of these functions are O(n). The time taken to execute the algorithm is proportional to the amount of data in the array. If you double the size of the data set, then the algorithm takes about twice as long to execute.

6.6 Discussion

In the analysis above we estimated how many times each line of code is executed, and then summed these terms to find the total number of high level operations. Since we are just looking for the dominant term of the function it is often unnecessary to carry out such a detailed analysis. In most algorithms it is fairly easy to decide whether the dominant term is n, n2, n!, 2n, log2n or nlog2n. We only need to do sufficient work to be able to classify the algorithm complexity with the big-O notation.
 
 

6.7 C Some rules of thumb C

A loop that processes each data item in turn is O(n).

A loop within a loop e.g.

do n times

do m times

process

end-do

end-do

is O(n*m). The process is executed m times for each iteration of the outer loop. The outer loop is executed n times, so process is executed n*m times.
 

If the data set is halved each time through a loop then there will be O(log2n) iterations.
(Recall that log2n is the number of times you repeatedly half n to reach 1.)

An algorithm that processes every subset of n data items has complexity is O(2n).

An algorithm that processes every permutation of n data items has complexity O(n!).

If the number of operations is the same whatever the number of data items then the complexity function does not depend on n. We say the operation has complexity O(1). An example of such an operation is the procedure to return the top of a stack: It doesn't matter how big the stack is, only one data item is processed. It takes the same effort to return the top of the stack that has 2 items as it does a stack with 1 000 000 items..

6.8 Binary Chop Search

This search requires the array to be sorted (in ascending order). By looking at the middle component we can decide whether the target key is in the lower or upper half of the array. So we reduce the search space by half. This process is repeated for the reduced portion of the array.

--------------------------------------------------------------------------------------------------------------------------------------

(*pseudo code*)

Set search space to the whole of the array

WHILE the search space is more than one

Locate the middle item

IF middle item has a smaller key than the target key

THEN reduce the search space to the upper portion

ELSE reduce the search space to the lower portion

END

END

(*assert: searchspace reduced to one item *)

IF this item has the required key THEN return its position ELSE return n END

--------------------------------------------------------------------------------------------------------------------------------------

PROCEDURE BinarySearch (VAR a : ARRAY OF Item;

n : CARDINAL;

x : KeyType) : CARDINAL;

(* pre- : a is in ascending order ;

n - 1 ã HIGH(a) : there are at least n components in the array;

post- : RETURNs i, where (0 ã i< n and a[i].key=x) or (x not in a[0..n-1] and i=n) *)

VAR i, lo, hi, mid : CARDINAL;

BEGIN

  1. lo:=0; hi:=n-1;
  2. WHILE lo<hi DO
  3. mid:=(lo+hi) DIV 2;
  4. IF a[m].key<x
  5. THEN lo:=mid+1
  6. ELSE hi:=mid
  7. END
  8. END;
  9. (* assert: lo = hi *)
  10. IF x=a[lo].key THEN i:=lo ELSE i:=n END;
  11. RETURN i
END BinarySearch;

END Searches.
 
 
 
 

The trace below illustrates how the search space is reduced on each iteration of the loop. The search space is the portion of the array a[lo..hi] . Only the keys are shown.

The target, x, is 34.

(To complete during lecture)

 
lo
           
mid
           
hi
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
a
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47

 
 
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
a
2
3
5
7.
11
13
17
19
23
29
31
37
41
43
47

 
 
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
a
2
3
5
7.
11
13
17
19
23
29
31
37
41
43
47

 
 
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
a
2
3
5
7.
11
13
17
19
23
29
31
37
41
43
47

 
 
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
a
2
3
5
7.
11
13
17
19
23
29
31
37
41
43
47

 

Only four iterations were required to reduce the search space to one component. The search space is halved each iteration.

Binary Chop Search is O(log2n)






6.8.2 Exercise

Approximately how many iterations are required to to reduce the search space if n is

  1. 250,
  2. 1 000
  3. 1 000 000

 
 

The following algorithms are not analysed in such detail as the previous two. You are, however, encouraged to spend some time convincing yourself of the correctness of the analysis.
 
 

6.9 Sorting Algorithms

There are a number of sorting algorithms and they provide a rich source for comparative study. Most sorts have some particular behaviour which gives them the edge in the right circumstances. Here we only consider their time-complexity. The formula giving the number of moves may have a different order of magnitude form the formula giving the number of comparisons. Unfortunately, overall the algorithm has the higher order.
 
 

6.9.1 Selection Sort

--------------------------------------------------------------------------------------------------------------------------------------

(*pseudo code*)

FOR i := 0 TO n-2 DO

assign to k the index of the smallest element in a[i..n-1]

swap a[i] and a[k]

END;

--------------------------------------------------------------------------------------------------------------------------------------

PROCEDURE SelectSort(VAR a :ARRAY OF Item; n : CARDINAL);

VAR i, j, k : CARDINAL;

x : Item;

BEGIN

  1. FOR i:=0 TO n-2 DO
  2. k:=i;
  3. FOR j:=i+1 TO n-1 DO
  4. IF a[j].key<a[k].key THEN k:=j END;
  5. END;
  6. x:=a[k]; a[k]:=a[i]; a[i]:=x;
  7. END;
END SelectSort;

--------------------------------------------------------------------------------------------------------------------------------------
 
 

6.9.2 Analysis

(To complete during lecture)

6.9.3 Quicksort

This sort, devised by C.A.R.Hoare in the middle sixties, out-performs all the other sorts most of the time. The only drawback is that there a worst-case scenario which reduces quicksort to something as chronically slow as bubble sort. Quicksort is a recursive procedure. Instead of counting how many times a loop iterates, we determine how many recursive calls are made.

--------------------------------------------------------------------------------------------------------------------------------------

(*pseudo code*)

Procedure Sort ( L, R : INTEGER);

(* quicksort the portion of the array a[L..R] *)

Choose a random item, x, from a[L..R]

Partition the array about x.

(*Assert: left partition has values ãx.key, and the right partition is äx.key*)

IF left partition has more than one element THEN Sort(left partition) END

IF right partition has more than one element THEN Sort(right partition) END

END Sort;

(* The first call to Sort has the first and last index *)

Sort(0, n-1)

--------------------------------------------------------------------------------------------------------------------------------------

PROCEDURE QuickSort (VAR a :ARRAY OF Item; n : INTEGER);

(* - - - - - - - - - - - - - - - - - - - - - - - - - - - *)

PROCEDURE Sort(L,R :INTEGER);

VAR i,j : INTEGER;

w,x : Item;

BEGIN

  1. i:=L; j:=R;
  2. x:=a[(L+R) DIV 2];
  3. REPEAT
  4. WHILE a[i].key < x.key DO i:=i+1 END;
  5. WHILE x.key < a[j].key DO j:=j-1 END;
  6. IF i<=j THEN
  7. w:=a[i]; a[i]:=a[j]; a[j]:=w;
  8. i:=i+1; j:=j-1
  9. END
  10. UNTIL i>j;
  11. IF L<j THEN Sort(L,j) END;
  12. IF i<R THEN Sort(i,R) END
END Sort;

(* - - - - - - - - - - - - - - - - - - - - - - - - - - - *)

BEGIN (* QuickSort *)

Sort(0, n-1);

END QuickSort;
 
 

6.9.4 Analysis

The recursive nature of this algorithm can be visualised as a tree structure. Each level of the tree contains the array partitioned into subarrays. Each subarray is further partitioned into two (or more) subarrays on the next level. If the partitioning is balanced then there will be log2n levels in the tree. I.e. the number of recursive calls (iterations) is O(log2n). The partitioning process on each recursion is O(n). So Quicksort is roughly O(nlog2n). Warning: this analysis is rather simplistic, but is about right in the 'normal' case.

6.10 Travelling Salesman Problem

This is a classic problem that has many similarities with other problems. It is often used in examples, since its solution transfers to other domains (e.g. graphs and networks).

A salesman has to visit each town in his area at least once in a given day and then go home. Find the shortest route that will allow him to do this. Here are two approaches:

Brute Force Algorithm:

List all possible routes

Find the shortest

Algorithm based on the following heuristic (a heuristic is a set of rules which give an answer that "solves the problem more or less".)

REPEAT

Go to the nearest town that you have not visited yet

UNTIL you have been everywhere

Go home






6.10.1 Analysis of Brute Force Algorithm
 
 
 
 
 
 
 
 

6.10.2 Analysis of Heuristic Algorithm
 
 
 
 
 
 
 
 
 
 

6.10.3 Exercise

(You will need a calculator)

Your computer carries out one million instructions per second. If n=25 how long will it take to work out (a) n2 instructions, (b) n! instructions?