Asymptotic Notation

Most of us had a hard time learning asymptotic notation. I fairly skipped it, like my other classmates until recently I found that it was quite interesting to understand.

Asymptote is a curve on a graph that is approached but not reached. Performance of algorithms is an important algorithmic study, only after it’s correctness. It is measured in terms of size of the input. As the input size increases boundlessly how does the running time of the algorithm vary ? We plot the graph of the input size versus the running time of the algorithm. The curve of the graph essentially tells us the performance of the algorithm.

The first sorting procedure we learnt in school is insertion sort. The worst-case time of this algorithm is Θ(n2). So, what is this?

The mathematical definition of Θ is as follows: Given a function g(n) , Θ(g(n) ) is defined as a set of functions:

Θ(g(n) ) = {f(n): there exist positive constants c1 and c2, and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0 }.

So, we are trying to club f(n) between c1g(n) and c2g(n) for a large value of n. So, since Θ(g(n) ) is a set of function and f(n) belongs to it, it means f(n) ε Θ(g(n) ). Instead we write f(n) = Θ(g(n) ), to mean the same (and we shall stick to that). The graph for f(n) = Θ(g(n) ) is:

thetaThe figure says that the value of function f(n) lies between an upper bound , c2g(n) , and a lower bound c1g(n). f(n) is hence a tightly bound by g(n).

The asymptotic running time ignores machine constants and looks for the growth of running time. So, as n approaches infinity, a Θ(n2) algorithm beats a Θ(n3). There will a point after which no matter how much step up you give to a Θ(n3) algorithm in terms of processing power – Θ(n2) will beat it. This point is n0 .

The O-notation gives an upper bound to the function. When it is used for bounding the worst running time of the algorithm, it gives a bound on every input to the algorithm. Ω-notation gives a lower bound to the function. When it is used to describe the best running time for an algorithm, it gives a bound to any random input.

For example, consider insertion sort on an array A.

1. for i = 2 to length(A)
do k = A[ i ]
j = i – 1;
while j > 0 and A[ j ] > k
do A[ j+1 ] = A[ j ]
j = j – 1;
A[ j+1 ] = k;

The running time of insertion sort is Θ(n2), when the input is reverse sorted. However, if the input is already sorted, it gives a running time of Θ(n).

Saying that running time is O(n2), means that there is a function f(n)=O(n2) so that no matter how large the input is or what the input is, running time will never go beyond f(n).

Saying that running time is Ω(n), means that there is a function
f(n)=Ω(n) so that no matter how large the input is or what the input is , running time will never fall below f(n).

To say running time (of insertion sort) is Ω(n2) means that there is a function f(n) = Ω(n2), such that no matter what size input is chosen it will never go below f(n). This is, of course, incorrect in case of insertion sort.

Thus, insertion sort will take a minimum of Ω(n) and a maximum of O(n2).

Worst case running time is a guarantee of performance, while best-case can cheat since some algorithms work faster on some input.


About this entry