Skip to main content

Asymptotic Analysis

Introduction

definition

Algorithm: A finite sequence of “well-defined” instructions to solve a given computational problem.

Algorithm design is an art that demands a lot of creativity, intuition, and perseverance.

There is generally a trade-off between simplicity and efficiency of algorithms (i.e., efficient algorithms are generally more complex)

We don’t want to analyze running time of an algorithm using a wall clock because that depends on the machine, programming language, etc. Instead, we use mathematical analysis.

Word-RAM Model

Word is the basic storage unit of RAM. Word is a collection of few bytes. Each input item (number, name) is stored in binary format.

RAM can be viewed as a huge 1D array of words.

Any arbitrary location of RAM can be accessed in the same time irrespective of the location (hence the name “Random Access”) → approximately same latency regardless of the position in the array.

Data as well as Program reside fully in RAM (otherwise you’d have to load each instruction from the disk which is super inefficient - think of CS2106).

Each arithmetic or logical operation (+, -, *, /, AND, OR, NOT) involving a constant number of words takes a constant number of CPU cycles. (However, note that exponentiation takes logarithmic time if done right)

So, an instruction is executed by fetching it first from RAM, decoding it, fetching operands, performing arithmetic or logical operations, and then storing the results back into RAM. Each instruction takes a few clock cycles to be executed (about a few nanoseconds)

In this model, we count the number of instructions taken in Word-RAM model as a mesaure of the time taken for a program to run (so we abstract away running time). There is an assumption here that all operations (+, *, etc.) take the same time - although multiplication is obviously much harder (and hence, takes longer) then addition.

The reason we analyze the algorithm running time mathematically is that it is independent of the machine, hardware, etc. We are only concerned with the analysis of algorithms, not the running time on actual machines.

Running Time

Running time depends on the size of input. So, we parameterize the running time by the size of the input. Generally, we seek upper bounds on the running time, because well, everybody likes a guarantee (we want to know for sure that no matter what input, our algorithm cannot do worse than a specific amount of time)

There are 3 kinds of analysis depending on how the inputs are fed to the algorithm:

important
  1. Worst case: T(n)T(n) = maximum number of instructions of an algorithm on any input of size nn (we assume that an adversary will try to find the worst possible input for our program such that it takes the longest time)
  2. Average case: T(n)T(n) = expected number of instructions of an algorithm over all inputs of size nn. We need some assumption of statistical distribution of inputs to calculate this properly.
  3. Best case: T(n)T(n) = minimum number of instructions of an algorithm on an input of size nn (we can just cheat with some naive slow algorithm that works fast on some input)

Example: worst case of bubble sort (reverse array) is n2\approx n^2 instructions, while best case of bubble sort is n\approx n instructions (already sorted array).

Example: Best case, worst case, average case of merge sort is nlogn\approx nlogn instructions.

note

Note that there is no indeterminism in the algorithm itself for average-case analysis. The uncertainty lies in the input, NOT the algorithm. This is different from expected time complexity for randomised algorithms where the expectation is over the statistical distribution of randomness internal to the algorithm/program.

Comparing Efficiency

When given two algorithms with their running times, we only compare for asymptotically large values of input size because time complexity really matters only for large-sized input (for small inputs, even a very inefficient algorithm will terminate in reasonable time).

Different machines/languages, etc. have different running time. So, we don’t measure actual run-time.

We estimate the rate of growth of runtime by asymptotic analysis.

Asymptotic Notations

note

Note that O(f(n)),Ω(f(n))O(f(n)), \Omega(f(n)), etc. are sets of functions, i.e., they represent a family of functions that are asymptotically upper bounded (or lower bounded) by f(n)f(n).

definition

Big-O notation (Upper Bound): We write f(n)=O(g(n))f(n) = O(g(n)) if there exists constants c>0,n0>0c > 0, n_0 > 0 such that 0f(n)cg(n)0 \leq f(n) \leq cg(n) for all nn0n \geq n_0

In set notation, O(g(n))={f(n):c>0,n0>0 s.t. nn0,0f(n)cg(n))}O(g(n)) = \{f(n) : \exists c > 0, n_0 > 0 \ s.t. \ \forall n \geq n_0, 0 \leq f(n) \leq cg(n)) \}

(So, actually f(n)=O(g(n))f(n) = O(g(n)) is abuse of notation! We should be writing f(n)O(g(n))f(n) \in O(g(n)) since O(g(n))O(g(n)) is a set of functions of which f(n)f(n) is a member)

definition

Big-Ω\Omega notation (Lower Bound): We write f(n)=Ω(g(n))f(n) = \Omega(g(n)) if there exists constants c>0,n0>0c > 0, n_0 > 0 such that 0cg(n)f(n)0 \leq cg(n) \leq f(n) for all nn0n \geq n_0

definition

Big-Θ\Theta (Tight Bound): Θ(g(n))={f(n):c1,c2,n0>0 s.t. nn0,0c1g(n)f(n)c2g(n)}\Theta(g(n)) = \{ f(n) : \exists c_1, c_2, n_0 > 0 \ s.t. \ \forall n \geq n_0, 0 \leq c_1g(n) \leq f(n) \leq c_2 g(n) \}

Note: Θ(g(n))=O(g(n))Ω(g(n))\Theta(g(n)) = O(g(n)) \cap \Omega(g(n)) (proof under Appendix)

There are 2 other notations - little-o and little-ω\omega (omega) - which are even stricter definitions of their counterparts:

  1. Little-o: o(g(n))={f(n):c>0,n0>0 s.t. nn0,0f(n)<cg(n)}o(g(n)) = \{ f(n) : \forall c > 0, \exists n_0 > 0 \ s.t. \ \forall n \geq n_0, 0 \leq f(n) < cg(n) \} (i.e., gg is strictly faster-growing than ff)
  2. Little-ω\omega: ω(g(n))={f(n):c>0,n0>0 s.t. nn0,0cg(n)<f(n)}\omega(g(n)) = \{ f(n) : \forall c > 0, \exists n_0 > 0 \ s.t. \ \forall n \geq n_0, 0 \leq cg(n) < f(n) \} (i.e., gg is stricly slower-growing than ff)

There are some important points to note about these little-o and little-omega definitions:

  1. You need to find an n0n_0 for every positive constant c. In other words, you need some sort of relation between n0n_0 and cc so that given a cc, you can find n0n_0. It is a universal statement in terms of cc. So, one example of cc and n0n_0 cannot be used to prove that a function is o(g(n))o(g(n)) or ω(g(n))\omega(g(n)). However, if you can find one value of cc for which there is no n0n_0 that satisfies the above conditions, then we can say that it is not o(g(n))o(g(n)) or ω(g(n))\omega(g(n)).
  2. Notice the strict inequality between cg(n)cg(n) and f(n)f(n) as opposed to the non-strict inequality for Big-O and Big-Ω\Omega.
  3. It should be obvious that:
    1. f(n)o(g(n))    f(n)O(g(n))f(n) \in o(g(n)) \implies f(n) \in O(g(n)) (if you can find an n0n_0 for every positive value of cc, you can pick any combination to satisfy the Big-O definition)
    2. f(n)ω(g(n))    f(n)Ω(g(n))f(n) \in \omega(g(n)) \implies f(n) \in \Omega(g(n)) (same as above)
  4. If f(n)o(g(n))f(n) \in o(g(n)) or f(n)ω(g(n))f(n) \in \omega(g(n)), then f(n)Θ(g(n))f(n) \notin \Theta(g(n)) (since g(n)g(n) either grows faster or slower than f(n)f(n), it doesn’t grow at the same rate as f(n)f(n), which is a requirement for f(n)f(n) to be tightly bounded by g(n)g(n).

Limits

Assume f(n),g(n)>0f(n), g(n) > 0. Then, the following results are true:

  1. limnf(n)g(n)=0    f(n)o(g(n))\lim_{n \to \infty} \frac{f(n)}{g(n)} = 0 \implies f(n) \in o(g(n))
  2. limnf(n)g(n)<    f(n)O(g(n))\lim_{n \to \infty} \frac{f(n)}{g(n)} < \infty \implies f(n) \in O(g(n))
  3. 0<limnf(n)g(n)<    f(n)Θ(g(n))0 < \lim_{n \to \infty} \frac{f(n)}{g(n)} < \infty \implies f(n) \in \Theta(g(n))
  4. limnf(n)g(n)>0    f(n)Ω(g(n))\lim_{n \to \infty} \frac{f(n)}{g(n)} > 0 \implies f(n) \in \Omega(g(n))
  5. limnf(n)g(n)=    f(n)ω(g(n))\lim_{n \to \infty} \frac{f(n)}{g(n)} = \infty \implies f(n) \in \omega(g(n))

We prove the first one using the epsilon-delta definition of limits:

  1. Since limnf(n)g(n)=0\lim_{n \to \infty} \frac{f(n)}{g(n)} = 0, by definition, we have:
    • For all ϵ>0,\epsilon > 0, there exists δ>0\delta > 0 such that f(n)g(n)<ϵ\frac{f(n)}{g(n)} < \epsilon for n>δn > \delta (as you increase nn beyond δ\delta, the ratio goes below ϵ\epsilon)
  2. Setting c=ϵc = \epsilon and n0=δn_0 = \delta, we have:
    • For all c>0,c> 0, there exists n0>0n_0 > 0 such that f(n)g(n)<c\frac{f(n)}{g(n)} < c for n>n0n > n_0
    • Hence, for all c>0c > 0, there exists n0>0n_0 > 0 such that f(n)<cg(n)f(n) < cg(n) for n>n0n > n_0.
    • So, by definition, f(n)o(g(n))f(n) \in o(g(n))

Other Important Properties

Transitivity

Should be obvious intuitively (e.g. If gg upper bounds ff, and hh upper bounds gg, then obviously hh upper bounds ff). All of the following are true because <,>,,<, >, \leq, \geq are transitive.

  1. f(n)=Θ(g(n))g(n)=Θ(h(n))    f(n)=Θ(h(n))f(n) = \Theta(g(n)) \land g(n) = \Theta(h(n)) \implies f(n) = \Theta(h(n))
  2. f(n)=O(g(n))g(n)=O(h(n))    f(n)=O(h(n))f(n) = O(g(n)) \land g(n) = O(h(n)) \implies f(n) = O(h(n))
  3. f(n)=Ω(g(n))g(n)=Ω(h(n))    f(n)=Ω(h(n))f(n) = \Omega(g(n)) \land g(n) = \Omega(h(n)) \implies f(n) = \Omega(h(n))
  4. f(n)=o(g(n))g(n)=o(h(n))    f(n)=o(h(n))f(n) = o(g(n)) \land g(n) = o(h(n)) \implies f(n) = o(h(n))
  5. f(n)=ω(g(n))g(n)=ω(h(n))    f(n)=ω(h(n))f(n) = \omega(g(n)) \land g(n) = \omega(h(n)) \implies f(n) = \omega(h(n))

Reflexivity

  1. f(n)=Θ(f(n))f(n) = \Theta(f(n))
  2. f(n)=O(f(n))f(n) = O(f(n))
  3. f(n)=Ω(f(n))f(n) = \Omega(f(n))

In particular, note that f(n)o(f(n))f(n) \neq o(f(n)) and f(n)ω(f(n))f(n) \neq \omega(f(n)) because ff cannot grow faster or slower than ff. (So, if f(n)=Θ(g(n))f(n) = \Theta(g(n)), then ff cannot be o(g(n))o(g(n)) or ω(g(n))\omega(g(n)) → here, g(n)g(n) = f(n)f(n) itself)

Symmetry

  1. f(n)=Θ(g(n))    g(n)=Θ(f(n))f(n) = \Theta(g(n)) \iff g(n) = \Theta(f(n)) (if ff grows at the same rate as gg, then gg grows at the same rate as ff → since equality is symmetrical)

Complementarity

  1. f(n)=O(g(n))    g(n)=Ω(f(n))f(n) = O(g(n)) \iff g(n) = \Omega(f(n)) (saying that ff is upper bounded by gg is the same as saying that gg is lower bounded by ff)
  2. f(n)=o(g(n))    g(n)=ω(f(n))f(n) = o(g(n)) \iff g(n) = \omega(f(n)) (if ff grows faster than gg, then gg grows slower than ff)

Appendix

Useful Mathematical Facts

Properties of logarithms and exponents

  1. ax=m    x=loga(m)a^x = m \iff x = log_a (m) → definition of logarithm
  2. loga(mn)=loga(m)+loga(n)log_a (mn) = log_a (m) + log_a (n) → product property
  3. loga(m/n)=loga(m)loga(n)log_a(m/n) = log_a(m) - log_a(n) → quotient property
  4. log(mn)=nlog(m)log(m^n) = nlog(m) → power property
  5. loga(b)=1/logb(a)log_a(b) = 1/log_b(a)
  6. logba=logc(a)×logb(c)=logc(a)/logc(b)log_b a = log_c(a) \times log_b(c) = log_c(a)/log_c(b) → change of base property
  7. aloga(x)=aa^{log_a(x)} = a → number raised to log
  8. loga(a)=1log_a(a) = 1
  9. loga(1)=0log_a(1) = 0
  10. loga(1/b)=loga(b)log_a(1/b) = -log_a(b)
  11. alogb(x)=xlogb(a)a^{log_b(x)} =x^{log_b(a)} (when x,a>0x, a > 0) → can take loglog on both sides to verify its true
  12. aman=am+na^ma^n = a^{m + n}
  13. am/an=amna^m/a^n = a^{m -n}
  14. 1/am=am1/a^m = a^{-m}
  15. (am)n=amn(a^m)^n = a^{mn}
  16. (ab)m=ambm(ab)^m = a^mb^m

Artithmetic series:

a+(a+d)+(a+2d)++(a+(n1)d)=n2(2a+(n1)d)a + (a + d) + (a + 2d) + \dots + (a + (n-1)d) = \dfrac{n}{2}(2a + (n-1)d)
1+2+3++n=n(n+1)2=Θ(n2)1 + 2 + 3 + \dots + n = \dfrac{n(n+1)}{2} = \Theta(n^2)

Geometric series:

a+ar+ar2+arn1=a(rn1)r1a + ar + ar^2 + \dots ar^{n-1} = \dfrac{a(r^n - 1)}{r -1}
a+ar+ar2+=a1rif r<1a + ar + ar^2 + \cdots = \frac{a}{1 - r} \quad \text{if } |r| < 1

Sum of Squares:

12+22++n2=n(n+1)(2n+1)6=Θ(n3)1^2 + 2^2 + \dots + n^2 = \dfrac{n(n+1)(2n + 1)}{6} = \Theta(n^3)

Sum of Cubes:

13+23+33++n3=(n(n+1)2)2=Θ(n4)1^3 + 2^3 + 3^3 + \dots + n^3 = \left(\dfrac{n(n+1)}{2}\right)^2 = \Theta(n^4)

Harmonic Series:

11+12+13++1n=Θ(logn)\dfrac{1}{1} + \dfrac{1}{2} + \dfrac{1}{3} + \dots + \dfrac{1}{n} = \Theta(\log n)

Sterling’s approximation: (ne)nn!nn(\frac n e )^n \leq n! \leq n^n

log1+log2+log3++logn=Θ(nlogn)\log1 + \log2 + \log3 + \dots + \log n = \Theta(n\log n)

Other important series:

121+222+323++n2ni=1i2i=2=O(1)\dfrac{1}{2^1} + \dfrac{2}{2^2} + \dfrac{3}{2^3} + \dots + \dfrac{n}{2^n} \leq \sum_{i=1}^\infty \dfrac{i}{2^i} = 2 = O(1)

Proof of Θ(g(n))=O(g(n))Ω(g(n))\Theta(g(n)) = O(g(n)) \cap \Omega(g(n))

Since all the above terms are sets (families of functions), we have to show that any arbitrary element of LHS is also in RHS, and vice versa. (Recall that proving that 2 sets are equal is the same as showing that each set is a subset of the other)

  1. So, let f(n)Θ(g(n))f(n) \in \Theta(g(n)) be an arbitrary element of Θ(g(n))\Theta(g(n)).
  2. Then, by definition, 0c1g(n)f(n)c2g(n)0 \leq c_1 g(n) \leq f(n) \leq c_2 g(n) for some constants c1,c2>0c_1, c_2 > 0 and for all n>n0n > n_0 for some n0n_0.
  3. The above inequality can be rewritten as: 0c1g(n)f(n)0 \leq c_1g(n) \leq f(n) for all n>n0n > n_0 and 0f(n)c2g(n)0 \leq f(n) \leq c_2 g(n) for all n>n0n > n_0
  4. By definition of Big-Omega and Big-O notation, we can write (3) it as: f(n)Ω(g(n))f(n)O(g(n))f(n) \in \Omega(g(n)) \cap f(n) \in O(g(n)).
  5. So, we have proved one direction of the equality.
  6. Now, we show that if f(n)f(n) (an arbitrary but particular function) is O(g(n))O(g(n)) and Ω(g(n))\Omega(g(n)), then f(n)Θ(g(n))f(n) \in \Theta(g(n)).
  7. Since f(n)O(g(n))f(n) \in O(g(n)), by definition: there exists some constants c1,n1c_1, n_1 such that n>n1,0f(n)c1g(n)\forall n > n_1, 0 \leq f(n) \leq c_1 g(n)
  8. Similarly, since f(n)O(g(n))f(n) \in O(g(n)), by definition: there exists some constants c2,n2c_2, n_2 such that n>n2,0c2g(n)f(n)\forall n > n_2, 0 \leq c_2 g(n) \leq f(n)
  9. We pick n0=max(n1,n2)n_0 = max(n_1, n_2) (to ensure the above inequalities hold simultaneously, we have to pick the intersection of n>n1n > n_1 and n>n2n > n_2, which is given by n>max(n1,n2)n > max(n_1, n_2)).
  10. Then, combining the above inequalities, we get: n>n0, 0c2g(n)f(n)c1g(n)\forall n > n_0, \ 0 \leq c_2g(n) \leq f(n) \leq c_1 g(n), which means that f(n)Θ(g(n))f(n) \in \Theta(g(n)) by defintion of Big-Theta.
  11. Hence, Θ(g(n))=O(g(n))Ω(g(n))\Theta(g(n)) = O(g(n)) \cap \Omega(g(n))