Complexity: why and where it matters.

I started this blog for those interested in computer science, and coming from that background, I expect you’ll be interested in how an algorithm’s complexity really makes a difference. I also expect that you know what I’m talking about when I say complexity. If you don’t, it would really be a bother to have to explain it all over again, but luckily I don’t have to!
Here are a few links from Wikipedia and Cornell University to get you started.

Once you’re through with that, check out this table I made, which observes how some of the most commonly-used computational complexity functions stack up against each other, as the number of inputs ‘n’ increases:

Note: this table contains exact values, but only the order of magnitude is more important, because the same algorithm might be coded in different ways, by different programmers, and hence the exact function of ‘n’ which describes the algorithm’s complexity may differ with different codes. However, if the method of approaching the problem in all of these codes is the same, the asymptotic notations O(n), O(nlogn) etc, will be the same as well.

 

Observing function values at various values of ‘n’:

n = n! 2^n n^3 (n^2).log(n) [base:2] n^2 n.log(n) [base 2] n log(n) [base 2]
1 1 2 1 0 1 0 1 0
2 2 4 8 4.1 4 2 2 1
3 6 8 27 14.265 9 4.755 3 1.585
6 720 64 216 93.059 36 15.51 6 2.585
10 3.6288 E6 1024 1000 322.193 100 33.219 10 3.322
15 1.308 E12 32,768 3,375 879.5 224 58.603 15 3.907
21 5.1091 E19 2.097 E6 9,261 1,937.012 441 92.239 21 4.392
28 3.048 E29 2.684 E8 21,952 3,768.966 784 134.606 28 4.807
36 3.72 E41 6.872 E10 46,656 6,700.223 1,296 186.117 36 5.17
45 1.962 E56 3.5124 E13 91,125 11,121.003 2,205 274.133 45 5.492
55 1.26964 E73 3.603 E16 166,375 17,488.613 3,025 317.975 55 5.781
66 5.443 E92 7.739 E19 287,496 26,329 4,356 398.93 66
78 1.132 E115 3.022 E23 474,552 38,240 6,084 490.26 78
91 1.352 E140 2.476 E27 753,571 53,891 8,281 592.21 91
105 1.081 E168 4.056 E31 1.157 E6 74,024 11,025 704.996 105 6.714
120 6.69 E198 1.329 E36 120
136 8.711 E40 131,089 18,496 963.895 136
153 1.142 E46 3.58 E6 153
171 2.993 E51 216,905 29,241 1,268 171
190 9.68 E351 1.569 E57 273,271 190
210 1.6455 E63 9.0261 E6 340,198 44,100 1,620 210 7.714
skip +21 to +23. .. .. .. .. .. .. .. ..
300 3.0606 E614 2.037 E90 2.7 E7 740,793 90,000 2,468 300 8.229
skip +25 .. .. .. .. .. .. .. ..
351 4.324 E7 1.042 E6 123,201 2,968 351
skip +27 to +30
496 1.22 E8 2.203 E6 246,016 4,441 496 8.954
skip +32 to +35
666 1.0106 E1593 3.062 E200 2.954 E8 4.16 E6 443,556 6,246 666 9.379
skip +37 to +39
820 5.5139 E8 6.5085 E6 672,400 7,937 820 9.6795
skip +41 to +44
1035 3.682 E311 1.109 E9 1.073 E7 1.071 E6 10,365 1035 10.0154
skip +46 to +54 …. …. …. …. …. …. …. ….
1540 3.652 E9 2.51 E7 2.372 E6 16,306 1540 10.589
skip +56 to +68 …. …. …. …. …. …. …. ….
2415 1.4084 E10 6.554 E7 5.83 E6 27,139 2415
skip +70 to +88 …. …. …. …. …. …. …. ….
4005 6.424 E10 1.92 E8 1.604 E7 47,930 4005 11.968
skip +90 to +114 ….. ….. ….. ….. ….. ….. ….. …..
6670 2.9674 E11 5.65 E8 4.45 E7 84,732 6670 12.7035
skip +116 to +140 ….. ….. ….. ….. ….. ….. ….. …..
10,011 2.865 E37 4.086 E3013 1.0033 E12 1.332 E9 1.002 E8 133,039 10,011 13.289
25,000 1.5625 E13 9.131 E9 6.25 E8 365,241 25,000 14.61
100,000 E15 1.661 E11 E10 1.661 E6 100,000 16.61
250,000 1.5625 E16 1.121 E12 6.25 E10 4.483 E6 250,000 17.93
E6 8.264 E(5,565,708) 9.9 E(301,029) E18 1.993 E13 E12 1.993 E7 E6 19.932

 

 

 

I really love this table because it shows just how massive n! and 2^n really are in comparison with the other functions.

I’ve also put in bold where the number of steps required for that function exceeds a million:

  • For n^2, it happens when the input is a thousand.For n.log(n) (base 2), it is around a hundred thousand.But for 2^n it happens at n=20, and for n!, n=10.This just goes to show why complexity is key in writing your algorithms. There are a set of problems called NP-complete problems (like the Travelling Salesman) which are million-dollar problems, said to be unsolvable (theoretically) and even with dynamic programming algorithms, they take either O(2^n) or O(n!) time. It’s also speculated that solving any one problem, indirectly solves all the others.
  • For log(n) (base 2), it isn’t shown, but basic logarithmic maths should point you to the last row in the 2^n column. See that number (9.9 E301029)? That’s how many inputs we must have, for the number of steps our O(logn) program to exceed a million steps.
    Putting that into perspective, the distance from the Earth to the Sun is around 1.5 E20 ….. picometers.
    This gives you an idea of just how efficient binary search and inserting into a binary search tree, etc, really is. If we have 100 billion sorted integer numbers (which will take about 20 GB of hard drive space) binary search will find the right number in just 37 steps.
  • The simple logarithmic function is just about as close to O(1) you can get, which is why a binary search tree is so very, very efficient in finding an element, inserting an element, deleting an element, etc. The heap, which is most commonly used to visualize heap-sort – an O(n.log(n)) sorting algorithm), is a binary tree.
    In fact, there are very few applications in which your complexity of using a balanced Binary Search Tree exceeds O(n). That’s why the BST is my favorite data structure.

I’ve often wondered about whether it’s worthwhile making a ternary search tree, so I drew up a graph using the great graphing tool, Desmos graphing calculator.

In the big picture (see below), you can see what I found; that it doesn’t make much difference which base you use, logartihmic time is pretty similar.

 

Maths note: log(2) ~=0.3 (base 10)

I think I’ve made my point, but if you’re still unconvinced, check this out:

You probably know that merge and heap sort are so very efficient with their O(n.log(n)) worst-time complexity, but I think the numbers do the real talking:

  • To reach a million steps, O(n.log(n))-complexity merge/heap sort, require n=100,000 elements as input.
  • Bubble/insertion/quick sort with O(n^2) complexity require only an input of n=1,000 to reach 1 million steps.

That’s a hundredfold gap . . . but then you notice that non-comparative sorts like radix, bucket and counting sort, would sort in O(n), i.e. a function relative to the number of elements you have in the input array. . . .whereas it has been proven that comparative sorting algorithms will max out at O(n.log n) complexity.

So why do the other sorting algorithms even exist?

It’s worth noting that even the best O(n.log(n)) comparative and O(n) non-comparative sorts, have a space-complexity of O(n), as compared to the O(1) space complexity of sorting algorithms like bubble/insertion/quick sort, which take O(n^2) time. Also, they all have their special sets of weaknesses.

Which is pretty bad. I mean, okay, all sorting algorithms must take O(n) space complexity at least – to store their data structure of ‘n’ elements – but the auxiliary (i.e. ‘extra’) space required for most O(n.log(n)) & O(n) sorts, is also O(n). And recursive implementation of a sorting algorithm takes O(n^2) space complexity, because the previous states are saved on the system’s function call stack.

The exception is heap-sort, where if you build the heap in-place (i.e. you modify the original array you are given), the space-complexity becomes O(1) i.e. you only need a fixed number of variables required to implement the sort, and your time-complexity doesn’t exceed O(n.log(n)) steps.

So my Completely Unrequested Recommendation (CUR) is: use Heap Sort! 😛

In actuality, there exist so many different sorting algorithms mostly because when you consider things like the average case, the limitations of different systems, and functionality…

We need to use different sorts in different situations.

Quicksort may have quadratic time…however, that is only in the worst case. In the average case, it is O(n.logn) or better. Heapsort, meanwhile, will be O(n.logn) always, because the indirect method it takes towards sorting an array. In real life – such as the Arrays.sort() function in Java – Quicksort is used most often.

Then there’s stability.

Merge, Radix, bubble and insertion are stable, but Heap sort is not. So for specific applications, you need a different kind of sort.

And another thing:

People keep finding new ways to improve them.

For example: Quicksort also gives an average-case complexity of n.log(n) (base 2), which is optimal and doable by selecting your ‘pivot’ correctly optimally. Then, someone came along and realized that if you choose your pivot as the median of your first, last and middle elements, you get sightly better results. If you select your pivot randomly, it gets safer, as you would foil the plans of someone who purposely wanted to hack your system, by causing a worst-case input every time (it’s happened…people are weird).

Still, if you know anything about algorithms, you know that calculating average case correctly is as difficult as teaching a dog how to play a piano . . . and not half as rewarding!

Welp, that’s all folks! I’m going to be trying to leaf around with all the concepts of trees in my next post, so keep your thumbs pointed to the stars and your browsers open!

 

Abhishek Divekar,

is a student at Veermata Jijabai Technological Institute. When he’s not juggling submissions and inhaling canteen food, he is an avid reader and spends his time learning about the world in general and computers in particular.