BREAKING NEWS:
Genius mathematicians and Algorithms 101 students concur; sorting is “Not that easy”.
Well, I guess that clears things up.
Thanks for reading.
One Abe in a bar is worth two Homers in a bush.
But seriously, there are a couple of things that make sorting really hard:
 You can’t just get it done with in one go; you have to go through your array/tuple multiple times.
 Every time you move an element, you must check that you don’t accidentally push it into the wrong place and overwrite some other data.
 Worse, you have to get a machine to do it. And computers are really strange sometimes.
 And you want your sort to be really fast, too.
These are all reasons why sorting is taught in Algo 101 classes and why brilliant mathematicians are still working on the problem.
It doesn’t mean we have managed to find one perfect way of doing it yet; sorting is still really difficult.
Okay, maybe not that difficult.
So, maybe it’s a good idea to start this post with why and how we sort data.
Disclaimer: this post begins with the following long, boring, definition on sorting.
The main purpose of sorting information is to optimise its usefulness for specific tasks. In general, there are two ways of grouping information: by category e.g. a shopping catalogue where items are compiled together under headings such as ‘home’, ‘sport & leisure’, ‘women’s clothes’ etc. (nominal scale) and by the intensity of some property, such as price, e.g. from the cheapest to most expensive (ordinal scale).
Richard Saul Wurman, in his book Information Anxiety, proposes that the most common sorting purposes are name, by location and by time (these are actually special cases of category and hierarchy). Together these give the acronym LATCH (Location, Alphabetical, Time, Category, Hierarchy) and can be used to describe just about every type of ordered information.
The opposite of sorting, rearranging a sequence of items in a random or meaningless order, is called shuffling.
For sorting, either a weak order “should not come after” or a strict weak order “should come before”, a strong order. For the sorting to be unique, these two are restricted to a total order and a strict total order, respectively.
Generally objects can be sorted based on a property. Such a component or property is called a sort key. For example, the items are books, the sort key is the title, subject or author, and the order is alphabetical. A new sort key can be created from two or more sort keys by lexicographical order. The first is then called the primary sort key, the second the secondary sort key, etc. For example, addresses could be sorted using the city as primary sort key, and the street as secondary sort key…….
…Basically, the small thing should come before the big thing. What the small thing is, depends on the situation.
Sometimes it is an integer. Sometimes it is a character. Sometimes it is a whole string of numbers, or even the addresses of houses.
Sometimes, we even put the big thing before the small thing. This is called sorting in descending order. By default though, when we sort, we sort in ascending order, because it makes a nice sound.
Types of sorting algorithms:

Comparative sorting algorithms:
these include (but are not restricted to) insertion, bubble, heap, Shell, merge, heap and quicksort. In these types, every comparison involves comparing entire numbers with one another, and rearranges the numbers (either inplace or using auxiliary i.e. extra memory).
It has been proved and explained that the average lowerbound of all comparisonbased sorting algorithms is Ω(n.logn). There are modifications that can be made to comparisonbased sorting algorithms so that they sort in O(n) time when the array is sorted or reverse sorted, but for any random input array, a comparisonbased sorting algorithm will not run much better than Θ(n.logn). 
Noncomparative sorting algorithms:
the major ones are bucket sort, counting sort, and radix. Here, we don’t compare the numbers directly…instead we do something else. All the noncomparative sorting algorithms use auxiliary memory, i.e. additional memory (other than our input array). All of them are similar to hashing. And most of them run in linear time. This is yet another example of the spacetime tradeoff that exists in the development of algorithms.
Space and time complexity of all sorting algorithms
I really encourage you to see the above link, which is about the space and time complexities of all the various sorting algorithms. It helps a lot to see how they stack up.
Also, I really, really recommend you see this short, informative, fun Youtube video about how fast different sorting algorithms will run in real time. Every algorithms student and programmer should see it. Here is also a playlist of all of them individually and slowed down.
Other than time and space complexity, there is one more criterion that you need to consider while choosing which sorting algorithm to use…
Stability:
Now, an important type of sorting algorithms are stable sorting algorithms, i.e. they maintain the relative order of objects with equal keys. A key is the parameter of the objects with respect to which you wish to sort them.
For example, if you are given a selection of students, presorted by rollcall number. We now sort them by name (first name only). Then, Alice who is #14 will come before Alice who is #27. Robert Ludlum who is #60 will come after Robert Stark who is #35 (remember, we are sorting by first name only) because that is the relative order they were in before we decided to sort the children by name.
Stability is more important than you think.
Tower of Hanoi: advanced mode.
Take another example:
You are buying a new cellphone. You check out various candidates weigh different features and operating systems, but in the end it all comes down to one thing: money.
You make a table of all your possible candidates with their respective prices. Initially, you’d probably make a list that looks like this:
Samesung DS4  $530 
Samesung NotDifferentAtAll

$750 
Samesung Not

$650 
Samesung Not3

$750 
Samesung DS5

$630 
macromin Easel  $270 
Mapple myPhone 6.0.0.0.2  $1450 
Mapple myPhone 5

$900 
Mapple myPhone 6  $1150 
Karcoal Uranium  $330 
Voogle Xenus 5

$650 
HTTPS Two  $890 
HTTPS Desire 982  $800 
Okai Candella L432 – 320 uberpixel Zion Lens  $1000 
Blueberry 3.1416  $490 
As you can see, these are all grouped by device manufacturer, but they are not sorted otherwise.
Suppose you want to sort this data by price. You could use a stable or unstable sorting algorithm to do this, and you would get a table like this:
macromin Easel

$270

Karcoal Uranium

$330

Blueberry 3.1416

$490

Samesung DS4  $530 
Samesung DS5

$630 
Voogle Xenus 5  $650 
Samesung Not

$650 
Samesung Not3

$750 
Samesung NotDifferentAtAll

$750 
HTTPS Desire 982

$800

HTTPS Two

$890

Mapple myPhone 5

$900

Okai Candella L432 – 320 uberpixel Zion Lens

$1000

Mapple myPhone 6

$1150

Mapple myPhone 6.0.0.0.2  $1450 
Now, suppose you want to group them back into company name, but you don’t want them to be jumbled back into how they were in the first one. In other words, you still want them to be sorted by price. This means when you sort a second time, you need to make sure that phones with the same company name retain their relative positions. Meaning:
When sorting with respect to multiple keys in succession, stable sorting algorithms are required.
We first sort with respect to the least important parameter, and then successively the more important parameters.
Our need would require you to run two sorts in succession, first by price (the less important one) and next by company name (the more important one), to get an arrangement like this:
Blueberry 3.1416

$490

HTTPS Desire 982

$800

HTTPS Two

$890

Karcoal Uranium

$330

macromin Easel  $270 
Mapple myPhone 5

$900

Mapple myPhone 6  $1150 
Mapple myPhone 6.0.0.0.2

$1450

Okai Candella L432 – 320 uberpixel Zion Lens

$1000

Samesung DS4

$530

Samesung DS5

$630

Samesung Not

$650

Samesung Not3

$750

Samesung NotDifferentAtAll

$750

Voogle Xenus 5

$650 
You can tell which parameters are considered most important by looking at which are the broadest categories. Company name, for example, is the broadest here. Only in case of conflict of company name do we need to add more search parameters. Keeping data sorted with respect to multiple parameters helps searching tremendously, as we can use binary search on each parameter. Companies like Amazon etc. allow you to shop for items in an entirely similar way.
Using stable sorting algorithms hence makes searching faster.
Stable sorting algorithms also have more uses.
 Radix sort, for example, completely relies on stability to sort (go ahead, try to run Radix by purposely not maintaining stability. The data will never be sorted). So does Bucket sort and Proxmap sort, as well as many more noncomparative sorting algorithms.
 More than that, we usually like stable sorting algorithms because they make sense, and when they are executed one after the order, using different types of different keys, you get a unique and interesting result. Suppose, you have a list of folders on your computer and you sort them first by ‘name’, and then ‘date modified’. You will get at the top your most recently modified files/folders, not in some random order, but neatly arranged by name.
 Another important thing about stable sorting algorithms is that you can run them on linked lists.
Here is a list of all the known stable sorting algorithms. The are mostly varieties of insertion sort, bubble sort, merge sort, radix and bucket sort.
Conclusion: sorting may be hard, but it is important. And sometimes fun, too.
I am now going to break up my notes on sorting algorithms into three parts, based on what I think are the important categories they fall into. They will be divided in three separate posts. They are:
 Stable Comparative sorts
 Unstable Comparative sorts
 Noncomparative sorts
Let’s get on with part 1…