Sorting algorithms : learn different method to sort your data

Gourine ayoub
11 min readApr 4, 2021

Introduction :

  • Working with different data mean that you will apply to that data many many operations for different purposes and outputs, one of the famous operation applied on the data is sorting. sorting a list of number or a list of names or even sorting different customized objects based on a specific condition.
  • In general sorting can be either a finishing step where you want to represent the data in a sorted form or the sorting will be used as a intermediate state for the data so it can be used in other operations like sorting the data before searching for a value inside that data.
  • Sorting is usually talked over with searching so if you are interested in searching methods and concept also you can check this two articles: https://gourineayoub-prf.medium.com/searching-algorithms-learn-different-searching-techniques-437ce0602463
  • In this article we will go through different sorting techniques how they work, and the theory behind them and also there time complexity.

Definition :

A sorting algorithms is a function that you give it a set of data as an example a list of number as an input and it will return a set of data in this example also a list of numbers that has the same size but the element are arranged in a different way from bigger to smaller or the inverse.

This algorithms that we are going to discuss need an understanding of basic data structure and also the recursing technique you can check this article to get familiar with those concepts : https://gourineayoub-prf.medium.com/data-structure-and-complexity-a-programmer-best-tow-weapons-part-1-introduction-a941cda2f557

Different Sorting methods :

Quick sort:

  • quick sort is a recursive algorithm based on the pivot concept where a pivot is a value from the list of values that we want to sort, than what we do is to try to make all the values smaller than our pivot in the left of our pivot position and all the value greater than it in the right of its position.
  • The steps will be like this :

1/ choose a pivot (an index of the value that we chose).

2/ move the pivot that we chose to the last place in our list.

3/ get the first item from left that is greater than our pivot value.

4/ get the first item from right that is smaller than our pivot value

5/ switch the values that we got from step 3 and step 4

6/ redo the steps 3 , 4 and 5 until the index of the right item is grater than the left items which means that we looped over all the items.

7/ redo all the steps from 1 to 6 as long as the values are not sorted.

How to choose the pivot ?

  • Choosing the pivot has a great impact on the time complexity of your search method in a way that some pivot will make the process faster than others.
  • one of the method used to choose the picot is the median of three where we look at the 1 middle and last value in the values that we have then we sort those three values and take the middle one.
  • the pivot that we want to choose should be as close as it can be to the median , the closer its the better performance and time complexity.
  • the time complexity of the quick method is at worse O(n2) but in average case it’s O(n log2(n)).

Merge sort:

  • Merge sort is a recursive method that is based on the divide and conquer principal, basically what you do is instead of trying to sort the big list of values that you have, you divide that data to smaller data and redo that again and again until you reach one value where you can no longer divide that you start going up and sort the data since sorting tow value is just comparing which value is greater , you continue going up resulting in the first big list of values sorted .
  • The steps of the algorithm :

1/ split the values that you have into to parts.

2/ redo the same for the result of the step 1.

3/ stop when you reach 1 value (a list that has 1 value)

4/ start going back forming a list of tow items sorted choosing 2 values from the bottom that those tow will combined with other 2 to from 4 values … etc until you reach the full size data that you started with.

  • The merge sort complexity will be O(n log2(n)) at worst.

Selection sort:

  • The concept of this sorting method is that in each iteration you find the minimum value put it in the first place then go through to the next iteration where the previous minimum is no longer apart of the data that we have to sort and we can achieve that with the help of two pointers or indexes one for the current value an another for the min values.
  • The steps :

1/ set the current item pointer and the min pointer to the first value.

2/ search for a values smaller that the current minimum where the search will start from the current min+1 until the end.

3/ swap the new min with the current min place (if non found no swap).

4/ set the current item to the next item after the current min then redo the same steps 1 2 and 3 until we reach the end of the list.

  • This algorithms use a nested loop so the complexity will be O(n²) since we have to go through the elements n times.

Insertion sort:

  • this concept of this method is to go through the elements and try to find a proper place for each value , so we start from the beginning item and compare it with the item next to it ,then if necessary each time we find that this value should no be here we change it’s place by putting in in the proper place to form a sorted list.
  • Steps :

1/ start from left or right (based on the sort ascending or descending).

2/ check the next value in the list of values then if we find that the value should not be here based on the comparison we change it’s place in the list.

3/do this until we reach the last item of the list.

  • This method has worst case time complexity of O(n²).

Bubble Sort:

  • This method is a simple sorting method based on comparing one item with the items next to it and swap if necessary and after each iteration the biggest (or smallest based on what you want) item will be at the last case in the list of values.
  • the method Steps :

1/ start from I =0 to I = n-1 compare list[i] with list[i+1]if the they are not sorted then swap there values.

2/ go through all the values and each time do the step1

3/ after the first iteration is done the result will be that the largest(or smallest) item is in the end.

4/ repeat the same steps 1 for the next iteration but this time from i=0 until I = n-2 cause we already now that the last items is in its right place.

5/ we stop when the we reach n-n (basically you loop over all the items n times).

  • This method time complexity will be O(n²) since we area dealing with nested loop and each time we are going through all the elements (I know that the inner loop is getting smaller bu 1 value each time but that will not effect the complexity so much).

Note : bubble sort and insertion sort may seem the same but the difference is that in the insertion sort each time we find a value not in it’s place we search for the correct place through all the previous elements and till we find the proper place. but in bubble sort we just compare the values with the adjacent values and swap them.

Heap Sort:

  • For this method you need to know what are heaps and for that you can check this article : https://gourineayoub-prf.medium.com/data-structure-and-complexity-part-4-priority-queues-and-heaps-7fffee1f1377
  • basically a heap is a binary sorted tree. We can have a max heap where the root of the tree is the biggest item and each parent is bigger that it’s child's or the inverse in a min heap.
  • For using this method we have different part to do
  • part one building the heap:
  • we first start to build the heap from the list of values that we have (we can use the heapify method).
  • We first create a tree from the list of values that we create a heap from that tree turning it into a max heap.
  • To get the inverse we will each time remove the root item from the heap put it in the last positing that we create a new max heap from the new heap until we are done with all the values.
  • This method complexity is O(n log(n)) sins building the max heap will take O(n) and heapify method will take O(log(n)) cause each time we go through the half of items and since the heapify is called each time we remove an item which is n time the result is O(n log(n)).

Shell Sort:

  • You can say that this method is a variation of the insertion sort method, this method where you try to take values from the left and right extremities and swap them over, the shell sort methods works with a fixed number called the gap numbers that indicates the gap between the number that are you comparing. The method follow this steps.

1/ identify a gap number lets say that in our case we have gap= 3.

2/ start from the first value and compare it to the value that is after it with x steps where x = gap value. So in our case we will take the first element(index 0) and compare it with the the 4th element (index 5)(starting from 0) . and swap there values if they are non sorted.

3/ repeat the 2 step for the other values until all the number has entered in a comparison. This called the first round.

4/ now we go to a 2end round where the we have a new gap value where new gap = old gap /2.

5/ redo this steps until you reach a gap value of 1 where this method will be a normal insertion sort method where you compare each element to the one next to it.

  • This process (the shell sort) will make the insertion process more faster cause the process will bring the element more closer to it’s proper place.
  • The time complexity will change based on the gap value and the shell sort method is not stable but the worst case will be O(n2).

Radix Sort:

  • This sorting method is used to sort numeric values and it’s based on comparing the digits of all the numbers that you wan to sort, we will start with the fact that any number must be formed from the 10 basic digits from 0 to 9. and we follow this steps:

- 1/ find the number with the maximum digits on it.

-2/ add X zeros to all the other numbers at the left side (so it will not effect the number) until all the numbers has the same number of digits.

-3/ the max number of digits that we found will be the number of round the algorithm will take to sort the values.

- for each iteration we have 10 buckets each bucket is labeled with a digit from 0 to 9.

  • in each iteration n (n from 1 to the max number of digits that we found first) we take the n digit from the right for each number and based on that we choose a bucket to put in that number .
  • after all the numbers are in there buckets we take out those number starting from the bucket 0 to 9 , the number in the same bcket will get out from bottom to top (in case you want a descending order do the inverse) .
  • the new ordered list of values will go now to then next round (also called pass) and this time we will choose buckets based on the next digit from right.
  • this round will repeat n (max digits) time and the end result will be a sorted list of values after you remove the zeroes that you added from each number.

Bucket Sort:

  • The basic of this method is to break the list of values that we have into sub list than we use a sorting algorithm to sort those sub list.
  • This method is useful since breaking your data into sub-list will make the time complexity lower when you use the sorting algorithms for those sub list then combine the sorted sub-list together.
  • one implementation of bucket sort is to go through all the items first to get the min and the max so you can know how u will divide the values .
  • next step would be how many buckets you will use ? Remember that a bigger number will take more space ! And also more comparison to decide which bucket to put in your value.
  • the complexity for this method depend on how you implement it and how you decide to put the values in the buckets and in general the average can be O(n+k) where k is the number of buckets.

Note : some images used in this article are not my creation.

Conclusion :

  • Sorting your data is always a big part in manipulating or filtering your data to be suitable for use and knowing the different methods to do that will allow you to write a more better and fast code depending on what data type you are working on.
  • You can always customize this methods for your own need , the most important thing is that you know how they worked and what they can give you the rest is up to you to make the best use out of it.

--

--