Quick sort Cost Analysis

Best Case

The best case for the quick sort occurs when each partition splits the array into two equal halves. Then, the overall cost for sorting $n$ items is given by:

  • The partitioning cost for $n$ items, which is $n$ comparisons (and a smaller number of exchanges), and
  • The cost for recursively sorting two half-size arrays

Since the size of each half is $n/2$, this leads directly to the recurrence relation:

$$C(n) = 2 C(\frac{n}{2}) + n \quad ; \quad C(1) = 0$$

This is the same recurrence relation that governs the merge sort. We know that it results in a cost function that is $O(n \ln n)$.

Worst Case

The worst case for the quick sort occurs when the partition does not split the array (i.e., when one set has no elements at all). Ironically, this happens when the array is sorted!

In this situation, the cost of the sort can be resolved into:

  • The partitioning cost for $n$ items, which is $n$ comparisons (and a smaller number of exchanges), as before, and
  • The cost for recursively sorting the remaining $n-1$ items

Thus, the recurrence relation in this case is slightly different:

$$C(n) = C(n-1) + n \quad ; \quad C(1) = 0$$

Solving the recurrence relation is straight forward:

$$\begin{array}{rclc} C(n) &=& C(n-1) + n & \\\\ &=& C(n-2) + (n-1) + n &\\\\ &=& C(n-3) + (n-2) + (n-1) + n &\\\\ &=& \cdots\\\\ &=& C(1) + 2 + \cdots + (n-2) + (n-1) + n &\\\\ &=& \displaystyle{\frac{n(n+1)}{2} - 1} &\\\\ \end{array}$$

Thus, $C(n)$ is $O(n^2)$.

Average Case and Improvements

Since the worst cases occurs when the array is ordered (or highly ordered), we can give ourselves a probabilistic guarantee that this doesn't happen by purposefully shuffling the elements of the array before using the quick sort algorithm. This comes at a slight extra cost, but is worth it to minimize to the point of negligibility the possibility of a $O(n^2)$ cost.

While we won't derive it here, on average, the quick sort results in a number of comparisons $\sim 1.39 n \ln n$.

That means one has $39$ percent more comparisons than merge sort. However, quick sort is faster than merge sort in practice, because there is less data movement during the sort.

Amazingly, we can improve on the quick sort algorithm with a few modifications:

  • Quick sort has a high recursive overhead when the arrays being considered are tiny. In the process of recursively calling quick sort on smaller and smaller arrays -- when the arrays are around 10 elements long, we could switch to an insertion sort to improve the overall efficiency. Alternatively, we could leave the sub-arrays that are around 10 elements long or shorter completely untouched by quick sort, and finish with an insertion sort in one big pass at the end.

  • The basic quick sort uses the first (or the last) element as the pivot value. The best choice for the pivot would be the median of the array -- but we can't find the median without sorting the array first -- which "puts the cart before the horse" in a certain sense. What we can do is generate a reasonable approximation of the overall median by taking the median of the first, last, and center elements. If we then swap this element to the front of the list in question, we improve the overall efficient of the algorithm by a fair degree.