# Understanding QuickSort

## TL;DR

QuickSort is a **divide-and-conquer** sorting algorithm with an average time complexity of \( O( n \cdot log(n) ) \) and a space complexity of \( O(1) \). The list is traversed and rearranged recursively until each pivot value finds its correct position.

## Partitioning

Before we dive into QuickSort, we need to understand what is **partitioning**.

Given an array of \( n \) elements, we pick one element that will serve as **pivot** and place all the smaller elements on its left and all the bigger elements on its right. By doing so, we position the pivot where it should be.

In the following example, we consider an unordered array of integers from 1 to 10. We choose the first value as the pivot. We then rearrange the array so that the smaller values are on its left and the bigger values are on its right.

Note that when moving smaller and bigger elements, their relative order may change: we say that QuickSort is **not stable**.

A simple way to partition would be to initialize three subsets, go through the array and allocate each element to the correct subset:

This function does the job but is not memory-efficient since we use three intermediary lists and return the partitioned array in a **new **list.

As you've probably guessed, we can be smarter than that and partition the array **in-place, **i.e. without creating new lists. To do so, we traverse the array once and swap elements so that all elements smaller than the pivot end up "towards the left" and all the elements bigger than the pivot end up "towards the right". We then put the pivot value at its correct position.

I encourage you to try this function in a Python debugger and see how the list is evolving at each iteration. Otherwise, you can follow this link where I go through each step in detail.

## QuickSort

QuickSort works by recursively partitioning the "smaller" and "bigger" subsets. By doing so, we place each pivot at its correct position given the subset we are in, then process the new "smaller" and "bigger" subsets, etc.

### A readable implementation

Below is a simple Python implementation of QuickSort often used for educational purpose. As you see, it is not memory-efficient since each call to `quicksort()`

returns a new list. But it has the benefit of being more readable than the in-place implementation.

Usage:

### The in-place implementation

Let's now look at QuickSort using in-place partitioning.

For that, we need to modify our `partition()`

function. Since the function will be used on different subsets of a **same list**, we need to add a start and an end to its signature so that it knows where to start and where to stop. This way, the same function can operate on different segments of a same list from different layers of the call stack. We also need to have it return the position of the pivot: this way, we can call it again on the "smaller" subset (i.e. up until the pivot position) and on the "bigger" subset (i.e. from the pivot position until the end).

We can rewrite it as below:

We can therefore write our QuickSort algorithm as below:

The **base condition** of our recursive algorithm is that it only runs when `start`

is strictly less than `end`

. The reason for that is: when we reach a subset of 1 element, the partitioning function will not do anything and return the position of the pivot, which is equal to `start`

. At this point the next two calls to `quicksort()`

further up the call stack will return nothing, and ultimately this branch of recursion will be closed.

## Complexity analysis

For the purpose of complexity analysis, we only examine the **in-place** version of QuickSort since it is the most efficient.

### Time complexity analysis

Let's first look at the time complexity. On average, the number of times the array will be divided into subsets is \( log(n) \), and at each pass we traverse the whole array, which takes \( n \) operations.

The average time complexity is therefore:

\[ O(n \cdot log(n)) \]

The worst time complexity happens when the array is already sorted. In that case, each pair of subset will be of type (\ 1 : n-1 \) and the array will have to be partitioned \( n \) times.

The worst-case time complexity is therefore:

\[ O(n^2) \]

### Space complexity analysis

There are two things to consider when it comes to space complexity:

- the actual memory used for variables
- the memory used by the call stack

Regarding variables, the footprint is minimum since we swap all elements in-place. Therefore the complexity is \( O(1) \). Regarding the call stack though, it is on average \(O( log(n) ) \) since we need to divide the array \( log(n) \) times, and it can be \(O(n) \) in the worst case.

## Conclusion

In this article we presented the QuickSort algorithm as well as a variety of Python implementations. QuickSort time efficiency is due to the partitioning, and its space efficiency is due to the partitioning being done in-place.

There is some flexibility with the algorithm: so far we have always selected the first element of the list as pivot, but it is possible to choose the last element, the middle element, a random element, etc. In fact, choosing a **random element** usually mitigates the risk of encountering a worst-case scenario.

As usual with sorting algorithms, the best algorithm is the one most suited for the data. Although QuickSort is often the fastest algorithm, it can't beat good old BubbleSort on an already-sorted list for instance (but then again, who would want to sort a sorted list 🤔?).

__References:__