The Lomuto partition scheme is a way to partition an array around a pivot for Quick sort. It’s simpler to write and easier to teach than the original Hoare partition, and is the version most students see first. Tony Hoare’s original scheme is faster on real workloads, but Lomuto’s clarity has made it the default in most pedagogy.

The idea: scan left to right with a single pointer that maintains an invariant about the elements already seen. Pick the last element as the pivot, scan everything before it, and at the end swap the pivot into its final position.

Algorithm

int lomuto_partition(int arr[], int lo, int hi) {
    int pivot = arr[hi];
    int i = lo - 1;          // boundary: arr[lo..i] are all <= pivot
    for (int j = lo; j < hi; j++) {
        if (arr[j] <= pivot) {
            i++;
            swap(&arr[i], &arr[j]);
        }
    }
    swap(&arr[i + 1], &arr[hi]);
    return i + 1;            // final position of the pivot
}

i is the boundary between “seen and ≤ pivot” (positions lo..i) and “seen and > pivot” (positions i+1..j-1). j is the scanning pointer.

The invariant at every loop iteration:

  • arr[lo..i] ≤ pivot.
  • arr[i+1..j-1] > pivot.
  • arr[j..hi-1] not yet examined.
  • arr[hi] = pivot, untouched until the end.

When j finds an element ≤ pivot, increment i and swap that element into the ”≤ pivot” region. Elements > pivot stay in place — the gap between i and j widens.

After the loop, the swap arr[i+1] ↔ arr[hi] places the pivot at position i+1, exactly the boundary between the two regions. The pivot’s final index is returned.

Worked example

Array: [3, 8, 2, 5, 1, 4, 7, 6], partition over the entire range. Pivot = arr[hi] = 6.

Initial: i = -1, j = 0.

jarr[j]≤ 6?ActionArray after
03Yesi=0, swap arr[0]↔arr[0] (no-op)[3, 8, 2, 5, 1, 4, 7, 6]
18Noskip[3, 8, 2, 5, 1, 4, 7, 6]
22Yesi=1, swap arr[1]↔arr[2][3, 2, 8, 5, 1, 4, 7, 6]
35Yesi=2, swap arr[2]↔arr[3][3, 2, 5, 8, 1, 4, 7, 6]
41Yesi=3, swap arr[3]↔arr[4][3, 2, 5, 1, 8, 4, 7, 6]
54Yesi=4, swap arr[4]↔arr[5][3, 2, 5, 1, 4, 8, 7, 6]
67Noskip[3, 2, 5, 1, 4, 8, 7, 6]

Final swap: arr[i+1] = arr[5] = 8arr[hi] = arr[7] = 6:

Result: [3, 2, 5, 1, 4, 6, 7, 8]. Pivot at index 5. Left half [3, 2, 5, 1, 4] all ≤ 6; right half [7, 8] all > 6. Correct.

Quicksort with Lomuto

void quicksort(int arr[], int lo, int hi) {
    if (lo < hi) {
        int p = lomuto_partition(arr, lo, hi);
        quicksort(arr, lo, p - 1);     // left half excludes pivot
        quicksort(arr, p + 1, hi);     // right half excludes pivot
    }
}

The pivot is at its final position after partitioning, so it’s excluded from both recursive calls — lo, p-1 for the left, p+1, hi for the right. The recursive subarrays are strictly smaller than the original, so the recursion terminates.

Complexity

Per call: comparisons (one per j-iteration), swaps in the worst case. Same asymptotic complexity as Hoare partition.

The constant factor is worse than Hoare’s: Lomuto does about comparisons and up to swaps per partition, while Hoare averages roughly comparisons and only swaps. On large arrays this matters.

Pitfalls

  • Worst case on already-sorted input. With the pivot always taken from the end, an already-sorted (or reverse-sorted) array makes every partition produce a maximally lopsided split (one side has 0 elements, the other has ). Recursion depth becomes , total work .

    Fix: randomise the pivot before calling lomuto_partition, or take the median of three.

  • Many equal elements. Lomuto handles equal elements poorly: every element equal to the pivot ends up on the ”≤ pivot” side, so duplicates cluster on one side rather than splitting evenly. With many duplicates the splits are unbalanced, costing . The fix is 3-way partitioning (Dutch national flag): split into <, =, > regions.

Why it’s preferred for teaching

Lomuto’s invariants are simple — one boundary pointer i, one scanning pointer j, one straightforward inequality. Students can trace by hand and prove correctness by inspection.

Hoare’s invariants are subtler: two pointers walking inward, the pivot doesn’t end up at the returned index, the recursion calls aren’t symmetric (the left call must include the split point). Easy to write a one-character bug that breaks for some inputs.

So Lomuto wins for clarity; Hoare wins for performance. Most production sort routines (introsort in std::sort, dual-pivot quicksort in Java) use Hoare-style partitioning with extra refinements (median-of-three, 3-way for equal keys, switch to insertion sort for small subarrays).

In context

Lomuto and Hoare partition are the two textbook partitioning schemes. Both are subroutines of Quick sort — partition behaviour is what makes quicksort either fast or pathological.