Array (data structure)

An array is the simplest aggregate data structure: a fixed-size, contiguous block of memory holding a sequence of elements of the same type. Each element is reachable in $O (1)$ by computing its address from the starting address and the index.

The contiguity is the whole point. Because every element is exactly sizeof(element) bytes from its neighbour, the address of element $i$ is

$addr (i) = base + i \cdot sizeof(element)$

That single multiplication and addition is what makes indexed access constant-time. No traversal, no pointer chasing.

Declaration and layout

In C:

int a[8];           // 8 ints, uninitialised
int b[5] = {1,2,3,4,5};

a is a name for the address of the first element. a[i] is shorthand for *(a + i) — the compiler does the address arithmetic for you. The size is fixed at compile time (or at allocation time, for variable-length arrays); it can’t grow. To grow, use a Dynamic array.

In memory, the eight ints sit back-to-back. On a typical 32-bit int, an int a[8] occupies 32 contiguous bytes. The CPU’s cache loads adjacent elements together, so iterating an array is one of the fastest things a program can do.

Operations and their costs

Operation	Cost	Why
Read/write at index $i$	$O (1)$	one address computation
Find an element by value	$O (n)$	must scan; or $O (lo g n)$ if sorted, see Binary search
Insert at index $i$	$O (n)$	shift everything from $i$ rightward
Delete at index $i$	$O (n)$	shift everything from $i + 1$ leftward
Append (if room)	$O (1)$	write to next slot
Append (no room)	not supported	size is fixed

The asymmetry — fast indexed access, slow middle insertion — is the defining trade-off. Choose an array when you index a lot and rarely change the structure; choose a Linked list when you insert and delete in the middle.

Multi-dimensional arrays

A 2D array int m[3][4] is laid out in memory as one contiguous run of 12 ints. C uses row-major order: row 0 first, then row 1, then row 2. The address formula generalises:

$addr (i, j) = base + (i \cdot ncols + j) \cdot sizeof(element)$

Fortran and MATLAB use column-major instead. The choice matters for performance: iterating in the order that matches the memory layout is much faster because of cache locality. (See Cache locality.)

Bounds and safety

C arrays do no bounds checking. Reading or writing past the end of an array silently corrupts adjacent memory or reads garbage — the source of many security bugs. Higher-level languages (Java, Python, Rust) bounds-check every access at the cost of one extra comparison per read.

When to use an array

Random access by integer index dominates the workload.
The size is known and stable.
You want maximum cache-friendly throughput.

When the size needs to grow, switch to a Dynamic array. When you need cheap front insertion, switch to a Linked list. When you need fast lookup by key rather than index, switch to a Hash table.

Idriss Rami — Notes

Explorer

Array (data structure)

Declaration and layout

Operations and their costs

Multi-dimensional arrays

Bounds and safety

When to use an array

Graph View

Table of Contents

Backlinks