A data structure is a way to organize and store data so that it can be efficiently accessed and modified. The choice of data structure determines what operations are fast, what operations are slow, and how much memory the data takes — and that choice usually decides whether an algorithm built on top of it is practical.
Every data structure supports some subset of these operations:
- Insert — add a new element.
- Delete — remove an element.
- Search — find an element matching some criterion.
- Modify — change an existing element.
Different structures make different operations fast. An array is fast for indexed access (O(1)), slow for insertion in the middle (O(n)). A Linked list is fast for insertion at the front (O(1)), slow for indexed access (O(n)). A Hash table is fast for keyed lookup on average (O(1)), but unordered. A BST keeps order and supports search/insert/delete in O(log n) when balanced. Picking the right one for the access pattern is engineering, not academia.
Two perspectives
Data structures are categorized along two axes:
Physical structure — how the data is physically arranged in memory:
- Array — contiguous block.
- Linked list — nodes scattered in memory linked by pointers.
- Matrix — 2D array.
Logical structure — how the data is conceptually used:
- Linear: stack, Queue — elements form a sequence.
- Non-linear: tree, Graph — elements have hierarchical or arbitrary relationships.
- Tabular: Hash table — keyed access by computed index.
Logical structures are implemented using physical structures. A stack can be implemented on top of an array or a linked list — same logical behavior (LIFO push/pop), different physical layout. The choice affects performance characteristics and memory usage.
Algorithm vs. data structure
An algorithm is a step-by-step procedure for performing a task. A data structure is the organization of the data the algorithm operates on. Efficient algorithms require well-chosen data structures — the two are inseparable.
For example, the algorithm “find the smallest element in a collection” runs in O(n) over an unsorted array, but in O(1) over a min-heap. Same data, same algorithm conceptually, but the structure changes the cost.
Where this matters
Real-world software lives or dies on data structure choice:
- Databases organize gigabytes of records. Indexes use B-trees or hash indexes depending on access pattern.
- Search engines invert documents into hash tables and use trees for ranked retrieval.
- Operating systems use priority queues for process scheduling, hash tables for the file cache, and linked lists for everything from process tables to free-block lists.
- Compilers parse source into abstract syntax trees, use symbol tables for name resolution, and control-flow graphs for optimization.
The bulk of practical programming is “I have data with these access patterns; which structure should I use?”