Indexes are the single most impactful performance tool available to database users. They trade write speed and storage space for dramatically faster reads — and understanding their internal data structures is essential to using them well.
What Is an Index, Really?
An index is a separate data structure that the database maintains alongside your table. It stores a subset of your columns in a format that allows fast lookups, and holds pointers back to the full rows in the original table (the "heap file"). Think of it like the index of a textbook — you do not read the whole book to find a topic, you consult the index, get a page number, and jump there.
B-Tree Indexes
The default index type in PostgreSQL, MySQL, SQLite, and most relational databases. A B-Tree (Balanced Tree) keeps its keys in sorted order across a multi-level tree where every leaf node is at the same depth. This gives O(log n) lookup, insert, and delete performance, and crucially supports:
- Equality lookups:
WHERE email = 'a@b.com' - Range queries:
WHERE created_at BETWEEN '2025-01-01' AND '2025-06-01' - Sorting without a sort step:
ORDER BY last_name - Prefix matching:
WHERE name LIKE 'Ada%'
-- B-Tree index (default) on a frequently queried column
CREATE INDEX idx_users_email
ON users (email);
-- Composite index: useful when you always filter by
-- both columns together. Column ORDER matters.
CREATE INDEX idx_orders_user_date
ON orders (user_id, created_at DESC);
Hash Indexes
Hash indexes compute a hash of the key and store it in a hash table. Average O(1) lookup — faster than B-Tree for exact equality. The trade-off: they cannot support range queries or sorting. Use them when your query pattern is purely equality-based on a high-cardinality column.
-- Hash index: O(1) equality, no range support
CREATE INDEX idx_sessions_token
ON sessions USING hash (token);
-- Only useful for: WHERE token = '...'
-- Useless for: WHERE token > '...'
-- ORDER BY token
When NOT to Add an Index
Every index has costs that compound as you add more:
- Write overhead — every INSERT, UPDATE, and DELETE must also update every index on the table. A table with 8 indexes on it effectively does 9 writes per row mutation.
- Storage — indexes can consume as much or more space than the table itself.
- Small tables — if a table fits in memory, a sequential scan is often faster than an index lookup (no pointer indirection).
- Low-cardinality columns — an index on a boolean column (true/false) is almost always useless.
The query planner knows all of this. Use EXPLAIN ANALYSE to see whether your index is actually being used — and what the cost estimate is with and without it.
Covering Indexes
A covering index includes all columns that a query needs, so the database can answer entirely from the index without touching the heap. This is the highest-performance scenario because it eliminates all I/O on the main table.