Choosing the right PostgreSQL index
PostgreSQL ships six index access methods, and CREATE INDEX without a USING clause always gives you a B-tree. That default is right most of the time — this guide is about recognizing the cases where it isn't, and about the three index features (partial, expression, covering) that matter more than the access method in day-to-day work.
B-tree: the default for a reason
A B-tree index supports equality and range comparisons (=, <, >, BETWEEN, IN), IS NULL, prefix LIKE 'abc%' (with the right collation or text_pattern_ops), and — often forgotten — sorting: an index on (created_at) can satisfy ORDER BY created_at DESC LIMIT 20 without sorting anything.
For multicolumn B-trees, column order is the whole game. An index on (customer_id, created_at) is excellent for WHERE customer_id = ? ORDER BY created_at, but close to useless for a query filtering only on created_at. Rule of thumb: equality columns first, then the range or sort column.
When B-tree is the wrong tool
| Index type | Reach for it when… | Watch out for |
|---|---|---|
| GIN | The value is a container and you search inside it: jsonb @> '{"status":"active"}', array overlap &&, full-text tsvector @@ tsquery, trigram search with pg_trgm for LIKE '%term%'. |
Slower to update than B-tree; index size can be large. Fine for read-heavy data. |
| GiST | Geometric data, range types (tstzrange overlaps), nearest-neighbor ORDER BY location <-> point, and exclusion constraints (e.g. "no overlapping bookings for the same room"). |
Lossy for some operator classes; usually bigger and slower than a B-tree for plain equality. |
| BRIN | Huge append-only tables (logs, events) where the column correlates with physical row order, typically timestamps. Tiny — megabytes where a B-tree would be gigabytes. | Useless if the data isn't physically ordered by that column; it only narrows the scan to block ranges. |
| Hash | Pure equality on long values where a B-tree gets fat. Rarely needed; crash-safe and replicated since PostgreSQL 10. | Equality only: no ranges, no sorting, no uniqueness enforcement. |
| SP-GiST | Space-partitioned structures: prefix searches on text, IP ranges (inet), quadtrees for points. |
Niche; benchmark against GiST/B-tree for your actual data before committing. |
Partial indexes: index only what you query
CREATE INDEX orders_pending_idx
ON orders (created_at)
WHERE status = 'pending';
If 99% of your orders are completed but every poll asks for pending ones, a partial index stays small, stays hot in cache, and is cheap to maintain — completed rows never touch it. The query must include the WHERE condition (or something the planner can prove implies it) for the index to be considered.
Expression indexes: match what the query computes
CREATE INDEX users_email_lower_idx ON users (lower(email));
A query filtering on lower(email) = lower($1) cannot use a plain index on email — the index stores raw values, not lowercased ones. The expression in the index must match the expression in the query. The same technique works for date_trunc('day', created_at), JSON field extraction, and any immutable function.
Covering indexes and index-only scans
CREATE INDEX orders_cust_idx
ON orders (customer_id) INCLUDE (total, created_at);
If the index contains every column the query needs, PostgreSQL can skip the table entirely — an index-only scan. INCLUDE adds payload columns to the index leaves without making them part of the key. Two caveats: the index gets bigger, and index-only scans also require the visibility map to be current — on a heavily updated, rarely vacuumed table you'll see Heap Fetches climb and much of the benefit disappear.
Practical rules
- Create indexes on production with
CREATE INDEX CONCURRENTLY— it doesn't block writes (it can't run inside a transaction and takes longer, that's the price). - Every index taxes every
INSERT/UPDATE/DELETE. Checkpg_stat_user_indexes.idx_scanperiodically and drop what's never read. - An index the planner ignores isn't necessarily missing statistics — check that types match, that functions in the predicate match the index expression, and that the filtered fraction is small enough to beat a sequential scan.
- Verify, don't assume: run
EXPLAIN (ANALYZE, BUFFERS)before and after. The proof is in the plan.
Rows Removed by Filter counts.
