Choosing the right PostgreSQL index

PostgreSQL ships six index access methods, and CREATE INDEX without a USING clause always gives you a B-tree. That default is right most of the time — this guide is about recognizing the cases where it isn't, and about the three index features (partial, expression, covering) that matter more than the access method in day-to-day work.

B-tree: the default for a reason

A B-tree index supports equality and range comparisons (=, <, >, BETWEEN, IN), IS NULL, prefix LIKE 'abc%' (with the right collation or text_pattern_ops), and — often forgotten — sorting: an index on (created_at) can satisfy ORDER BY created_at DESC LIMIT 20 without sorting anything.

For multicolumn B-trees, column order is the whole game. An index on (customer_id, created_at) is excellent for WHERE customer_id = ? ORDER BY created_at, but close to useless for a query filtering only on created_at. Rule of thumb: equality columns first, then the range or sort column.

When B-tree is the wrong tool

Index typeReach for it when…Watch out for
GIN The value is a container and you search inside it: jsonb @> '{"status":"active"}', array overlap &&, full-text tsvector @@ tsquery, trigram search with pg_trgm for LIKE '%term%'. Slower to update than B-tree; index size can be large. Fine for read-heavy data.
GiST Geometric data, range types (tstzrange overlaps), nearest-neighbor ORDER BY location <-> point, and exclusion constraints (e.g. "no overlapping bookings for the same room"). Lossy for some operator classes; usually bigger and slower than a B-tree for plain equality.
BRIN Huge append-only tables (logs, events) where the column correlates with physical row order, typically timestamps. Tiny — megabytes where a B-tree would be gigabytes. Useless if the data isn't physically ordered by that column; it only narrows the scan to block ranges.
Hash Pure equality on long values where a B-tree gets fat. Rarely needed; crash-safe and replicated since PostgreSQL 10. Equality only: no ranges, no sorting, no uniqueness enforcement.
SP-GiST Space-partitioned structures: prefix searches on text, IP ranges (inet), quadtrees for points. Niche; benchmark against GiST/B-tree for your actual data before committing.

Partial indexes: index only what you query

CREATE INDEX orders_pending_idx
    ON orders (created_at)
    WHERE status = 'pending';

If 99% of your orders are completed but every poll asks for pending ones, a partial index stays small, stays hot in cache, and is cheap to maintain — completed rows never touch it. The query must include the WHERE condition (or something the planner can prove implies it) for the index to be considered.

Expression indexes: match what the query computes

CREATE INDEX users_email_lower_idx ON users (lower(email));

A query filtering on lower(email) = lower($1) cannot use a plain index on email — the index stores raw values, not lowercased ones. The expression in the index must match the expression in the query. The same technique works for date_trunc('day', created_at), JSON field extraction, and any immutable function.

Covering indexes and index-only scans

CREATE INDEX orders_cust_idx
    ON orders (customer_id) INCLUDE (total, created_at);

If the index contains every column the query needs, PostgreSQL can skip the table entirely — an index-only scan. INCLUDE adds payload columns to the index leaves without making them part of the key. Two caveats: the index gets bigger, and index-only scans also require the visibility map to be current — on a heavily updated, rarely vacuumed table you'll see Heap Fetches climb and much of the benefit disappear.

Practical rules

The EXPLAIN Visualizer flags the two classic signs of a missing or wrong index: sequential scans that read far more than they return, and large Rows Removed by Filter counts.