How do databases count rows?
Counting sounds simple: ask for a number, get a number. In databases, counting can range from instant to painfully slow depending on what you count, how data is stored, and what the engine must scan to answer the query.
What COUNT(*) really means
COUNT(*) returns the number of rows that match a query. For a plain table count:
Sql
many engines must confirm how many rows exist. If the engine cannot trust a stored total, it will scan data pages or an index to count entries. This is reliable but can be costly on large tables, especially when combined with filters:
Sql
Now the database must find all rows with status='PAID' and count them. The best case is a selective index on status so the engine counts matching index entries instead of reading full rows.
COUNT(column) vs COUNT(*)
COUNT(*)counts rows, including rows where some columns areNULL.COUNT(column)counts non-NULLvalues in that column.
If a column contains many NULLs, COUNT(column) may be smaller than COUNT(*). Performance depends on whether the engine can use an index that covers the needed column and predicate.
Why counting can be slow
Counting is slow when the engine must touch a lot of data:
- Full scans: No usable index for the predicate, so the engine reads many pages.
- Low selectivity: An index exists but matches a huge fraction of rows, so the engine still processes many entries.
- Joins and grouping:
COUNT(*)after joins orGROUP BYcan multiply work because intermediate row sets can be large. - MVCC and visibility rules: In systems using multi-version concurrency control, the engine may need extra checks to decide whether a row version is visible to your transaction, which can add overhead during scans.
Indexes: the main tool for faster counts
Indexes can help counts in two primary ways:
1) Count through an index instead of the table
If the query predicate is supported by an index, the engine may count index entries. Index pages are often narrower than table pages, so fewer bytes are read.
Example:
Sql
2) Use covering indexes
A covering index contains all columns needed to satisfy the query without reading the table. For a count with a filter, an index on the filter column is often enough.
For multi-column filters, a composite index can be better:
Sql
Approximate counts and pre-aggregation
Exact counts can be expensive. If the goal is a dashboard number or a quick estimate, approximate approaches can be far cheaper:
Table statistics
Many databases keep statistics used by the optimizer, including estimated row counts. Some systems can expose these estimates. They are not exact, but they are quick.
For frequently requested counts, store them:
- Maintain a
order_counts_by_statustable. - Update it on insert/update/delete, or via scheduled jobs.
This trades write complexity for fast reads.
Materialized views
A materialized view precomputes aggregates. Refresh strategies vary (incremental vs full refresh). This is useful when many users query the same counts.
Query patterns that help
Count only what you need
Avoid counting after expanding rows with joins if a distinct entity count is desired:
Sql
This can still be heavy, but it prevents counting duplicates introduced by joins.
Filter early
Place predicates so the engine can use indexes and reduce scanned rows. Keep functions off indexed columns when possible, since expressions can block index usage:
Bad for index use:
Sql
Better:
Sql
Choosing the right strategy
- Need an exact total once in a while:
COUNT(*)with proper indexes is fine. - Need exact totals repeatedly on large tables: consider summary tables or materialized views.
- Need quick UI numbers: prefer approximate counts or cached results.
Efficient counting is less about the COUNT function and more about reducing the data the engine must read and verify.












