How do you search keywords in a database?
Keyword search is one of the most common tasks in software: finding rows where a word or phrase appears in a text field. It sounds simple, but the “right” method depends on your database, the size of your data, and what kind of matching you need (exact, partial, case-insensitive, multi-word, ranked results, and so on). This article walks through practical ways to implement keyword search, with tips that keep queries accurate and performant.
Start with the goal: what kind of match do you need?
Before writing SQL, decide what “keyword search” means for your use case:
- Exact match: “apple” matches only “apple”.
- Substring match: “app” matches “apple”, “application”.
- Word match: “cat” matches “black cat” but not “concatenate”.
- Case-insensitive match: “Apple” matches “apple”.
- Multi-keyword match: “red shoes” matches rows containing both words.
- Phrase match: “red shoes” matches that exact phrase in that order.
- Ranked results: show the most relevant matches first.
This choice drives both query style and indexing strategy.
The basic tool: LIKE and pattern matching
For many small-to-medium tables, LIKE is the simplest option.
Substring search
Sql
- Works everywhere.
- Easy to reason about.
- Can be slow on large tables, especially with a leading wildcard (
%shoe%), because many engines can’t use a normal index effectively.
Prefix search (often faster)
Sql
Prefix matching can often use an index on name, because the engine can jump to the “shoe…” range.
Case-insensitive matching
Different databases handle this differently:
- Some collations are case-insensitive by default.
- Others require a function or operator.
Common patterns:
Sql
This is easy but can block index usage unless you have a functional index (if supported).
Multi-column keyword search
Often you want to search across multiple fields, such as title, description, and tags.
Sql
This works, but watch for:
- Performance issues on large datasets.
- Duplicate logic if you repeat it across endpoints.
A common improvement is to create a combined searchable column (or generated column) and index it where appropriate.
Escaping special characters safely
When users type search terms, they may include % or _, which are wildcards in LIKE. If you treat those literally, you must escape them.
Example idea (syntax varies by engine):
Sql
In application code, it’s best to:
- Use parameterized queries (to prevent SQL injection).
- Escape
%and_if you intend literal search.
Token-based search: matching whole words
Substring search can produce noisy matches (“cat” matches “concatenate”). To match whole words, you have a few options:
1) Store tokens separately (normalized approach)
Create a table that stores tokens for each record:
article(id, title, body)article_token(article_id, token)
Then query:
Sql
Pros:
- Precise word matching.
- Indexing
article_token(token)is very effective.
Cons:
- More write complexity.
- Must generate tokens on insert/update.
2) Use regex (if supported)
Some databases support regex matching. This can match word boundaries, but performance varies and indexing may not help.
Conceptually:
- Match
\bcat\bto get word boundaries.
Use with care on large tables.
Full-text search: better relevance and speed for text-heavy fields
When you need high-quality keyword search on long text (articles, comments, product descriptions), full-text search features are usually the best fit. Full-text indexing typically:
- Breaks text into tokens.
- Ignores common stop words (configurable).
- Supports stemming in some setups (so “run” can match “running”).
- Produces relevance scores for ranking.
What you gain
- Faster searching on large text fields.
- Multi-word queries without writing complex SQL.
- Ranking and sometimes phrase search.
What to plan for
- A full-text index must be built and updated.
- Different languages require proper tokenization settings.
- Some queries behave differently than
LIKE(especially around punctuation and short words).
Even if full-text search is available, LIKE can still be useful for small lookup fields (like SKU prefix searches).
Indexing strategies that matter
Indexes are what make search scale. The right index depends on query shape.
B-tree indexes
Standard indexes work well for:
- Exact matches (
=). - Prefix matches (
LIKE 'term%'). - Range filters (dates, numeric ranges).
They usually do not help with:
- Leading wildcard searches (
LIKE '%term%').
Functional indexes
If you search with LOWER(column), consider an index on that expression (if your database supports it). Then case-insensitive lookups can remain fast.
Specialized text indexes
Full-text search typically uses its own index type. Use it for large text, multi-keyword search, and relevance ranking.
Handling multi-keyword input
Users often type more than one word. Decide whether you want AND behavior (must contain all words) or OR behavior (any word).
AND-style matching with LIKE
If the input is split into tokens ["red", "shoe"]:
Sql
Pros: simple. Cons: can be slow and may produce odd results (matches anywhere, not necessarily as words).
OR-style matching
Sql
This broadens results, useful for discovery searches.
A practical approach is:
- Default to AND for short lists of tokens.
- Add filters and sorting to help narrow results.
Sorting and pagination
Search results are often paginated. Use a stable sort order.
Example:
Sql
For large offsets, keyset pagination is faster (using “where published_at < last_seen…”), but it requires more logic.
Common pitfalls and how to avoid them
1) Slow queries on large tables
- Avoid
LIKE '%term%'on big unindexed text fields. - Use full-text indexing or token tables for large-scale keyword search.
2) Case and accent mismatches
- Confirm your collation rules.
- If users expect accent-insensitive matching, pick the right collation or store a normalized form.
3) Poor-quality results
- Substring search may match too broadly.
- Word-based token search or full-text search improves relevance.
4) Security issues
- Always use parameterized queries.
- Never concatenate raw user input into SQL strings.
A practical decision guide
- Small table, quick feature:
LIKE '%term%'may be acceptable. - Prefix search (names, codes):
LIKE 'term%'plus a normal index. - Whole-word search with strict control: token table with indexes.
- Text-heavy search with ranking: full-text search.
Keyword search in a database isn’t a single technique; it’s a toolbox. Pick the simplest method that meets your matching needs and performance targets, then upgrade to indexing and full-text features when the dataset grows or search quality becomes a product feature.












