An Elasticsearch cheat sheet is a single-page reference of the REST endpoints, Query DSL fragments, and admin commands that get you from "I have a running cluster" to "I have a useful search experience" without re-reading the manual every time. This sheet covers Elasticsearch 9.x as of 2026, including the bits that newcomers trip over: index lifecycle, the difference between match and term, aggregations, the newer ESQL query language, and vector / kNN search for embedding-based retrieval.
I run Elasticsearch behind a few production search experiences and an analytics pipeline. The sections below are organized in the order you actually hit them: first connect to the cluster, then create an index and load documents, then search, then aggregate, then tune.
What this cheat sheet covers
Use this page as a reference card. Each section is a self-contained chunk: skip to Search DSL if you already have a populated index, skip to Aggregations if you need facet counts, skip to Vector / kNN search if you are wiring up semantic search. The end-of-page Common mistakes and FAQ sections answer the questions I get asked most.
What this sheet does NOT cover:
- Elastic's commercial features (Watcher, machine learning UI, cross-cluster replication). Those need an enterprise license. The free / open AGPL+ELv2 stack covers everything below.
- Kibana dashboards. Kibana is its own product with its own docs.
- The legacy Transport client. It was removed in Elasticsearch 8.0. Use the REST clients (Java, Python
elasticsearch, Node@elastic/elasticsearch, or raw HTTP).
Connecting to Elasticsearch
Every command in this sheet is an HTTP request. I show them in curl form because that is the universal denominator. The same calls work verbatim in Kibana Dev Tools (paste without the curl -X prefix), in httpie, or through any official client.
A local single-node 9.x cluster started fresh requires basic auth (the elastic user) and HTTPS by default. The first-time setup prints the password to the console; reset it any time with:
bin/elasticsearch-reset-password -u elasticA typical authenticated request:
curl -k -u elastic:CHANGEME https://localhost:9200/_cluster/healthThe -k skips self-signed cert verification. For production, use a real certificate and pass --cacert instead. Most of the URLs in the rest of this sheet are shown without the host prefix; assume https://localhost:9200 (or wherever your cluster lives).
Quick reference
Elasticsearch 9.x Quick Reference
REST endpoints, Query DSL fragments, aggregations, and admin commands for Elasticsearch 9.x.
Search DSL: a worked example
The Query DSL is more verbose than the URI query string but composable. The idiom I reach for first is a bool query with must for text relevance and filter for binary conditions:
curl -k -u elastic:CHANGEME -X POST "https://localhost:9200/products/_search" \
-H 'Content-Type: application/json' -d '{
"size": 20,
"query": {
"bool": {
"must": [
{ "match": { "name": "wireless headphones" } }
],
"filter": [
{ "term": { "in_stock": true } },
{ "range": { "price": { "lte": 200 } } }
]
}
},
"sort": [
"_score",
{ "rating": "desc" }
],
"_source": ["name", "price", "rating", "image_url"]
}'What each part does:
mustruns the user query through thenamefield's analyzer and contributes to_score.filterenforces "in stock and under $200" without touching the score; these clauses are cached.sortranks by relevance first, then by rating as a tiebreaker._sourcerestricts the response to the four fields the UI actually renders, saving bandwidth.
Aggregations: facets and analytics
Aggregations turn search results into facet counts, time series, and summary statistics. A typical product-search facet response:
curl -k -u elastic:CHANGEME -X POST "https://localhost:9200/products/_search" \
-H 'Content-Type: application/json' -d '{
"size": 0,
"query": { "match": { "name": "headphones" } },
"aggs": {
"by_brand": { "terms": { "field": "brand.keyword", "size": 10 } },
"price_bucket": { "histogram": { "field": "price", "interval": 50 } },
"stats_price": { "stats": { "field": "price" } }
}
}'"size": 0 skips the hits and just returns the aggregations. That is the right pattern when you only need counts.
Aggregations can nest. To get the average price by brand:
"aggs": {
"by_brand": {
"terms": { "field": "brand.keyword", "size": 10 },
"aggs": {
"avg_price": { "avg": { "field": "price" } }
}
}
}Vector / kNN search
Elasticsearch 8.x added native vector search; 9.x stabilized it as the recommended path for semantic retrieval. Define a dense_vector field in the mapping, then run a knn query with a query vector you compute client-side (or use semantic_text to skip the client step).
Mapping:
{
"mappings": {
"properties": {
"title": { "type": "text" },
"embedding": { "type": "dense_vector", "dims": 768, "index": true, "similarity": "cosine" }
}
}
}Query:
{
"knn": {
"field": "embedding",
"query_vector": [0.012, -0.041, ...],
"k": 10,
"num_candidates": 100
}
}num_candidates controls recall vs latency. A common ratio is 10× k. For hybrid search that combines lexical relevance with vector similarity, put both knn and a bool/match query into the same request:
{
"query": { "match": { "title": "noise cancelling" } },
"knn": { "field": "embedding", "query_vector": [...], "k": 10, "num_candidates": 100, "boost": 0.7 }
}For the broader context on how vector search fits with embeddings, document chunking, and a working app, see how to build RAG with embeddings and vector search and how to add semantic search to a MySQL app.
ESQL: the SQL-like query language
Introduced in 8.11 and refined through the 8.x and 9.x lines, ESQL gives Elasticsearch a piped query language that reads like SQL with a Unix-pipeline twist. For analytics, dashboards, and log exploration it is often easier than building the equivalent Query DSL by hand.
FROM logs-2026.05.*
| WHERE status >= 400 AND status < 500
| STATS count = count() BY host, status
| SORT count DESC
| LIMIT 20
When to use which:
| Use case | DSL | ESQL |
|---|---|---|
| Search-relevance tuning, scoring | Yes | No (no relevance scoring) |
| Time-series analytics, log queries | Possible but verbose | Yes |
| Joins across multiple indices | No | Yes (LOOKUP JOIN in 8.15+) |
| Aggregations | Yes | Yes (cleaner syntax) |
| Vector / kNN search | Yes | Not yet |
Version compatibility
| Feature | Available since | Notes |
|---|---|---|
Single type per index (no _type) | 7.0 | Multi-type indices are gone. References to types in older tutorials are obsolete. |
| Composable index templates | 7.8 | Old index_template API replaced by index_templates + component_templates. |
| Searchable snapshots | 7.10 | Mount a snapshot as a read-only index in the frozen tier. |
| Runtime fields | 7.11 | Compute fields at query time from _source or other fields. Cheap mapping additions. |
dense_vector indexed for kNN | 8.0 | Required for the knn query. |
| ESQL | 8.11 | General-availability piped query language. Major addition. |
semantic_text field type | 8.15 | Embedding inference at index and query time, no external embedding step. |
| Elasticsearch 9.0 | 2025 | License returned to open AGPL + ELv2 alongside the existing free Basic. |
For specific upgrade walkthroughs, the official Elasticsearch upgrade documentation is the source of truth. Always snapshot before upgrading.
Common mistakes
The bugs I have shipped or seen in code review.
Searching a keyword field with match, or a text field with term. A match runs the value through the analyzer (lowercase, tokenize); a term does not. If your category field is keyword, { "match": { "category": "Footwear" } } will look for the analyzed form, which a keyword field never produced. Use term. Conversely, term against a text field looks for the exact analyzed token (often lowercased), so { "term": { "title": "Headphones" } } misses Headphones if the analyzer lowercased it to headphones.
Using from/size to paginate past the 10,000-result wall. The default index.max_result_window caps from + size at 10,000 because deep pagination forces every shard to maintain a deep priority queue. Raising the setting is almost always the wrong fix. Use search_after with a unique sort field (usually _id as a tiebreaker) or a Point In Time (PIT) for stable cursor pagination.
Sorting on a text field. Text fields do not support sorting unless you enable fielddata: true (which loads every term into memory). The correct fix is to add a keyword subfield in the mapping (fields: { keyword: { type: "keyword" } }) and sort on name.keyword instead.
Forgetting to refresh after indexing in a test. Elasticsearch refreshes every second by default. If your test indexes a doc and immediately queries it back, the doc is not yet searchable. Append ?refresh=true to the index call, or call POST /:index/_refresh. Do NOT set refresh_interval to a tiny value in production; the cost in segment count is brutal.
Letting unmapped fields run wild. Without an explicit mapping, Elasticsearch auto-detects types from the first document. A field that holds "42" on doc 1 and 42 on doc 2 will end up text and queries with term fail in confusing ways. Define mappings up front for anything that matters.
Treating shards as cheap. Each shard is a Lucene index with overhead. Hundreds of small shards per node will starve heap. The general guidance: 10-50GB per shard for search workloads, fewer larger shards over more smaller ones, total shards per node roughly 20 × heap_in_GB.
Running a single-node cluster in production. A one-node cluster has number_of_replicas: 1 by default and cannot allocate the replica anywhere; cluster status sits at yellow. Either set replicas to 0 (knowing you have no redundancy) or run at least two data nodes. Don't ignore the yellow.
Snapshotting without a registered repository. Snapshots require a repository registered ahead of time; PUT /_snapshot/:repo once, then PUT /_snapshot/:repo/:snapshot. Many teams discover at recovery time that they never set this up. Test the restore path before you need it.
Frequently asked questions
See also
- MySQL Cheat Sheet: the companion reference for relational queries and the primary store you typically index FROM into Elasticsearch
- How to Build RAG with Embeddings and Vector Search: end-to-end semantic search pipeline that uses the
dense_vectormapping above - How to Add Semantic Search to a MySQL App: the lighter-weight alternative when you don't want to run a full Elasticsearch cluster
- Regex Cheat Sheet: for the regex queries Elasticsearch supports on
keywordandwildcardfields - curl Cheat Sheet: the HTTP client used in every example above, including the flags for self-signed certs and basic auth
External references: Elasticsearch official documentation is the source of truth for endpoint behavior and version-specific changes. The Elastic Search Labs blog covers the newer ESQL, vector search, and semantic_text features in depth.





