A zero-downtime Elasticsearch reindex is the technique I use in production to change an index's mapping without taking the search experience offline. It rests on two Elasticsearch primitives: aliases (so the application queries products, not products_v3) and the _reindex API (so a new index is rebuilt in parallel from an existing one). When the new index is ready, a single atomic alias swap moves traffic over. The old index is preserved as an instant rollback target until you're confident the new one is correct.
This article covers the pattern step-by-step, gives you a parameterized bash script that runs the reindex and rolls back if needed, and is verified against Elasticsearch 8.x and 9.x. The technique itself has been stable since the _reindex API landed in Elasticsearch 2.3 (2016), and Elastic's own Changing Mapping with Zero Downtime post describes the same idea in pre-_reindex form.
The problem: changing mapping rebuilds the index
I work on large sites that hit Elasticsearch for product search, log search, and content discovery. Whenever the schema needs to change, which happens more often than people expect, you hit the same wall:
- A field's analyzer needs to change (
standardtoenglish, or adding a custom synonym filter) - A
keywordfield needs to become adense_vectorfor semantic search - A field's
text/keywordtype needs to flip the other way - A
textfield needs a siblingkeywordsubfield for sorting and aggregations
Elasticsearch does not let you change the type or analyzer of an existing field. Once a field is mapped, that's its shape for the life of the index. The first three changes above are outright forbidden in place.
The last one is a softer case worth being precise about: you can add a keyword multi-field to an existing text field via the update-mapping API. What you cannot do is backfill it. Documents indexed before the mapping change have no value in the new subfield, so sorting and aggregations on it are wrong until every document is rewritten. A reindex (or _update_by_query) is what populates it. So it still leads here, just for the backfill rather than because the mapping edit is rejected.
For the forbidden changes, the only way to apply a different mapping is to put the documents into a different index. Without aliases that means: drop the index, recreate it with the new mapping, reingest from your source of truth. While that's happening, the application's search endpoint either errors (the index is gone), returns empty (the index is empty), or both.
For anything that gets real traffic, that downtime window is unacceptable. The alias-swap pattern below is how I avoid it.
The pattern: applications never query the index directly
The prerequisite that makes everything else possible: the application never refers to an Elasticsearch index by its real name. It refers to an alias. The alias is what points to the real index. As long as the alias exists, the application can be ignorant of which physical index is actually serving its requests.
[ application ]
|
| queries "products"
v
[ alias: products ] ----> [ index: products_v3 ]
When you need a new mapping, you don't touch products_v3. You build a brand-new index (products_v4) with the new mapping, populate it via _reindex, and then atomically move the alias from products_v3 to products_v4. The application sees no interruption. It always sees "products".
For new applications I always introduce the alias from day one (even if the alias and the index have the same name and feel pointless), because retrofitting an alias onto an application that hardcodes index names is its own small migration. If your application is already querying the index name directly, the first step is to add an alias to the existing index and update the application to use the alias. That alone is zero-downtime.
Setup: the placeholder variables to substitute
The walkthrough below uses placeholder names. Set them once below and the curl commands and bash script will reflect your values. (Stays local to your browser via storageKey; nothing is sent anywhere.)
Set the Elasticsearch host, the current and new index names, and the alias the application uses. The commands below will pick up these values.
I'll assume basic auth with the elastic user in the commands. Replace elastic:changeme with the credentials you actually use (or drop -u entirely if your cluster is unauthenticated and behind a VPN).
Step 1: Make sure the alias exists on the current index
Check first. If the alias is already there, skip to step 2.
curl -sk -u elastic:changeme ":host/_alias/:alias"If it returns {} or 404, create it pointing at the current index:
curl -sk -u elastic:changeme -X POST ":host/_aliases" \
-H 'Content-Type: application/json' \
-d '{"actions":[{"add":{"index":":old_index","alias":":alias"}}]}'At this point the application can continue querying the alias and nothing has changed for users. The rest of the work happens behind that alias.
Step 2: Create the new index with the new mapping
Define the mapping for the new index. This is the only step where the shape of the request depends on what you're changing.
The mapping below is an illustrative example, not something to copy as-is. It is a made-up products schema that happens to add a keyword subfield to name and a dense_vector field. Do not paste it into your own migration. Replace the entire request body with the actual mapping your index needs. The point of this step is the shape of the PUT request; the fields inside it are yours to define.
curl -sk -u elastic:changeme -X PUT ":host/:new_index" \
-H 'Content-Type: application/json' \
-d '{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 0,
"refresh_interval": "60s"
},
"mappings": {
"properties": {
"name": { "type": "text", "fields": { "keyword": { "type": "keyword" } } },
"description": { "type": "text" },
"price": { "type": "double" },
"in_stock": { "type": "boolean" },
"embedding": { "type": "dense_vector", "dims": 768, "index": true, "similarity": "cosine" }
}
}
}'Two tuning choices worth flagging:
number_of_replicas: 0during reindex. Replicas force every write to be acknowledged by N nodes. Set replicas to 0 for the reindex, then bump back up after. On a meaningful corpus this can be 2x to 3x faster.refresh_interval: 60s(or-1to disable entirely). Default is 1 second. Reindex throughput improves significantly when Lucene doesn't have to commit a new segment that often.
Bump both back to production values after the reindex completes and before the alias swap.
Step 3: Reindex from old to new
The _reindex API copies documents from source to dest. The two flags I always set:
wait_for_completion=falseruns the reindex as a background task and returns immediately with a task id. You can poll the task for progress and the request can't time out on a long reindex.slices=autoparallelizes the reindex. Elasticsearch chooses a reasonable number of slices, using the shard count as a guide rather than a strict one-slice-per-shard guarantee. This is the single biggest performance win for large indices. One caveat: with sliced reindex the parent task's progress counts only fully completed slices, so thecreatednumber below can jump in steps rather than climbing smoothly.
curl -sk -u elastic:changeme -X POST ":host/_reindex?wait_for_completion=false&slices=auto" \
-H 'Content-Type: application/json' \
-d '{
"source": { "index": ":old_index", "size": 1000 },
"dest": { "index": ":new_index" }
}'The response includes {"task":"node_id:task_id"}. Save that. Poll progress with:
curl -sk -u elastic:changeme ":host/_tasks/<task_id>"The response includes task.status with counts of total, created, updated, version_conflicts, and throttled_millis. When the top-level completed field flips to true, the reindex is done.
If you need to throttle the reindex (because it's competing with production reads), pass requests_per_second=500 to the original _reindex call. You can also adjust it on a running task via POST /_reindex/<task_id>/_rethrottle?requests_per_second=N.
Step 4: Verify the document counts match
Before swapping the alias, confirm the new index ended up with the same number of documents as the old one. A count mismatch usually means:
- New writes hit the old index during the reindex (more on this below)
- A field in the new mapping silently rejected some documents (parse failures during reindex log into the task status)
curl -sk -u elastic:changeme ":host/:old_index/_count"curl -sk -u elastic:changeme ":host/:new_index/_count"Expect the new count to be equal to or slightly lower than the old one. _reindex works from a point-in-time snapshot taken when the task starts, so any writes that hit the old index after that snapshot are not copied. Meanwhile those writes do land on the old index, because the alias still points there. The result is the new index trailing the old one by however many documents were written during the reindex window. That gap is expected, not a bug. You close it by replaying the delta after the alias swap (covered in the FAQ), or by pausing writes for the duration.
A new count that is higher than the old one, or a large unexplained gap in either direction, is the real warning sign: it usually means a field in the new mapping rejected some documents. Parse failures during reindex are recorded in the task status, which is exactly what the script in the next section checks before it lets you swap.
If the gap is non-trivial and not explained by in-flight writes, investigate before swapping the alias. The old index is still serving traffic; nothing breaks yet.
Once the count is satisfactory, restore the production settings on the new index:
curl -sk -u elastic:changeme -X PUT ":host/:new_index/_settings" \
-H 'Content-Type: application/json' \
-d '{"index":{"number_of_replicas":1,"refresh_interval":"1s"}}'Step 5: Atomic alias swap
One _aliases request with both a remove and an add. Elasticsearch executes the actions atomically: at no point in time does the alias resolve to nothing.
curl -sk -u elastic:changeme -X POST ":host/_aliases" \
-H 'Content-Type: application/json' \
-d '{
"actions": [
{ "remove": { "index": ":old_index", "alias": ":alias" } },
{ "add": { "index": ":new_index", "alias": ":alias" } }
]
}'After this command, the application is querying the new index. There was no outage and no errors.
Step 6: Delete the old index (when you're confident)
Keep the old index around for at least a few hours, or a day, as a rollback target. When you're sure the new index is healthy, delete the old one:
curl -sk -u elastic:changeme -X DELETE ":host/:old_index"I usually wait until the next deploy cycle to do this. Disk is cheap; an unexpected outage is not.
A complete bash script
The script below glues all five steps into a single command. It accepts the four parameters as flags, polls the reindex task until completion, inspects the finished task for errors and document-level failures (a reindex can report completed and still have failed on individual documents), verifies counts, and prompts before the alias swap. Save it as es-zero-downtime-reindex.sh, chmod +x, and run.
#!/usr/bin/env bash
# es-zero-downtime-reindex.sh, reindex an Elasticsearch index without downtime
# via the alias-swap pattern. Verified against Elasticsearch 8.x and 9.x.
# Source: https://techearl.com/elasticsearch-zero-downtime-reindex
# Site: https://techearl.com/
#
# Usage:
# ./es-zero-downtime-reindex.sh --old products_v3 --new products_v4 --alias products
# ./es-zero-downtime-reindex.sh --old products_v3 --new products_v4 --alias products --rollback
#
# Requires: curl, jq.
#
# Prerequisites:
# - The new index already exists with your desired mapping.
# - The alias already points to the old index.
# - ES_HOST and ES_AUTH are set in the environment, or you pass --host.
set -euo pipefail
HOST="${ES_HOST:-https://localhost:9200}"
AUTH="${ES_AUTH:-elastic:changeme}"
OLD_INDEX=""
NEW_INDEX=""
ALIAS=""
ACTION="reindex"
YES=0
usage() {
cat <<EOF
Usage: $0 --old <old_index> --new <new_index> --alias <alias> [--rollback] [--yes] [--host <host>]
Required:
--old Current index name (alias currently points here)
--new New index name (must already exist with the new mapping)
--alias Alias the application queries
Optional:
--rollback Swap the alias back from new to old (use when the new index is bad)
--yes Don't prompt before the alias swap
--host Elasticsearch host (default: \$ES_HOST or https://localhost:9200)
Auth: set ES_AUTH=user:pass in the environment (default: elastic:changeme).
EOF
exit 1
}
while [[ $# -gt 0 ]]; do
case "$1" in
--old) OLD_INDEX="$2"; shift 2 ;;
--new) NEW_INDEX="$2"; shift 2 ;;
--alias) ALIAS="$2"; shift 2 ;;
--rollback) ACTION="rollback"; shift ;;
--yes) YES=1; shift ;;
--host) HOST="$2"; shift 2 ;;
-h|--help) usage ;;
*) echo "Unknown arg: $1"; usage ;;
esac
done
[[ -z "$OLD_INDEX" || -z "$NEW_INDEX" || -z "$ALIAS" ]] && usage
es() {
curl -sS -k -u "$AUTH" -H 'Content-Type: application/json' "$@"
}
# --- Rollback: swap the alias back from new to old, then exit ---
if [[ "$ACTION" == "rollback" ]]; then
echo "Rolling back: pointing '$ALIAS' from '$NEW_INDEX' back to '$OLD_INDEX'..."
es -X POST "$HOST/_aliases" -d @- <<JSON
{ "actions": [
{ "remove": { "index": "$NEW_INDEX", "alias": "$ALIAS" } },
{ "add": { "index": "$OLD_INDEX", "alias": "$ALIAS" } }
] }
JSON
echo "Done. The application is now reading from '$OLD_INDEX' again."
exit 0
fi
# --- Sanity checks ---
echo "Verifying old index exists..."
es -o /dev/null -w "%{http_code}\n" "$HOST/$OLD_INDEX" | grep -q "^200$" \
|| { echo "Error: '$OLD_INDEX' not found"; exit 1; }
echo "Verifying new index exists with mapping..."
es -o /dev/null -w "%{http_code}\n" "$HOST/$NEW_INDEX" | grep -q "^200$" \
|| { echo "Error: '$NEW_INDEX' not found. Create it with the new mapping first."; exit 1; }
echo "Verifying alias points to old index..."
es "$HOST/_alias/$ALIAS" | jq -e ".\"$OLD_INDEX\".aliases.\"$ALIAS\"" > /dev/null \
|| { echo "Error: alias '$ALIAS' does not point to '$OLD_INDEX'"; exit 1; }
# --- Tune the new index for fast reindex ---
echo "Tuning new index for reindex (replicas=0, refresh=60s)..."
es -X PUT "$HOST/$NEW_INDEX/_settings" \
-d '{"index":{"number_of_replicas":0,"refresh_interval":"60s"}}' > /dev/null
# --- Trigger the reindex as a background task ---
echo "Starting reindex from '$OLD_INDEX' to '$NEW_INDEX' (slices=auto)..."
TASK=$(es -X POST "$HOST/_reindex?wait_for_completion=false&slices=auto" -d @- <<JSON | jq -r .task
{ "source": { "index": "$OLD_INDEX", "size": 1000 },
"dest": { "index": "$NEW_INDEX" } }
JSON
)
echo "Task: $TASK"
# --- Poll until completion ---
# Note: with slices=auto the parent task counts only completed slices, so
# the progress number can jump in steps rather than climbing smoothly.
while true; do
RESP=$(es "$HOST/_tasks/$TASK")
COMPLETED=$(echo "$RESP" | jq -r .completed)
CREATED=$(echo "$RESP" | jq -r '.task.status.created // 0')
TOTAL=$(echo "$RESP" | jq -r '.task.status.total // 0')
printf "\r Progress: %s / %s " "$CREATED" "$TOTAL"
[[ "$COMPLETED" == "true" ]] && break
sleep 5
done
printf "\n Reindex task finished. Inspecting result...\n"
# --- Fail fast on task errors, document failures, and version conflicts ---
# A _reindex task can report "completed": true and still have failed on
# individual documents. Counts alone do not catch this.
FINAL=$(es "$HOST/_tasks/$TASK")
if echo "$FINAL" | jq -e 'has("error")' > /dev/null; then
echo "ERROR: the reindex task itself failed:"
echo "$FINAL" | jq '.error'
echo "Aborted. The alias still points to '$OLD_INDEX'."
exit 1
fi
FAILURES=$(echo "$FINAL" | jq -r '(.response.failures // []) | length')
VCONFLICTS=$(echo "$FINAL" | jq -r '.response.version_conflicts // 0')
if [[ "$FAILURES" -gt 0 ]]; then
echo "ERROR: reindex completed with $FAILURES document failure(s). First few:"
echo "$FINAL" | jq '.response.failures[0:5]'
echo "Aborted. The alias still points to '$OLD_INDEX'."
exit 1
fi
if [[ "$VCONFLICTS" -gt 0 ]]; then
echo "WARNING: $VCONFLICTS version conflict(s) during reindex."
echo "This is normal only if live writes were updating the same documents."
fi
echo " No document failures reported."
# --- Refresh the new index so _count is current ---
es -X POST "$HOST/$NEW_INDEX/_refresh" > /dev/null
# --- Verify counts ---
OLD_COUNT=$(es "$HOST/$OLD_INDEX/_count" | jq -r .count)
NEW_COUNT=$(es "$HOST/$NEW_INDEX/_count" | jq -r .count)
echo "Document counts: old=$OLD_COUNT new=$NEW_COUNT"
if [[ "$OLD_COUNT" != "$NEW_COUNT" ]]; then
echo "WARNING: counts differ. Investigate before continuing."
if [[ $YES -ne 1 ]]; then
read -p "Swap the alias anyway? [y/N] " ans
[[ "$ans" =~ ^[Yy]$ ]] || { echo "Aborted. The alias still points to '$OLD_INDEX'."; exit 1; }
fi
fi
# --- Restore production settings on new index ---
echo "Restoring production settings (replicas=1, refresh=1s)..."
es -X PUT "$HOST/$NEW_INDEX/_settings" \
-d '{"index":{"number_of_replicas":1,"refresh_interval":"1s"}}' > /dev/null
# --- Atomic alias swap ---
if [[ $YES -ne 1 ]]; then
read -p "Ready to swap alias '$ALIAS' from '$OLD_INDEX' to '$NEW_INDEX'? [y/N] " ans
[[ "$ans" =~ ^[Yy]$ ]] || { echo "Aborted. The alias still points to '$OLD_INDEX'."; exit 1; }
fi
es -X POST "$HOST/_aliases" -d @- <<JSON
{ "actions": [
{ "remove": { "index": "$OLD_INDEX", "alias": "$ALIAS" } },
{ "add": { "index": "$NEW_INDEX", "alias": "$ALIAS" } }
] }
JSON
echo "Done. The application is now reading from '$NEW_INDEX' via '$ALIAS'."
echo
echo "Old index '$OLD_INDEX' is preserved. Delete when you're confident:"
echo " curl -sk -u $AUTH -X DELETE \"$HOST/$OLD_INDEX\""
echo
echo "If something goes wrong, roll back with:"
echo " $0 --old $OLD_INDEX --new $NEW_INDEX --alias $ALIAS --rollback"Rollback if something goes wrong
If the new index turns out to be wrong (queries return weird results, the mapping was misconfigured, a synonym filter exploded), the old index is still there. The rollback is the same alias swap, in reverse:
./es-zero-downtime-reindex.sh \
--old products_v3 \
--new products_v4 \
--alias products \
--rollbackRun it as soon as you notice the issue. The application is back on the old index in seconds. The new index stays where it is, so you can investigate it offline, fix the mapping, and try again with _v5.
This is why I always keep the old index around for at least a day. The rollback path is the entire reason this technique is safer than dropping and recreating: at any point during or after the swap, you have an instant escape hatch.
Elasticsearch 8.x and 9.x compatibility
I've verified this pattern against the current 8.x and 9.x lines. Both behave identically for the commands above. A few notes on what changed and what stayed the same:
| Feature | 8.x | 9.x | Notes |
|---|---|---|---|
_reindex API | Yes | Yes | Stable since 2.3 (2016). |
wait_for_completion=false | Yes | Yes | Same task semantics. |
slices=auto | Yes | Yes | Since 6.3. Elasticsearch chooses the slice count, guided by (not strictly equal to) the shard count. |
Atomic _aliases actions | Yes | Yes | Stable since 1.x. Multi-action body executes in a single transaction. |
| Mapping format | Flat (no _type) | Flat (no _type) | _type was deprecated in 6.x, removed in 8.0. Any pre-7.x example with {"properties": {...}} nested under "_doc" won't run on 8 or 9. |
number_of_shards change | Requires reindex | Requires reindex | The whole reason this technique exists. |
is_write_index on aliases | Yes | Yes | Useful for rolling indices and ILM, not needed for a simple mapping change. |
| Authentication | Required by default | Required by default | The first-time setup prints the elastic password; reset with bin/elasticsearch-reset-password -u elastic. |
One thing genuinely worth flagging: in 8.0 the typed URL form was removed. Note that /products/_doc/123 is not the legacy form. On 8.x and 9.x, /<index>/_doc/<id> is the normal, current document API endpoint, where _doc is the typeless endpoint name. The form that was removed is the old typed URL like /products/product/123, with a custom type name between the index and the id. If your application or tooling still builds typed URLs like that, those are what break on 8 and 9 and need cleaning up. Mapping bodies must also not include a _type key.
For the Reindex Is Coming Elastic blog post (which announced the API in 2.3), the pattern it describes is exactly what's documented above. The API has gained slicing, throttling, and better task reporting since, but the contract is the same.
Common mistakes
The bugs I've shipped or seen in code review on this pattern.
Application queries the index name directly, not the alias. This is the pre-requisite, and the most common reason teams can't do zero-downtime reindexes. Step zero is finding every place in the application that names an index and switching them to the alias. Once you have the alias working in production, the rest of this pattern is trivial.
Forgetting to set wait_for_completion=false. On a large index _reindex can take minutes or hours. Without wait_for_completion=false, curl will sit there until either the reindex finishes or the proxy/LB in front of Elasticsearch kills the connection. Run it as a task and poll.
Skipping the count verification before the alias swap. This is the only step that catches a botched mapping where some documents silently failed to reindex (because a value didn't fit the new type, for example). Always verify counts before swapping.
Deleting the old index immediately after the swap. Treat the old index as your rollback target. Keep it for at least a day. Disk is cheap. The next time the new mapping turns out to be subtly wrong, you'll be glad you waited.
Not pausing writes (or not making them idempotent) during reindex. If your application is writing to the alias during the reindex, those writes land on the OLD index (because the alias still points there). When the alias swaps, those writes still exist on the old index but not on the new one. Two options: pause writes during reindex (acceptable for short windows), or make writes idempotent and replay the deltas from your source of truth after the swap. For a search index built from a primary store (Postgres, MySQL, Kafka log), the second option is straightforward.
Reindexing into an index with replicas turned on. Every write has to be acknowledged by the replica nodes, which slows the reindex noticeably. Set number_of_replicas: 0 during reindex, bump back up before the swap.
Frequently asked questions
See also
- Elasticsearch Cheat Sheet: the full reference for index operations, search DSL, aggregations, and vector / kNN search
- How to Build RAG with Embeddings and Vector Search: the workflow that most often forces a mapping change (adding a
dense_vectorfield for semantic retrieval) - How to Add Semantic Search to a MySQL App: the lighter-weight alternative when the corpus is small enough to live in your application database
- MySQL Cheat Sheet: the source-of-truth side of a typical Elasticsearch deployment, where the documents originate before being indexed
External references: Changing Mapping with Zero Downtime (the Elastic blog post that introduced this pattern in 2014), Reindex Is Coming (the announcement of the _reindex API in 2.3), and the official Reindex API documentation for parameter details on slicing and throttling.





