TechEarl

How to Use Regex in .htaccess (Apache mod_rewrite)

Use regex in .htaccess with Apache mod_rewrite: how RewriteRule and RewriteCond patterns work, the per-directory quirk that breaks everyone, and copy-paste rules for HTTPS, www, trailing slashes, 301s, clean URLs, and access blocking.

Ishan KarunaratneIshan Karunaratne⏱️ 17 min readUpdated
Using regex in Apache .htaccess with mod_rewrite: RewriteRule and RewriteCond pattern syntax, rewrite flags, and copy-paste rules for HTTPS redirects, www normalization, trailing slashes, 301 redirects, clean URLs, and blocking by user-agent or IP.

Almost everything useful in an Apache .htaccess file runs on regex. RewriteRule takes a regex. RewriteCond takes a regex. RedirectMatch takes a regex. If you have ever pasted a redirect rule from a forum, watched it do nothing, and had no idea why, the answer is almost always that you did not understand what the regex was actually being matched against.

This article fixes that. I cover how mod_rewrite regex works, the one quirk that breaks more rules than anything else, and then a copy-paste pattern for every common job: HTTPS, www, trailing slashes, 301 redirects, clean URLs, blocking bad traffic. Type your domain into the box below once and every example on the page updates to use it.

Try it with your own values

Type your values once. Every code block below substitutes them in, ready to paste.

The quirk that breaks most rules: the per-directory path

This is the single most important thing on the page, so it goes first.

When mod_rewrite runs inside a .htaccess file, the string your RewriteRule regex matches against is the request path relative to the directory the .htaccess file sits in, with the leading slash removed.

A request for https://example.com/blog/post-1 evaluated by a .htaccess in the document root does not match against /blog/post-1. It matches against blog/post-1. No leading slash.

This is why a rule copied from an Apache <VirtualHost> example (server context, where the path does have a leading slash) silently fails in .htaccess. A pattern like ^/blog/(.*)$ will never match in .htaccess because the string never starts with a slash there.

The fix is to write .htaccess patterns without a leading slash:

apache
# WRONG in .htaccess: the leading slash never matches
RewriteRule ^/blog/(.*)$ /articles/$1 [R=301,L]

# RIGHT in .htaccess: no leading slash in the pattern
RewriteRule ^blog/(.*)$ /articles/$1 [R=301,L]

The substitution (the second argument) is different: a substitution that starts with a slash, or a full https:// URL, is treated as a real path or URL. Only the pattern loses its leading slash. Keep that distinction in your head and half of all .htaccess confusion disappears.

How RewriteRule regex works

A RewriteRule has three parts:

apache
RewriteRule  PATTERN  SUBSTITUTION  [FLAGS]
  • PATTERN is a PCRE regular expression matched against the per-directory path described above.
  • SUBSTITUTION is what the path becomes. It can be a path, a full URL, or - (meaning "do not change the path, just apply the flags").
  • FLAGS in square brackets control behavior: status code, case sensitivity, whether to stop processing, and more.

Capture groups in the pattern become backreferences in the substitution: $1 is the first (...) group, $2 the second, up to $9.

apache
# (.*) captures everything after "products/"; $1 puts it back
RewriteRule ^products/(.*)$ /shop/$1 [R=301,L]

mod_rewrite uses PCRE, the same regex engine as PHP's preg_* functions. Every token from the regex cheat sheet works here: character classes, quantifiers, anchors, alternation, lookarounds. The anchors ^ and $ matter more here than almost anywhere else, because an unanchored pattern matches a substring and will fire on URLs you did not intend.

How RewriteCond regex works

RewriteRule only sees the path. To make a decision based on anything else (the hostname, the protocol, the query string, the user-agent), you put a RewriteCond directly above the RewriteRule. The condition is checked first; the rule only runs if the condition passes.

apache
RewriteCond  TEST-STRING  CONDITION-PATTERN  [FLAGS]
RewriteRule  PATTERN      SUBSTITUTION       [FLAGS]

The TEST-STRING is usually a server variable in %{...} form. The ones you will use most:

  • %{HTTP_HOST} holds the hostname from the request, such as www.example.com.
  • %{HTTPS} holds on or off.
  • %{REQUEST_URI} holds the path portion of the URL, with the leading slash.
  • %{QUERY_STRING} holds everything after the ?.
  • %{HTTP_USER_AGENT} holds the browser or bot identification string.
  • %{HTTP_REFERER} holds the page that linked to this request.
  • %{REMOTE_ADDR} holds the client IP address.
  • %{REQUEST_FILENAME} holds the full filesystem path Apache mapped the request to.

Capture groups inside a RewriteCond pattern are backreferenced with %1 to %9 (percent, not dollar). Capture groups inside the RewriteRule pattern stay $1 to $9. Mixing those two up is a classic bug:

apache
# %1 here is the captured group from the RewriteCond above
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ https://%1/$1 [R=301,L]

In that example %1 is whatever the host regex captured (the domain without www.) and $1 is whatever the rule regex captured (the requested path). Two different sources, two different prefixes.

The flags that actually matter

Flags go in square brackets after the substitution, comma-separated. The ones worth memorizing:

FlagEffect
[L]Last. Stop processing further rules if this one matched. Use it on almost every rule.
[R=301]Redirect with this HTTP status. 301 is a permanent redirect, 302 temporary. Without a number it defaults to 302.
[NC]No case. Makes the pattern case-insensitive.
[QSA]Query String Append. Keeps the original ?query and adds it to the substitution.
[NE]No Escape. Stops Apache from URL-encoding special characters in the substitution.
[F]Forbidden. Return a 403 and stop. Used for blocking.
[END]Like [L] but stronger: stops the rewrite engine completely, even across .htaccess re-runs. Apache 2.4+.

For a permanent move you almost always want [R=301,L]. For an internal rewrite the visitor should not see, you want [L] alone (no R).

One caveat on [L] in .htaccess: it stops the current rewrite pass, but an internal rewrite can cause Apache to run the whole .htaccess ruleset again from the top. If a rule keeps re-triggering across those re-runs, use [END] instead. [END] stops the rewrite engine completely and is the safer choice when a rule must run exactly once.

Force HTTPS

Problem: visitors hitting http:// get an insecure connection, and search engines see two versions of every page.

Solution: redirect every HTTP request to the HTTPS equivalent.

apache
RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://:domain/$1 [R=301,L]

The RewriteCond checks that HTTPS is currently off. The RewriteRule captures the whole path with ^(.*)$ and rebuilds the URL on https://. If you are behind a proxy or load balancer that terminates TLS, %{HTTPS} may always read off; in that case test %{HTTP:X-Forwarded-Proto} against !https instead.

Redirect www to non-www (or the reverse)

Problem: www.example.com and example.com both resolve, splitting your SEO signals and confusing analytics.

Solution, www to non-www:

apache
RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ https://%1/$1 [R=301,L]

The condition captures everything after www. into %1. The rule captures the path into $1. Note the escaped dot: \. matches a literal period, while a bare . matches any character. Every literal dot in a host pattern should be escaped this way.

Solution, non-www to www is the mirror image. Replace example\.com with your own domain, keeping the dot escaped as \.:

apache
RewriteEngine On
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ https://www.example.com/$1 [R=301,L]

Pick one canonical form and redirect the other. It does not matter which, as long as you are consistent.

Add or remove the trailing slash

Problem: /about and /about/ serve the same content at two URLs.

Solution, force a trailing slash (skip real files, which should not get one):

apache
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(/$|\.)
RewriteRule ^(.*)$ /$1/ [R=301,L]

The first condition skips anything that maps to a real file (!-f means "not a file"). The second skips paths that already end in a slash or contain a dot (a file extension). What is left gets a slash appended.

Solution, remove the trailing slash:

apache
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [R=301,L]

Here !-d skips real directories (which legitimately end in a slash), and the pattern ^(.*)/$ captures everything before a trailing slash and rebuilds the path without it.

Redirect an old URL to a new one

Problem: you moved or renamed a page and the old URL is indexed, bookmarked, and linked from other sites.

Solution, a single page:

apache
RewriteEngine On
RewriteRule ^:oldpath/?$ :newpath [R=301,L]

The /?$ at the end makes the trailing slash optional, so both /old-page and /old-page/ are caught. Note the RewriteRule pattern has no leading slash, because in .htaccess the path is matched relative to the directory (see the quirk section above). For a one-to-one move with no pattern matching, the simpler mod_alias directive also works, needs no RewriteEngine, and here the path does start with a slash because Redirect matches against the URL path:

apache
Redirect 301 /:oldpath :newpath

Solution, a whole section where the path structure is preserved:

apache
RewriteEngine On
RewriteRule ^blog/(.*)$ /articles/$1 [R=301,L]

Every URL under /blog/ lands at the same path under /articles/. (.*) captures the remainder and $1 puts it back.

Remove the file extension from URLs

Problem: your URLs expose .php (or .html), which looks dated and ties the URL to the implementation.

Solution: serve /about but have Apache load /about.php internally, and redirect anyone who requests the extension directly.

apache
RewriteEngine On

# Redirect direct .ext requests to the clean URL
RewriteCond %{THE_REQUEST} \s/+(.+)\.:ext[\s?] [NC]
RewriteRule ^ /%1 [R=301,L]

# Internally serve the .ext file for the clean URL
RewriteCond %{REQUEST_FILENAME}.:ext -f
RewriteRule ^(.+?)/?$ $1.:ext [L]

The first block uses %{THE_REQUEST}, the raw HTTP request line, because %{REQUEST_URI} would already be rewritten by the time the rule runs and you would get a redirect loop. The second block checks that a file with the extension actually exists (-f) before rewriting to it, so missing pages still 404 correctly.

Pretty URLs: route everything to a front controller

Problem: a custom PHP application needs every request that is not a real file or directory to go to index.php so the application can route it.

Solution:

apache
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^ index.php [L]

The two conditions say "if the request is not a real file and not a real directory". The rule then sends everything else to index.php as an internal rewrite (no R flag, so the browser URL does not change). This is the pattern WordPress, Laravel, and most PHP frameworks ship in their default .htaccess.

Redirect an entire domain

Problem: you are migrating to a new domain and need every URL to follow.

Solution. Replace example\.com with your old domain (dot escaped as \.) and newsite.com with the new one:

apache
RewriteEngine On
RewriteCond %{HTTP_HOST} ^(www\.)?example\.com$ [NC]
RewriteRule ^(.*)$ https://newsite.com/$1 [R=301,L]

The (www\.)? makes the www. optional so both forms of the old domain are caught. Both literal dots in the host pattern are escaped as \.. Every path is preserved through $1. After the move, confirm the new domain resolves and is configured correctly: I run every migrated domain through the DNS Inspector at dnschkr.com, and the wider process is covered in my DNS health check walkthrough.

Block by user-agent, referer, or IP

Problem: a scraper, a bad bot, or a spam referer is hammering the site.

Solution, block a user-agent:

apache
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (badbot|scraperthing|evilcrawler) [NC]
RewriteRule ^ - [F]

The condition matches any of the listed strings (alternation with |) anywhere in the user-agent. The rule substitution - means "do not rewrite the path", and [F] returns a 403. Keep the list specific: matching bot alone would block Googlebot.

Solution, block hotlinking (other sites embedding your images). Replace example\.com with your domain, dot escaped:

apache
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^https?://(www\.)?example\.com/ [NC]
RewriteRule \.(jpg|jpeg|png|gif|webp)$ - [F,NC]

The first condition allows an empty referer (direct visits, some privacy tools). The second allows requests that came from your own domain. Anything else requesting an image file gets a 403.

Solution, block an IP does not need regex at all. In Apache 2.4 a Require not directive cannot stand alone: it must sit inside a <RequireAll> block alongside a positive requirement, or Apache rejects it.

apache
<RequireAll>
    Require all granted
    Require not ip 203.0.113.45
</RequireAll>

Match on the query string

Problem: you need to redirect or block based on something after the ?, which RewriteRule cannot see (it only matches the path).

Solution: RewriteRule patterns never include the query string, so you match it with a RewriteCond against %{QUERY_STRING}.

apache
# Redirect /search?q=something to /find/something
RewriteEngine On
RewriteCond %{QUERY_STRING} ^q=(.+)$
RewriteRule ^search/?$ /find/%1? [R=301,L]

The %1 is the captured query value. The trailing ? on the substitution is deliberate: it discards the original query string so it is not appended again. Without it you would get /find/something?q=something.

Quirks and gotchas

The things that cost people an afternoon.

RewriteEngine On is required. None of the Rewrite* directives do anything until the engine is switched on. It only needs to appear once per .htaccess.

.htaccess must be allowed. The server needs AllowOverride All (or at least AllowOverride FileInfo) for the directory in its main config. On many managed hosts this is already set; on your own server it is not the default in Apache 2.4. If your .htaccess is being ignored entirely, this is the first thing to check.

Infinite redirect loops. A rule that redirects to a URL that also matches the rule will loop until the browser gives up. The fix is a RewriteCond that excludes the destination, or matching %{THE_REQUEST} (the original request line) instead of %{REQUEST_URI} (which reflects earlier rewrites).

RewriteBase. When a substitution is a relative path and the .htaccess is not in the document root, Apache can guess the base path wrong. Setting RewriteBase / (or the correct subdirectory) makes it explicit.

Escape the dot. In a regex, . matches any character. A literal period is \.. example.com as a pattern technically matches exampleXcom too. It rarely matters for host matching but always escape it in file-extension patterns where it absolutely does.

Anchor your patterns. An unanchored pattern matches a substring. RewriteRule old (.*) fires on /folder/old-stuff as well as /old. Use ^ and $ unless you have a specific reason not to. See regex anchors for why this matters so much in validation and rewriting.

.htaccess has a performance cost. Apache reads and parses every .htaccess file in the path of every request. On your own server, moving the rules into the main <VirtualHost> config and setting AllowOverride None is faster. .htaccess is for when you cannot edit the main config, which is most shared hosting.

Order matters. Rules run top to bottom. A rule with [L] stops the cascade. Put your most specific rules first and your catch-all front-controller rule last.

Test before you break production

A bad .htaccess can take a site down with a 500 error. Test deliberately.

  • Syntax check locally. apachectl configtest validates the main config. .htaccess itself is parsed per request, so the real test is loading a page.
  • Turn on rewrite logging. In Apache 2.4, add LogLevel alert rewrite:trace3 to the server or virtual-host config. The error log then shows, step by step, what each rule matched and rewrote. This is the single best debugging tool for mod_rewrite.
  • Test the regex in isolation. Paste just the pattern into regex101.com with the PCRE flavor selected, and test it against the per-directory path string (no leading slash).
  • Keep a backup. Before editing, copy the working file. If a rule causes a 500, restoring the backup is faster than debugging under pressure.
  • Use a staging copy. Apply rules on a staging site or a test path first whenever the change is non-trivial.

FAQ

See also

External references: the Apache mod_rewrite documentation is the authoritative source for directive syntax and flags. The Apache RewriteRule flags reference lists every flag. Test patterns interactively at regex101.com with the PCRE flavor selected.

TagshtaccessRegexApachemod_rewriteRewriteRuleRewriteCondURL RedirectsRegular ExpressionsWeb ServerSEO
Share
Ishan Karunaratne

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years across software, Linux systems, DevOps, and infrastructure — and a more recent focus on AI. Currently Chief Technology Officer at a tech startup in the healthcare space.

Keep reading

Related posts

Using regex in Nginx with location blocks and the rewrite directive: location modifier priority, the rewrite directive flags, return-based redirects, and copy-paste config for HTTPS redirects, www normalization, trailing slashes, 301 redirects, clean URLs, and blocking by user-agent or IP.

How to Use Regex in Nginx (location and rewrite)

Use regex in Nginx with location blocks and the rewrite directive: how location modifiers and matching priority work, why return beats rewrite for redirects, and copy-paste config for HTTPS, www, trailing slashes, 301s, clean URLs, and access blocking.

Regex lookaheads and lookbehinds assert what comes before or after a match without consuming characters. Full reference with syntax, password validation, variable-width vs fixed-width support per engine, and examples in JavaScript, Python, PHP, Go, Java, .NET.

How to Use Regex Lookaheads and Lookbehinds

Regex lookaheads and lookbehinds assert what comes before or after a match without consuming characters. Full reference with syntax, password validation, variable-width vs fixed-width support per engine, and examples in JavaScript, Python, PHP, Go, Java, .NET.

Wire ElasticPress to WP_Query so WordPress queries hit Elasticsearch or OpenSearch instead of MySQL. Install, indexable post types, ep_integrate, wp-cli index, faceted aggregations, and when ES actually beats MySQL FULLTEXT.

How to Use ElasticPress with WP_Query

Wire ElasticPress to WP_Query so WordPress queries hit Elasticsearch (or OpenSearch) instead of MySQL. Covers installation, indexable post types, ep_integrate, the wp-cli index command, faceted search with aggregations, and when ES actually beats MySQL FULLTEXT.