The practical regex for matching a domain name: ^(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,63}$. It accepts example.com, sub.example.co.uk, and mail.example-domain.io while rejecting example (no TLD), -example.com (leading hyphen), and example.c (TLD too short). For Internationalised Domain Names like 例え.テスト, match the punycode form (xn--r8jz45g.xn--zckzah) using the same character class and the same rules. Below I walk the basic pattern, the strict RFC 1035 form with full length checks, runnable code in JavaScript, Python, and PHP, engine notes, the common bugs, and the case where the right call is to skip regex and run a DNS check.
The reason this comes up so often is that domain validation lives in two places: form inputs (where users type their company domain) and log scanning (where you extract every domain mentioned in a stream of text). The same pattern works for both, with anchors for full-string match in forms and unanchored in scans.
Quick reference
The practical pattern, ready to paste:
^(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,63}$
The same pattern with the 255-character total length enforced:
^(?=.{1,255}$)(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,63}$
Single-label hostname (e.g., localhost, printer-01):
^[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?$
The practical pattern
^(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,63}$
Breaking it down:
^and$anchor to the full string for form validation. Drop them for log scanning.[a-zA-Z0-9]means a label cannot start with a hyphen.(?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?allows middle characters including hyphens, but the last character cannot be a hyphen. Together with the first character rule, this enforces "no leading or trailing hyphen". The 61 limits the middle so the total label length stays at most 63.\.is a literal dot.- The whole label pattern is repeated with
+so you can have multiple labels (mail.example.co.uk). [a-zA-Z]{2,63}is the TLD: letters only, between 2 and 63 characters. Modern TLDs include.museum,.london,.amazon.
This pattern rejects all-numeric TLDs (which don't exist in real DNS) and localhost (single-label). To accept single-label hostnames for internal use, see the variant in the Quick reference above.
The strict pattern (RFC 1035 length limits)
RFC 1035 imposes two length rules: each label is at most 63 octets, and the entire domain (including dots) is at most 255 octets. Regex can enforce the per-label limit naturally; the total length is easier to check with a lookahead.
^(?=.{1,255}$)(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,63}$
The added piece is (?=.{1,255}$), a lookahead that asserts the entire string is between 1 and 255 characters. Combined with the rest, this enforces both per-label and total-length limits.
For more on lookahead patterns like this, see how to use regex lookaheads and lookbehinds.
Subdomains and depth control
If you want to limit how deep the subdomain tree can go (for example, allow only something.example.com, not a.b.c.example.com):
Exactly one subdomain: ^[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.[a-zA-Z]{2,63}$
Up to 3 subdomains: ^(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.){1,3}[a-zA-Z]{2,63}$
Just root domain (no subs): ^[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.[a-zA-Z]{2,63}$
The {1,3} quantifier on the label group controls how many "name." segments precede the TLD. Use this when your application has a specific subdomain policy.
Internationalised domain names (IDN / punycode)
A domain like 例え.テスト is internationalised. DNS itself does not speak Unicode. Internationalised domains get encoded as punycode (xn--r8jz45g.xn--zckzah) for the wire format. The good news: punycode is ASCII-safe and the same regex matches it.
^(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,63}$
Try it against xn--r8jz45g.xn--zckzah. It matches because punycode uses only a-z, 0-9, and hyphen.
If you need to accept the Unicode form directly in user input (and convert to punycode later), Python provides idna.encode(), Node has url.domainToASCII(), and PHP has idn_to_ascii(). Run the input through the IDN encoder first, then validate the punycode result with the regex.
Examples in JavaScript, Python, and PHP
JavaScript:
const domainPattern = /^(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,63}$/;
function isValidDomain(input) {
if (input.length > 255) return false;
return domainPattern.test(input);
}
isValidDomain("example.com"); // true
isValidDomain("mail.example.co.uk"); // true
isValidDomain("xn--r8jz45g.xn--zckzah"); // true (punycode IDN)
isValidDomain("-example.com"); // false (leading hyphen)
isValidDomain("example"); // false (no TLD)
// For Unicode input, encode first
const punycode = require("url").domainToASCII("例え.テスト");
isValidDomain(punycode); // truePython:
import re
DOMAIN_RE = re.compile(
r"^(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,63}$"
)
def is_valid_domain(value: str) -> bool:
if len(value) > 255:
return False
return bool(DOMAIN_RE.match(value))
is_valid_domain("example.com") # True
is_valid_domain("mail.sub.co.uk") # True
is_valid_domain("example..com") # False (empty label)
# For Unicode input, use the idna package
import idna
punycode = idna.encode("例え.テスト").decode("ascii")
is_valid_domain(punycode) # TruePHP:
function isValidDomain(string $value): bool {
if (strlen($value) > 255) return false;
$pattern = '/^(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,63}$/';
return (bool) preg_match($pattern, $value);
}
isValidDomain("example.com"); // true
isValidDomain("xn--r8jz45g.xn--zckzah"); // true
// For Unicode input
$punycode = idn_to_ascii("例え.テスト", IDNA_DEFAULT, INTL_IDNA_VARIANT_UTS46);
isValidDomain($punycode); // trueEngine compatibility
The strict-length variant uses a lookahead (?=.{1,255}$). Most engines support it; a few do not. The practical pattern is lookahead-free and runs everywhere.
| Engine | Practical pattern | Strict (lookahead) | IDN encoder in stdlib |
|---|---|---|---|
| JavaScript | Works | Works | url.domainToASCII (Node), or the punycode package |
Python (re) | Works | Works | encodings.idna or third-party idna |
| PHP (PCRE) | Works | Works | idn_to_ascii (intl extension) |
| Java | Works | Works | java.net.IDN.toASCII |
| .NET | Works | Works | System.Globalization.IdnMapping |
| Go (RE2) | Works | Not supported | golang.org/x/net/idna |
Rust (regex crate) | Works | Not supported | idna crate |
| Ruby | Works | Works | Addressable::URI (gem) |
POSIX ERE (grep -E) | Works (no \d/\w) | Not supported | None |
For Go and Rust, where lookahead is unavailable, enforce the total 255-character length in code as a separate check after the regex passes (see the JavaScript example above).
When to skip regex and use a DNS check
A regex confirms the string is shaped like a domain. It does not confirm the domain exists, resolves, or points anywhere. For applications where that matters (webhook URLs, OAuth callbacks, transactional email domains), pair the regex with a DNS lookup.
JavaScript (Node):
const dns = require("node:dns").promises;
async function domainExists(domain) {
if (!isValidDomain(domain)) return false;
try {
await dns.resolve(domain);
return true;
} catch {
return false;
}
}Python:
import socket
def domain_exists(domain: str) -> bool:
if not is_valid_domain(domain):
return False
try:
socket.gethostbyname(domain)
return True
except socket.gaierror:
return FalseFor deeper checks (does this domain have valid MX records, DNSSEC, working SPF and DKIM, a current SSL certificate), the DNS Inspector at dnschkr.com runs 25+ automated tests against a domain in one request and returns a health score. It's also what I cover in detail in the DNS health check walkthrough on this site.
Common mistakes
The bugs I see most often, with the fix for each.
Allowing labels to start or end with a hyphen. A naive [a-zA-Z0-9-]+ accepts -bad.com and bad-.com. The practical pattern in this article uses a first-character class without -, a middle class with -, and a last-character class without -, which enforces the rule.
Accepting all-numeric labels in the TLD position. 123.456 parses as "shape OK" but no real TLD is numeric. The pattern uses [a-zA-Z]{2,63} for the TLD, which excludes digits.
Capping the TLD at 4 or 6 characters. Old patterns used [a-zA-Z]{2,4} which rejects .museum, .amazon, .travel, and .london. Use {2,63} to match the RFC limit.
Forgetting punycode is the wire format. Unicode domains (例え.テスト) never appear on the DNS query. They're encoded to xn--... before the lookup. Validate the encoded form, or run the IDN encoder first and then validate.
Trying to match the leading-zero IPv4-as-domain case. 192.168.1.1 matches the practical domain pattern because each octet is alphanumeric. If your input might be an IP literal, route it through the IP regex first (see IPv4 and IPv6 matching).
Treating a regex pass as proof the domain resolves. It doesn't. Pair the regex with a DNS lookup whenever "is this real" matters.
Test cases
| Input | Practical pattern | RFC 1035 strict |
|---|---|---|
example.com | Match | Match |
mail.example.co.uk | Match | Match |
sub.sub.sub.example.com | Match | Match |
xn--r8jz45g.xn--zckzah | Match | Match |
example-domain.io | Match | Match |
123abc.com | Match | Match |
-example.com | No match | No match |
example-.com | No match | No match |
example..com | No match | No match |
example | No match | No match |
example.c | No match | No match |
a.b | No match | No match |
localhost | No match | No match |
FAQ
See also
- How to Match a URL with Regex: the parent pattern that contains the domain plus protocol, path, query, and fragment
- How to Match an Email Address with Regex: the part after the
@in an email is a domain - How to Match an IPv4 and IPv6 Address with Regex: the parser fallback when you need to handle both domain names and raw IPs in the same field
- How to Run a DNS Health Check on Your Domain: the wider DNS-validation picture
- Regex Anchors: why
^and$matter so much for validation patterns - Regex Lookaheads and Lookbehinds: how the 255-character lookahead works
- Regex Cheat Sheet: the wider syntax and engine compatibility reference
External references: RFC 1035 section 2.3.1 defines the domain name length rules. The dnschkr.com DNS Inspector runs DNS health tests against any domain to verify it actually resolves and is configured correctly. Test the regex interactively at regex101.com.





