TechEarl

Build State and City Location URLs in WordPress (from the Root)

Serve geographic URLs like /california/ and /california/los-angeles/ from the WordPress root: a location CPT, a state/city taxonomy, two add_rewrite_rule patterns mapping path segments to query vars, and pre_get_posts to load the right listings without thin pages.

Ishan Karunaratne⏱️ 13 min readUpdated
Share thisCopied
Build state and city location URLs in WordPress: a location custom post type, a hierarchical state/city taxonomy, add_rewrite_rule mapping one or two path segments to state and city query vars, and pre_get_posts loading the right term listings.

To serve geographic URLs straight from the root, like /california/ for a state and /california/los-angeles/ for a city, you need three pieces working together: a place to store the location data (a location custom post type plus a hierarchical state/city taxonomy), two add_rewrite_rule patterns that capture one or two path segments into query vars, and a pre_get_posts hook that reads those vars and loads the right term and listings. Here is the routing core in isolation, the part most snippets get wrong (the data-model section below folds these same rules into one assembled plugin file):

php
add_action( 'init', function () {
    // Two segments: /california/los-angeles/ -> state + city.
    add_rewrite_rule(
        '^([^/]+)/([^/]+)/?$',
        'index.php?te_state=$matches[1]&te_city=$matches[2]',
        'top'
    );
    // One segment: /california/ -> state only.
    add_rewrite_rule(
        '^([^/]+)/?$',
        'index.php?te_state=$matches[1]',
        'top'
    );
} );

add_filter( 'query_vars', function ( $vars ) {
    $vars[] = 'te_state';
    $vars[] = 'te_city';
    return $vars;
} );

add_action( 'pre_get_posts', function ( $query ) {
    if ( is_admin() || ! $query->is_main_query() ) {
        return;
    }

    $state = $query->get( 'te_state' );
    if ( ! $state ) {
        return; // Not one of our location URLs.
    }

    $city = $query->get( 'te_city' );

    $query->set( 'post_type', 'location' );
    $query->set( 'location', $city ? $city : $state );
    $query->set( 'posts_per_page', 24 );
} );

That pre_get_posts callback is the whole trick. It runs on the main query, checks whether the request carried a te_state value, and if so rewrites the query to pull location posts filtered to the right taxonomy term. No template-redirect hacks, no second WP_Query, no query_posts(). WordPress builds the page off its own main loop, so pagination, is_paged(), and template hierarchy all behave normally.

The rest of this article fills in the data model behind it, explains why root-level segments are the dangerous part, and shows how to keep these pages from being thin SEO filler.

The geo URL rule, plainly

A location directory wants two URL shapes:

  • /[state]/ lists every location in a state.
  • /[state]/[city]/ lists (or shows) the locations in one city.

The reason to serve them from the root, with no /state/ or /locations/ prefix, is that short, keyword-clean URLs read better and tend to do marginally better in search. /california/los-angeles/ is the URL a human would guess. That is the upside.

The catch is that a root-level ^([^/]+)/?$ rule matches literally any single-segment path: /about/, /contact/, /california/, all of them. You are claiming the entire root namespace for your location router. That collision risk is the price of the clean URL, and the back half of this article is about guarding it. If you cannot guard it cleanly, prefixing the rules with a static segment (^locations/([^/]+)/?$) trades the prettiness for safety, and that is often the right call.

The data model: a CPT plus a state/city taxonomy

Store each physical location as a location custom post type, and model the geography as a single hierarchical taxonomy where states are top-level terms and cities are their children. A hierarchical taxonomy is the natural fit because a city genuinely belongs to a state, the same parent/child relationship WordPress categories use.

This is the file I drop into wp-content/mu-plugins/te-location-urls.php so it loads automatically and survives theme switches. The docblock header is the standard plugin metadata:

php
<?php
/**
 * Plugin Name: TE Location URLs
 * Plugin URI:  https://techearl.com/wordpress-state-city-location-urls
 * Description: Serves /state/ and /state/city/ location URLs from the WordPress root.
 * Version:     1.0.0
 * Author:      Ishan Karunaratne
 * Author URI:  https://techearl.com
 * License:     GPL-2.0-or-later
 * Text Domain: te-location-urls
 */

add_action( 'init', 'te_location_register' );

function te_location_register() {
    register_post_type( 'location', array(
        'labels'       => array( 'name' => __( 'Locations', 'te-location-urls' ) ),
        'public'       => true,
        'has_archive'  => false,
        'rewrite'      => false, // We own the rewrites by hand; see below.
        'supports'     => array( 'title', 'editor', 'thumbnail', 'custom-fields' ),
        'show_in_rest' => true,
    ) );

    register_taxonomy( 'location', 'location', array(
        'labels'            => array( 'name' => __( 'Places', 'te-location-urls' ) ),
        'public'            => true,
        'hierarchical'      => true, // States are parents, cities are children.
        'rewrite'           => false, // Same: no auto rewrite, we route ourselves.
        'show_admin_column' => true,
        'show_in_rest'      => true,
    ) );

    // Two segments: /california/los-angeles/ -> state + city.
    add_rewrite_rule(
        '^([^/]+)/([^/]+)/?$',
        'index.php?te_state=$matches[1]&te_city=$matches[2]',
        'top'
    );
    // One segment: /california/ -> state only.
    add_rewrite_rule(
        '^([^/]+)/?$',
        'index.php?te_state=$matches[1]',
        'top'
    );
}

The rewrite rules from the top of the article live inside this same te_location_register() function, so the activation flush (further down) regenerates the exact rules this function declares. Two things to flag. First, both the post type and the taxonomy register with 'rewrite' => false. If you let WordPress generate its own permalink structure for them, it will add a second set of rules (/location/california/, /?location=california) that compete with the clean root rules you are about to add, and you will spend an evening figuring out which rule won. Own the routing or let WordPress own it, not both.

Second, I named the taxonomy location to match the post type, which is fine, but note that the slug a term gets (california, los-angeles) is what your URL segment matches against in pre_get_posts. Term slugs are unique within their parent in a hierarchical taxonomy, so two states can each have a springfield child without colliding, which is exactly the behaviour a real directory needs.

If a single taxonomy feels heavy, the lighter alternative is two flat query vars and no taxonomy at all: store the state and city as post meta on each location, and have pre_get_posts build a meta_query. That is simpler to reason about for a small dataset, but you lose the term archives, the admin term UI, and clean term-based caching, so I reach for the taxonomy on anything that will grow.

Routing the one and two segment cases

Order matters in the rewrite table. WordPress evaluates rules top to bottom and takes the first match, so the two-segment rule must be registered before the one-segment rule. If the single-segment ^([^/]+)/?$ came first, /california/los-angeles/ would never reach the two-segment rule because... actually it would not match the single-segment anchor at all (the trailing /los-angeles/ breaks the $), but ordering still bites you with looser patterns, so register specific-before-general as a habit.

The 'top' priority argument pushes both rules above WordPress's built-in rules. That is deliberate and it is also where the danger lives: a top rule for ^([^/]+)/?$ will intercept /sample-page/ before WordPress gets to resolve it as a real page. Read the collision section before you ship this.

When the request comes in as /california/los-angeles/, the matched rule rewrites it internally to index.php?te_state=california&te_city=los-angeles. WordPress parses that into query vars (because you whitelisted them in the query_vars filter), and pre_get_posts reads them. The $matches[1] and $matches[2] tokens are the captured regex groups, in order. Keep them paired with the right query var: swapping $matches[1] and $matches[2] is a classic ten-minute bug.

One detail on the captured value: WordPress URL-decodes and passes the raw segment through, so a city slug with a space-turned-hyphen (los-angeles) arrives as-is and matches the term slug directly. If your terms could contain characters the regex [^/]+ is too greedy or too loose for, tighten the pattern (for example ([a-z0-9-]+)) so a malformed request fails the rule instead of running a junk query.

Make each location page genuinely useful (the thin-content trap)

This is the part that decides whether the cluster ranks or gets ignored. A location directory is, by construction, a programmatic-SEO pattern: one template, hundreds or thousands of generated URLs. Search engines are openly hostile to that pattern when the pages are thin, near-duplicate, or exist only to capture a "[service] in [city]" query with nothing behind it.

So the rule is: every location URL you let resolve must be a page a human would find useful on its own. Concretely:

  • Real content per page. A state page should summarize what is in that state, link to its cities, and show actual listings, not just echo "Locations in California" above an empty loop. A city page needs the locations themselves plus something specific: hours, addresses, a short human description. If the only difference between /california/los-angeles/ and /texas/austin/ is the two place names swapped in, Google will treat them as duplicates.
  • One canonical URL per location. Emit a self-referential <link rel="canonical"> on each location page pointing at its own clean URL, so the /california/ archive and any ?location=california fallback do not split signals. If you killed the taxonomy's auto-rewrite as shown above, you have already closed the most common duplicate-URL leak.
  • Do not generate pages for empty places. A /wyoming/cheyenne/ URL that resolves to zero listings is the textbook thin page. Let those 404 (return early in pre_get_posts and let the query come up empty, then send a real 404) rather than serving an empty template.
  • Interlink the hierarchy. State pages link down to their cities; city pages link up to their state and across to sibling cities. That internal linking is both a usability win and how the crawler discovers the whole tree.

This exact pattern, root-level geographic URLs backed by a CPT and a place taxonomy, is what sits under most rehab-directory, real-estate, and local-services sites. The technique is sound. It earns its keep only when each generated page clears the "would a person bookmark this" bar.

Collisions and the flush gotcha

Two failure modes will eat your afternoon.

The root-namespace collision. Because ^([^/]+)/?$ is a top rule, it intercepts single-segment requests before WordPress resolves them as pages or other post types. Ship this as-is and /about/, /contact/, /privacy-policy/ all route into your location handler, hit pre_get_posts, find no matching te_state term, and (if you are not careful) render an empty location archive instead of the real page. Guard it: in pre_get_posts, confirm the captured segment actually maps to a top-level term before you take over the query.

php
add_action( 'pre_get_posts', function ( $query ) {
    if ( is_admin() || ! $query->is_main_query() ) {
        return;
    }

    $state = $query->get( 'te_state' );
    if ( ! $state ) {
        return;
    }

    // Only claim the request if the segment is a real top-level place term.
    $term = get_term_by( 'slug', sanitize_title( $state ), 'location' );
    if ( ! $term || 0 !== (int) $term->parent ) {
        return; // Let WordPress resolve /about/, /contact/, etc. normally.
    }

    $city = $query->get( 'te_city' );
    $query->set( 'post_type', 'location' );
    $query->set( 'location', $city ? $city : $state );
} );

That get_term_by check is what makes a root-level router safe: it hands the request back to WordPress whenever the first segment is not one of your states. If you cannot tolerate even the risk, prefix the rewrite rules with a static segment (^locations/...) and the whole collision class disappears, at the cost of a longer URL. Pick based on how much you trust your slug space; a marketing site with dozens of top-level pages should lean toward the prefix.

The flush. Rewrite rules are cached in the database, not recomputed on every request. A freshly added add_rewrite_rule does nothing until the rule table is regenerated, so a new install of this code makes /california/ return a 404 and you assume the regex is wrong when it is fine. Flush once, on plugin activation, never on init:

php
register_activation_hook( __FILE__, 'te_location_flush_rules' );

function te_location_flush_rules() {
    te_location_register(); // Your function that registers the CPT, taxonomy, and rules.
    flush_rewrite_rules();   // Then regenerate the rule table once.
}

Pull the registration shown earlier into a named te_location_register() function (rather than the anonymous init closure) so activation can call the exact same code before flushing. Calling flush_rewrite_rules() on init (a tempting "just make it work" move) runs that expensive regeneration on every page load and will quietly tank your performance. Flush on activation, and again by hand (Settings, then Permalinks, and hit Save) any time you edit the rules during development. There is a fuller treatment of the flush-once discipline in the foundation article linked below.

Verify it with curl

curl resolving the root-level location URL /california/los-angeles/ to the location content
Real output: the root-level /california/los-angeles/ URL resolving.

Before you trust the routing, hit it from the command line and read the headers. A 200 with your location template is right; a 404 means the rule did not match (or you forgot to flush).

bash
# State page should resolve.
curl -I https://example.com/california/

# City page should resolve.
curl -I https://example.com/california/los-angeles/

# A non-place single segment must still 404 or hit the real page, not your handler.
curl -I https://example.com/about/

# See exactly which rewrite rule matched (returns the resolved query string).
curl -s "https://example.com/california/los-angeles/?_debug_rewrite=1"

If you want to inspect the rule table itself rather than guess, WP-CLI prints every registered rewrite rule and what it maps to:

bash
wp rewrite list --format=table | grep -E 'te_state|te_city'

Seeing your two patterns in wp rewrite list confirms they are registered and flushed; not seeing them means the registration did not run or the flush has not happened yet.

See also

Sources

Authoritative references this article was fact-checked against.

TagsWordPressPHPRewrite APIPermalinksCustom Post TypesTaxonomyLocal SEO

Found this useful? Pass it on.

Copied

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years building software, Linux systems, and DevOps infrastructure, and lately working AI into the stack. Currently Chief Technology Officer at a healthcare tech startup, which is where most of these field notes come from.

Keep reading

Related posts