ALTER TABLE on a billion-row table is the canonical "this is going to be a long night" moment for a DBA. MySQL 5.6 added online DDL for many operations. MySQL 8.0 added ALGORITHM=INSTANT for ADD/DROP COLUMN and a few other cases, making them effectively free. But some operations still rewrite the entire table: changing a column type, converting character sets, modifying a primary key. For those, the right tool is one of three: native online DDL, gh-ost (built by GitHub), or pt-online-schema-change (Percona Toolkit). Each has a different mechanism, a different failure mode, and a different operational profile. Below is the comparison from running each in production over the years, with the cases where each is the right answer.
Which online schema migration tool should I use for MySQL?
Use native online DDL for any operation MySQL 8.0+ can do with ALGORITHM=INSTANT (ADD COLUMN, DROP COLUMN, RENAME COLUMN since 8.0.29, change column default) — and for INPLACE operations like adding a secondary index. These complete in milliseconds (INSTANT) or run online (INPLACE) and need no external tool. Use gh-ost for any heavy ALTER that rebuilds the table (column type change, character set conversion) on a topology with replicas. gh-ost is triggerless: instead of triggers it captures ongoing changes by tailing the binary log, so it adds no synchronous overhead to application writes. The row copy and cutover still run on the primary — gh-ost is lower-impact than a trigger-based tool, not zero-impact. Use pt-online-schema-change when you're on a single-instance MySQL with no replicas, or when you need to ALTER something gh-ost doesn't support (a foreign-key parent table, or a table whose only key is a non-PRIMARY unique index). pt-osc is trigger-based, which is operationally simpler but adds synchronous load to the primary. Never use any of these without first checking if the operation can run with ALGORITHM=INSTANT — that's always faster and safer.
Jump to:
- What MySQL 8.0 made instant
- When native online DDL is enough
- How gh-ost works
- How pt-online-schema-change works
- The decision matrix
- Concrete examples: same ALTER, three tools
- Replication considerations
- Foreign-key handling
- Throttling and back-off
- What to do next
- FAQ
What MySQL 8.0 made instant
ALGORITHM=INSTANT adds or removes the column from the table metadata only. No table rewrite, no data movement, no per-row work. The operation is O(1) regardless of table size.
8.0 supported INSTANT for these operations:
| Operation | 8.0 |
|---|---|
| ADD COLUMN (as the last column) | ✓ since 8.0.0 |
| ADD COLUMN (in any position) | ✓ since 8.0.29 |
| DROP COLUMN | ✓ since 8.0.29 |
| RENAME COLUMN | ✓ since 8.0.29 |
| Modify column default | ✓ since 8.0 |
| SET / DROP DEFAULT | ✓ since 8.0 |
| Change column to NULL | Sometimes (depends on storage format) |
| Rename table | ✓ always was metadata-only |
For these, run the ALTER unmodified. MySQL picks INSTANT automatically when possible. To make sure:
ALTER TABLE my_table ADD COLUMN new_col VARCHAR(255) DEFAULT NULL, ALGORITHM=INSTANT;A non-NULL default does not rule out INSTANT — MySQL stores the instant-added column's default in table metadata rather than writing it to every row. The real reasons an ALTER can't be INSTANT are an unsupported operation, an incompatible row format, exceeding the row-size limit, certain generated-column or FULLTEXT cases, or a table that has used up its budget of pending INSTANT changes before a forced rebuild. When INSTANT can't apply, the ALTER fails fast and tells you which algorithm would actually work.
When native online DDL is enough
Beyond INSTANT, MySQL's ALGORITHM=INPLACE, LOCK=NONE covers many heavier operations without a table rewrite or without a long lock:
| Operation | INPLACE without rewrite | INPLACE with rewrite |
|---|---|---|
| Add secondary index | ✓ | n/a |
| Drop secondary index | ✓ | n/a |
| Rename index | ✓ | n/a |
| Change column type (compatible) | ✓ | Sometimes |
| Add PRIMARY KEY | n/a | ✓ |
| Change PRIMARY KEY | n/a | ✓ |
| Change column type (incompatible: VARCHAR→TEXT) | n/a | ✓ |
| Convert character set | n/a | ✓ |
| Change ROW_FORMAT | n/a | ✓ |
| Add FULLTEXT index | n/a | ✓ |
For "INPLACE without rewrite" operations, native is the right answer. They run concurrent with traffic, with only a brief metadata lock at the start and end. One caveat: building a secondary index is "INPLACE without rewrite" but still scans the whole table and consumes real I/O — on a very large, very busy table you may still reach for an external tool to pace it, even though native online DDL can technically do it.
For "INPLACE with rewrite," native is workable for small-to-medium tables (under 50 GB or so) but the operation can still take hours and consumes I/O. This is where external tools start to matter.
The diagnostic: name the algorithm explicitly. EXPLAIN does not work on ALTER TABLE (it covers SELECT, INSERT, UPDATE, DELETE, REPLACE, and TABLE only), so you can't dry-run an ALTER that way. Instead, add an explicit ALGORITHM= / LOCK= clause: MySQL will refuse the statement up front rather than silently fall back to a slower algorithm, which tells you immediately what it can and can't do. Or test the ALTER on a staging copy of the table.
ALTER TABLE my_table MODIFY COLUMN status VARCHAR(50), ALGORITHM=INPLACE, LOCK=NONE;
-- ERROR 1845 (0A000): ALGORITHM=INPLACE is not supported for this operation.
-- Try ALGORITHM=COPY.That error means native online won't work — pick gh-ost or pt-osc.
How gh-ost works
gh-ost (GitHub's online schema migration tool) is triggerless. The flow:
- Create a shadow table (
_tablename_gho) with the desired schema, on the same server as the original. - Copy rows from the original table to the shadow table in chunks. This row copy runs on the primary, where the table lives.
- Tail the binary log to capture every INSERT/UPDATE/DELETE on the original and replay it onto the shadow table. gh-ost can read the binlog from a replica so it isn't competing with the primary's own binlog-dump threads, but the replayed writes still land on the shadow table on the primary.
- When the copy completes and replay catches up, run an atomic table rename: original →
_tablename_del, shadow →tablename. - Drop the
_deltable (or keep it, gh-ost has a flag).
The key property: gh-ost is triggerless. Where pt-osc installs triggers that add synchronous work to every application write, gh-ost captures changes asynchronously from the binary log. That makes it gentler on application latency and easy to throttle. But the row copy, the change replay, and the cutover all run against the primary — gh-ost is lower-impact than a trigger-based tool, not zero-impact.
Run on a typical schema:
gh-ost \
--host=replica.db.example.com \
--user=root \
--password=YOUR_PASSWORD \
--database=mydb \
--table=users \
--alter="MODIFY COLUMN id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT" \
--max-load='Threads_running=25' \
--critical-load='Threads_running=50' \
--chunk-size=1000 \
--switch-to-rbr \
--executeThe --max-load and --critical-load flags throttle gh-ost when the primary is busy, so the migration backs off automatically under load.
What gh-ost does not support:
- Foreign keys as parent (a table referenced by FK on another table). It can handle child tables with FKs pointing out, but not parents.
- Tables with no primary key or unique key.
- Triggers on the original table (gh-ost won't run; remove them, run gh-ost, recreate triggers via gh-ost's
--hooksmechanism).
When those constraints aren't an issue, gh-ost is what I reach for first on any large-table ALTER.
How pt-online-schema-change works
pt-online-schema-change (from Percona Toolkit) is trigger-based. The flow:
- Create a shadow table with the desired schema.
- Install three triggers on the original (
AFTER INSERT,AFTER UPDATE,AFTER DELETE) that mirror every change into the shadow. - Copy existing rows from original to shadow in chunks.
- Atomic rename: original → old, shadow → original. Drop old.
The triggers run synchronously on every write to the original table during the migration, which:
- Adds latency to every INSERT/UPDATE/DELETE on the original.
- Doubles the write load (every write also writes to the shadow).
- Risks deadlocks if your application does concurrent transactions on overlapping rows.
Run:
pt-online-schema-change \
--alter "MODIFY COLUMN id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT" \
--execute \
h=db.example.com,D=mydb,t=usersWhat pt-osc does support that gh-ost doesn't:
- Parent tables with foreign keys. pt-osc has flags (
--alter-foreign-keys-method) to handle the FK reconnection — though Percona flags this as the riskiest part of the tool (see Foreign-key handling). - Tables that have a unique index but no declared
PRIMARY KEY— pt-osc can use a suitable unique index as the chunk key. It cannot operate on a table with no primary key and no unique index at all. - Some weirder schema shapes that gh-ost rejects.
When pt-osc is the right pick:
- You have only one MySQL instance (no replicas). gh-ost prefers to read from a replica's binary log.
- The table is a parent of foreign keys you can't or won't drop.
- The application can tolerate trigger overhead on every write.
The decision matrix
Operation possible with ALGORITHM=INSTANT?
Yes → native, ALTER TABLE ... ALGORITHM=INSTANT
No → continue
Operation possible with ALGORITHM=INPLACE, LOCK=NONE, no rewrite?
Yes → native, ALTER TABLE ... ALGORITHM=INPLACE, LOCK=NONE
No → continue
Have replicas + table is not an FK parent + has primary key?
Yes → gh-ost
No → continue
Single instance, or table is an FK parent, or no primary key?
→ pt-online-schema-change
Run the matrix top-to-bottom. Don't skip levels because of habit; native is always faster than external when it works.
Concrete examples: same ALTER, three tools
The example: a users table is running out of INT primary-key space, so we widen id from INT to BIGINT UNSIGNED. 50 million rows, 12 GB on disk. A primary-key column type change genuinely rebuilds the table on every MySQL version and storage format — there is no INSTANT shortcut and no INPLACE path, so this is exactly the kind of ALTER the three tools exist for.
Native (MySQL 8.0):
ALTER TABLE users MODIFY COLUMN id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
ALGORITHM=COPY, LOCK=SHARED;A column data-type change is ALGORITHM=COPY only — there is no INPLACE option. LOCK=SHARED permits concurrent reads but blocks all writes for the duration. Wall-clock time: ~45-90 minutes on SSD, during which the table is read-only. That is acceptable only inside a maintenance window, which is the whole reason the external tools exist.
gh-ost:
gh-ost \
--host=replica.db.example.com \
--user=root --password=$DB_PASS \
--database=mydb --table=users \
--alter="MODIFY COLUMN id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT" \
--executeWall-clock time: ~60-120 minutes on the same hardware, fully online — reads and writes continue throughout. The primary still does the row copy and the cutover, but there are no triggers, so application writes are never slowed synchronously, and gh-ost throttles the copy when the primary is under load. --host points at a replica; gh-ost tails that replica's binlog and applies the migration to the primary it discovers.
pt-online-schema-change:
pt-online-schema-change \
--alter "MODIFY COLUMN id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT" \
--execute \
h=primary.db.example.com,D=mydb,t=usersWall-clock time: ~75-150 minutes, also fully online. The primary sees doubled write load (every write also fires the shadow-table triggers) for the duration, and application write latency typically rises 20-50%.
For this operation, native COPY is off the table for a live service — it blocks writes. The real choice is gh-ost (if you have replicas) or pt-osc (if you don't). Native is the right answer only when the operation qualifies for INSTANT or for INPLACE without a long lock.
Replication considerations
| Tool | Primary impact | Replica impact | Replication safety |
|---|---|---|---|
| Native ALTER | Direct — COPY blocks writes, INPLACE allows them | Replays via binlog as one big statement | Brief lock on the replica end-to-end |
| gh-ost | Real — row copy and cutover run here — but no trigger overhead, and throttleable | Light — only the binlog tail, if you point gh-ost at a replica | Naturally throttles via replication lag detection |
| pt-osc | Heavy — synchronous trigger overhead on every write | Replays trigger DML statements | Can cause replica lag if writes pile up |
For replicated environments, gh-ost is the friendliest: change capture comes from the binary log instead of triggers, so it never adds synchronous write overhead, and it backs off automatically when replication lag climbs. The work itself still runs on the primary — "friendliest" means lowest-overhead, not zero-overhead.
For an instance with no replicas, native or pt-osc are the only options.
Foreign-key handling
The case that bites people. If orders.user_id is a foreign key to users.id, and you want to ALTER the users table:
- Native handles this transparently. The FK is just metadata; rewriting the parent table doesn't disturb the constraint.
- gh-ost refuses to run. The shadow table doesn't have the same FK references and gh-ost won't bridge them. The workaround is dropping the FK before migration, running gh-ost, recreating the FK.
- pt-osc handles this through the
--alter-foreign-keys-methodflag. Two strategies:rebuild_constraints(rebuilds the FK constraints after the rename — slower, but the safer choice) ordrop_swap(faster, but not atomic: the table can briefly disappear from the schema during the swap, and FK enforcement is dropped for that window).
For FK-heavy schemas, pt-osc is the only one of the two that can migrate a parent table at all — gh-ost simply refuses. But that is not the same as easy: Percona's own documentation flags foreign keys as the riskiest part of pt-osc. Prefer rebuild_constraints, and treat an FK-parent migration as a maintenance-window operation rather than a routine online change.
Throttling and back-off
Both gh-ost and pt-osc have load-based throttling so the migration backs off when the primary is busy.
# gh-ost throttle by Threads_running on the primary
gh-ost ... \
--max-load='Threads_running=25,Threads_connected=200' \
--critical-load='Threads_running=50,Threads_connected=500'
# gh-ost throttle by replication lag on a specific replica
gh-ost ... \
--throttle-control-replicas='replica1.db,replica2.db' \
--max-lag-millis=1500
# pt-osc throttle by Threads_running
pt-online-schema-change ... \
--max-load Threads_running=25 \
--critical-load Threads_running=50Set conservative thresholds. The migration will take longer but the application stays responsive.
What to do next
After your ALTER:
- MySQL Cheat Sheet — for the diagnostic queries.
- How to Change a MySQL Column Type — the foundational ALTER TABLE patterns.
- How to Add a Column in MySQL — for the common ADD COLUMN case (which is INSTANT on 8.0+).
- utf8 to utf8mb4 Migration in MySQL — a classic case where you need an online tool.
External references:
- gh-ost on GitHub
- pt-online-schema-change documentation
- MySQL Reference Manual: Online DDL Operations
- The original gh-ost announcement (GitHub Engineering, 2016)
FAQ
Sources
Authoritative references this article was fact-checked against.
- GitHubgithub.com





