Store JSON in MySQL: The JSON Type vs TEXT (2026)

Use the native JSON column type, not TEXT, whenever you actually query into the document. MySQL has had a real JSON data type since 5.7.8 (October 2015): it validates the document on insert (bad JSON is rejected), stores a parsed binary format so path lookups do not re-parse the string, and gives you operators like -> and ->> to pull values out. Reach for TEXT only when the JSON is an opaque blob you store and read back whole and never query into. Below are the differences, a comparison against normalizing into real columns, a worked schema with an indexed generated column, and the MariaDB gotcha that trips people up.

Short answer: attributes JSON for a column you query into; TEXT for a blob you only ever read whole. You cannot index a JSON column directly, so to make a path searchable, add a generated column over the path and index that. And know that MariaDB's JSON is not the same thing as MySQL's: it is an alias for LONGTEXT with a validity check, stored as text, not the binary format. More on that below.

Why the native JSON type beats TEXT

Before 5.7.8 the only way to put a JSON document in MySQL was to dump the string into a TEXT or VARCHAR column and parse it in your application. That still works, and for some cases it is still the right call. But the native JSON type buys you three things a text column cannot:

Validation on insert. A JSON column rejects malformed input at write time. INSERT ... VALUES('[1, 2,') fails with ERROR 3140 (22032): Invalid JSON text. A TEXT column will happily store [1, 2, and you will not find out until something downstream tries to parse it.
A parsed binary format. MySQL converts the document to an internal binary representation on insert. That format is structured so the server can look up a key or array index directly without scanning the whole document, and reads do not re-parse the text every time. A TEXT column stores the raw string, so every read hands your application a string to parse from scratch.
Path operators. JSON_EXTRACT(), plus the -> operator (shorthand for JSON_EXTRACT(col, path)) and the ->> operator (shorthand for JSON_UNQUOTE(JSON_EXTRACT(col, path)), which strips the surrounding quotes). You query into the document in SQL instead of pulling the whole blob into the app and parsing it there.

The cost is small. A JSON value is limited like LONGBLOB (up to about 4 GB, capped in practice by max_allowed_packet), and most of the bytes are stored off-page. The binary encoding carries some overhead, so a JSON column can sit a little larger on disk than the equivalent raw text. In exchange you get validation, in-place path access, and indexable extraction. For anything you query into, that is a trade worth making.

When TEXT is still the right choice

TEXT is the better pick when the JSON is opaque: you write it, you read it back whole, and you never ask the database about its contents. A cached API response, a serialized event payload you replay elsewhere, a third-party webhook body you keep for audit. There is nothing to validate against (you do not control the shape), nothing to index into (you never filter on a field), and TEXT skips the binary-encoding step so the insert is marginally cheaper. Parsing happens in the app, which is where the blob is going anyway.

The line is simple: do you ever write WHERE or ORDER BY against a field inside the document? If yes, JSON. If no, TEXT is fine and a touch leaner.

JSON vs normalizing into real columns

The harder question is not JSON versus TEXT. It is whether the data should be JSON at all, or whether it belongs in ordinary columns and tables.

If the shape is known and stable, normal columns are almost always better. You get typed columns, NOT NULL and foreign-key constraints, straightforward B-tree indexes, and clean joins. A product has a price, a SKU, and a weight: those are columns, not keys in a JSON blob. Burying stable fields in JSON throws away everything a relational database is good at and makes every query clumsier.

JSON earns its place when the data is genuinely sparse, dynamic, or ad-hoc: attributes that differ per row, optional metadata you cannot enumerate up front, a settings document whose keys you do not control. A products catalog where a book has author and pages, a shirt has size and color, and a cable has length and gauge is the textbook case. Modelling that as columns gives you a wide table full of NULLs, or an entity-attribute-value mess. A single attributes JSON column holds whatever each product needs.

My rule: stable, shared fields are columns; the long tail of per-row variation is one JSON column alongside them. You do not have to choose globally, you choose per field.

Comparison: JSON vs TEXT vs normalized columns

Aspect	`JSON` type	`TEXT`	Normalized columns
Validation	Rejects invalid JSON on insert	None (stores any string)	Per-column types and constraints
Query operators	`->`, `->>`, `JSON_EXTRACT`, etc.	None (parse in the app)	Native SQL on every column
Indexing	Via a generated column or functional index	Prefix index only, useless for fields	Direct B-tree index per column
Best for	Sparse / dynamic data you query into	Opaque blobs read whole	Known, stable, shared fields

A worked schema: products with a JSON attributes column

Here is a products table. The stable fields (sku, name, price) are real columns. The per-product variation lives in a single attributes JSON column.

sql

CREATE TABLE products (
  id         BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  sku        VARCHAR(64) NOT NULL,
  name       VARCHAR(255) NOT NULL,
  price      DECIMAL(10,2) NOT NULL,
  attributes JSON NOT NULL,
  created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  UNIQUE KEY uq_products_sku (sku)
) ENGINE=InnoDB;

A live MySQL session using the native JSON column, showing the create statement and the real query output. — A live MySQL 8 session running this schema: the real output, not illustrative.

Insert a document. MySQL validates it on the way in, so a typo in the JSON fails loudly here, not three reads later:

sql

INSERT INTO products (sku, name, price, attributes)
VALUES (
  'TSHIRT-RED-L',
  'Cotton T-Shirt',
  19.99,
  '{"color": "red", "size": "L", "material": "cotton", "tags": ["summer", "sale"]}'
);

Pull a single field out with the ->> operator. It returns the unquoted scalar, so you get red, not "red":

sql

SELECT name, attributes->>'$.color' AS color
FROM products
WHERE attributes->>'$.size' = 'L';

The $.color and $.size are JSON path expressions: $ is the document root, .color is the key. ->>'$.color' is exactly JSON_UNQUOTE(JSON_EXTRACT(attributes, '$.color')), just shorter to type. Use -> instead of ->> when you want the raw JSON form (quoted strings, intact arrays).

This is all standard on MariaDB too, as far as the SQL goes. JSON_EXTRACT, the -> and ->> operators, and JSON path syntax behave the same. What differs is the storage, and that matters enough to spell out.

Indexing a JSON column

You cannot put an index directly on a JSON column. A B-tree wants a single comparable value, and a whole JSON document is not that. The query above (WHERE attributes->>'$.size' = 'L') works, but with no index it scans every row and evaluates the extraction each time. To make a path searchable, you index the extracted value, and there are two ways.

Generated column (works on 5.7+ and MariaDB). Add a column defined as an extraction over the JSON path, then index that column:

sql

ALTER TABLE products
  ADD COLUMN size VARCHAR(8)
    AS (attributes->>'$.size') STORED,
  ADD INDEX idx_size (size);

Now WHERE size = 'L' uses idx_size directly. The column can be VIRTUAL (computed on read, materialized only in the index records) or STORED (computed on write, kept in the row).

A MySQL 8.0 session extracting attrs->>'$.color' from a JSON column and running EXPLAIN on a query filtered by that path, where the plan shows key: idx_color and Extra: Using index, proving the generated-column index is used. — Extracting attrs->>'$.color' and running EXPLAIN shows the query uses key idx_color with Using index, confirming the generated-column index is hit. Real output from MySQL 8.0.

I default to VIRTUAL for a sparse attribute so the row stays lean; STORED is worth it when the same value is read far more than written, or when you need it as part of a more complex index. Either way the optimizer will use the index for a WHERE clause that repeats the exact extraction expression, even without naming the generated column. This is the same generated-column trick used to index a long column by its hash, applied to a JSON path instead of a digest.

Functional index (MySQL 8.0.13+). From 8.0.13 you can index the expression directly, no separate column:

sql

CREATE INDEX idx_size ON products ((CAST(attributes->>'$.size' AS CHAR(8))));

And for arrays inside the document, multi-valued indexes (MySQL 8.0.17+) let one row contribute several index entries, so MEMBER OF and JSON_CONTAINS against an array field like $.tags can use an index. MariaDB does not have multi-valued indexes; there you index a single extracted scalar through the generated-column route above.

The MariaDB difference (do not gloss over this)

Here is the part that surprises people migrating between the two. MariaDB's JSON is not a native binary type. It is an alias for LONGTEXT (specifically LONGTEXT COLLATE utf8mb4_bin) with an automatic JSON_VALID() CHECK constraint bolted on for validation (that automatic check arrived in MariaDB 10.4.3; on 10.2 and 10.3 the alias existed but did not validate, so older schemas may have a JSON column accepting invalid documents). MariaDB introduced it purely for MySQL compatibility, and it stores the document as text, parsing on access, with no compact binary representation.

So the SQL surface is largely compatible (the functions and operators line up), but the storage model is fundamentally different: MySQL keeps a parsed binary blob and reads subfields without re-parsing; MariaDB keeps the raw string and parses it on every access. The performance characteristics diverge accordingly, and binary-level features that depend on the encoding (and row-based replication of JSON columns between the two engines) do not carry over. If you are writing for both, treat MariaDB's JSON as "validated LONGTEXT" and do not assume the MySQL binary-format optimizations are there.

What to do next

For exact byte sizes and limits of every column type, see MySQL data types and sizes.
For the conversion and extraction functions in one place, the MySQL cheat sheet covers the JSON path operators alongside the rest of the DDL.
For the same indexed generated-column pattern applied to a hash instead of a JSON path, see how to store a SHA-256 hash in MySQL.
PostgreSQL handles this with jsonb, a true binary type with GIN indexing rather than a generated-column workaround; see JSON vs JSONB in PostgreSQL.

FAQ

Use the native JSON type whenever you query into the document, that is, whenever any field inside it appears in a WHERE or ORDER BY. It validates the JSON on insert, stores a parsed binary format so reads do not re-parse, and gives you path operators like ->>.

Use TEXT only when the JSON is an opaque blob you store and read back whole and never filter on. TEXT skips the binary encoding so the insert is marginally leaner, but you get no validation and no operators.

Not directly. A B-tree needs a single comparable value and a whole document is not one. Instead, index an extracted path: add a generated column defined as the extraction (for example size VARCHAR(8) AS (attributes->>'$.size') STORED) and index that column, or on MySQL 8.0.13+ create a functional index on the expression directly. For arrays inside the JSON, multi-valued indexes (8.0.17+) index each element.

Both extract a value at a JSON path. col->'$.x' is shorthand for JSON_EXTRACT(col, '$.x') and returns the JSON form (strings stay quoted). col->>'$.x' is shorthand for JSON_UNQUOTE(JSON_EXTRACT(col, '$.x')) and returns the unquoted scalar, so a string comes back as red, not "red". Use ->> when comparing against a plain SQL string.

No. MySQL stores JSON in a parsed binary format and reads subfields without re-parsing. MariaDB's JSON is an alias for LONGTEXT with a JSON_VALID() CHECK constraint, stored as text and parsed on access. The SQL functions and operators are largely compatible, but the storage model and performance characteristics differ, so do not assume MySQL's binary optimizations apply on MariaDB. The automatic JSON_VALID() CHECK constraint was added in MariaDB 10.4.3; earlier versions left the alias unvalidated.

When the shape is known and stable. Stable, shared fields belong in typed columns where you get constraints, foreign keys, plain indexes, and clean joins. Reserve a JSON column for genuinely sparse, dynamic, or per-row data you cannot enumerate up front. A common pattern is both: real columns for the shared fields and one JSON column for the long tail of variation.

Use JSON_SET() in an UPDATE and assign the result back to the column: UPDATE products SET attributes = JSON_SET(attributes, '$.color', 'blue') WHERE id = 1. It updates the value at the path if the key exists, or adds it if it does not. JSON_REPLACE() only touches existing keys, JSON_INSERT() only adds missing ones, and JSON_REMOVE() deletes a path. MySQL also does a partial, in-place update for these on the binary format when the change fits, so you are not always rewriting the full document on disk. These functions work the same on MariaDB.

Yes. Inserting malformed JSON into a JSON column fails immediately with ERROR 3140 (22032): Invalid JSON text. A TEXT column performs no such check and stores any string, valid or not, which means corruption surfaces later at read time instead of at write time. MariaDB enforces the same validity through its automatic JSON_VALID() CHECK constraint.

How to Store JSON in MySQL: The JSON Type vs TEXT

Why the native JSON type beats TEXT

When TEXT is still the right choice

JSON vs normalizing into real columns

Comparison: JSON vs TEXT vs normalized columns

A worked schema: products with a JSON attributes column

Indexing a JSON column

The MariaDB difference (do not gloss over this)

What to do next

FAQ

See also

Sources

Ishan Karunaratne

Related posts

How to Store JSON in PostgreSQL: json vs jsonb

How to Store Money in MySQL: DECIMAL vs Integer Cents

How to Store a Boolean in MySQL: TINYINT(1) vs BIT(1)

Should I use the JSON type or TEXT to store JSON in MySQL?

Can I index a JSON column in MySQL?

What is the difference between the -> and ->> operators?

Is MySQL's JSON type the same as MariaDB's?

When should I normalize into columns instead of using JSON?

How do I update a single field inside a JSON column without rewriting the whole document?

Does the JSON type validate data on insert?

Sources

Ishan Karunaratne