MySQL has 30+ built-in data types across seven families: numeric, string, date/time, JSON, spatial, binary, and (in MySQL 9.0+) vector. Picking the right one matters at scale because smaller fixed-size columns mean smaller indexes, fewer I/O pages, and faster joins. This is the complete reference I keep open whenever I design a new schema: exact storage size in bytes, value range, and a one-line "when to use it" for every type. It's current to MySQL 9.x (May 2026).
What are MySQL data types?
A MySQL data type is the storage format for a column. It controls how many bytes each row spends on that column, what values are valid, and how MySQL compares and sorts them. Choosing well affects four things at once: disk usage (smaller types pack more rows per page), index performance (B-tree depth scales with row width), query correctness (TIMESTAMP wraps at 2038, INT overflows at ~2.1 billion), and application code (TEXT can't have a default value, JSON validates on insert). Use the smallest type that fits the actual range of values you'll store, prefer fixed-width over variable-width when sizes are predictable, and reach for VECTOR only with MySQL 9.0+ where the embedding workload genuinely needs in-database similarity search.
Jump to:
- Numeric data types
- String data types
- Date and time data types
- JSON data type
- VECTOR data type (MySQL 9.0+)
- Spatial data types
- Binary data types
- CHAR vs VARCHAR: when to use which
- DATETIME vs TIMESTAMP: when to use which
- INT vs BIGINT: picking the right integer size
- UUID storage: BINARY(16) vs CHAR(36)
- NDB storage engine considerations
- Notes on storage overhead
- FAQ
Numeric data types
| Type | Storage | Signed range | Unsigned range |
|---|---|---|---|
TINYINT | 1 byte | -128 to 127 | 0 to 255 |
SMALLINT | 2 bytes | -32,768 to 32,767 | 0 to 65,535 |
MEDIUMINT | 3 bytes | -8,388,608 to 8,388,607 | 0 to 16,777,215 |
INT / INTEGER | 4 bytes | -2,147,483,648 to 2,147,483,647 | 0 to 4,294,967,295 |
BIGINT | 8 bytes | -9.22×10^18 to 9.22×10^18 | 0 to 1.84×10^19 |
DECIMAL(M,D) / NUMERIC | ≈ M/2 + 1 bytes | fixed-point, exact precision | — |
FLOAT | 4 bytes | ±3.4×10^38, ~7 decimal digits | — |
DOUBLE / REAL | 8 bytes | ±1.8×10^308, ~15 decimal digits | — |
BIT(M) | (M+7)/8 bytes | 1 to 64-bit bitfield | — |
Rules of thumb:
- For monetary values, never use
FLOAT/DOUBLE— they round. UseDECIMAL(M,D)with explicit precision (typicallyDECIMAL(19,4)for currency). - Unsigned only buys you a doubled positive range. Use it for counters and IDs that can never be negative; the signed/unsigned mismatch is a common source of join bugs.
INT(11)does not mean "11 digits". The number in parentheses is a display width hint that most clients ignore in MySQL 8.0+. Just writeINT.
String data types
| Type | Storage | Notes |
|---|---|---|
CHAR(M) | M × charset bytes, padded | Fixed length, right-padded with spaces. 0-255 chars. |
VARCHAR(M) | actual length + 1-2 bytes for length prefix | Variable length, M from 0 to 65,535 chars (subject to row size limit). |
TINYTEXT | length + 1 byte | Up to 255 bytes. Stored off-page. |
TEXT | length + 2 bytes | Up to 65,535 bytes (64KB). Stored off-page. |
MEDIUMTEXT | length + 3 bytes | Up to 16,777,215 bytes (16MB). |
LONGTEXT | length + 4 bytes | Up to 4,294,967,295 bytes (4GB). |
ENUM(...) | 1 or 2 bytes | One of up to 65,535 predefined string values. |
SET(...) | 1, 2, 3, 4, or 8 bytes | A subset of up to 64 predefined string values, stored as a bitmask. |
Multibyte charsets matter for sizing. With utf8mb4 (the default since MySQL 8.0 and required for full Unicode including emoji), each character can take up to 4 bytes. A VARCHAR(255) column in utf8mb4 reserves up to 1,020 bytes per row plus the 1-2 byte length prefix. The same column in latin1 reserves up to 255 bytes. Choose your charset before sizing string columns.
TEXT columns can't have a default value (other than NULL). If your app needs an empty-string default, use VARCHAR instead.
Date and time data types
| Type | Storage | Range | Notes |
|---|---|---|---|
YEAR | 1 byte | 1901 to 2155 | Stored as YYYY |
DATE | 3 bytes | 1000-01-01 to 9999-12-31 | Stored as YYYY-MM-DD |
TIME | 3 bytes + fractional | -838:59:59 to 838:59:59 | Stored as HH:MM:SS[.fraction] |
DATETIME | 5 bytes + fractional | 1000-01-01 00:00:00 to 9999-12-31 23:59:59 | Calendar local time, no timezone |
TIMESTAMP | 4 bytes + fractional | 1970-01-01 00:00:01 UTC to 2038-01-19 03:14:07 UTC | Converted to UTC on store, back on read |
The "+ fractional" overhead is 0-3 bytes depending on the precision: DATETIME(0) adds 0 bytes, DATETIME(3) (millisecond) adds 2 bytes, DATETIME(6) (microsecond) adds 3 bytes.
For the difference between DATETIME and TIMESTAMP, see the DATETIME vs TIMESTAMP decision section below — short version: TIMESTAMP is half the storage and timezone-aware but capped at 2038; DATETIME is wider, dumber, and safe past 2038.
For formatting date/time columns at query time, the full set of DATE_FORMAT() specifiers is in How to Format DATETIME in MySQL.
JSON data type
| Type | Storage | Notes |
|---|---|---|
JSON | ≈ LONGBLOB + binary-encoded structure overhead | Validated on insert. Supports JSON_EXTRACT, ->, ->>, generated columns. |
MySQL stores JSON in a parsed binary format (not as text), so reads are fast but inserts are slightly slower than equivalent LONGTEXT. Use JSON when you genuinely need ad-hoc schema, indexing via generated columns, or path-based queries. For everything else, normalise.
JSON columns count toward the row size limit indirectly — the bulk of the JSON sits off-page like TEXT/BLOB, but each row has a pointer + small metadata in the primary record.
VECTOR data type (MySQL 9.0+)
| Type | Storage | Notes |
|---|---|---|
VECTOR(N) | 4 × N bytes + small header | Stores a fixed-dimension vector of single-precision floats. N is the dimension count (e.g., 1024 for Voyage-3, 1536 for OpenAI text-embedding-3-small). |
Added in MySQL 9.0 as the native primitive for semantic search and RAG workloads. Pair with VEC_DISTANCE_COSINE() and VEC_FROM_TEXT() for similarity queries against text or image embeddings stored alongside source data.
A 1024-dimension vector occupies ~4KB per row. A 3072-dimension vector occupies ~12KB. Plan storage accordingly — a 1M-row table with VECTOR(1024) adds ~4GB just for the embedding column.
For the full integration walkthrough (embedding model choice, backfill scripts, similarity query syntax), see How to Add Semantic Search to a MySQL App.
Spatial data types
| Type | Storage | Notes |
|---|---|---|
GEOMETRY | 25+ bytes | Generic container for any spatial type below. |
POINT | 25 bytes | A single (x, y) coordinate pair. |
LINESTRING | 33+ bytes | A sequence of points forming a line. |
POLYGON | 65+ bytes | A closed ring of points forming an area. |
MULTIPOINT / MULTILINESTRING / MULTIPOLYGON / GEOMETRYCOLLECTION | varies | Collections of the above. |
The "25+" overhead is the SRID (4 bytes) + WKB header + the actual coordinate data. For GIS workloads, add a SPATIAL INDEX for fast bounding-box and proximity queries.
Binary data types
| Type | Storage | Notes |
|---|---|---|
BINARY(M) | M bytes, zero-padded | Fixed-length raw bytes. |
VARBINARY(M) | actual length + 1-2 bytes | Variable-length raw bytes, up to 65,535. |
TINYBLOB | length + 1 byte | Up to 255 bytes. |
BLOB | length + 2 bytes | Up to 65,535 bytes (64KB). |
MEDIUMBLOB | length + 3 bytes | Up to 16,777,215 bytes (16MB). |
LONGBLOB | length + 4 bytes | Up to 4,294,967,295 bytes (4GB). |
For hashes (SHA-256, BLAKE3), use BINARY(32) — it's exactly the right size and indexes cleanly.
For file storage, don't. Store files in object storage (S3, R2) and keep only the key/URL in MySQL. BLOB columns bloat backups and replication.
CHAR vs VARCHAR: when to use which
| Question | Pick CHAR | Pick VARCHAR |
|---|---|---|
| Are all values the same length? | yes (e.g., country code, US state, ISBN-13) | no |
| Will the column be in a covering index hot path? | yes (fixed-width rows pack denser in B-trees) | not necessarily |
| Is the value tiny (1-10 chars)? | yes (the 1-2 byte VARCHAR length prefix is overhead) | — |
| Is the value variable and over 30 chars? | — | yes |
| Is the column UTF8 with mixed-length content? | — | yes (CHAR wastes padding bytes) |
Common right call: CHAR(2) for ISO country codes, CHAR(36) for stringified UUIDs (or better: BINARY(16) — see below), VARCHAR(255) for everything else short, TEXT for "user-typed paragraphs".
DATETIME vs TIMESTAMP: when to use which
| Concern | DATETIME | TIMESTAMP |
|---|---|---|
| Storage | 5 bytes | 4 bytes |
| Range | 1000-01-01 to 9999-12-31 | 1970-01-01 to 2038-01-19 |
| Timezone behavior | Stored as-is, no conversion | Converted to UTC on store, back to session timezone on read |
| Auto-update on row change | manual | ON UPDATE CURRENT_TIMESTAMP available |
| 2038 problem | unaffected | affected (Y2038) |
Use TIMESTAMP for created_at / updated_at audit columns where you want auto-updating and timezone awareness. Use DATETIME for user-facing dates that have semantic meaning ("event start time", "subscription renewal") — they shouldn't shift if the server timezone changes.
For the row size calculation in How to Change a MySQL Column Type, TIMESTAMP to DATETIME is a +1 byte schema change.
INT vs BIGINT: picking the right integer size
Cost of using BIGINT instead of INT for a primary key on a 1B-row table: 4 bytes × 1 billion = ~4GB extra in the primary index, plus every secondary index that includes the PK. That's real money on a hot table.
When INT is enough: anything where the maximum count won't exceed 2.1 billion. Most user tables, order tables, transaction tables.
When BIGINT is required: distributed ID generators (Snowflake, ULID-as-int), high-volume event/log streams (Kafka offsets, click events), anything tracking nanosecond timestamps as integers.
When TINYINT(1) or BIT(1): boolean flags. TINYINT(1) is the more portable convention; MySQL's BOOLEAN is an alias for TINYINT(1).
UUID storage: BINARY(16) vs CHAR(36)
| Storage option | Bytes per row | Index size for 1B rows | Notes |
|---|---|---|---|
CHAR(36) (stringified) | 36 bytes (latin1) or 144 bytes (utf8mb4) | 36-144 GB | Human-readable, easy debugging |
BINARY(16) (raw bytes) | 16 bytes | 16 GB | 2.25-9× smaller, no readable in queries |
For high-cardinality tables, store UUIDs as BINARY(16) and use UUID_TO_BIN() / BIN_TO_UUID() (MySQL 8.0+) to convert at the SQL layer. The swap_flag argument to UUID_TO_BIN(uuid, 1) rearranges UUID v1 bytes so the high-entropy bits sort time-locally, which dramatically improves B-tree insert performance.
CREATE TABLE orders (
id BINARY(16) PRIMARY KEY,
-- ...
);
INSERT INTO orders (id, ...) VALUES (UUID_TO_BIN(UUID(), 1), ...);
SELECT BIN_TO_UUID(id, 1) AS uuid FROM orders;For more on schema-design trade-offs in MySQL, see the MySQL Cheat Sheet.
NDB storage engine considerations
The NDB Cluster storage engine has its own alignment and overhead rules that don't apply to InnoDB:
- All column data is aligned to 4 bytes.
- Columns allowing
NULLrequire an additional 4 bytes per row for the first 32 nullable columns. BIT(M)values are stored in 4-byte chunks (1-32 bits per chunk).- The maximum row size in NDB is 14,000 bytes (lower than InnoDB's 65,535).
NDB is rarely the right choice for typical web workloads. If you're not running NDB, ignore this section.
Notes on storage overhead
- Multibyte characters.
utf8mb4columns reserve up to 4 bytes per character. Size string columns based on character count, but plan disk based on character × 4 bytes. - InnoDB row overhead. Each row carries a 6-byte transaction ID, a 7-byte rollback pointer, and per-column header bytes. The headline byte counts above are payload-only.
- Compression. InnoDB tables can use
ROW_FORMAT=COMPRESSEDto halve disk usage (at CPU cost). Not commonly used; faster storage has made it less attractive. - Maximum row size. 65,535 bytes per row, excluding BLOB/TEXT columns which are stored off-page (just an off-page pointer counts toward the row size).
- Off-page storage. When a row would exceed the page size (default 16KB), InnoDB moves large variable-length columns off-page. This adds an indirect read for those columns at query time.
What to do next
- For day-to-day schema changes (adding columns, changing types), the MySQL ADD COLUMN and MySQL CHANGE COLUMN TYPE walkthroughs cover the safe-online patterns.
- For datetime formatting at query time, MySQL DATE_FORMAT() reference.
- For the broader SQL syntax reference, the MySQL Cheat Sheet covers commands, types, and functions on one page.
- For semantic search using the new
VECTORtype, How to Add Semantic Search to a MySQL App.





