TechEarl

MySQL Data Types and Sizes: Complete Reference

Every MySQL data type with exact storage size, value range, and when to use it. Numeric, string, date/time, JSON, spatial, binary, and the new VECTOR type added in MySQL 9.0.

Ishan KarunaratneIshan Karunaratne⏱️ 15 min readUpdated
Every MySQL data type with exact storage size, range, and use case. INT, VARCHAR, DATETIME, JSON, VECTOR (MySQL 9.0+), plus CHAR-vs-VARCHAR and DATETIME-vs-TIMESTAMP decision guides.

MySQL has 30+ built-in data types across seven families: numeric, string, date/time, JSON, spatial, binary, and (in MySQL 9.0+) vector. Picking the right one matters at scale because smaller fixed-size columns mean smaller indexes, fewer I/O pages, and faster joins. This is the complete reference I keep open whenever I design a new schema: exact storage size in bytes, value range, and a one-line "when to use it" for every type. It's current to MySQL 9.x (May 2026).

What are MySQL data types?

A MySQL data type is the storage format for a column. It controls how many bytes each row spends on that column, what values are valid, and how MySQL compares and sorts them. Choosing well affects four things at once: disk usage (smaller types pack more rows per page), index performance (B-tree depth scales with row width), query correctness (TIMESTAMP wraps at 2038, INT overflows at ~2.1 billion), and application code (TEXT can't have a default value, JSON validates on insert). Use the smallest type that fits the actual range of values you'll store, prefer fixed-width over variable-width when sizes are predictable, and reach for VECTOR only with MySQL 9.0+ where the embedding workload genuinely needs in-database similarity search.

Jump to:

Numeric data types

TypeStorageSigned rangeUnsigned range
TINYINT1 byte-128 to 1270 to 255
SMALLINT2 bytes-32,768 to 32,7670 to 65,535
MEDIUMINT3 bytes-8,388,608 to 8,388,6070 to 16,777,215
INT / INTEGER4 bytes-2,147,483,648 to 2,147,483,6470 to 4,294,967,295
BIGINT8 bytes-9.22×10^18 to 9.22×10^180 to 1.84×10^19
DECIMAL(M,D) / NUMERIC≈ M/2 + 1 bytesfixed-point, exact precision
FLOAT4 bytes±3.4×10^38, ~7 decimal digits
DOUBLE / REAL8 bytes±1.8×10^308, ~15 decimal digits
BIT(M)(M+7)/8 bytes1 to 64-bit bitfield

Rules of thumb:

  • For monetary values, never use FLOAT/DOUBLE — they round. Use DECIMAL(M,D) with explicit precision (typically DECIMAL(19,4) for currency).
  • Unsigned only buys you a doubled positive range. Use it for counters and IDs that can never be negative; the signed/unsigned mismatch is a common source of join bugs.
  • INT(11) does not mean "11 digits". The number in parentheses is a display width hint that most clients ignore in MySQL 8.0+. Just write INT.

String data types

TypeStorageNotes
CHAR(M)M × charset bytes, paddedFixed length, right-padded with spaces. 0-255 chars.
VARCHAR(M)actual length + 1-2 bytes for length prefixVariable length, M from 0 to 65,535 chars (subject to row size limit).
TINYTEXTlength + 1 byteUp to 255 bytes. Stored off-page.
TEXTlength + 2 bytesUp to 65,535 bytes (64KB). Stored off-page.
MEDIUMTEXTlength + 3 bytesUp to 16,777,215 bytes (16MB).
LONGTEXTlength + 4 bytesUp to 4,294,967,295 bytes (4GB).
ENUM(...)1 or 2 bytesOne of up to 65,535 predefined string values.
SET(...)1, 2, 3, 4, or 8 bytesA subset of up to 64 predefined string values, stored as a bitmask.

Multibyte charsets matter for sizing. With utf8mb4 (the default since MySQL 8.0 and required for full Unicode including emoji), each character can take up to 4 bytes. A VARCHAR(255) column in utf8mb4 reserves up to 1,020 bytes per row plus the 1-2 byte length prefix. The same column in latin1 reserves up to 255 bytes. Choose your charset before sizing string columns.

TEXT columns can't have a default value (other than NULL). If your app needs an empty-string default, use VARCHAR instead.

Date and time data types

TypeStorageRangeNotes
YEAR1 byte1901 to 2155Stored as YYYY
DATE3 bytes1000-01-01 to 9999-12-31Stored as YYYY-MM-DD
TIME3 bytes + fractional-838:59:59 to 838:59:59Stored as HH:MM:SS[.fraction]
DATETIME5 bytes + fractional1000-01-01 00:00:00 to 9999-12-31 23:59:59Calendar local time, no timezone
TIMESTAMP4 bytes + fractional1970-01-01 00:00:01 UTC to 2038-01-19 03:14:07 UTCConverted to UTC on store, back on read

The "+ fractional" overhead is 0-3 bytes depending on the precision: DATETIME(0) adds 0 bytes, DATETIME(3) (millisecond) adds 2 bytes, DATETIME(6) (microsecond) adds 3 bytes.

For the difference between DATETIME and TIMESTAMP, see the DATETIME vs TIMESTAMP decision section below — short version: TIMESTAMP is half the storage and timezone-aware but capped at 2038; DATETIME is wider, dumber, and safe past 2038.

For formatting date/time columns at query time, the full set of DATE_FORMAT() specifiers is in How to Format DATETIME in MySQL.

JSON data type

TypeStorageNotes
JSON≈ LONGBLOB + binary-encoded structure overheadValidated on insert. Supports JSON_EXTRACT, ->, ->>, generated columns.

MySQL stores JSON in a parsed binary format (not as text), so reads are fast but inserts are slightly slower than equivalent LONGTEXT. Use JSON when you genuinely need ad-hoc schema, indexing via generated columns, or path-based queries. For everything else, normalise.

JSON columns count toward the row size limit indirectly — the bulk of the JSON sits off-page like TEXT/BLOB, but each row has a pointer + small metadata in the primary record.

VECTOR data type (MySQL 9.0+)

TypeStorageNotes
VECTOR(N)4 × N bytes + small headerStores a fixed-dimension vector of single-precision floats. N is the dimension count (e.g., 1024 for Voyage-3, 1536 for OpenAI text-embedding-3-small).

Added in MySQL 9.0 as the native primitive for semantic search and RAG workloads. Pair with VEC_DISTANCE_COSINE() and VEC_FROM_TEXT() for similarity queries against text or image embeddings stored alongside source data.

A 1024-dimension vector occupies ~4KB per row. A 3072-dimension vector occupies ~12KB. Plan storage accordingly — a 1M-row table with VECTOR(1024) adds ~4GB just for the embedding column.

For the full integration walkthrough (embedding model choice, backfill scripts, similarity query syntax), see How to Add Semantic Search to a MySQL App.

Spatial data types

TypeStorageNotes
GEOMETRY25+ bytesGeneric container for any spatial type below.
POINT25 bytesA single (x, y) coordinate pair.
LINESTRING33+ bytesA sequence of points forming a line.
POLYGON65+ bytesA closed ring of points forming an area.
MULTIPOINT / MULTILINESTRING / MULTIPOLYGON / GEOMETRYCOLLECTIONvariesCollections of the above.

The "25+" overhead is the SRID (4 bytes) + WKB header + the actual coordinate data. For GIS workloads, add a SPATIAL INDEX for fast bounding-box and proximity queries.

Binary data types

TypeStorageNotes
BINARY(M)M bytes, zero-paddedFixed-length raw bytes.
VARBINARY(M)actual length + 1-2 bytesVariable-length raw bytes, up to 65,535.
TINYBLOBlength + 1 byteUp to 255 bytes.
BLOBlength + 2 bytesUp to 65,535 bytes (64KB).
MEDIUMBLOBlength + 3 bytesUp to 16,777,215 bytes (16MB).
LONGBLOBlength + 4 bytesUp to 4,294,967,295 bytes (4GB).

For hashes (SHA-256, BLAKE3), use BINARY(32) — it's exactly the right size and indexes cleanly.

For file storage, don't. Store files in object storage (S3, R2) and keep only the key/URL in MySQL. BLOB columns bloat backups and replication.

CHAR vs VARCHAR: when to use which

QuestionPick CHARPick VARCHAR
Are all values the same length?yes (e.g., country code, US state, ISBN-13)no
Will the column be in a covering index hot path?yes (fixed-width rows pack denser in B-trees)not necessarily
Is the value tiny (1-10 chars)?yes (the 1-2 byte VARCHAR length prefix is overhead)
Is the value variable and over 30 chars?yes
Is the column UTF8 with mixed-length content?yes (CHAR wastes padding bytes)

Common right call: CHAR(2) for ISO country codes, CHAR(36) for stringified UUIDs (or better: BINARY(16) — see below), VARCHAR(255) for everything else short, TEXT for "user-typed paragraphs".

DATETIME vs TIMESTAMP: when to use which

ConcernDATETIMETIMESTAMP
Storage5 bytes4 bytes
Range1000-01-01 to 9999-12-311970-01-01 to 2038-01-19
Timezone behaviorStored as-is, no conversionConverted to UTC on store, back to session timezone on read
Auto-update on row changemanualON UPDATE CURRENT_TIMESTAMP available
2038 problemunaffectedaffected (Y2038)

Use TIMESTAMP for created_at / updated_at audit columns where you want auto-updating and timezone awareness. Use DATETIME for user-facing dates that have semantic meaning ("event start time", "subscription renewal") — they shouldn't shift if the server timezone changes.

For the row size calculation in How to Change a MySQL Column Type, TIMESTAMP to DATETIME is a +1 byte schema change.

INT vs BIGINT: picking the right integer size

Cost of using BIGINT instead of INT for a primary key on a 1B-row table: 4 bytes × 1 billion = ~4GB extra in the primary index, plus every secondary index that includes the PK. That's real money on a hot table.

When INT is enough: anything where the maximum count won't exceed 2.1 billion. Most user tables, order tables, transaction tables.

When BIGINT is required: distributed ID generators (Snowflake, ULID-as-int), high-volume event/log streams (Kafka offsets, click events), anything tracking nanosecond timestamps as integers.

When TINYINT(1) or BIT(1): boolean flags. TINYINT(1) is the more portable convention; MySQL's BOOLEAN is an alias for TINYINT(1).

UUID storage: BINARY(16) vs CHAR(36)

Storage optionBytes per rowIndex size for 1B rowsNotes
CHAR(36) (stringified)36 bytes (latin1) or 144 bytes (utf8mb4)36-144 GBHuman-readable, easy debugging
BINARY(16) (raw bytes)16 bytes16 GB2.25-9× smaller, no readable in queries

For high-cardinality tables, store UUIDs as BINARY(16) and use UUID_TO_BIN() / BIN_TO_UUID() (MySQL 8.0+) to convert at the SQL layer. The swap_flag argument to UUID_TO_BIN(uuid, 1) rearranges UUID v1 bytes so the high-entropy bits sort time-locally, which dramatically improves B-tree insert performance.

sql
CREATE TABLE orders (
  id BINARY(16) PRIMARY KEY,
  -- ...
);

INSERT INTO orders (id, ...) VALUES (UUID_TO_BIN(UUID(), 1), ...);

SELECT BIN_TO_UUID(id, 1) AS uuid FROM orders;

For more on schema-design trade-offs in MySQL, see the MySQL Cheat Sheet.

NDB storage engine considerations

The NDB Cluster storage engine has its own alignment and overhead rules that don't apply to InnoDB:

  • All column data is aligned to 4 bytes.
  • Columns allowing NULL require an additional 4 bytes per row for the first 32 nullable columns.
  • BIT(M) values are stored in 4-byte chunks (1-32 bits per chunk).
  • The maximum row size in NDB is 14,000 bytes (lower than InnoDB's 65,535).

NDB is rarely the right choice for typical web workloads. If you're not running NDB, ignore this section.

Notes on storage overhead

  • Multibyte characters. utf8mb4 columns reserve up to 4 bytes per character. Size string columns based on character count, but plan disk based on character × 4 bytes.
  • InnoDB row overhead. Each row carries a 6-byte transaction ID, a 7-byte rollback pointer, and per-column header bytes. The headline byte counts above are payload-only.
  • Compression. InnoDB tables can use ROW_FORMAT=COMPRESSED to halve disk usage (at CPU cost). Not commonly used; faster storage has made it less attractive.
  • Maximum row size. 65,535 bytes per row, excluding BLOB/TEXT columns which are stored off-page (just an off-page pointer counts toward the row size).
  • Off-page storage. When a row would exceed the page size (default 16KB), InnoDB moves large variable-length columns off-page. This adds an indirect read for those columns at query time.

What to do next

FAQ

TagsMySQLDatabaseData TypesVARCHARINTDATETIMEJSONVECTORSchema Design
Share
Ishan Karunaratne

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years building software, Linux systems, and DevOps infrastructure, and lately working AI into the stack. Currently Chief Technology Officer at a healthcare tech startup, which is where most of these field notes come from.

Keep reading

Related posts