TechEarl

MySQL Data Types and Sizes: Complete Reference

Every MySQL data type with exact storage size, value range, and when to use it. Numeric, string, date/time, JSON, spatial, binary, and the new VECTOR type added in MySQL 9.0.

Ishan Karunaratne⏱️ 23 min readUpdated
Share thisCopied
Every MySQL data type with exact storage size, range, and use case. INT, VARCHAR, DATETIME, JSON, VECTOR (MySQL 9.0+), plus CHAR-vs-VARCHAR and DATETIME-vs-TIMESTAMP decision guides.

MySQL has 30+ built-in data types across seven families: numeric, string, date/time, JSON, spatial, binary, and (in MySQL 9.0+) vector. Picking the right one matters at scale because smaller fixed-size columns mean smaller indexes, fewer I/O pages, and faster joins. This is the complete reference I keep open whenever I design a new schema: exact storage size in bytes, value range, and a one-line "when to use it" for every type. It's current to MySQL 9.x (May 2026).

What are MySQL data types?

A MySQL data type is the storage format for a column. It controls how many bytes each row spends on that column, what values are valid, and how MySQL compares and sorts them. Choosing well affects four things at once: disk usage (smaller types pack more rows per page), index performance (B-tree depth scales with row width), query correctness (TIMESTAMP wraps at 2038, INT overflows at ~2.1 billion), and application code (TEXT can't have a default value, JSON validates on insert). Use the smallest type that fits the actual range of values you'll store, prefer fixed-width over variable-width when sizes are predictable, and reach for VECTOR only with MySQL 9.0+ where the embedding workload genuinely needs in-database similarity search.

Every MySQL data type by storage size

The fastest way to size a schema is to read the byte cost straight off a list. The smallest MySQL types take 1 byte (TINYINT, BOOL, YEAR); the largest fixed types take 8 bytes (BIGINT, DOUBLE); and the variable-length TEXT/BLOB family runs from 256 bytes up to 4 GB, stored off-page. Here is every built-in type ordered smallest to largest, with the exact storage and what it holds:

TypeStorage (bytes)Holds
TINYINT / BOOL1Integer -128 to 127 (0 to 255 unsigned). BOOL is an alias for TINYINT(1).
YEAR1Year 1901 to 2155
ENUM(...)1 to 2One of up to 65,535 predefined labels
SMALLINT2Integer -32,768 to 32,767 (0 to 65,535 unsigned)
MEDIUMINT3Integer -8.4M to 8.4M (0 to 16.7M unsigned)
DATE31000-01-01 to 9999-12-31
TIME3 (+0 to 3 fractional)-838:59:59 to 838:59:59
INT / INTEGER4Integer -2.1B to 2.1B (0 to 4.29B unsigned)
FLOAT4Approximate float, ~7 significant digits
TIMESTAMP4 (+0 to 3 fractional)1970-01-01 to 2038-01-19 UTC
DATETIME5 (+0 to 3 fractional)1000-01-01 to 9999-12-31, no timezone
BIGINT8Integer -9.2×10^18 to 9.2×10^18
DOUBLE / REAL8Approximate float, ~15 significant digits
BIT(M)(M+7)/8 (1 to 8)Bitfield of 1 to 64 bits
SET(...)1, 2, 3, 4, or 8Subset of up to 64 labels, stored as a bitmask
DECIMAL(M,D)≈ M/2 + 1Exact fixed-point number
CHAR(M)M × charset bytesFixed-length string, 0 to 255 chars
VARCHAR(M)length + 1 to 2Variable string up to the 65,535-byte row limit
TINYTEXT / TINYBLOBlength + 1Up to 255 bytes
TEXT / BLOBlength + 2Up to 65,535 bytes (64 KB)
MEDIUMTEXT / MEDIUMBLOBlength + 3Up to 16,777,215 bytes (16 MB)
LONGTEXT / LONGBLOBlength + 4Up to 4,294,967,295 bytes (4 GB)
JSON≈ document + small overheadBinary-encoded JSON, bulk stored off-page
POINT / GEOMETRY / …25+Spatial geometry (SRID + WKB)
VECTOR(N)4 × N + headerN single-precision floats (MySQL 9.0+)

The exact ranges, charset effects, and per-family caveats are broken out in the sections below. The headline numbers above are payload-only; InnoDB row overhead adds a fixed per-row cost on top.

Jump to:

Numeric data types

TypeStorageSigned rangeUnsigned range
TINYINT1 byte-128 to 1270 to 255
SMALLINT2 bytes-32,768 to 32,7670 to 65,535
MEDIUMINT3 bytes-8,388,608 to 8,388,6070 to 16,777,215
INT / INTEGER4 bytes-2,147,483,648 to 2,147,483,6470 to 4,294,967,295
BIGINT8 bytes-9.22×10^18 to 9.22×10^180 to 1.84×10^19
DECIMAL(M,D) / NUMERIC≈ M/2 + 1 bytesfixed-point, exact precision,
FLOAT4 bytes±3.4×10^38, ~7 decimal digits,
DOUBLE / REAL8 bytes±1.8×10^308, ~15 decimal digits,
BIT(M)(M+7)/8 bytes1 to 64-bit bitfield,

Rules of thumb:

  • For monetary values, never use FLOAT/DOUBLE, they round. Use DECIMAL(M,D) with explicit precision (typically DECIMAL(19,4) for currency). The full breakdown of the DECIMAL-vs-integer-cents trade-off for storing money covers when each is the right call.
  • Unsigned only buys you a doubled positive range. Use it for counters and IDs that can never be negative; the signed/unsigned mismatch is a common source of join bugs.
  • INT(11) does not mean "11 digits". The number in parentheses is a display width hint that most clients ignore in MySQL 8.0+. Just write INT.

Those ranges are hard limits, not suggestions. With strict mode on (the default since MySQL 8.0), writing one past the top of a type is an error, not a silent wrap:

MySQL session showing TINYINT accepting 127 but rejecting 128 with ERROR 1264 Out of range, TINYINT UNSIGNED accepting 255, and INT rejecting 2147483648 (one past its signed maximum).
Strict mode turns an out-of-range value into ERROR 1264 rather than a silent truncation. TINYINT caps at 127 (255 unsigned); INT caps at 2,147,483,647. Real output from MySQL 8.0.

String data types

TypeStorageNotes
CHAR(M)M × charset bytes, paddedFixed length, right-padded with spaces. 0-255 chars.
VARCHAR(M)actual length + 1-2 bytes for length prefixVariable length, M from 0 to 65,535 chars (subject to row size limit).
TINYTEXTlength + 1 byteUp to 255 bytes. Stored off-page.
TEXTlength + 2 bytesUp to 65,535 bytes (64KB). Stored off-page.
MEDIUMTEXTlength + 3 bytesUp to 16,777,215 bytes (16MB).
LONGTEXTlength + 4 bytesUp to 4,294,967,295 bytes (4GB).
ENUM(...)1 or 2 bytesOne of up to 65,535 predefined string values.
SET(...)1, 2, 3, 4, or 8 bytesA subset of up to 64 predefined string values, stored as a bitmask.

Multibyte charsets matter for sizing. With utf8mb4 (the default since MySQL 8.0 and required for full Unicode including emoji), each character can take up to 4 bytes. A VARCHAR(255) column in utf8mb4 reserves up to 1,020 bytes per row plus the 1-2 byte length prefix. The same column in latin1 reserves up to 255 bytes. Choose your charset before sizing string columns.

Two of the most common short-string columns have their own length-and-collation walkthroughs: storing an email address in MySQL (why VARCHAR(255) and which collation to index it under), and storing a phone number in MySQL (the E.164 VARCHAR pattern and why you never store a phone number as an integer).

TEXT columns can't have a default value (other than NULL). If your app needs an empty-string default, use VARCHAR instead. For the choice between a TEXT family type and a BLOB for bulkier content, see storing large text vs BLOB in MySQL.

Date and time data types

TypeStorageRangeNotes
YEAR1 byte1901 to 2155Stored as YYYY
DATE3 bytes1000-01-01 to 9999-12-31Stored as YYYY-MM-DD
TIME3 bytes + fractional-838:59:59 to 838:59:59Stored as HH:MM:SS[.fraction]
DATETIME5 bytes + fractional1000-01-01 00:00:00 to 9999-12-31 23:59:59Calendar local time, no timezone
TIMESTAMP4 bytes + fractional1970-01-01 00:00:01 UTC to 2038-01-19 03:14:07 UTCConverted to UTC on store, back on read

The "+ fractional" overhead is 0-3 bytes depending on the precision: DATETIME(0) adds 0 bytes, DATETIME(3) (millisecond) adds 2 bytes, DATETIME(6) (microsecond) adds 3 bytes.

For the difference between DATETIME and TIMESTAMP, see the DATETIME vs TIMESTAMP decision section below: short version: TIMESTAMP is half the storage and timezone-aware but capped at 2038; DATETIME is wider, dumber, and safe past 2038.

For formatting date/time columns at query time, the full set of DATE_FORMAT() specifiers is in How to Format DATETIME in MySQL.

JSON data type

TypeStorageNotes
JSON≈ LONGBLOB + binary-encoded structure overheadValidated on insert. Supports JSON_EXTRACT, ->, ->>, generated columns.

MySQL stores JSON in a parsed binary format (not as text), so reads are fast but inserts are slightly slower than equivalent LONGTEXT. Use JSON when you genuinely need ad-hoc schema, indexing via generated columns, or path-based queries. For everything else, normalise. The full case for and against the native type, indexing, generated columns, when to reach for it over a normalised table, is in storing JSON in MySQL.

JSON columns count toward the row size limit indirectly, the bulk of the JSON sits off-page like TEXT/BLOB, but each row has a pointer + small metadata in the primary record.

VECTOR data type (MySQL 9.0+)

TypeStorageNotes
VECTOR(N)4 × N bytes + small headerStores a fixed-dimension vector of single-precision floats. N is the dimension count (e.g., 1024 for Voyage-3, 1536 for OpenAI text-embedding-3-small).

Added in MySQL 9.0 as the native primitive for semantic search and RAG workloads. You load and read vectors with STRING_TO_VECTOR() (alias TO_VECTOR()) to turn a [1.05, -17.8, 32]-style string into the binary column value, VECTOR_TO_STRING() (alias FROM_VECTOR()) to read it back, and VECTOR_DIM() to get the dimension count.

The catch for similarity search: the DISTANCE() function (with COSINE, DOT, or EUCLIDEAN) is only available in HeatWave MySQL on OCI, it is not part of MySQL Community or Commercial. On a self-managed community server you can store and retrieve vectors natively, but the distance computation has to happen in your application (or via HeatWave). Don't plan a community-MySQL RAG pipeline around an in-database DISTANCE() that isn't there.

A 1024-dimension vector occupies ~4KB per row. A 3072-dimension vector occupies ~12KB. Plan storage accordingly, a 1M-row table with VECTOR(1024) adds ~4GB just for the embedding column.

For the full integration walkthrough (embedding model choice, backfill scripts, similarity query syntax), see How to Add Semantic Search to a MySQL App.

Spatial data types

TypeStorageNotes
GEOMETRY25+ bytesGeneric container for any spatial type below.
POINT25 bytesA single (x, y) coordinate pair.
LINESTRING33+ bytesA sequence of points forming a line.
POLYGON65+ bytesA closed ring of points forming an area.
MULTIPOINT / MULTILINESTRING / MULTIPOLYGON / GEOMETRYCOLLECTIONvariesCollections of the above.

The "25+" overhead is the SRID (4 bytes) + WKB header + the actual coordinate data. For GIS workloads, add a SPATIAL INDEX for fast bounding-box and proximity queries.

Binary data types

TypeStorageNotes
BINARY(M)M bytes, zero-paddedFixed-length raw bytes.
VARBINARY(M)actual length + 1-2 bytesVariable-length raw bytes, up to 65,535.
TINYBLOBlength + 1 byteUp to 255 bytes.
BLOBlength + 2 bytesUp to 65,535 bytes (64KB).
MEDIUMBLOBlength + 3 bytesUp to 16,777,215 bytes (16MB).
LONGBLOBlength + 4 bytesUp to 4,294,967,295 bytes (4GB).

For hashes, use BINARY(N) at the digest's exact byte width, it indexes cleanly and halves the size of the equivalent hex CHAR column. The two common cases each have a full walkthrough: storing a SHA-256 hash in BINARY(32), and storing an MD5 hash in BINARY(16), both covering the UNHEX()/HEX() round-trip and the indexed generated-column pattern for hashing a long column.

Password hashes are the exception, don't use BINARY(N). A bcrypt or Argon2 hash is a self-describing ASCII string (algorithm, cost, salt, and digest all in one), so it goes in a VARCHAR, not a fixed binary column. The right column width is covered in the password column type for MySQL, with per-algorithm detail in storing a bcrypt hash and storing an Argon2 hash.

For file storage, don't. Store files in object storage (S3, R2) and keep only the key/URL in MySQL. BLOB columns bloat backups and replication.

CHAR vs VARCHAR: when to use which

QuestionPick CHARPick VARCHAR
Are all values the same length?yes (e.g., country code, US state, ISBN-13)no
Will the column be in a covering index hot path?yes (fixed-width rows pack denser in B-trees)not necessarily
Is the value tiny (1-10 chars)?yes (the 1-2 byte VARCHAR length prefix is overhead),
Is the value variable and over 30 chars?,yes
Is the column UTF8 with mixed-length content?,yes (CHAR wastes padding bytes)

Common right call: CHAR(2) for ISO country codes, CHAR(36) for stringified UUIDs (or better: BINARY(16), see below), VARCHAR(255) for everything else short, TEXT for "user-typed paragraphs".

DATETIME vs TIMESTAMP: when to use which

ConcernDATETIMETIMESTAMP
Storage5 bytes4 bytes
Range1000-01-01 to 9999-12-311970-01-01 to 2038-01-19
Timezone behaviorStored as-is, no conversionConverted to UTC on store, back to session timezone on read
Auto-update on row changemanualON UPDATE CURRENT_TIMESTAMP available
2038 problemunaffectedaffected (Y2038)

Use TIMESTAMP for created_at / updated_at audit columns where you want auto-updating and timezone awareness. Use DATETIME for user-facing dates that have semantic meaning ("event start time", "subscription renewal"): they shouldn't shift if the server timezone changes. For the end-to-end pattern (which type to pick, and how to handle time zones at insert and read time), see storing dates, times, and time zones in MySQL.

The 2038 cap is the gotcha that bites in production. TIMESTAMP accepts 2038-01-19 03:14:07 but rejects one second later; DATETIME happily stores dates into the year 9999:

MySQL session showing TIMESTAMP accepting 2038-01-19 03:14:07 but rejecting 2038-01-19 03:14:08 with ERROR 1292 Incorrect datetime value, while DATETIME accepts 9999-12-31 23:59:59.
TIMESTAMP hits its Y2038 wall one second after 2038-01-19 03:14:07 UTC (ERROR 1292); DATETIME stores dates to 9999-12-31. Real output from MySQL 8.0.

For the row size calculation in How to Change a MySQL Column Type, TIMESTAMP to DATETIME is a +1 byte schema change.

INT vs BIGINT: picking the right integer size

Cost of using BIGINT instead of INT for a primary key on a 1B-row table: 4 bytes × 1 billion = ~4GB extra in the primary index, plus every secondary index that includes the PK. That's real money on a hot table.

When INT is enough: anything where the maximum count won't exceed 2.1 billion. Most user tables, order tables, transaction tables.

When BIGINT is required: distributed ID generators (Snowflake, ULID-as-int), high-volume event/log streams (Kafka offsets, click events), anything tracking nanosecond timestamps as integers.

When TINYINT(1) or BIT(1): boolean flags. TINYINT(1) is the more portable convention; MySQL's BOOLEAN is an alias for TINYINT(1). The two options are weighed side by side in storing a boolean in MySQL: TINYINT(1) vs BIT.

UUID storage: BINARY(16) vs CHAR(36)

Storage optionBytes per rowIndex size for 1B rowsNotes
CHAR(36) (stringified)36 bytes (latin1) or 144 bytes (utf8mb4)36-144 GBHuman-readable, easy debugging
BINARY(16) (raw bytes)16 bytes16 GB2.25-9× smaller, no readable in queries

For high-cardinality tables, store UUIDs as BINARY(16) and use UUID_TO_BIN() / BIN_TO_UUID() (MySQL 8.0+) to convert at the SQL layer. The swap_flag argument to UUID_TO_BIN(uuid, 1) rearranges UUID v1 bytes so the high-entropy bits sort time-locally, which dramatically improves B-tree insert performance. The full pattern, including the v1-vs-v4 sort-order gotcha and the generated-column trick for keeping a readable copy, is in storing a UUID in MySQL as BINARY(16).

sql
CREATE TABLE orders (
  id BINARY(16) PRIMARY KEY,
  -- ...
);

INSERT INTO orders (id, ...) VALUES (UUID_TO_BIN(UUID(), 1), ...);

SELECT BIN_TO_UUID(id, 1) AS uuid FROM orders;

For more on schema-design trade-offs in MySQL, see the MySQL Cheat Sheet.

Worked example: picking the right type for every column

Rules in isolation are easy to forget; here is a realistic users table where every column is the smallest correct type, with the reasoning inline:

sql
CREATE TABLE users (
  id            BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  public_id     BINARY(16) NOT NULL UNIQUE,        -- a UUID as 16 raw bytes, not a 36-byte string
  email         VARCHAR(255) NOT NULL UNIQUE,       -- emails vary in length; index needs the real value
  display_name  VARCHAR(80) NOT NULL,               -- short, but not a fixed width -> VARCHAR
  country_code  CHAR(2) NOT NULL,                   -- always exactly 2 chars (ISO 3166-1) -> CHAR is denser
  is_active     TINYINT(1) NOT NULL DEFAULT 1,      -- boolean flag, 1 byte
  plan          ENUM('free','pro','team') NOT NULL DEFAULT 'free', -- fixed vocabulary, 1 byte
  balance_cents BIGINT NOT NULL DEFAULT 0,          -- money as integer cents, never FLOAT/DOUBLE
  settings      JSON DEFAULT NULL,                  -- sparse per-user prefs, queried by path
  created_at    TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  updated_at    TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  trial_ends_at DATETIME NULL                       -- user-facing date, must survive past 2038
) ENGINE=InnoDB;

Why each pick:

  • id is BIGINT UNSIGNED, not INT. A surrogate primary key on a table you expect to grow should not cap at 4.2 billion. The 4 extra bytes per row are worth never having to migrate a hot PK.
  • public_id is BINARY(16), not CHAR(36). The id you expose in URLs is a UUID, but stored raw it is 16 bytes and indexes cleanly. See UUID storage.
  • country_code is CHAR(2), display_name is VARCHAR(80). Fixed-width value goes in CHAR; genuinely variable value goes in VARCHAR. Mixing them up either wastes padding bytes or pays a needless length prefix.
  • balance_cents is BIGINT, never FLOAT/DOUBLE. Money in a binary float rounds and drifts; store integer cents (or DECIMAL(19,4) if you need fractional units).
  • created_at/updated_at are TIMESTAMP; trial_ends_at is DATETIME. Audit columns want UTC normalization and ON UPDATE; a user-facing future date must not hit the 2038 wall.
  • plan is ENUM, settings is JSON. A closed set of labels costs 1 byte as an ENUM; open-ended sparse data goes in JSON and is queried by path. ENUM is not always the right call for a fixed vocabulary, though, ENUM vs a lookup table in MySQL walks through when the labels change often enough that a reference table wins.

NDB storage engine considerations

The NDB Cluster storage engine has its own alignment and overhead rules that don't apply to InnoDB:

  • All column data is aligned to 4 bytes.
  • Columns allowing NULL require an additional 4 bytes per row for the first 32 nullable columns.
  • BIT(M) values are stored in 4-byte chunks (1-32 bits per chunk).
  • The maximum row size in NDB is 14,000 bytes (lower than InnoDB's 65,535).

NDB is rarely the right choice for typical web workloads. If you're not running NDB, ignore this section.

Notes on storage overhead

  • Multibyte characters. utf8mb4 columns reserve up to 4 bytes per character. Size string columns based on character count, but plan disk based on character × 4 bytes.
  • InnoDB row overhead. Each row carries a 6-byte transaction ID, a 7-byte rollback pointer, and per-column header bytes. The headline byte counts above are payload-only.
  • Compression. InnoDB tables can use ROW_FORMAT=COMPRESSED to halve disk usage (at CPU cost). Not commonly used; faster storage has made it less attractive.
  • Maximum row size. 65,535 bytes per row, excluding BLOB/TEXT columns which are stored off-page (just an off-page pointer counts toward the row size).
  • Off-page storage. When a row would exceed the page size (default 16KB), InnoDB moves large variable-length columns off-page. This adds an indirect read for those columns at query time.

What to do next

FAQ

Sources

Authoritative references this article was fact-checked against.

TagsMySQLDatabaseData TypesVARCHARINTDATETIMEJSONVECTORSchema Design

Found this useful? Pass it on.

Copied

Ishan Karunaratne

Software Systems Architect · Senior Software Engineer · Engineering Leadership

Software systems architect and senior software engineer with more than two decades designing, building, and running production software, Linux systems, and DevOps infrastructure, and lately working AI into the stack. Now a CTO, though what I write here is drawn from the full arc of that work, across architecture, engineering, and operations, not any single job.

Keep reading

Related posts