What’s the Most a QR Code Can Actually Hold? (And the ECC Trade-Off Nobody Explains)

Published on May 22, 2026 by The Kestrel Tools Team • 9 min read

A project called ShadowCat hit the Hacker News front page this week — a single HTML file that streams arbitrary files between two browsers using nothing but QR codes. One window flickers through frames at 10 FPS. The other points a webcam at the screen and reassembles the file from base64 chunks. It works. People in the comments started asking the obvious question and finding out almost nobody can answer it cleanly: how much data does a single QR code actually hold, and what does it cost to make it survive a smudge?

The QR code data capacity vs error correction trade-off is one of those topics that’s easy to half-explain and hard to get right. Most blog posts quote a single number (“4,296 alphanumeric characters!”) and stop. The real answer involves 40 versions, four error correction levels, four character modes, and a Reed–Solomon math budget that gets divided between your payload and your damage tolerance. Pick the wrong combination and your QR code is either too fragile to scan or too dense to fit on a business card.

This post walks through the actual numbers, explains why higher error correction eats payload but tolerates more camera distortion, shows the exact capacity table for the four ECC levels, and ends with the practical rule for picking one in 2026.

A direct answer: what is the maximum data a QR code can hold?

A single QR code can hold up to 7,089 numeric digits, 4,296 alphanumeric characters, 2,953 raw bytes, or 1,817 Kanji characters — but only at the largest version (40, a 177×177 module grid) with the lowest error correction level (L, ~7% recovery). Crank error correction up to H (~30% recovery) and the same version-40 grid drops to 3,057 numeric digits, 1,852 alphanumeric characters, or 1,273 raw bytes — roughly 43% of the L-level payload. Most real-world QR codes are version 1–10 and carry a URL of 20–80 bytes, which is why you almost never hit the ceiling.

The key insight is that those four numbers aren’t independent. The QR symbol is a fixed-size grid of modules (the black-and-white squares). Every grid has a total module budget. That budget gets split between three things: the data payload, the Reed–Solomon error correction codewords, and a small amount of fixed overhead (finder patterns, timing patterns, format and version info). When you raise the error correction level, you’re not getting a free upgrade — you’re moving codewords from the payload column into the recovery column.

The four error correction levels

The ISO/IEC 18004 spec defines four error correction levels, each named after the approximate fraction of the data codewords it can recover after damage:

LevelRecovery capacityUse case
L (Low)~7% of codewordsClean digital displays, screen-to-screen transfer, minimal payload overhead
M (Medium)~15% of codewordsDefault for most generators. Print on clean paper, scanned indoors.
Q (Quartile)~25% of codewordsIndustrial labels, packaging, mild dirt or curvature
H (High)~30% of codewordsHeavily branded codes (logo overlay), outdoor signage, restaurant tables

A few details that the marketing pages skip:

  • The percentages are codeword recovery, not bit recovery. A codeword is 8 bits. If a codeword is partially correct, Reed–Solomon still treats the whole thing as one error to repair.
  • The percentages are maximum recovery, not guaranteed under all damage patterns. Burst errors (a continuous smudge) and erasures (missing modules) consume the budget at different rates.
  • The level you pick must be readable from the QR code itself — the format information block in the corner near the finder patterns encodes the ECC level so the scanner knows how to decode the rest.

The “H means 30%” claim deserves a footnote: H provides up to ~30% recovery of the codewords that carry data and ECC. The fixed overhead modules (timing patterns, finder patterns) aren’t recoverable — if those are damaged, no ECC level helps.

Why higher ECC eats your payload

Reed–Solomon error correction works by treating your data as a polynomial over a finite field (GF(256), specifically) and appending evaluation points so the decoder can detect and reconstruct corrupted symbols. The math says: to correct t symbol errors, you need 2t extra ECC symbols. To correct 30% of your codewords, you need ~60% extra codewords on top of your data — which is why H-level is so payload-expensive.

Here’s the version-40 picture in concrete numbers. A version-40 QR code has 3,706 total codewords. Subtract the format and version info modules and you get 2,956 codewords available for data + ECC. Then:

  • L-level spends 750 codewords on ECC, leaving 2,956 for data → 2,953 raw bytes after a 3-byte mode header.
  • M-level spends 1,372 codewords on ECC, leaving 2,334 for data → 2,331 raw bytes.
  • Q-level spends 2,040 codewords on ECC, leaving 1,666 for data → 1,663 raw bytes.
  • H-level spends 2,430 codewords on ECC, leaving 1,276 for data → 1,273 raw bytes.

The pattern repeats at every version. Going from L to H costs you roughly 57% of your payload in exchange for damage tolerance. There’s no in-between — the spec only defines those four levels.

The four character modes (and why mode matters as much as level)

QR codes don’t store “characters” — they store bits. The four standard encoding modes pack different character sets at different bit-densities:

ModeBits per characterCharacter setVersion-40 H capacity
Numeric3.33 bits/char (10 bits per 3 digits)0–93,057 digits
Alphanumeric5.5 bits/char (11 bits per 2 chars)0–9, A–Z (uppercase only), space, $%*+-./:1,852 chars
Byte8 bits/charAny 8-bit value (UTF-8 by convention)1,273 bytes
Kanji13 bits/charShift-JIS double-byte chars784 chars

The practical implication: if you’re encoding a phone number, numeric mode gets you almost 2.4× the capacity of byte mode for the same QR code. If you’re encoding a URL, you’re probably in byte mode, but uppercase-only URLs (or URLs that fit alphanumeric mode’s restricted set) can encode in alphanumeric mode and gain ~38% capacity. Most generators auto-detect the most efficient mode for your input. A few don’t — and on those you’ll see your QR code grow several versions larger than necessary.

The version-to-capacity table (the part nobody tabulates)

A QR code’s “version” is its size: version 1 is 21×21 modules, version 40 is 177×177. Every version adds 4 modules per side. Here’s the byte-mode capacity at each ECC level for the versions you’ll actually see in the wild:

VersionSizeL (bytes)M (bytes)Q (bytes)H (bytes)
121Ă—211714117
225Ă—2532262014
537Ă—37106846044
1057Ă—57271213151119
2097Ă—97858666482382
30137Ă—1371,7321,370980742
40177Ă—1772,9532,3311,6631,273

A few things to read out of this table:

  • A version-1 QR code at H-level holds 7 bytes. Seven. That’s why heavily branded codes (the ones with a logo overlay in the middle) need to be physically larger — the generator picks H to survive the logo, then bumps the version up to fit your payload.
  • A typical short URL (https://kestreltools.com, 25 bytes) at M-level fits in version 2. That’s the QR code on the back of a business card.
  • A 280-character tweet (~280 bytes) at M-level needs version 11 (61Ă—61 modules). That’s the practical ceiling for codes you’d point a phone at from across a conference room.
  • Beyond ~2 KB you’re in version-40 territory, which is too dense for most phone cameras to scan reliably from more than ~30 cm away. This is the wall that ShadowCat hits — and the reason it streams frames instead of trying to cram a 100 KB file into a single code.

How ShadowCat (and other QR file-transfer tools) get past the single-frame limit

The single-code limit is ~3 KB at L-level. To transfer a real file you need either: (1) a much larger code, which the camera can’t resolve at distance; or (2) chunking across multiple frames. ShadowCat picks chunking.

The approach is straightforward and worth understanding because it’s the same pattern used by tools like qrcp, qrss, and the academic TXQR project:

  1. Base64-encode the file (so the binary content fits inside a string-safe payload).
  2. Split the encoded payload into chunks of N bytes (ShadowCat defaults to ~1,500 bytes per frame at L-level — comfortably under the version-40 ceiling).
  3. Prefix each chunk with a sequence number and a CRC32 checksum.
  4. Render the chunks as QR codes at a configurable FPS (10–30 fps in ShadowCat).
  5. The receiver runs continuous video decode, dedupes frames by sequence number, validates each chunk via CRC32, and reassembles the file once all chunks arrive.

Throughput math: at 10 FPS with 1,500 bytes per frame and ~30% base64 overhead, you get a raw throughput of roughly 0.83 KB/s of original file content. A 1 MB file takes ~20 minutes. It’s slower than carrier pigeon but works through any air-gap that has a screen and a camera — which is the actual point.

The per-frame ECC choice matters here too. ShadowCat defaults to L because its target is screen-to-screen transfer where the “damage” is camera blur, not paper smudges. On a clean LCD with a steady webcam, L’s 7% recovery is plenty. If you tried the same protocol with paper printouts under fluorescent light, you’d want at least M, and you’d take a 21% throughput hit.

The practical rule for picking an ECC level in 2026

For 95% of cases, the right answer is: use M-level and pick the smallest version that fits your data. M is the default in every mainstream library (qrcode.js, python-qrcode, node-qrcode) for a reason — it’s the right balance for printed-on-paper, scanned-with-a-phone reality.

The exceptions:

  • Use L when the code lives on a clean digital display, the scanner is a fixed device (kiosk, POS terminal, your own webcam), and you want the smallest possible code or the most data per frame.
  • Use Q when the code will be printed and exposed to mild wear — restaurant menus, conference badges, packaging that gets handled.
  • Use H when the code has a logo overlay covering up to ~25% of its area, or when it’ll be in a hostile environment (outdoor signage, dirty industrial surfaces, anywhere with weather). H is also the right pick for any code you can’t reprint cheaply.
  • Don’t pick H by default “just to be safe.” You’re paying for damage tolerance you don’t need with a much denser code that ironically scans worse on cheap cameras because the modules are smaller.

A working example

If you want to see the trade-off in your browser, try the Kestrel Tools QR Code Generator. Paste a URL or block of text, toggle through L → M → Q → H, and watch the version number climb (the code gets denser) as the ECC level rises for the same payload.

A concrete pair to try: paste your Twitter bio (typically ~150 chars) at L-level — you’ll get version 6. Switch to H-level and the same input pushes you to version 9. Same content, denser code, ~30% damage tolerance instead of ~7%. The generator runs entirely client-side, so the URLs and text you’re testing never leave your machine — useful when the QR code is for an internal dashboard or a Wi-Fi password.

The takeaway

The QR code data capacity ceiling is real but bigger than most people think — up to 2,953 bytes in a single version-40 code at the lowest error correction level. The trade-off most people don’t see is that error correction eats payload at a ~57% discount from L to H, and the version of the code (the physical size of the grid) has to grow to compensate.

  • L-level: ~7% recovery, max payload, only safe on clean digital displays.
  • M-level: ~15% recovery, sensible default for paper printouts.
  • Q-level: ~25% recovery, for industrial labels and packaging.
  • H-level: ~30% recovery, for logo-overlaid codes and outdoor signage.

For file transfer, single-frame QR codes are the wrong tool past ~2 KB. ShadowCat-style chunked streaming with sequence numbers and CRC32 hits ~0.83 KB/s on a clean screen-to-camera link, which is enough for configs and small documents and not enough for anything else.

Next time someone asks how much data fits in a QR code, the honest answer is “it depends on the version, the error correction level, and the character mode — and the most useful number is usually 80 bytes, because that’s what a sensible URL shortener spits out.” If you want to see the exact numbers for your input, run it through Kestrel Tools and watch the version count climb in real time as you turn ECC up.