What is CSV and why convert JSON to it?
CSV (Comma-Separated Values) is the lingua franca of tabular data. Defined informally for decades and standardized in RFC 4180 (2005), CSV is a flat-text format where each line is a row and each row contains comma-separated values. It opens natively in Excel, Google Sheets, Numbers, every analytics tool, and every database import wizard. Its simplicity is its superpower — you can read a CSV in any text editor.
JSON is hierarchical (nested objects, arrays, mixed types); CSV is flat (rows × columns of strings). Converting between them is one of the most common data tasks in 2026:
- API responses → spreadsheets — fetch from a JSON API, hand a CSV to your finance team.
- Database exports → analysis — most databases export to JSON; analysts want CSV.
- Bulk imports — migrating 10,000 records into a new system that only accepts CSV uploads.
- Reporting — JSON event logs → CSV for stakeholders to filter in Excel.
- Data science prep — Pandas, R, and BigQuery all happily ingest CSV; nested JSON requires preprocessing.
The conversion is conceptually simple — extract the keys, write headers, write each object as a row — but the devil is in the details: nested objects, mixed schemas, commas inside values, Unicode, and Excel's many CSV quirks.
The CSV format — RFC 4180 in plain English
The "informal" CSV spec is full of dialects, but RFC 4180 codifies the most-compatible rules. Follow these and your CSV will work in 99% of consumers:
- Each record on its own line, terminated by CRLF (
\r\n). Lone\nworks in most modern parsers but RFC says CRLF. - Fields separated by commas. (Some "CSV" files actually use semicolons, tabs, or pipes — those are TSV/SSV, not strict CSV.)
- Optional header line as the first record.
- Fields containing commas, double-quotes, or line breaks must be quoted with double quotes.
Smith, John→"Smith, John". - Double-quotes inside quoted fields must be escaped by doubling them.
She said "hi"→"She said ""hi""". - No trailing comma at end of row (some parsers tolerate it; many don't).
- UTF-8 is the modern encoding. But Microsoft Excel on Windows defaults to system encoding (Windows-1252) unless you prepend a UTF-8 BOM.
Flattening nested JSON — three approaches
The hard problem in JSON→CSV is what to do with nested objects and arrays. CSV is flat; JSON isn't. Three common strategies:
1. Dot-notation flattening (default for this tool)
Walk the JSON tree, joining keys with dots. {user: {name: "Alice", address: {city: "NYC"}}} becomes columns user.name and user.address.city. Works for moderately nested data; column names get unwieldy past 3 levels deep.
2. Stringify-arrays
For arrays, serialize the entire array as a JSON string in a single cell: {tags: ["a","b","c"]} → tags column with value ["a","b","c"]. Loses tabular nature but preserves data.
3. Cartesian-product expansion (one-to-many)
For arrays of objects, emit one row per array item. {user: "Alice", orders: [{id: 1}, {id: 2}]} becomes 2 rows, both with user=Alice, but different orders.id values. Useful for joining order line items to orders.
| Strategy | Best for | Trade-off |
|---|---|---|
| Dot notation | Nested objects (configs, profile fields) | Long column names; lossy on arrays |
| Stringify arrays | Mixed nested data, machine-only consumers | Cell contains JSON, not human-readable |
| Cartesian expansion | One-to-many relationships (orders → line items) | Repeats parent data on every row |
The Excel CSV gotchas — what trips everyone up
1. UTF-8 without BOM = corrupted accents
Open a UTF-8 CSV in Microsoft Excel on Windows: accented characters like é show up as é. The fix is a 3-byte BOM (0xEF 0xBB 0xBF) at the start of the file. Mac Excel and Google Sheets handle UTF-8 natively; Windows Excel needs the BOM. Most online JSON-to-CSV tools forget this. This one prepends the BOM by default.
2. Date columns lose precision
Dates that look like 2026-05-02 get auto-converted by Excel to its native serial number. Dates with leading zeros ("01-02-2026") might be reformatted as "1/2/2026". Workaround: prepend dates with a tab character (\t2026-05-02) or wrap in ="2026-05-02" formula syntax to force string interpretation.
3. Number formatting
Long numbers like phone numbers or credit card numbers get scientific-notation'd in Excel. 4117234567890 → 4.11723E+12. Same workaround: ="4117234567890" in the cell forces string.
4. Locale-specific delimiter conflicts
European Excel installs default to semicolon as the CSV delimiter (because comma is the decimal separator in those locales). A US-format CSV with commas will appear as a single mangled column. Workaround: use the SEP indicator: prepend the file with sep=,\n on the first line, and Excel will respect it regardless of locale.
5. Leading equals sign
Excel treats values starting with =, +, -, @ as formulas — including dangerous ones. A CSV cell containing =cmd|'/c calc'!A1 can launch the Calculator app on Windows when the user opens the file. This is CSV injection, a real attack vector. Workaround: prefix any string starting with these chars with a single tick: '=username.
JSON to CSV in 8 programming languages
JavaScript / Node.js
// Vanilla JS for simple cases
function jsonToCsv(arr) {
const headers = [...new Set(arr.flatMap(o => Object.keys(o)))];
const rows = arr.map(o =>
headers.map(h => JSON.stringify(o[h] ?? '')).join(',')
);
return [headers.join(','), ...rows].join('\n');
}
// Or use a library: csv-stringify, papaparse, json-2-csv
import { stringify } from 'csv-stringify/sync';
const csv = stringify(arr, { header: true });
Python
import csv, json
with open('data.json') as f:
data = json.load(f)
# Auto-detect headers from first object
fieldnames = list(data[0].keys()) if data else []
with open('out.csv', 'w', newline='', encoding='utf-8-sig') as f: # 'utf-8-sig' adds BOM
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(data)
# Pandas approach for nested data
import pandas as pd
df = pd.json_normalize(data) # auto-flattens dot notation
df.to_csv('out.csv', index=False, encoding='utf-8-sig')
jq (command-line JSON processor)
# Convert array of objects to CSV
jq -r '(.[0] | keys_unsorted) as $keys | $keys, map([.[ $keys[] ]])[] | @csv' data.json
# Custom column selection
jq -r '.[] | [.name, .age, .email] | @csv' data.json
# With headers
echo '"name","age","email"'; jq -r '.[] | [.name,.age,.email] | @csv' data.json
Go
import (
"encoding/csv"
"encoding/json"
"os"
)
var data []map[string]interface{}
json.Unmarshal(jsonBytes, &data)
f, _ := os.Create("out.csv")
defer f.Close()
f.WriteString("\xEF\xBB\xBF") // UTF-8 BOM for Excel
w := csv.NewWriter(f)
defer w.Flush()
// Write headers
headers := []string{"name", "age", "email"}
w.Write(headers)
// Write rows
for _, item := range data {
row := []string{
fmt.Sprint(item["name"]),
fmt.Sprint(item["age"]),
fmt.Sprint(item["email"]),
}
w.Write(row)
}
PHP
$data = json_decode(file_get_contents('data.json'), true);
$fp = fopen('out.csv', 'w');
fwrite($fp, "\xEF\xBB\xBF"); // UTF-8 BOM
// Headers
fputcsv($fp, array_keys($data[0]));
// Rows
foreach ($data as $row) {
fputcsv($fp, $row);
}
fclose($fp);
Ruby
require 'json'
require 'csv'
data = JSON.parse(File.read('data.json'))
CSV.open('out.csv', 'w', write_headers: true,
headers: data.first.keys, encoding: 'bom|utf-8') do |csv|
data.each { |row| csv << row.values }
end
Bash / awk
# With jq + miller (if installed)
jq -r 'map([.name, .age, .email] | @csv)[]' data.json > out.csv
# Excel-friendly: prepend BOM
printf '\xEF\xBB\xBF' > out.csv
jq -r '(map(keys_unsorted) | add | unique) as $k |
($k | @csv), (.[] | [.[$k[]]] | @csv)' data.json >> out.csv
Excel / Power Query (no code)
# Excel 365 / 2019+:
# Data tab → Get Data → From File → From JSON
# → Select your .json file
# → Power Query opens with auto-flattened columns
# → Click "Close & Load" — JSON is now a sheet
# This handles nested objects via Power Query's transformations,
# without manually flattening
Common JSON-to-CSV mistakes
- Forgetting the BOM for Excel. The #1 reported issue. Always prepend
\xEF\xBB\xBFfor Windows Excel users. - Not escaping commas/quotes inside cells. A field
Hello, worldwithout quoting splits into 2 cells. Always use a real CSV writer, not string concatenation. - Mixed schema across array. Object 1 has
{a, b}, object 2 has{a, c}. Naive flattening uses object 1's keys and dropsc. Compute headers as the union of all keys. - Date / number coercion in Excel. Phone numbers as scientific notation. Dates reformatted. Pre-quote with
=-syntax or tab prefix. - Treating
nulland""the same. CSV has no null. Pick a convention: empty cell,NULLstring, or distinguish via separate flag column. - Nested arrays serialized inconsistently. Sometimes
JSON.stringify, sometimes;-separated. Pick one. - CSV injection. A user-supplied field starting with
=can execute formulas in Excel. Prefix with'for safety. - Wrong line endings. Mixed
\nand\r\nacross rows. Use the platform's CSV writer to pick consistently. - Forgotten encoding. Windows-1252 vs UTF-8 vs ISO-8859-1. Always use UTF-8 with BOM for cross-platform compatibility.
Best practices
- Always emit UTF-8 with BOM when the file might open in Excel.
- Use a real CSV library, not string concatenation. The escaping rules are subtle.
- Compute headers from the union of all keys, not just the first object's keys. Mixed schemas are common in real data.
- Document your nested-data strategy. Dot notation? Stringify arrays? Cartesian expansion? Pick one and document so consumers know what to expect.
- Quote dates and IDs. Excel auto-converts otherwise. Wrap in
="..."formula syntax for hard preservation. - Sanitize formula triggers. Prefix any string starting with
=,+,-,@with a tick to prevent CSV injection. - For huge datasets, stream don't accumulate. Build CSV row-by-row to avoid loading everything into memory.
- Test the round-trip. Convert JSON → CSV → JSON and verify equivalence. Catches escaping and encoding bugs early.
- Consider Parquet or NDJSON for large datasets. CSV's flat schema and lack of types make it a poor archive format. CSV for human consumption; binary formats for machines.