CSV (Comma-Separated Values)

Spreadsheets as text files, with commas doing all the heavy lifting.

4 min read

What is CSV?

CSV is a text format for tabular data. Each line is a row, and commas separate the values (columns). It's been around since the 1970s and refuses to die because it just works.

csv
name,email,age
Alice,alice@example.com,28
Bob,bob@example.com,34
Charlie,charlie@example.com,22

That's it. No schema, no types, no fancy features. Just rows and commas.

Why CSV Exists

  • Universal - Every spreadsheet app can read and write it
  • Human readable - Open it in any text editor
  • Lightweight - No overhead, just your data
  • Easy to generate - Trivial to create programmatically

The trade-off? No data types, no nested structures, and a surprising number of edge cases that will haunt you.

The "Standard" (RFC 4180)

There's technically a spec, but many CSV files ignore it:

RuleDescription
DelimiterComma (,) separates fields
Line endingCRLF (\r\n) ends each row
QuotingFields with commas, quotes, or newlines must be quoted
Escaping quotesDouble the quote: "She said ""hello"""
Header rowOptional but recommended

In practice, you'll see tabs, semicolons, pipes, and chaos.

Basic Examples

Simple Data

csv
product,price,quantity
Widget,9.99,100
Gadget,24.99,50

Quoted Fields (Commas in Values)

csv
name,address,city
"Smith, John","123 Main St",Boston
"Doe, Jane","456 Oak Ave",Chicago

Escaped Quotes

csv
title,quote
Hamlet,"To be, or not to be"
Wisdom,"He said ""always quote your fields"""

Multiline Values

csv
name,notes
Alice,"Line one
Line two
Line three"
Bob,"Single line"

Where You'll See This

  • Spreadsheet exports - Excel, Google Sheets, Numbers
  • Database dumps - Quick and dirty data export
  • Data imports - Bulk uploads to web apps
  • Log analysis - Server logs, analytics exports
  • Financial data - Bank statements, trading data
  • ETL pipelines - Moving data between systems

Common Gotchas

⚠️Excel's Encoding Trap

Excel defaults to Windows-1252 encoding, not UTF-8. If your CSV has special characters and looks garbled in Excel, add a UTF-8 BOM (\xEF\xBB\xBF) at the start of the file.

ℹ️Delimiter Wars

European locales use semicolons (;) instead of commas because they use commas for decimal points (3,14 instead of 3.14). Always check your locale settings.

  • No data types - Everything is a string. "123" and 123 are indistinguishable. Leading zeros disappear when Excel "helps" (007 becomes 7).
  • Inconsistent quoting - Some tools quote everything, some quote nothing, some quote only when needed. Be liberal in what you accept.
  • Null values - Is an empty field null, an empty string, or "NULL"? Nobody agrees.
  • Newlines in values - Perfectly valid, but many parsers choke on them.
  • Trailing commas - Does a,b,c, have 3 or 4 columns? Depends on the parser.
  • Large files - CSV has no streaming hints. A 10GB file means loading 10GB into memory for naive parsers.

CSV vs Alternatives

FormatBest ForDrawback
CSVTabular data, spreadsheetsNo types, quoting edge cases
TSVData with commasTabs in data still break it
JSONNested/typed dataLarger, harder to edit manually
ParquetBig data, analyticsBinary, not human-readable
ExcelRich spreadsheetsProprietary, large files

In Code

javascript
// Simple parsing (don't use in production)
const rows = csv.split('\n').map(row => row.split(','));

// Proper parsing with a library (Papa Parse)
import Papa from 'papaparse';

const result = Papa.parse(csvString, {
  header: true,      // First row is header
  dynamicTyping: true,  // Convert numbers
  skipEmptyLines: true
});
// result.data = [{name: "Alice", age: 28}, ...]

// Generate CSV
const data = [
  ['name', 'email'],
  ['Alice', 'alice@example.com'],
  ['Bob', 'bob@example.com']
];
const csv = data.map(row => row.join(',')).join('\n');

// With proper escaping
function escapeCSV(value) {
  if (/[,"\n\r]/.test(value)) {
    return `"${value.replace(/"/g, '""')}"`;
  }
  return value;
}
python
# Python's csv module handles edge cases
import csv

# Read
with open('data.csv', newline='') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(row['name'], row['email'])

# Write
with open('output.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['name', 'email'])
    writer.writerow(['Alice', 'alice@example.com'])

Try It

Convert CSV to JSON

"CSV: the file format that's one misplaced comma away from ruining your entire afternoon."