[New-bugs-announce] [issue23178] csv.reader does not handle BOM

Jon Dufresne report at bugs.python.org
Tue Jan 6 20:05:05 CET 2015

New submission from Jon Dufresne:

The following test script demonstrates that Python's csv library does not handle a BOM. I would expect the returned row to be equal to expected and to print 'True' to stdout.

In the wild, it is typical for other CSV writers to add a BOM. MS Excel is especially picky about the BOM when reading a utf-8 encoded file. So many writers add a BOM for interopability with MS Excel.

If a python program accepts a CSV file as input (often the case in web apps), these files will not be handled correctly without preprocessing. In my opinion, this should "just work" when reading the file.

import codecs
import csv

f = open('foo.csv', 'wb')
f.write(codecs.BOM_UTF8 + b'a,b,c')

expected = ['a', 'b', 'c']
f = open('foo.csv')
r = csv.reader(f)
row = next(r)

print(row == expected)

$ ./python ~/test.py
['\ufeffa', 'b', 'c']

components: Library (Lib)
messages: 233549
nosy: jdufresne
priority: normal
severity: normal
status: open
title: csv.reader does not handle BOM
type: behavior
versions: Python 3.5

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list