[New-bugs-announce] [issue7185] csv reader utf-8 BOM error

Istvan Szirtes report at bugs.python.org
Thu Oct 22 12:46:06 CEST 2009


New submission from Istvan Szirtes <istvan.szirtes at gmail.com>:

The CSV module try to read a .csv file which is coded in utf-8 with utf-
8 BOM. 

The first row in the csv file is 
["value","vocal","vocal","vocal","vocal"]

in hex:
"value","vocal","vocal","vocal","vocal"

the reader can not read corectly the first row and if I try to seek up 
to 0 somewhere in the file I got an error like this:

['\ufeff"value"', 'vocal', 'vocal', 'vocal', 'vocal']

I think the csv reader is not seekable correctly.

I attached a test file for the bug and here is my code:

import codecs
import csv

InDistancesFile = codecs.open( '..\\distances.csv', 'r', encoding='utf-
8' )
InDistancesObj = csv.reader( InDistancesFile )

for Row in InDistancesObj:
    if Row[0] == '20':
        print(Row)
        break

InDistancesFile.seek(0)

for Row in InDistancesObj:
    print(Row)

----------
components: Unicode
files: distances.csv
messages: 94340
nosy: W00D00
severity: normal
status: open
title: csv reader utf-8 BOM error
type: compile error
versions: Python 3.1
Added file: http://bugs.python.org/file15182/distances.csv

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue7185>
_______________________________________


More information about the New-bugs-announce mailing list