csv module and NULL data byte
Tim Chase
python.list at tim.thechases.com
Wed Feb 28 20:35:35 EST 2018
While inelegant, I've "solved" this with a wrapper/generator
f = file(fname, …)
g = (line.replace('\0', '') for line in f)
reader = csv.reader(g, …)
for row in reader:
process(row)
My actual use at $DAYJOB cleans out a few other things
too, particularly non-breaking spaces coming from client data
that .strip() doesn't catch in Py2.x ("hello\xa0".strip())
-tkc
On 2018-02-28 23:40, John Pote wrote:
> I have a csv data file that may become corrupted (already happened)
> resulting in a NULL byte appearing in the file. The NULL byte
> causes an _csv.Error exception.
>
> I'd rather like the csv reader to return csv lines as best it can
> and subsequent processing of each comma separated field deal with
> illegal bytes. That way as many lines from the file may be
> processed and the corrupted ones simply dumped.
>
> Is there a way of getting the csv reader to accept all 256 possible
> bytes. (with \r,\n and ',' bytes delimiting lines and fields).
>
> My test code is,
>
> with open( fname, 'rt', encoding='iso-8859-1' ) as csvfile:
> csvreader = csv.reader(csvfile, delimiter=',',
> quoting=csv.QUOTE_NONE, strict=False )
> data = list( csvreader )
> for ln in data:
> print( ln )
>
> Result
>
> >>python36 csvTest.py
> Traceback (most recent call last):
> File "csvTest.py", line 22, in <module>
> data = list( csvreader )
> _csv.Error: line contains NULL byte
>
> strict=False or True makes no difference.
>
> Help appreciated,
>
> John
>
> --
> https://mail.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list