csv module and NULL data byte
Tim Chase
python.list at tim.thechases.com
Thu Mar 1 19:15:50 EST 2018
On 2018-03-01 23:57, John Pote wrote:
> On 01/03/2018 01:35, Tim Chase wrote:
> > While inelegant, I've "solved" this with a wrapper/generator
> >
> > f = file(fname, …)
> > g = (line.replace('\0', '') for line in f)
> I wondered about something like this but thought if there's a way
> of avoiding the extra step it would keep the execution speed up.
There shouldn't be noticeable performance issues with using a
generator. It's also lazy so it's not like it's pulling the entire
file into memory; no more than one line at a time.
> My next thought was to pass a custom encoder to the open() that
> translates NULLs to, say, 0x01. It won't make any difference to
> change one corrupt value to a different corrupt value.
> > reader = csv.reader(g, …)
> > for row in reader:
> > process(row)
...which is pretty much exactly what my generator solution does:
putting a translating encoder between the open() and the
csv.reader() call.
-tkc
More information about the Python-list
mailing list