[Python-Dev] 3.1a2
R. David Murray
rdmurray at bitdance.com
Wed Apr 1 07:17:05 CEST 2009
On Tue, 31 Mar 2009 at 14:09, Benjamin Peterson wrote:
> I haven't looked at #4847 in depth, but appears that the csv module
> will need some API changes to deal with encodings. Perhaps somebody
> would like to sprint on it?
First we have to figure out what should be done.
http://bugs.python.org/4847
Having read through the ticket, it seems that a CSV file must be (and
2.6 was) treated as a binary file, and part of the CSV module's job
is to convert that binary data to and from strings. That is, the CSV
module is at the same layer of the input stack as the TextIOWrapper.
So IMO it should have an encoding parameter, and the defaults should be
handled the same way they are for TextIOBase.
_csv as indicated by the initial error report is in py3k expecting to read
strings from the iterator passed to it, which IMO is wrong. It should
be expecting bytes. The problem with this solution is that those people
currently passing it string iterators would have to change their code.
The documentation says "If csvfile is a file object, it must be opened
with the âbâ flag on platforms where that makes a difference."
With the advent of unicode strings, it now makes a difference on all
platforms.
--
R. David Murray http://www.bitdance.com
More information about the Python-Dev
mailing list