[Python-Dev] csv module TODO list

M.-A. Lemburg mal at egenix.com
Wed Jan 5 11:16:50 CET 2005


Andrew McNamara wrote:
>>>Yes, although it would be nice to also retain the 8-bit versions as well.
>>
>>You can do so by using latin-1 as default encoding. Works great !
> 
> Yep, although that means we wear the cost of decoding and encoding for
> all 8 bit input.

Right, but it makes the code very clean and straight forward.
Again, it depends on what you need. If performance is critical
then you probably need a C version written using the same trick
as _sre.c...

> What does the _sre.c code do?

It comes in two versions: one for 8-bit the other for Unicode.

>>Depends on your needs: CSV files tend to be small enough
>>to do the decoding in one call in memory.
> 
> We are routinely dealing with multi-gigabyte csv files - which is why the
> original 2001 vintage csv module was written as a C state machine. 

I see, but are you sure that the typical Python user will have
the same requirements to make it worth the effort (and
complexity) ?

I've written a few CSV parsers and writers myself over the years
and the requirements were different every time, in terms
of being flexible in the parsing phase, the interfaces and
the performance needs. Haven't yet found a one fits all
solution and don't really expect to any more :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 05 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list