[Python-Dev] csv module TODO list
M.-A. Lemburg
mal at egenix.com
Wed Jan 5 10:44:40 CET 2005
Andrew McNamara wrote:
>>>Andrew McNamara wrote:
>>>
>>>>There's a bunch of jobs we (CSV module maintainers) have been putting
>>>>off - attached is a list (in no particular order):
>>>>* unicode support (this will probably uglify the code considerably).
>>>
>>Martin v. Löwis wrote:
>>
>>>Can you please elaborate on that? What needs to be done, and how is
>>>that going to be done? It might be possible to avoid considerable
>>>uglification.
>
>
> I'm not altogether sure there. The parsing state machine is all written in
> C, and deals with signed chars - I expect we'll need two versions of that
> (or one version that's compiled twice using pre-processor macros). Quite
> a large job. Suggestions gratefully received.
>
> M.-A. Lemburg wrote:
>
>>Indeed. The trick is to convert to Unicode early and to use Unicode
>>literals instead of string literals in the code.
>
>
> Yes, although it would be nice to also retain the 8-bit versions as well.
You can do so by using latin-1 as default encoding. Works great !
>>Note that the only real-life Unicode format in use is UTF-16
>>(with BOM mark) written by Excel. Note that there's no standard
>>for specifying the encoding in CSV files, so this is also the only
>>feasable format.
>
> Yes - that's part of the problem I hadn't really thought about yet - the
> csv module currently interacts directly with files as iterators, but it's
> clear that we'll need to decode as we go.
Depends on your needs: CSV files tend to be small enough
to do the decoding in one call in memory.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Jan 05 2005)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
More information about the Python-Dev
mailing list