[Python-Dev] Generalised String Coercion

Guido van Rossum gvanrossum at gmail.com
Mon Aug 8 02:07:49 CEST 2005


[Reinhold Birkenfeld]
> > FWIW, I've already drafted a patch for the former. It lets you write to
> > file.encoding and honors this when writing Unicode strings to it.

[Martin v L]
> I don't like that approach. You shouldn't be allowed to change the
> encoding mid-stream (except perhaps under very specific circumstances).

Right. IMO the encoding is something you specify when opening the
file, just like buffer size and text mode.

> Another issue is seeking: given the many different kinds of buffers,
> seeking becomes fairly complex. Ideally, seeking should apply to
> application-level positions, ie. if when you tell the current position,
> it should be in terms of data already consumed by the application.
> Perhaps seeking in an encoded stream should not be supported at all.

I'm not sure if it works for all encodings, but if possible I'd like
to extend the seeking semantics on text files: seek positions are byte
counts, and the application should consider them as "magic cookies".

> Finally, you also have to consider Universal Newlines: you can apply
> them either on the byte stream, or on the character stream. I think
> conceptually right would be to do universal newlines on the character
> stream.

Is there any reason not to do Universal Newline processing on *all*
text files? I can't think of a use case where you'd like text file
processing but you want to see the bare \r characters.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list