[Python-3000] Google Sprint Ideas
talin at acm.org
Mon Aug 21 03:34:28 CEST 2006
Guido van Rossum wrote:
> On 8/20/06, Talin <talin at acm.org> wrote:
>> Guido van Rossum wrote:
>> > On 8/20/06, Paul Moore <p.f.moore at gmail.com> wrote:
>> > Without endorsing every detail of his design, tomer filiba has written
>> > several blog (?) entries about this, the latest being
>> > http://sebulba.wikispaces.com/project+iostack+v2 . You can also look
>> > at sandbox/sio/sio.py in svn.
>> One comment after reading this: If we're going to re-invent the Java/C#
>> i/o library, could we at least use the same terminology? In particular,
>> the term "Layer" has connotations which may be confusing in this context
>> - I would prefer something like "Adapter" or "Filter".
> That's an example of what I meant when I said "without endorsing every
> I don't know which terminology C++ uses beyond streams. I think Java
> uses Streams for the lower-level stuff and Reader/Writer for the
> higher-level stuff -- or is it the other way around?
Well, the situation with Java is kind of complex. There are two sets of
stream classes, but rather than classifying them as "low-level" and
"high-level", a better classification is "old" and "new". The old
classes (InputStream/OutputStream) are byte-oriented, whereas the newer
ones (Reader/Writer) are character-oriented. It it not the case,
however, that the character-oriented interface sits on top of the
byte-oriented interface - rather, both interfaces are implemented by a
number of different back ends.
For purposes of Python, it probably makes more sense to look at the .Net
System.IO.Stream. (As a general rule, the .Net classes are refactored
versions of the Java classes, which is both good and bad. It's best to
study both if one is looking for inspiration.)
Hmmm, apparently the .Net documentation *does* use the term 'layer' to
describe one stream wrapping another - which I still find strange. To my
mind, the term 'layer' can either describe a particular design stratum
within an architecture - such as the 'device layer' of an operating
system - or it can describe a portion of a document, such as a drawing
layer in a CAD program. I don't normally think of a single instance of a
class wrapping another instance as constituting a "layer" - I usually
use the term "adapter" or "proxy" to describe that case.
(OK, so I'm pedantic about naming. Now you know why one of my side
projects is writing an online programmer's thesaurus -- using
Python/TurboGears of course!)
>> Also, I notice that this proposal removes what I consider to be a nice
>> feature of Python, which is that you can take a plain file object and
>> iterate over the lines of the file -- it would require a separate line
>> buffering adapter to be created. I think I understand the reasoning
>> behind this - in a world with multiple text encodings, the definition of
>> "line" may not be so simple. However, I would assume that the "built-in"
>> streams would support the most basic, least-common-denominator encodings
>> for convenience.
> First time I noticed that. But perhaps it's the concept of "plain file
> object" that changed? My own hierarchy (which I arrived at without
> reading tomer's proposal) is something like this:
> (1) Basic level (implemented in C) -- open, close, read, write, seek,
> tell. Completely unbuffered, maps directly to system calls. Does
> binary I/O only.
> (2) Buffering. Implements the same API as (1) but adds buffering. This
> is what one normally uses for binary file I/O. It builds on (1), but
> can also be built on raw sockets instead. It adds an API to inquire
> about the amount of buffered data, a flush() method, and ways to
> change the buffer size.
> (3) Encoding and line endings. Implements a somewhat different API,
> for reading/writing text files; the API resembles Python 2's I/O
> library more. This is where readline() and next() giving the next line
> are implemented. It also does newline translation to/from the
> platform's native convention (CRLF or LF, or perhaps CR if anyone
> still cares about Mac OS <= 9) and Python's convention (always \n). I
> think I want to put these two features (encoding and line endings) in
> the same layer because they are both text related. Of course you can
> specify ASCII or Latin-1 to effectively disable the encoding part.
> Does this make more sense?
I understood that much -- this is pretty much the way everyone does
things these days (our own custom stream library at work looks pretty
much like this too.)
The question I was wondering is, will the built-in 'file' function
return an object of level 3?
More information about the Python-3000