[Python-3000] iostack, second revision
tomerfiliba at gmail.com
Thu Sep 7 19:30:45 CEST 2006
> As long as the state of the decoder is "neutral" at the start of a
> line, it should be possible to do this. I like the idea that tell()
> returns a "cookie" which is really a byte offset. If one wants to be
> able to seek to positions with a non-neutral decoder state, the cookie
> would have to be more abstract. It shouldn't matter; text apps should
> not do arithmetic on seek/tell positions.
> In all my programming days I don't believe I written to and read from
> the same file handle even once. Use cases exist, like if you're
> implementing a DBMS, or adding to a zip file in-place, but they're the
> exception, and by separating that functionality out in a dedicated
> class like FileBytes, you avoid having the complexities of mixed input
> and output affect your typical use cases.
> Watch out! There's an essentiel difference between files and
> bidirectional communications channels that you need to take into
> account. For a TCP connection, input and output can be seen as
> isolated from one another, with each their own stream position, and
> each their own contents. For read/write files, it's a whole different
> ballgame, because stream position and data are shared.
> Now, I'm not saying that you can't stick additional layers in-between
> TextReader and FileStream if you want to. An example might be the
> "resync" layer that you mentioned, or a journaling layer that insures
> that all writes are recoverable. I'm merely saying that for the specific
> issue of buffering, I think that the choice of buffer type is
> complicated, and requires knowledge that might not be accessible to the
> person assembling the stack.
lots of things have been discussed, lots of new ideas came:
it's time to rethink the design of iostack; i'll try to see into it.
there are several key issues:
* splitting streams to separate reading and writing sides.
* the underlying OS resource can be separated into some very low
level abstraction layer, over which streams would operate.
* stateful-seek-cookies sound like the perfect solution
issues with seeking:
being opaque, there's no sense in having the long debated
position property (although i really liked it :)). i.e., there's no sense
in doing s.position += some_opaque_cookie
on the other hand, since streams are byte-oriented, over which the
data abstraction layer (text, etc.) is placed, maybe there's sense in
splitting these into two distinct APIs:
* tell()/seek() for the byte-level stream position: a stream is just a
sequence of bytes in which you can seek.
* data-abstraction-layer "pointers": pointers will be stateful stream
locations of encoded *objects*.
you will not be able to "forge" pointers, you'll first have come across
a valid object location, and then could you get a "pointer" pointing to it.
of course these pointers should be kept cheap, and for most situations,
plain integers would suffice.
f = TextAdapter(BufferingLayer(FileStream(...)), encoding = "utf-32")
p = f.get_pointer()
or using a property:
p = f.pointer
f.pointer = p
something like that....though i would like to recv comments on
that first, before i go into deeper meditation :)
More information about the Python-3000