pread/pwrite (was Re: files: direct access - NOT sequential)
aleax at aleax.it
Tue Apr 15 19:25:57 CEST 2003
On Tuesday 15 April 2003 06:19 pm, Jeff Epler wrote:
> On Tue, Apr 15, 2003 at 12:33:54PM +0000, Alex Martelli wrote:
> > Patrick Carabin wrote:
> > > i search a means to get " direct access " inside files, eg mixing
> > > read & write operaions for the same file
> > > example:
> > > given a file with contents : "qwerty" , positionning on the "w", and
> > > writing a "N" , so the file becomes "qNerty"
> > > How can i achieve this in Python ?
> > thefile = open('thefile', 'r+') # must be already-existing
> > thefile.seek(1)
> > thefile.write('N')
> > thefile.close()
> > Here, in fact, we're NOT mixing read and write operations -- the
> > only _operation_ is a write -- but, if we wanted to, we might (that
> > is what mode 'r+' means).
> Speaking of which, (if a patch were supplied,) is it remotely likely
> that the os module would get pread/pwrite functions? These make
> lseek+read or lseek+write atomic, which is useful if two threads share
> the same file descriptor and want to use random-access to it in a
> threadsafe way.
I know Guido is reconsidering the whole structure of fileobjects, so
it is quite possible he might entertain the idea (for Python 2.4 -- I
think it's too late for such major surgery in Python 2.3!) -- I think a
clear suggestion to python-dev, with examples and use cases, might
be more useful in this case than a patch, given the prospect of
pensioning off the whole current implementation of file objects.
> (nicest would be if file.read and file.write took an optional pos=
> argument, but pread/pwrite are for file-descriptors, not FILE*s so stdio
> buffering would louse things up)
I think the positioning argument[s] should be the same as for .seek --
ONE argument (unless it's a tuple specifying origin and offset) would
not suffice. Consider a frequent case: "I want to write these bytes
at the END of the file" - no matter how long it's now and where it's
currently positioned; or, ditto but "at the START of the file"; I don't
think the Pythonic convention of -1 to meand end would work here,
i.e. to write _after the end_ -- I think that, like for seek, offset and
indicator of origin (start, end, current) would in general be needed.
I don't know if adding optional argumens to write and read is the
right way, or rather new methods should be used -- I do think that
the key issue is defining the syntax, semantics and use cases, as
opposed to implementing via a patch (buffering might not be a
problem, if such operations implied previous flushing of any
pending buffers or semantics equivalent thereto, for example).
More information about the Python-list