[Python-ideas] Atomic file.get(offset, length)
Antoine Pitrou
solipsis at pitrou.net
Mon Jul 23 11:52:16 CEST 2012
On Sun, 22 Jul 2012 17:16:57 -0700
Guido van Rossum <guido at python.org> wrote:
> On Sun, Jul 22, 2012 at 4:47 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> > On Sun, 22 Jul 2012 15:36:44 -0700
> > Guido van Rossum <guido at python.org> wrote:
> >> On Sun, Jul 22, 2012 at 2:25 PM, Victor Stinner
> >> <victor.stinner at gmail.com> wrote:
> >> >> "man pread" on OS/X suggests it exists there too
> >> >
> >> > "man pread" or "import os; help(os.pread" ;-) pread() and pwrite()
> >> > have been added to Python 3.3.
> >>
> >> Awesome. :-) But does the io module offer an API that uses it? It's
> >> kind of awkward to have to call os.pread() with stream.fileno() as an
> >> argument.
> >
> > It doesn't. I guess we could add an "offset" keyword-only argument to
> > read() and write(), but then we need to provide a Windows
> > implementation as well (it seems using overlapped I/O with
> > ReadFile() / WriteFile() could make it possible). Also, I'm not sure it
> > makes sense for buffered I/O, or only unbuffered.
>
> Given that the use case is to avoid race conditions when more than one
> thread is doing random-access reads on the same open file, I think it
> makes some sense to implement it for both buffered and unbuffered
> streams -- and even for text streams, since those support seek() as
> well, so the race condition exists for those too.
>
> But note that the pread() man page (at least the one I checked :-)
> specifies that pread() doesn't affect the file pointer. So I suppose
> it should also not affect the buffer. That may make it hard to
> implement it for text streams (which IIRC rely quite heavy on
> buffering for their implementation), but it should be easy for
> buffered streams: it should just be passed on to the underlying
> unbuffered stream.
Indeed, it should not affect the buffer. That's why I'm questioning the
addition of this feature to buffered streams (whose whole point is their
implicit buffer management). Also, there are implementation subtleties
when e.g. reading from an area which overlaps the current buffer :-)
As you pointed out, I think a reasonable solution to the race condition
problem is to use several file descriptors. It may not work so well if
you also write to the file, though.
Regards
Antoine.
--
Software development and contracting: http://pro.pitrou.net
More information about the Python-ideas
mailing list