[Python-ideas] Atomic file.get(offset, length)

Guido van Rossum guido at python.org
Sat Jul 21 22:35:21 CEST 2012


On Sat, Jul 21, 2012 at 12:35 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 7/21/2012 2:59 PM, Matt Chaput wrote:
>>
>> I wish Python binary file objects had an atomic seek-read method, so
>> I wouldn't have to perform my own locking everywhere to prevent other
>> threads from moving the file pointer between seek and read.
>
>
> If you are reading a file from multiple threads, I suggest you write your
> own seek_and_read_with_locks function that does exactly what you need in one
> place. Or add a .readx method to a subclass.
>
>
>> Is this something that can be bubbled up from the underlying
>> platform? I think the Linux C equivalent is pread.
>
> If there is a standard posix function that is not yet wrapped in os, you can
> propose its addition. But some research to see has widespread and actually
> standardized it is.

"man pread" on OS/X suggests it exists there too. I presume the use
case is to have a large data file open for reading by multiple
threads. This is a reasonable use case and it makes some sense to
extend our binary readable streams (buffered and unbuffered) with an
API for this purpose.

However, it's probably just efficient to just have a separate open
stream per thread -- I doubt that open file descriptors are scarcer
resources than threads, and I presume the kernel will happily share
any buffering it does on behalf of multiple open files referencing the
same file. If you're worried about the buffer space, the default
buffer size is 8K, which is hardly worth mentioning compared to the
default thread stack allocation. Depending on your use case you may
get away with an unbuffered stream just fine.

This approach seems better than implementing something using locks
(since the locks create contention that is not inherent in the
problem) and is available right now, without waiting for Python 3.4 to
be released...

-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list