[Python-3000] patch: bytes object PyBUF_LOCKDATA read-only and immutable support

Guido van Rossum guido at python.org
Tue Sep 11 23:49:17 CEST 2007


On 9/11/07, Travis E. Oliphant <oliphant at enthought.com> wrote:
> I'm not sure I understand the difference between a classic read lock and
> the exclusive write lock concept.   Does the classic read-lock just
> prevent writing to the memory area.  In my mind that is a read-only
> memory buffer and the buffer interface would complain if a writeable
> buffer was requested.

There are different notions of reading and writing.  Sometimes an
object it naturally read-only (e.g. a PyString). In that case
requesting SIMPLE access should pass but requesting WRITABLE or
LOCKDATA access should fail. (I think the other flags are orthogonal
to these, right?). Any number of concurrent SIMPLE accesses can
coexist since the clients promise they will only read.

OTOH suppose we have an object that is naturally writable (e.g. e
PyBytes). I understood that in this case any number of SIMPLE or
WRITABLE requests would be allowed to be outstanding simultaneously,
and any of these would simply prevent the buffer from moving (fixing
the object's size). But this doesn't sound like it is how you meant it
-- you seem to say that once any SIMPLE (readonly) requests are
outstanding, WRITABLE requests should fail. And I suppose that only
one WRITABLE request ought to be allowed at a time. But then I don't
know what the difference between WRITABLE and LOCKDATA would be.

I guess I would be inclined to propose separate flags for indicating
the operation that the caller will attempt (read or write) and the
level of locking (lock the buffer's address or also prevent anyone
else from writing). Then a "classic read lock" would request read
access while locking out writers (bsddb would use this); a "classic
write lock" would request write access while locking out writers (your
scratch area example would use this); others who don't really care if
the data changes underneath them as long as it doesn't move (e.g.
traditional I/O) could request read access without locking. I'm not
sure if there's a use case to be made for write access without
locking, but I wouldn't rule it out -- possibly when two threads share
a memory area they might have their own protocol for locking it and
might just both want to be able to write to (parts of) it.

What do you think? Another way to look at this would be to consider
these 4 cases:

basic read access (I can read, others can read or write)
locked read access (I can read, others can only read)
basic write access (I can read and write, others can read or write)
exclusive write access (I can read and write, no others can read or write)

Except that accessing the object from Python (e.g. iteration or
indexing) never gets locked out. (Or perhaps it should be? That can
also be done.)

Also, it remains to be seen whether basic read access should be
granted when someone has exclusive write access (see below).

> Actually, writeable is an accepted variant of 'writable' (but it doesn't
> show up in many spell-check dictionaries).  No, it is not too late to
> change it.  Or just define WRITEABLE as WRITABLE.   NumPy uses
> "WRITEABLE" simply because I like that spelling better.

Google found 1.4M occurrences of writeable vs. 3.9M occurrences of
writable. I guess you represent a strong minority. :-) I'd still like
to see it changed. We can leave WRITEABLE as an alias for WRITABLE for
those who are used to seeing it that way in NumPy.

> I'm anxious for feedback and help with the locking mechanism, because I
> do not have all use cases in mind.  I have never thought about a lock
> that prevents reading.  In my mind, this would be handled by the object
> itself.  It could refuse buffer requests if it's data had been locked or
> it could not.

Well, the scratch area scenario you describe makes it iffy to read
anything out of the original object since you wouldn't know whether
you were reading before, during or after the write back from the
scratch area to the object's buffer. The question is, do we really
care. If we adopted my 4 access modes above, we could say that basic
read access will still be granted when someone has exclusive write
access if we don't care, OR we could say that basic reads are locked
out by exclusive write access. (And then there's the separate issue of
whether python-level access counts as basic read access or doesn't
count at all -- though the moer I think about it, I think it should be
treated the smne as basic read access.)

> On the other hand, there could be two concepts of locking that a
> consumer could request from an object
>
> 1) Lock so that no other reads or writes are possible until the lock is
> released.
> 2) Lock so that only reads are possible.
>
> I had only thought of #2 for the current buffer interface.

#1 maps to locked read OR exclusive write access in the strict variant.
#2 maps to locked read in my scheme.

(Gotta go -- ttyl.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list