[Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

Sat Sep 3 20:59:35 EDT 2016

Brett, I have not followed everything here but I have no problem with
tweaks at this level as long as you are happy with it.

--Guido (mobile)

On Sep 3, 2016 5:39 PM, "Brett Cannon" <brett at python.org> wrote:

>
>
> On Sat, 3 Sep 2016 at 17:27 Yury Selivanov <yselivanov.ml at gmail.com>
> wrote:
>
>>
>> On 2016-09-03 5:19 PM, Brett Cannon wrote:
>> >
>> >
>> > On Sat, 3 Sep 2016 at 16:43 Yury Selivanov <yselivanov.ml at gmail.com
>> > <mailto:yselivanov.ml at gmail.com>> wrote:
>> >
>> >
>> >
>> >     On 2016-09-03 4:15 PM, Christian Heimes wrote:
>> >     > On 2016-09-04 00:03, Yury Selivanov wrote:
>> >     >>
>> >     >> On 2016-09-03 12:27 PM, Brett Cannon wrote:
>> >     >>> Below is the `co_extra` section of PEP 523 with the update
>> >     saying that
>> >     >>> users are expected to put a tuple in the field for easier
>> >     simultaneous
>> >     >>> use of the field.
>> >     >>>
>> >     >>> Since the `co_extra` discussions do not affect CPython itself
>> I'm
>> >     >>> planning on landing the changes stemming from the PEP probably
>> >     on Monday.
>> >     >> Tuples are immutable.  If you have multiple co_extra users then
>> >     they
>> >     >> will have to either mutate tuple (which isn't always possible,
>> for
>> >     >> instance, you can't increase size), or to replace it with
>> >     another tuple.
>> >     >>
>> >     >> Creating lists is a bit more expensive, but item access speed
>> >     should be
>> >     >> in the same ballpark.
>> >     >>
>> >     >> Another question -- sorry if this was discussed before -- why
>> >     do we want
>> >     >> a PyObject* there at all?  I.e. why don't we create a dedicated
>> >     struct
>> >     >> CoExtraContainer to manage the stuff in co_extra? My
>> >     understanding is
>> >     >> that the users of co_extra are C-level python optimizers and
>> >     profilers,
>> >     >> which don't need the overhead of CPython API.
>> >
>> >
>> > As Chris pointed out in another email, the overhead is only in the
>> > allocation, not the iteration/access if you use the PyTuple macros to
>> > get the size and index into the tuple the overhead is negligible.
>>
>> Yes, my point was that it's as cheap to use a list as a tuple for
>> co_extra.  If we decide to store PyObject in co_extra.
>>
>> >     >>
>> >     >> This way my work to add an extra caching layer (which I'm very
>> much
>> >     >> willing to continue to work on) wouldn't require another set of
>> >     extra
>> >     >> fields for code objects.
>> >     > Quick idea before I go to bed:
>> >     >
>> >     > You could adopt a similar API to OpenSSL's
>> CRYPTO_get_ex_new_index()
>> >     > API,
>> >     >
>> >     https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_
>> ex_new_index.html
>> >     >
>> >     >
>> >     > static int code_index = 0;
>> >     >
>> >     > int PyCodeObject_NewIndex() {
>> >     >      return code_index++;
>> >     > }
>> >     >
>> >     > A library like Pyjion has to acquire an index first. In further
>> >     calls it
>> >     > uses the index as offset into the new co_extra field. Libraries
>> >     don't
>> >     > have to hard-code their offset and two libraries will never
>> >     conflict.
>> >     > PyCode_New() can pre-populate co_extra with a PyTuple of size
>> >     > code_index. This avoids most resizes if you load Pyjion early. For
>> >     > code_index == 0 leaf the field NULL.
>> >
>> >     Sounds like a very good idea!
>> >
>> >
>> > The problem with this is the pre-population. If you don't get your
>> > index assigned before the very first code object is allocated then you
>> > still have to manage the size of the tuple in co_extra. So what this
>> > would do is avoid the iteration but not the allocation overhead.
>> >
>> > If we open up the can of worms in terms of custom functions for this
>> > (which I was trying to avoid), then you end up with Py_ssize_t
>> > _PyCode_ExtraIndex(), PyObject *
>> >   _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int
>> > _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data)
>> > which does all the right things for creating or resizing the tuple as
>> > necessary and which I think matches mostly what Nick had proposed
>> > earlier. But the pseudo-code for _PyCode_GetExtra() would be::
>> >
>> >   if co_extra is None:
>> >     co_extra = (None,) * _next_extra_index;
>> >     return None
>> >   elif len(co_extra) < index - 1:
>> >     ... pad out tuple
>> >     return None
>> >    else:
>> >      return co_extra[index]
>> >
>> > Is that going to save us enough to want to have a custom API for this?
>>
>> But without that new API (basically what Christian proposed) you'd need
>> to iterate over the list in order to find the object that belongs to
>> Pyjion.
>
>
> Yes.
>
>
>>   If we manage to implement my opcode caching idea, we'll have at
>> least two known users of co_extra.  Without a way to claim a particular
>> index in co_extra you will have some overhead to locate your objects.
>>
>
> Two things. One, I would want any new API to start with an underscore so
> people know we can and will change its semantics as necessary. Two, Guido
> would have to re-accept the PEP as this is a shift in the use of the field
> if this is how people want to go.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> guido%40python.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160903/603b7565/attachment.html>