[Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

Sat Sep 3 21:21:21 EDT 2016

Great, thanks!

On Sat, Sep 3, 2016, 17:59 Guido van Rossum <gvanrossum at gmail.com> wrote:

> Brett, I have not followed everything here but I have no problem with
> tweaks at this level as long as you are happy with it.
>
> --Guido (mobile)
>
> On Sep 3, 2016 5:39 PM, "Brett Cannon" <brett at python.org> wrote:
>
>>
>>
>> On Sat, 3 Sep 2016 at 17:27 Yury Selivanov <yselivanov.ml at gmail.com>
>> wrote:
>>
>>>
>>> On 2016-09-03 5:19 PM, Brett Cannon wrote:
>>> >
>>> >
>>> > On Sat, 3 Sep 2016 at 16:43 Yury Selivanov <yselivanov.ml at gmail.com
>>> > <mailto:yselivanov.ml at gmail.com>> wrote:
>>> >
>>> >
>>> >
>>> >     On 2016-09-03 4:15 PM, Christian Heimes wrote:
>>> >     > On 2016-09-04 00:03, Yury Selivanov wrote:
>>> >     >>
>>> >     >> On 2016-09-03 12:27 PM, Brett Cannon wrote:
>>> >     >>> Below is the `co_extra` section of PEP 523 with the update
>>> >     saying that
>>> >     >>> users are expected to put a tuple in the field for easier
>>> >     simultaneous
>>> >     >>> use of the field.
>>> >     >>>
>>> >     >>> Since the `co_extra` discussions do not affect CPython itself
>>> I'm
>>> >     >>> planning on landing the changes stemming from the PEP probably
>>> >     on Monday.
>>> >     >> Tuples are immutable.  If you have multiple co_extra users then
>>> >     they
>>> >     >> will have to either mutate tuple (which isn't always possible,
>>> for
>>> >     >> instance, you can't increase size), or to replace it with
>>> >     another tuple.
>>> >     >>
>>> >     >> Creating lists is a bit more expensive, but item access speed
>>> >     should be
>>> >     >> in the same ballpark.
>>> >     >>
>>> >     >> Another question -- sorry if this was discussed before -- why
>>> >     do we want
>>> >     >> a PyObject* there at all?  I.e. why don't we create a dedicated
>>> >     struct
>>> >     >> CoExtraContainer to manage the stuff in co_extra? My
>>> >     understanding is
>>> >     >> that the users of co_extra are C-level python optimizers and
>>> >     profilers,
>>> >     >> which don't need the overhead of CPython API.
>>> >
>>> >
>>> > As Chris pointed out in another email, the overhead is only in the
>>> > allocation, not the iteration/access if you use the PyTuple macros to
>>> > get the size and index into the tuple the overhead is negligible.
>>>
>>> Yes, my point was that it's as cheap to use a list as a tuple for
>>> co_extra.  If we decide to store PyObject in co_extra.
>>>
>>> >     >>
>>> >     >> This way my work to add an extra caching layer (which I'm very
>>> much
>>> >     >> willing to continue to work on) wouldn't require another set of
>>> >     extra
>>> >     >> fields for code objects.
>>> >     > Quick idea before I go to bed:
>>> >     >
>>> >     > You could adopt a similar API to OpenSSL's
>>> CRYPTO_get_ex_new_index()
>>> >     > API,
>>> >     >
>>> >
>>> https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html
>>> >     >
>>> >     >
>>> >     > static int code_index = 0;
>>> >     >
>>> >     > int PyCodeObject_NewIndex() {
>>> >     >      return code_index++;
>>> >     > }
>>> >     >
>>> >     > A library like Pyjion has to acquire an index first. In further
>>> >     calls it
>>> >     > uses the index as offset into the new co_extra field. Libraries
>>> >     don't
>>> >     > have to hard-code their offset and two libraries will never
>>> >     conflict.
>>> >     > PyCode_New() can pre-populate co_extra with a PyTuple of size
>>> >     > code_index. This avoids most resizes if you load Pyjion early.
>>> For
>>> >     > code_index == 0 leaf the field NULL.
>>> >
>>> >     Sounds like a very good idea!
>>> >
>>> >
>>> > The problem with this is the pre-population. If you don't get your
>>> > index assigned before the very first code object is allocated then you
>>> > still have to manage the size of the tuple in co_extra. So what this
>>> > would do is avoid the iteration but not the allocation overhead.
>>> >
>>> > If we open up the can of worms in terms of custom functions for this
>>> > (which I was trying to avoid), then you end up with Py_ssize_t
>>> > _PyCode_ExtraIndex(), PyObject *
>>> >   _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int
>>> > _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data)
>>> > which does all the right things for creating or resizing the tuple as
>>> > necessary and which I think matches mostly what Nick had proposed
>>> > earlier. But the pseudo-code for _PyCode_GetExtra() would be::
>>> >
>>> >   if co_extra is None:
>>> >     co_extra = (None,) * _next_extra_index;
>>> >     return None
>>> >   elif len(co_extra) < index - 1:
>>> >     ... pad out tuple
>>> >     return None
>>> >    else:
>>> >      return co_extra[index]
>>> >
>>> > Is that going to save us enough to want to have a custom API for this?
>>>
>>> But without that new API (basically what Christian proposed) you'd need
>>> to iterate over the list in order to find the object that belongs to
>>> Pyjion.
>>
>>
>> Yes.
>>
>>
>>>   If we manage to implement my opcode caching idea, we'll have at
>>> least two known users of co_extra.  Without a way to claim a particular
>>> index in co_extra you will have some overhead to locate your objects.
>>>
>>
>> Two things. One, I would want any new API to start with an underscore so
>> people know we can and will change its semantics as necessary. Two, Guido
>> would have to re-accept the PEP as this is a shift in the use of the field
>> if this is how people want to go.
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>>
> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160904/b4035114/attachment-0001.html>