[Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra
Brett Cannon
brett at python.org
Sat Sep 3 20:36:39 EDT 2016
On Sat, 3 Sep 2016 at 17:27 Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>
> On 2016-09-03 5:19 PM, Brett Cannon wrote:
> >
> >
> > On Sat, 3 Sep 2016 at 16:43 Yury Selivanov <yselivanov.ml at gmail.com
> > <mailto:yselivanov.ml at gmail.com>> wrote:
> >
> >
> >
> > On 2016-09-03 4:15 PM, Christian Heimes wrote:
> > > On 2016-09-04 00:03, Yury Selivanov wrote:
> > >>
> > >> On 2016-09-03 12:27 PM, Brett Cannon wrote:
> > >>> Below is the `co_extra` section of PEP 523 with the update
> > saying that
> > >>> users are expected to put a tuple in the field for easier
> > simultaneous
> > >>> use of the field.
> > >>>
> > >>> Since the `co_extra` discussions do not affect CPython itself I'm
> > >>> planning on landing the changes stemming from the PEP probably
> > on Monday.
> > >> Tuples are immutable. If you have multiple co_extra users then
> > they
> > >> will have to either mutate tuple (which isn't always possible, for
> > >> instance, you can't increase size), or to replace it with
> > another tuple.
> > >>
> > >> Creating lists is a bit more expensive, but item access speed
> > should be
> > >> in the same ballpark.
> > >>
> > >> Another question -- sorry if this was discussed before -- why
> > do we want
> > >> a PyObject* there at all? I.e. why don't we create a dedicated
> > struct
> > >> CoExtraContainer to manage the stuff in co_extra? My
> > understanding is
> > >> that the users of co_extra are C-level python optimizers and
> > profilers,
> > >> which don't need the overhead of CPython API.
> >
> >
> > As Chris pointed out in another email, the overhead is only in the
> > allocation, not the iteration/access if you use the PyTuple macros to
> > get the size and index into the tuple the overhead is negligible.
>
> Yes, my point was that it's as cheap to use a list as a tuple for
> co_extra. If we decide to store PyObject in co_extra.
>
> > >>
> > >> This way my work to add an extra caching layer (which I'm very
> much
> > >> willing to continue to work on) wouldn't require another set of
> > extra
> > >> fields for code objects.
> > > Quick idea before I go to bed:
> > >
> > > You could adopt a similar API to OpenSSL's
> CRYPTO_get_ex_new_index()
> > > API,
> > >
> >
> https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html
> > >
> > >
> > > static int code_index = 0;
> > >
> > > int PyCodeObject_NewIndex() {
> > > return code_index++;
> > > }
> > >
> > > A library like Pyjion has to acquire an index first. In further
> > calls it
> > > uses the index as offset into the new co_extra field. Libraries
> > don't
> > > have to hard-code their offset and two libraries will never
> > conflict.
> > > PyCode_New() can pre-populate co_extra with a PyTuple of size
> > > code_index. This avoids most resizes if you load Pyjion early. For
> > > code_index == 0 leaf the field NULL.
> >
> > Sounds like a very good idea!
> >
> >
> > The problem with this is the pre-population. If you don't get your
> > index assigned before the very first code object is allocated then you
> > still have to manage the size of the tuple in co_extra. So what this
> > would do is avoid the iteration but not the allocation overhead.
> >
> > If we open up the can of worms in terms of custom functions for this
> > (which I was trying to avoid), then you end up with Py_ssize_t
> > _PyCode_ExtraIndex(), PyObject *
> > _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int
> > _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data)
> > which does all the right things for creating or resizing the tuple as
> > necessary and which I think matches mostly what Nick had proposed
> > earlier. But the pseudo-code for _PyCode_GetExtra() would be::
> >
> > if co_extra is None:
> > co_extra = (None,) * _next_extra_index;
> > return None
> > elif len(co_extra) < index - 1:
> > ... pad out tuple
> > return None
> > else:
> > return co_extra[index]
> >
> > Is that going to save us enough to want to have a custom API for this?
>
> But without that new API (basically what Christian proposed) you'd need
> to iterate over the list in order to find the object that belongs to
> Pyjion.
Yes.
> If we manage to implement my opcode caching idea, we'll have at
> least two known users of co_extra. Without a way to claim a particular
> index in co_extra you will have some overhead to locate your objects.
>
Two things. One, I would want any new API to start with an underscore so
people know we can and will change its semantics as necessary. Two, Guido
would have to re-accept the PEP as this is a shift in the use of the field
if this is how people want to go.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160904/db17e7cd/attachment.html>
More information about the Python-Dev
mailing list