[Python-Dev] Pre-PEP: Redesigning extension modules

Sun Sep 1 03:28:36 CEST 2013

On 1 Sep 2013 05:18, "Stefan Behnel" <stefan_ml at behnel.de> wrote:
>
> Nick Coghlan, 31.08.2013 18:49:
> > On 25 Aug 2013 21:56, "Stefan Behnel" wrote:
> >>>>> One key point to note is that it *doesn't* call
> >>>>> _PyImport_FixupExtensionObject, which is the API that handles all
the
> >>>>> PEP 3121 per-module state stuff. Instead, the idea will be for
modules
> >>>>> that don't need additional C level state to just implement
> >>>>> PyImportExec_NAME, while those that *do* need C level state
implement
> >>>>> PyImportCreate_NAME and return a custom object (which may or may not
> >>>>> be a module subtype).
> >>>>
> >>>> Is it really a common case for an extension module not to need any C
> >>>> level
> >>>> state at all? I mean, this might work for very simple accelerator
> >>>> modules
> >>>> with only a few stand-alone functions. But anything non-trivial will
> >>>> almost
> >>>> certainly have some kind of global state, cache, external library,
> >>>> etc.,
> >>>> and that state is best stored at the C level for safety reasons.
> >
> > In my experience, most extension authors aren't writing high
performance C
> > accelerators, they're exposing an existing C API to Python. It's the
cffi
> > use case rather than the Cython use case.
>
> Interesting. I can't really remember a case where I could afford the
> runtime overhead of implementing a wrapper in Python and going through
> something like ctypes or cffi. I mean, testing C libraries with Python
> tools would be one, but then, you wouldn't want to write an extension
> module for that and instead want to call it directly from the test code as
> directly as possible.
>
> I'm certainly aware that that use case exists, though, and also the case
of
> just wanting to get things done as quickly and easily as possible.

Keep in mind I first came to Python as a tool for test automation of custom
C++ hardware APIs that could be written to be SWIG friendly.

I now work for an OS vendor where the 3 common languages for system
utilities are C, C++ and Python.

For those use cases, dropping a bunch of standard Python objects in a
module dict is often going to be a quick and easy solution that avoids a
lot of nasty pointer lifecycle issues at the C level.

This style of extension code would suffer similar runtime checking overhead
as Python, including for function calls, but, like CPython itself, would
still often be "fast enough".

However, as soon as you want to manually optimise for *speed* at all,
you're going to want to remove those module internal indirections through
the Python API. There are at least three ways to do this (internally,
CPython uses all of them in various places):

* type checks followed by direct function calls on the optimised path and
falling back to the abstract object APIs on the compatibility path
* type checks followed by an exception for unknown types
* hidden state that isn't exposed directly at the Python level and hence
can be trusted to only be changed through the module APIs.

The third approach can be implemented in three ways, with various
consequences:

* C static variables. For mutable state, including pointers to Python
types, this breaks subinterpreters, reloading in place and loading a fresh
copy of the module
* PEP 3121 per-interpreter shared state. Handles subinterpreters, *may*
handle reloading (but may segfault if references are held to old types and
functions from before the reload), doesn't handle loading a fresh copy at
all.
* PEP 3121 with a size of "0". As above, but avoids the module state APIs
in order to support reloading. All module state (including type
cross-references) is stored in hidden state (e.g. an instance of a custom
type not exposed to Python, with a reference stored on each custom type
object defined in the module, and any module level "functions" actually
being methods of a hidden object). Still doesn't support loading a *fresh*
copy due to the hidden PEP 3121 module cache.

The proposed new approach is to bypass the PEP 3121 cache entirely, and
instead recommend providing an instance of a custom type to be placed in
sys.modules. Extension modules will be given the ability to explicitly
disallow in-place reloading *or* to make it work reliably, rather than the
status quo where the import system assumes it will work, and instead may
fail in unusual ways.

> > Mutable module global state is always a recipe for obscure bugs, and not
> > something I will ever let through code review without a really good
> > rationale. Hidden process global state is never good, just sometimes a
> > necessary evil.
>
> I'm not necessarily talking about mutable state. Rather about things like
> pre-initialised data or imported functionality. For example, I often have
a
> bound method of a compiled regex lying around somewhere in my Python
> modules as a utility function. And the same kind of stuff exists in C
code,
> some may be local to a class, but other things can well be module global.
> And given that we are talking about module internals here I'd always keep
> them at the C level rather than exposing them through the module dict. The
> module dict involves a much higher access overhead, in addition to the
> reduced safety due to user accessibility.
>
> Exported C-APIs are also a use case. You'd import the C-API of another
> module at init time and from that point on only go through function
> pointers etc. Those are (sub-)interpreter specific, i.e. they are module
> global state that is specific to the currently loaded module instances.

Due to refcounting, all instances of Python objects qualify as mutable
state.

Hopefully my elaboration above helps make it clear why I think it's
worthwhile to clearly separate out the "no custom C level state needed"
case.

> > However, keep in mind my patch is currently just the part I can
implement
> > without PEP 451 module spec objects.
>
> Understood.
>
>
> >> Note that even global functions usually hold state, be it in the form
of
> >> globally imported modules, global caches, constants, ...
> >
> > If they can be shared safely across multiple instances of the module
(e.g.
> > immutable constants), then these can be shared at the C level.
Otherwise, a
> > custom Python type will be needed to make them instance specific.
>
> I assume you meant a custom module (extension) type here.

Not sure yet. For PEP 451, we still need to support arbitrary objects in
sys.modules, so it's still possible that freedom will be made available to
extension modules.

>
> Just to be clear, the "module state at the C-level" is meant to be stored
> in the object struct fields of the extension type that implements the
> module, at least for modules that want to support reloading and
> sub-interpreters. Obviously, nothing should be stored in static (global)
> variables etc.

Right.

> >>> We also need the create/exec split to properly support reloading.
Reload
> >>> *must* reinitialize the object already in sys.modules instead of
> >>> inserting
> >>> a different object or it completely misses the point of reloading
> >>> modules
> >>> over deleting and reimporting them (i.e. implicitly affecting the
> >>> references from other modules that imported the original object).
> >>
> >> Interesting. I never thought of it that way.
> >>
> >> I'm not sure this can be done in general. What if the module has
threads
> >> running that access the global state? In that case, reinitialising the
> >> module object itself would almost certainly lead to a crash.

That's why I want a way for loaders in general (and extension modules in
particular) to clearly say "in-place reloading not supported", rather than
Python blundering ahead with it and risking a crash.

> > My current proposal on import-sig is to make the first hook
> > "prepare_module", and pass in the existing object in the reload case.
For
> > the extension loader, this would be reflected in the signature of the C
> > level hook as well, so the module could decide for itself if it
supported
> > reloading.
>
> I really don't like the idea of reloading by replacing module state. It
> would be much simpler if the module itself would be replaced, then the
> original module could stay alive and could still be used by those who hold
> a reference to it or parts of its contents. Especially the from-import
case
> would benefit from this. Obviously, you could still run into obscure bugs
> where a function you call rejects the input because it expects an older
> version of a type, for example. But I can't see that being worse (or even
> just different) from the reload-by-refilling-dict case.

Sure, this is what we do in the test suite in
"test.support.import_fresh_module". It was actually Eli trying to use that
in the etree tests that triggered our recent investigation of the limits of
PEP 3121 (it breaks for stateful extension modules due to the
per-interpreter caching).

It's a different operation from imp.reload, though. Assuming we can get
this stable and reliable in the new API, I expect we'll be able to add
"imp.reload_fresh" as a supported API in 3.5.

> You seemed to be ok with my idea of making the loader return a wrapped
> extension module instead of the module itself. We should actually try
that.

Sure, that's just a variant of the "hidden state object" idea I described
above. It should actually work today with the PEP 3121 custom storage size
set to zero.

> > This is actually my primary motivation for trying to improve the
> > "can this be reloaded or not?" aspects of the loader API in PEP 451.
>
> I assume you mean that the extension module would be able to clearly
signal
> that it can't be reloaded, right? I agree that that's helpful. If you're
> wrapping a C library, then the way that library is implemented might
simply
> force you to prevent any attempts at reloading the wrapper module. But if
> reloading is possible at all, it would be even more helpful if we could
> make it really easy to properly support it.

Yep, that's my goal (and why it's really good to be having this discussion
while PEP 451 is still in development).

>
>
> > (keep in mind existing extension modules using the existing API will
still
> > never be reloaded)
>
> Sure, that's the cool thing. We can really design this totally from
scratch
> without looking back.

Well, not *quite*. We need to ensure a module can implement both APIs can
coexist in the same module for source compatibility without nasty ifdef
hacks, and that there is a reasonable migration path for existing
handwritten extension modules.

> >>> Take a look at the current example - everything gets stored in the
> >>> module dict for the simple case with no C level global state.
> >>
> >> Well, you're storing types there. And those types are your module API.
I
> >> understand that it's just an example, but I don't think it matches a
> >> common
> >> case. As far as I can see, the types are not even interacting with each
> >> other, let alone doing any C-level access of each other. We should try
to
> >> focus on the normal case that needs C-level state and C-level field
access
> >> of extension types. Once that's solved, we can still think about how to
> >> make the really simple cases simpler, if it turns out that they are not
> >> simple enough.
> >
> > Our experience is very different - my perspective is that the normal
case
> > either eschews C level global state in the extension module, because it
> > causes so many problems, or else just completely ignores subinterpreter
> > support and proper module cleanup.
>
> As soon as you have more than one extension type in your module, and they
> interact with each other, they will almost certainly have to do type
checks
> against each other to make sure users haven't passed them rubbish before
> they access any C struct fields of the object. Doing a type check means
> that at least one type has a pointer to the other, meaning that it holds
> global module state.

Sure, but you can use the CPython API rather than writing normal C code. We
do this fairly often in CPython when we're dealing with things stored in
modules that can be manipulated from Python.

It incurs CPython's dynamic dispatch overhead, but sometimes that's worth
it to avoid needing to deal with C level lifecycle issues.

> I really think that having some kind of global module state is the
> exceedingly common case for an extension module.

I wouldn't be willing to make the call about which of stateless vs stateful
is more common without a lot more research :)

They're both common enough that I think they should both be well supported,
and making the "no custom C level state" case as simple as possible.

> >> I didn't know about PyType_FromSpec(), BTW. It looks like a nice
addition
> >> for manually written code (although useless for Cython).
> >
> > This is the only way to create custom types when using the stable ABI.
Can
> > I take your observation to mean that Cython doesn't currently offer the
> > option of limiting itself to the stable ABI?
>
> Correct. I've taken a bird's view at it back then, and keep stumbling over
> "wow - I couldn't even use that?" kind of declarations in the header
files.
> I don't think it makes sense for Cython. Existing CPython versions are
easy
> to support because they don't change anymore, and new major releases most
> likely need adaptations anyway, if only to adapt to new features and
> performance changes. Cython actually knows quite a lot about the inner
> workings of CPython and its various releases. Going only through the
stable
> ABI parts of the C-API would make the code horribly slow in comparison, so
> there are huge drawbacks for the benefit it might give.
>
> The Cython way of doing it is more like: you want your code to run on a
new
> CPython version, then use a recent Cython release to compile it. It may
> still work with older ones, but what you actually want is the newest
> anyway, and you also want to compile the C code for the specific CPython
> version at hand to get the most out of it. It's the C code that adapts,
not
> the runtime code (or Cython itself).
>
> We run continuous integration tests with all of CPython's development
> branches since 2.4, so we usually support new CPython releases long before
> they are out. And new releases of CPython rarely affect Cython user code.

The main advantage of the stable ABI is being able to publish cross-version
binary extension modules. I guess if Cython already supports generating
binaries for each new version of CPython before we release it, that
capability is indeed less useful than it is for those that are maintaining
extension modules by hand.

Cheers,
Nick.

>
> Stefan
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20130901/aa8d7cd8/attachment.html>