[Python-Dev] Choosing a best practice solution for Python/extension modules

Jean-Paul Calderone exarkun at divmod.com
Sat Feb 21 20:32:50 CET 2009


On Sat, 21 Feb 2009 11:07:07 -0800, Brett Cannon <brett at python.org> wrote:
>On Sat, Feb 21, 2009 at 09:17, Jean-Paul Calderone <exarkun at divmod.com>wrote:
>
>> On Fri, 20 Feb 2009 13:45:26 -0800, Brett Cannon <brett at python.org> wrote:
>>
>>> On Fri, Feb 20, 2009 at 12:53, Aahz <aahz at pythoncraft.com> wrote:
>>>
>>>  On Fri, Feb 20, 2009, Brett Cannon wrote:
>>>> > On Fri, Feb 20, 2009 at 12:37, Brett Cannon <brett at python.org> wrote:
>>>> >> On Fri, Feb 20, 2009 at 12:31, Daniel Stutzbach <
>>>> >> daniel at stutzbachenterprises.com> wrote:
>>>> >>>
>>>> >>> A slight change would make it work for modules where only key
>>>> functions
>>>> >>> have been rewritten.  For example, pickle.py could read:
>>>> >>>
>>>> >>> from _pypickle import *
>>>> >>> try: from _pickle import *
>>>> >>> except ImportError: pass
>>>> >>
>>>> >> True, although that still suffers from the problem of overwriting
>>>> things
>>>> >> like __name__, __file__, etc.
>>>> >
>>>> > Actually, I take that back; the IMPORT_STAR opcode doesn't pull in
>>>> anything
>>>> > starting with an underscore. So while this alleviates the worry above,
>>>> it
>>>> > does mean that anything that gets rewritten needs to have a name that
>>>> does
>>>> > not lead with an underscore for this to work. Is that really an
>>>> acceptable
>>>> > compromise for a simple solution like this?
>>>>
>>>> Doesn't __all__ control this?
>>>>
>>>
>>>
>>> If you define it, yes.
>>>
>>> But there is another issue with this: the pure Python code will never call
>>> the extension code because the globals will be bound to _pypickle and not
>>> _pickle. So if you have something like::
>>>
>>>  # _pypickle
>>>  def A(): return _B()
>>>  def _B(): return -13
>>>
>>>  # _pickle
>>>  def _B(): return 42
>>>
>>>  # pickle
>>>  from _pypickle import *
>>>  try: from _pickle import *
>>>  except ImportError: pass
>>>
>>> If you import pickle and call pickle.A() you will get -13 which is not
>>> what
>>> you are after.
>>>
>>
>> If pickle and _pypickle are both Python modules, and _pypickle.A is
>> intended
>> to be used all the time, regardless of whether _pickle is available, then
>> there's not really any reason to implement A in _pypickle.  Just implement
>> it
>> in pickle.  Then import whatever optionally fast thing it depends on from
>> _pickle, if possible, and fall-back to the less fast thing in _pypickle
>> otherwise.
>>
>> This is really the same as any other high-level/low-level library split.
>>  It
>> doesn't matter that in this case, one low-level implementation is provided
>> as
>> an extension module.  Importing the low-level APIs from another module and
>> then using them to implement high-level APIs is a pretty common, simple,
>> well-understood technique which is quite applicable here.
>
>
>But that doesn't provide a clear way, short of screwing with sys.modules, to
>get at just the pure Python implementation for testing when the extensions
>are also present. The key point in trying to figure this out is to
>facilitate testing since the standard library already uses the import *
>trick in a couple of places.

"screwing with sys.modules" isn't a goal.  It's a means of achieving a goal,
and not a particularly good one.

I guess I overedited my message, sorry about that.  Originally I included
an example of how to parameterize the high-level API to make it easier to
test (or use) with any implementation one wants.  It went something like
this:

    try:
        import _pickle as _lowlevel
    except ImportError:
        import _pypickle as _lowlevel

    class Pickler:
        def __init__(self, implementation=None):
            if implementation is None:
                implementation = _lowlevel
            self.dump = implementation.dump
            self.load = implementation.load
            ...

Perhaps this isn't /exactly/ how pickle wants to work - I haven't looked at
how the C extension and the Python code fit together - but the general idea
should apply regardless of those details.

Jean-Paul


More information about the Python-Dev mailing list