
Hey folks, What do you think about making it easier to use packages by automatically importing submodules on attribute access. Consider this example: >>> import matplotlib >>> figure = matplotlib.figure.Figure() AttributeError: 'module' object has no attribute 'figure' For the newcomer (like me some months ago) it's not obvious that the solution is to import matplotlib.figure. Worse even: it may sometimes/later on work, if the submodule has been imported from another place. How, I'd like it to behave instead (in pseudo code, since `package` is not a python class right now): class package: def __getattr__(self, name): try: return self.__dict__[name] except KeyError: # either try to import `name` or raise a nicer error message The automatic import feature could also play nicely when porting a package with submodules to or from a simple module with namespaces (as suggested in [1]), making this transition seemless to any user. I'm not sure about potential problems from auto-importing. I currently see the following issues: - harmless looking attribute access can lead to significant code execution including side effects. On the other hand, that could always be the case. - you can't use attribute access anymore to test whether a submodule is imported (must use sys.modules instead, I guess) In principle one can already make this feature happen today, by replacing the object in sys.modules - which is kind of ugly and has probably more flaws. This would also be made easier if there were a module.__getattr__ ([2]) or "metaclass" like feature for modules (which would be just a class then, I guess). Sorry, if this has come up before and I missed it. Anyhow, just interested if anyone else considers this a nice feature. Best regards, Thomas [1] https://mail.python.org/pipermail/python-ideas/2014-September/029341.html [2] https://mail.python.org/pipermail/python-ideas/2012-April/014957.html

On 24.09.2014 19:57, Thomas Gläßle wrote:
Agreed, it's a nice feature :-) I've been using this in our mx packages since 1999 using a module called LazyModule.py. See e.g. http://educommons.com/dev/browser/3.2/installers/windows/src/eduCommons/pyth... Regarding making module more class like: we've played with this a bit at PyCon UK and it's really easy to turn a module into a regular class (with all its features) by tweaking sys.modules - we even got .__getattr__() to work. With some more effort, we could have a main() function automatically called upon direct import from the command line. The whole thing is a huge hack, though, so I'll leave out the details :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 24 2014)
2014-09-27: PyDDF Sprint 2014 ... 3 days to go 2014-09-30: Python Meeting Duesseldorf ... 6 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Sep 24, 2014, at 11:10, "M.-A. Lemburg" <mal@egenix.com> wrote:
Could LazyModule be easily added to the stdlib, or split out into a separate PyPI package? It seems to me that would be a pretty good solution. Today, a package has to eagerly preload modules, make the users do it manually, or write a few dozen lines of code to lazily load modules on demand, so it's not surprising that many of them don't use the third option even when it would be best for their users. If that could be one or two lines instead, I'm guessing a lot more packages would do so.

On Wed, Sep 24, 2014 at 2:22 PM, Andrew Barnert < abarnert@yahoo.com.dmarc.invalid> wrote:
Could LazyModule be easily added to the stdlib, or split out into a separate PyPI package?
How is it different from apipkg? https://pypi.python.org/pypi/apipkg/1.2

On Sep 29, 2014, at 7:15, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
No idea. Could apipkg be easily added to the stdlib? Is it actively maintained? ("virtually all Python versions, including CPython2.3 to Python3.1" sounds a bit worrisome...). Does it provide all the same functionality as Mark-Andre's package? If the answers are all "yes" then you can take my message as support for adding either one.

On Mon, Sep 29, 2014 at 10:25 AM, Andrew Barnert <abarnert@yahoo.com> wrote:
Is [apipkg] actively maintained?
It is distributed as a part of the popular "py" library, so I would assume it is fairly well maintained. See <http://pylib.readthedocs.org/en/latest/>.

On 24.09.2014 20:22, Andrew Barnert wrote:
If there's enough interest, then yes, separating it out into a PyPI package or adding it to the stdlib would be an option. The code is pretty simple. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 30 2014)
2014-09-30: Python Meeting Duesseldorf ... today ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Wed, Sep 24, 2014 at 7:10 PM, M.-A. Lemburg <mal@egenix.com> wrote:
Indeed. I can think of multiple places where there are compelling reasons to want to hook module attribute lookup: Lazy loading: as per above. E.g., ten years ago for whatever reason, someone decided that 'import numpy' ought to automatically execute 'import numpy.testing' as well. So now backcompat means we're stuck with it. 'import numpy.testing' is rather slow, to the point that it can be a substantial part of the total overhead for launching numpy-using scripts. We get bug reports about this, from people who are irritated that their production code is spending all this time loading unit-test harnesses and whatnot that it doesn't even use. Module attribute deprecation: For reasons that are even more lost in the mists of time, numpy re-exports some objects from the __builtins__ namespace (e.g., numpy.float exists but is __builtins__.float; if you want the default numpy floating-point type you have to write numpy.float_). As you can probably imagine this is massively confusing to everyone, but if we just removed these re-exports then it would break existing working code (e.g., 'numpy.array([1, 2, 3], dtype=numpy.float)' does work and do the right thing right now), so according to our deprecation policy we have to spend a few releases issuing warnings every time someone writes 'numpy.float'. Which requires executing arbitrary code at attribute lookup time. I think both of these use cases arise very commonly in long-lived projects, but right now the only ways to accomplish either of these things involve massive disgusting hacks. They are really really hard to do cleanly, and you risk all kinds of breakage in edge-cases (e.g. try reload()'ing a module that's been replaced by an object). So, we haven't dared release anything like this in production, and the above problems just hang around indefinitely. What I'd really like is for module attribute lookup to start supporting the descriptor protocol. This would be super-easy to work with and fast (you only pay the extra overhead for the attributes which have been hooked). -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

Nathaniel Smith wrote on 09/25/2014 07:07 AM:
I'm not sure, I picture this the same way you intended, but I believe supporting the descriptor protocol is too confusing and breaks too much code in many cases. You wouldn't normally expect to execute x.__get__, etc on module attribute access if you are just trying to export some object x that happens to be a descriptor.

I love it. +1 :). On 25 September 2014 20:16, Thomas Gläßle <t_glaessle@gmx.de> wrote:
-- -------------------------------------------------- Tennessee Leeuwenburg http://myownhat.blogspot.com/ "Don't believe everything you think"

Nathaniel Smith wrote on 09/25/2014 07:07 AM:
The reason I brought implicit imports up in isolation from (well, maybe not isolated enough) supporting a module.__getattr__ protocol altogether, is that it's much less involved. The former can be added without also adding the latter and already cover a lot of its use cases. If module.__getattr__ can be added, I'm all for it. But it also suggests to enable other class-like features in modules, which might not be so easy anymore, conceptually. In contrast, IMO, it is natural to expect package.module to *just work*, regardless of whether the submodule has already been imported. At least, if packages were only collections of modules. Maybe, this is the more fundamental problem with packages. They are more like module/package hybrids with a mixed-up namespace. This also causes other irritating issues. E.g.: package/__init__.py: foo = "foo" from . import foo from . import bar bar = "bar" baz = "baz" # has the following submodules: package/foo.py: ... package/bar.py: ... package/baz.py: ... user: >>> package.foo <module ...> >>> package.bar bar >>> import package.bar as bar >>> bar # not the module you might expect.. bar >>> package.baz baz >>> from package import baz baz >>> import package.baz as baz >>> baz <module ...> The "baz" case can be especially confusing. I know, you shouldn't write code like this. But sometimes it happens, because it's just so easy.

Nathaniel Smith wrote:
One small thing that might help is to allow the __class__ of a module to be reassigned to a subclass of the module type. That would allow a module to be given custom behaviours, while remaining a real module object so that reload() etc. continue to work. -- Greg

On Thu, Sep 25, 2014 at 11:31 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Heh, I was actually just pondering whether it would be opening too big a can of worms to suggest this myself. This is the best design I managed to come up with last time I looked at it, though in existing versions of python it requires ctypes hackitude to accomplish the __class__ reassignment. (The advantages of this approach are that (1) you get to use the full class machinery to define your "metamodule", (2) any existing references to the module get transformed in-place, so you don't have to worry about ending up with a mixture of old and new instances existing in the same program, (3) by subclassing and avoiding copying you automatically support all the subtleties and internal fields of actual module objects in a forward- and backward-compatible way.) This would work today, and would solve all these problems, except for the following code in Objects/typeobject.c:object_set_class: if (!(newto->tp_flags & Py_TPFLAGS_HEAPTYPE) || !(oldto->tp_flags & Py_TPFLAGS_HEAPTYPE)) { PyErr_Format(PyExc_TypeError, "__class__ assignment: only for heap types"); return -1; } if (compatible_for_assignment(oldto, newto, "__class__")) { Py_INCREF(newto); Py_TYPE(self) = newto; Py_DECREF(oldto); return 0; } The builtin "module" type is not a HEAPTYPE, so if we try to do mymodule.__class__ = mysubclass, then the !(oldto->tp_flags & Py_TPFLAGS_HEAPTYPE) check gets triggered and the assignment fails. This code has been around forever, but I don't know why. AFAIK we could replace the above with if (compatible_for_assignment(oldto, newto, "__class__")) { if (newto->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_INCREF(newto); } Py_TYPE(self) = newto; if (oldto->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_DECREF(oldto); } return 0; } and everything would just work, but I could well be missing something? Is there some dragon lurking inside Python's memory management or is this just an ancient overabundance of caution? -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On Thu, Sep 25, 2014, at 21:02, Nathaniel Smith wrote:
Currently, this is the message you get if you attempt to reassign the class of a list, or an int. Is there something else that would prevent it? Maybe the "object layout differs" check? What if the class you are assigning is a legitimate subclass of the basic type?

On Thu, Sep 25, 2014 at 8:32 PM, <random832@fastmail.us> wrote:
IIRC the caution is for the case where a built-in type has its own allocation policy, such as the custom free list used by float and a few other types. The custom deallocation code is careful not to use the free list for subclass instances. But (depending on how the free list is implemented) if you could switch the type out for an object that's in a custom free list, the free list could become corrupt. There is no custom allocation for modules, and even for float I don't see how switching types back and forth between float and a subclass could corrupt the free list (assuming the struct size and layout constraints are met), but it is certainly possible to have a custom allocation policy that would be broken. So indeed the smell of dragons is still there (they may exist in 3rd party modules). Perhaps we can rename HEAPTYPE to NO_CUSTOM_ALLOCATOR and set it for most built-in types (or at least for the module type) and all will be well. -- --Guido van Rossum (python.org/~guido)

On Fri, Sep 26, 2014, at 00:24, Guido van Rossum wrote:
For float I'd be worried more about the fact that it's supposed to be immutable. It would be entirely reasonable for an implementation to make all floats with the same value the same object (as cpython does do for ints in a certain range), and what happens if you change its type then? And even if it doesn't do so, it does for literals with the same value in the same function. So, realistically, an immutable type (especially an immutable type which has literals or another interning mechanism) needs to forbid __class__ from being assigned.

On Fri, Sep 26, 2014 at 10:43 AM, <random832@fastmail.us> wrote:
That's also a good one, but probably not exactly what the code we're discussing is protecting against -- the same issue could happen with immutable values implemented in pure Python. It's likely though that the HEAPTYPE flag is a proxy for a variety of invariants maintained for the built-in base types, and that is what makes it smell like dragon. -- --Guido van Rossum (python.org/~guido)

On Fri, 26 Sep 2014 02:02:01 +0100 Nathaniel Smith <njs@pobox.com> wrote:
The tp_dealloc for a heap type is not the same as the non-heap base type's tp_dealloc. See subtype_dealloc() in typeobject.c. Switching the __class__ would deallocate the instance with an incompatible tp_dealloc. (in particular, a heap type is always incref'ed when an instance is created and decref'ed when an instance is destroyed, but the base type wouldn't) Also, look at compatible_for_assignment(): it calls same_slots_added() which assumes both args are heap types. Note that this can be a gotcha when using the stable ABI: http://bugs.python.org/issue16690 Regards Antoine.

Antoine Pitrou wrote:
It looks like the easiest way to address this particular use case would be to make the module type a heap type. In the long term, how about turning *all* types into heap types? We're already having to call PyType_Ready on all the static type objects, so allocating them from the heap shouldn't incur much extra overhead. Seems to me that this would simplify a lot of the cpython code and make it easier to maintain. As it is, thinking about all the tricky differences between heap and non-heap types makes my head hurt. -- Greg

On Sep 26, 2014, at 14:43, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
What about extension modules? Deprecate static types? Automatically copy them to heap types? Use some horrible macro tricks in Python.h or a custom preprocessor in distutils?

On Sat, Sep 27, 2014 at 12:43 AM, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
I think the name "heap types" is misleading. The actual distinction being made isn't really about where the type object is allocated. Static type objects are still subject to the refcounting machinery in most cases (try sys.getrefcount(int)), but this is fine because the refcount never reaches zero. AFAICT from skimming the source a bit, what happened back in the 2.2 days is that the devs went around fixing all the random places where the assumption that all type objects were immortal had snuck in, and they hid all this fixes behind a generic switch called "heap types". It's all stuff like "we'll carefully only do standard refcounting if HEAPTYPE is set" (even though refcounting could be applied to all types without causing any problems), or "we will disable the GC machinery when walking non-heap types" (even though again, who cares), or "heap types all use the same tp_dealloc function". I'm sure some of this stuff we're stuck with due to backcompat with C extension modules that make funny assumptions, but presumably a lot of it could be cleaned up -- I think that's what Greg means. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On Sat, 27 Sep 2014 01:03:04 +0100 Nathaniel Smith <njs@pobox.com> wrote:
Static type objects are still subject to the refcounting machinery in most cases (try sys.getrefcount(int)),
So what about it? :-)
Regards Antoine.

On 27 Sep 2014 01:08, "Antoine Pitrou" <solipsis@pitrou.net> wrote:
Yes, that's why I said "most cases", not all cases :-). My point was that being statically allocated doesn't make list a special snowflake that *needs* some sort of protection from refcounting. If heap and non-heap types were treated the same in this regard then nothing horrible would happen. -n

On Sep 26, 2014, at 17:03, Nathaniel Smith <njs@pobox.com> wrote:
Yes, I wasn't sure whether Greg was suggesting to get rid of actual non-heap-allocated types, or just making static types fit HEAPTYPE. The former would be a lot more work, but it would also allow simplifying a lot of additional things, so they both seem like reasonable things to suggest (whether or not they're both reasonable things to actually do),
Well, there's obviously a non-zero performance cost to doing all this stuff with all types. Of course there's also a non-zero cost to checking the heap-type-ness of all types. And both costs may be so minimal they're hard to even measure.

Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
With branch prediction on a modern CPU an "if unlikely()" can probably push it down to inpunity. Both the Linux kernel and Cython does this liberally. Sturla

Sturla Molden <sturla.molden@gmail.com> wrote:
With branch prediction on a modern CPU an "if unlikely()" can probably push it down to inpunity. Both the Linux kernel and Cython does this liberally.
Just for reference, the definition of these macros in Cython and Linux are: #define likely(x) __builtin_expect(!!(x), 1) #define unlikely(x) __builtin_expect(!!(x), 0) Typical usecases are fd = open(...); if (unlikely(fd < 0)) { /* handle unlikely error */ } or ptr = malloc(...); if (unlikely(!ptr)) { /* handle unlikely error */ } If the conditionals fail, these checks have exactly zero impact on the run-time with a processor that supports branch prediction. Microsoft compilers don't know about __builtin_expect, but GCC, Clang and Intel compilers know what to do with it. Sturla

On Sun, 28 Sep 2014 17:13:06 +0000 (UTC) Sturla Molden <sturla.molden@gmail.com> wrote:
If the conditionals fail, these checks have exactly zero impact on the run-time with a processor that supports branch prediction.
Branch prediction is typically implemented using branch predictors, which is a form of cache updated with the results of previous branches. "Impunity" can therefore only be achieved with an infinite number of branch predictors :-) Regards Antoine.

On Sep 28, 2014, at 9:44, Sturla Molden <sturla.molden@gmail.com> wrote:
On what modern CPU does unlikely have any effect at all? x86 has an opcode to provide static branch prediction hints, but it's been a no-op since Core 2; ARM doesn't have one; I don't know about other instruction sets but I'd be surprised if they did. And that's a good thing. If that macro still controlled branch prediction, using it would mean blowing away the entire pipeline on every use of a non-heap type. A modern CPU will use recent history to decide which branch is more likely, so whether your loop is using a heap type or a non-heap type, it won't mispredict anything after the first run through the loop.

Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
AFAIK, the branch prediction is somewhat controlled by the order of instructions. And this compiler hint allows the compiler to restructure the code to better exploit this behavior. It does not result in specific opcodes being inserted. Sturla

Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
http://madalanarayana.wordpress.com/2013/08/29/__builtin_expect-a-must-for-s... http://benyossef.com/helping-the-compiler-help-you/

On Sep 28, 2014, at 13:17, Sturla Molden <sturla.molden@gmail.com> wrote:
The example in this post shows the exact opposite of what it purports to: the generated code puts the unlikely i++ operation immediately after the conditional branch; because Haswell processors assume, in the absence of any information, that forward branches are unlikely, this will cause the wrong branch to be speculatively executed. In other words, gcc has completely ignored the builtin_expect here--as it often does. Also note the comment in the quoted source:
In general, you should prefer to use actual profile feedback for this (`-fprofile-arcs'), as programmers are notoriously bad at predicting how their programs actually perform
This one vaguely waves its hands at the idea without providing any examples, before concluding:
It should be noted that GCC also provide a run time parameter -fprofile-arcs, which can profile the code for the actual statistics for each branch and the use of it should be prefered above guessing.
Meanwhile, this whole thing started with you saying that branch prediction means we can add conditional checks "with impunity". The exact opposite is true. On older processors, we _could_ issue checks with impunity; branch prediction means they're now an order of magnitude more expensive than they used to be unless we're very careful. The ability to hint the CPU by rearranging code (whether manually, with builtin_expect, or using PGO) partly mitigated this effect, but it doesn't reverse it. And at any rate, consider the case we're talking about. We have some heap types and some non-heap types. Neither branch is very unlikely, which means that no matter which version you mark as unlikely, it's going to be wrong quite often. Which means, exactly as I said at the start, that the check for non-heap it not free. Unnecessary refcounts are also not free. Which one is more costly? Is either one costly enough to matter? Hell if I know; that's the kind of thing you pretty much have to test. Trying to reason it from first principles is hard enough even if you get all the principles right, but even harder if you're thinking in terms of P4 chips.

On Sun, Sep 28, 2014, at 22:58, Andrew Barnert wrote:
And at any rate, consider the case we're talking about. We have some heap types and some non-heap types. Neither branch is very unlikely,
What? It is very unlikely, especially in existing code where it won't work at all, for someone to attempt to reassign the __class__ of a non heap type object. We are not talking about something that gets run on every object.

<random832@fastmail.us> wrote:
And because of that it is better to have the pipeline flushed whenever it happens, rather than, say, 50 % of the times it might happen. But I aggree with Andrew that it is something we should try to measure. Similarly, tagging functions 'hot' or 'cold' might also be a good idea. We know there are functions that will execute a lot, and there are error handlers that will only rarely be run. Anyone that has used Fortran will also know that tagging a function 'pure' is of great help to the compiler, particularly if arrays or pointers are involved. This informs the compiler that the function has no side effects. For example if we assert that a function like sin(x) is pure, it does not have to assume that calling this function will change something elsewhere. In Fortran it is a keyword, but we can use it in C as a GNU extension. Sturla

On Sep 29, 2014, at 6:27, random832@fastmail.us wrote:
Look at the subject of this thread. Go back to the first message in the thread. Greg's suggestion is that, instead of just working around the __class__ assignment test, "I'm thinking it should be possible to reduce the differences to the point where [heap allocation itself is] the *only* distinction, so the vast majority of code doesn't have to care, and the same tp_* functions can be used for both." That's what we're talking about here. Is there a potential performance impact for making all of those changes? There could be a benefit from removing the tests; there could be a cost from adding work we didn't used to do (e.g., extra refcounting or other tests that we can currently skip). So, the fact that the one check on __class__ can be statically predicted pretty well doesn't have much to do with the potential cost or benefit of removing all of the differences between heap and non-heap types instead of just the check on __class__.

Nathaniel Smith wrote:
Yes, it's probably not necessary to actually allocate them on the heap (that would cause big problems for existing extension modules that assume they can statically declare them). But I'm thinking it should be possible to reduce the differences to the point where that's the *only* distinction, so the vast majority of code doesn't have to care, and the same tp_* functions can be used for both. -- Greg

On Sep 25, 2014, at 18:02, Nathaniel Smith <njs@pobox.com> wrote:
When I tried this a year or two ago, I did I with an import hook that allows you to specify metaclass=absolute.qualified.spam in any comment that comes before any non-comment lines, so you actually construct the module object as a subclass instance rather than re-classing it. In theory that seems a lot cleaner. In practice it's a weird way to specify your type; it only works if the import-hooking module and the module that defines your type have already been imported and otherwise silently does the wrong thing; and my implementation was pretty hideous. Is there a cleaner version of that we could do if we were modifying the normal import machinery instead of hooking it, and if it didn't have to work pre-3.4, and if it were part of the language instead of a hack? IIRC (too hard to check from my phone on the train), a module is built by calling exec with a new global dict and then calling the module constructor with that dict, so it's just a matter of something like: cls = g.get('__metamodule__', module) if not issubclass(cls, module): raise TypeError('metamodule {} is not a module type'.format(cls)) mod = cls(name, doc, g) # etc. Then you could import the module subclass and assign it to __metamodule__ from inside, rather than needing to pre-import stuff, and you'd get perfectly understandable errors, and so on. It seems less hacky and more flexible than re-classing the module after construction, for the same reason metaclasses and, for that matter, normal class constructors are better than reclassing after the fact. Of course I could be misremembering how modules are constructed, in which case... Never mind.

On 26 Sep 2014 15:59, "Andrew Barnert" <abarnert@yahoo.com> wrote:
allows you to specify metaclass=absolute.qualified.spam in any comment that comes before any non-comment lines, so you actually construct the module object as a subclass instance rather than re-classing it.
In theory that seems a lot cleaner. In practice it's a weird way to
specify your type; it only works if the import-hooking module and the module that defines your type have already been imported and otherwise silently does the wrong thing; and my implementation was pretty hideous.
Is there a cleaner version of that we could do if we were modifying the
normal import machinery instead of hooking it, and if it didn't have to work pre-3.4, and if it were part of the language instead of a hack?
IIRC (too hard to check from my phone on the train), a module is built by
calling exec with a new global dict and then calling the module constructor with that dict, so it's just a matter of something like:
from inside, rather than needing to pre-import stuff, and you'd get perfectly understandable errors, and so on.
It seems less hacky and more flexible than re-classing the module after
construction, for the same reason metaclasses and, for that matter, normal class constructors are better than reclassing after the fact.
Of course I could be misremembering how modules are constructed, in which
case... Never mind. Alas, in this regard module objects are different than classes; they're constructed and placed in sys.modules before the body is exec'ed. And unfortunately it has to work that way, because if foo/__init__.py does 'import foo.bar', then the module 'foo' has to be immediately resolvable, before __init__.py finishes executing. A similar issue arises for circular imports. So this would argue for your 'magic comment' or special syntax approach, sort of like how __future__ imports work. This part we could do non-hackily if we modified the import mechanism itself. But we'd still have the other problem you mention, that the metamodule would have to be defined before the module is imported. I think this is a showstopper, given that the main use cases for metamodule support involve using it for top-level package namespaces. If the numpy project wants to define a metamodule for the 'numpy' namespace, then where do they put it? So I think we necessarily will always start out with a regular module object, and our goal is to end up with a metamodule instance instead. If this is right then it means that even in principle we really only have two options, so we should focus our attention on these. Option 1: allocate a new object, shallowly copy over all the old object properties into the new one, and then find all references to the old object and replace them with the new object. This is possible right now, but error prone: cloning a module object requires intimate knowledge of which fields exist, and swapping all the references requires that we be careful to perform the swap very early, when the only reference is the one in sys.modules. Option 2: the __class__ switcheroo. This avoids the two issues above. In exchange it's fairly brain-hurty. Oh wait, I just thought of a third option. It only works for packages, but that's okay, you can always convert a module into a package by a simple mechanical transformation. The proposal is that before exec'ing __init__.py, we check for the existence of a __preinit__.py, and if found we do something like sys.modules[package] = sentinel to block circular imports namespace = {} exec __preinit__.py in namespace cls = namespace.get("__metamodule___", ModuleType) mod = cls(name, doc, namespace) sys.modules[package] = mod exec __init__.py in namespace So preinit runs in the same namespace as init, but with a special restriction that if it tries to (directly or indirectly) import the current package, then this will trigger an ImportError. This is somewhat restrictive, but it does allow arbitrary code to be run before the module object is created. -n

On Sep 26, 2014, at 12:12, Nathaniel Smith <njs@pobox.com> wrote:
I had an email written just to say "this sounds brilliant, but why isn't it called __new__", with three paragraphs explaining why it was a good analogy... Now I guess I can delete draft. :) Anyway, I definitely like this better than re-classing modules in mid-initialization, and better than my magic comment hack (and looking at the code again, of course you're right that my magic comment hack was necessary with anything like my approach, I guess I just forgot in the intervening time).

Thomas Gläßle wrote on 09/26/2014 11:03 PM:
On second thought. Scratch all of that. This is easy enough to do it a few lines of code and customize to the specific use case. Sorry for the noise, I think it's too late for my brain to work well;) Using an __autoimport__ list could still be an option if not resorting to the metamodule.

On Fri, Sep 26, 2014 at 12:43:12PM -0700, Ethan Furman wrote:
I don't know that this is strictly necessary. You can put anything you like into sys.modules, and reload() just raises a TypeError: py> sys.modules['spam'] = 23 py> import spam py> spam 23 py> reload(spam) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: reload() argument must be module Since reload() is mostly intended as a convenience at the REPL, I'd be willing to forgo that convenience for special "modules". Or perhaps these special "modules" could subclass ModuleType and somehow get reloading to work correctly. In 2.7 at least you can manually copy a module to a module subclass, install it into sys.modules, and reload will accept it. Not only that, but after reloading it still uses the same subclass. Unfortunately, when I tried it in 3.3, imp.reload complained about my custom module subclass not being a module, so it seems that 3.3 at least is more restrictive than 2.7. (Perhaps 3.3 reload does a "type(obj) is ModuleType" instead of isinstance test?) Nevertheless, I got this proof of concept more-or-less working in 2.7 and 3.3: import sys from types import ModuleType class MagicModule(ModuleType): def __getattr__(self, name): if name == "spam": return "Spam spam spam!" raise AttributeError eggs = 23 _tmp = MagicModule(__name__) _tmp.__dict__.update(sys.modules[__name__].__dict__) sys.modules[__name__] = _tmp del _tmp -- Steven

On Sep 26, 2014, at 17:33, Steven D'Aprano <steve@pearwood.info> wrote:
I don't know about 3.3 (and who cares?), but in trunk it's an isinstance test: https://hg.python.org/cpython/file/default/Lib/importlib/__init__.py#l115

On Sat, Sep 27, 2014 at 1:33 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Yeah, it looks like 3.3 does an explicit 'type(obj) is ModuleType' check, but is the only version that works like this -- earlier and later versions both use isinstance.
This approach won't work well for packages -- imagine that instead of 'eggs = 23', the body of the file imports a bunch of submodules. If those submodules then import the top-level package in turn, then they'll end up with the original module object and namespace, not the modified one. One could move the sys.modules assignment up to the top of the file, but you can't move the __dict__.update call up to the top of the file, because you can't copy the old namespace until after it's finished being initialized. OTOH leaving the __dict__.update at the bottom of the file is pretty risky too, because then any submodule that imports the top-level package will see a weird inconsistent view of it until after the import has finished. The solution is, instead of having two dicts and updating one to match the other, simply point the new module directly at the existing namespace dict, so they always stay in sync: _tmp = MagicModule(__name__) _tmp.__dict__ = sys.modules[__name__].__dict__ ...except this gives an error because module objects disallow assignment to __dict__. Sooooo you're kinda doomed no matter what you do. -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

About apipkg... When using apipkg, you define your module's API in one package, and implement it in another: # mypkg/__init__.py import apipkg apipkg.initpkg(__name__, { 'path': { 'Class1': "_mypkg.somemodule:Class1", 'clsattr': "_mypkg.othermodule:Class2.attr", } } apipkg replaces sys.modules[mypkg] with a subclass of ModuleType. Anything in the apipkg is exposed under an alias and lazily imported on first use, including submodules. I've really enjoyed using it. It lets me think about the API as a separate entity from the implementation, and it lets me delay a slow import until during the first method call, for much more pleasing interactive tinkering.

On Sep 24, 2014, at 10:57, Thomas Gläßle <t_glaessle@gmx.de> wrote:
Doesn't IPython already have this feature as an option? I know that not everyone who uses scipy and matplotlib uses IPython, and they aren't the only two packages used by novices that have sub modules they don't automatically import for you, but... I'm guessing the percentages are high. Of course this support could also be added to scipy and matplotlib itself. And maybe importlib could have a function that makes automatic lazy loading of submodules on demand a one-liner for packages that want to support it.

On Wed, Sep 24, 2014 at 2:14 PM, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
I don't *think* IPython has exactly this feature. Rather, when typing an import statement it will check to see if there are any submodules and add them to the autocomplete list. So I don't think typing
import foo
will automatically mean a submodule foo.bar is imported, even though it shows up on the autocomplete list when typing the import statement. I could be wrong about this though. In any case unless it were a feature built into Python I think this has potential to be highly confusing to newcomers. They might type
import scipy
and start using scipy.stats in their code. But then when they dump this code to a script they won't have `import scipy.stats` in script, just `import scipy`. Then, suddenly, when they write their script they'll get `AttributeError: stats` and then come complaining to some mailinglist or StackOverflow that something broke their scipy installation ;)
This definitely has some appeal though, and shouldn't be outside the realm of possibility. I especially like the suggestion of making it optional. Erik

On Wed, Sep 24, 2014 at 07:57:36PM +0200, Thomas Gläßle wrote:
I think it is a bad idea. And yet I also think that optionally supporting module.__getattr__ and friends is a good idea. What's the difference between the two? (1) "Automatically importing submodules on attribute access" implies that every package and module does this, whether it is appropriate for it or not. (2) Building some sort of support for module.__getattr__ implies that it is opt-in. Whatever the mechanism ends up being, the module author has to actively make some sort of __getattr__ hook. The Zen already has something to say about this: Explicit is better than implicit. Automatic importing is implicit importing. Now, of course the Zen is not an absolute, and modules/packages can preload sub-modules if they so choose, e.g. os automatically imports os.path. But as a general rule if you want to import a module, you have to import the module, not it's parent.
I sympathise. This issue comes up occasionally on the tutor@ and python-list@python.org mailing lists. Beginners sometimes don't understand when they need to do an import and when they dont, so we get things like `import sys.version`. In hindsight, it is a little unfortunate that package dotted names and attribute access use the same notation.
True, but today it is quite rare that the second line: import spam spam.thing will execute arbitrary code. (The initial import will, of course.) By making importing automatic, every failed attribute access has to determine whether or not there is a sub-module to import, which could be quite expensive. -- Steven

On 26 September 2014 18:44, Chris Angelico <rosuav@gmail.com> wrote:
It's also worth noting the caution in https://docs.python.org/dev/library/importlib.html#importlib.util.LazyLoader Yes, the AttributeError when you try to access a submodule that hasn't been imported yet can be a little confusing, but it's positively crystal clear compared to the confusion you encounter when an attribute access attempt fails with ImportError (or, worse, if the AttributeError is hiding an import error). Explicit, eager imports make it clear when module level code execution might be triggered, with all the associated potential for failure (whether in module lookup, in compilation, in bytecode caching or in code execution). Implicit and lazy imports take that complexity, and run it automatically as part of a __getattr__ operation. There are valid reasons for doing that (such as to improve startup time in large applications), but postponing the point where new users need to learn the difference between "package attribute set in __init__" and "imported submodule" likely isn't one of them. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Steven D'Aprano wrote:
Another thing to consider is that code executed during an import runs with the import lock held. This can lead to surprises in multi-threaded code. I got caught out by it once, and it took me a while to figure out what was going on. As long as the import lock exists, it's probably better for importing to remain an eplicit action, at least by default. -- Greg

On 24.09.2014 19:57, Thomas Gläßle wrote:
Agreed, it's a nice feature :-) I've been using this in our mx packages since 1999 using a module called LazyModule.py. See e.g. http://educommons.com/dev/browser/3.2/installers/windows/src/eduCommons/pyth... Regarding making module more class like: we've played with this a bit at PyCon UK and it's really easy to turn a module into a regular class (with all its features) by tweaking sys.modules - we even got .__getattr__() to work. With some more effort, we could have a main() function automatically called upon direct import from the command line. The whole thing is a huge hack, though, so I'll leave out the details :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 24 2014)
2014-09-27: PyDDF Sprint 2014 ... 3 days to go 2014-09-30: Python Meeting Duesseldorf ... 6 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Sep 24, 2014, at 11:10, "M.-A. Lemburg" <mal@egenix.com> wrote:
Could LazyModule be easily added to the stdlib, or split out into a separate PyPI package? It seems to me that would be a pretty good solution. Today, a package has to eagerly preload modules, make the users do it manually, or write a few dozen lines of code to lazily load modules on demand, so it's not surprising that many of them don't use the third option even when it would be best for their users. If that could be one or two lines instead, I'm guessing a lot more packages would do so.

On Wed, Sep 24, 2014 at 2:22 PM, Andrew Barnert < abarnert@yahoo.com.dmarc.invalid> wrote:
Could LazyModule be easily added to the stdlib, or split out into a separate PyPI package?
How is it different from apipkg? https://pypi.python.org/pypi/apipkg/1.2

On Sep 29, 2014, at 7:15, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
No idea. Could apipkg be easily added to the stdlib? Is it actively maintained? ("virtually all Python versions, including CPython2.3 to Python3.1" sounds a bit worrisome...). Does it provide all the same functionality as Mark-Andre's package? If the answers are all "yes" then you can take my message as support for adding either one.

On Mon, Sep 29, 2014 at 10:25 AM, Andrew Barnert <abarnert@yahoo.com> wrote:
Is [apipkg] actively maintained?
It is distributed as a part of the popular "py" library, so I would assume it is fairly well maintained. See <http://pylib.readthedocs.org/en/latest/>.

On 24.09.2014 20:22, Andrew Barnert wrote:
If there's enough interest, then yes, separating it out into a PyPI package or adding it to the stdlib would be an option. The code is pretty simple. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 30 2014)
2014-09-30: Python Meeting Duesseldorf ... today ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Wed, Sep 24, 2014 at 7:10 PM, M.-A. Lemburg <mal@egenix.com> wrote:
Indeed. I can think of multiple places where there are compelling reasons to want to hook module attribute lookup: Lazy loading: as per above. E.g., ten years ago for whatever reason, someone decided that 'import numpy' ought to automatically execute 'import numpy.testing' as well. So now backcompat means we're stuck with it. 'import numpy.testing' is rather slow, to the point that it can be a substantial part of the total overhead for launching numpy-using scripts. We get bug reports about this, from people who are irritated that their production code is spending all this time loading unit-test harnesses and whatnot that it doesn't even use. Module attribute deprecation: For reasons that are even more lost in the mists of time, numpy re-exports some objects from the __builtins__ namespace (e.g., numpy.float exists but is __builtins__.float; if you want the default numpy floating-point type you have to write numpy.float_). As you can probably imagine this is massively confusing to everyone, but if we just removed these re-exports then it would break existing working code (e.g., 'numpy.array([1, 2, 3], dtype=numpy.float)' does work and do the right thing right now), so according to our deprecation policy we have to spend a few releases issuing warnings every time someone writes 'numpy.float'. Which requires executing arbitrary code at attribute lookup time. I think both of these use cases arise very commonly in long-lived projects, but right now the only ways to accomplish either of these things involve massive disgusting hacks. They are really really hard to do cleanly, and you risk all kinds of breakage in edge-cases (e.g. try reload()'ing a module that's been replaced by an object). So, we haven't dared release anything like this in production, and the above problems just hang around indefinitely. What I'd really like is for module attribute lookup to start supporting the descriptor protocol. This would be super-easy to work with and fast (you only pay the extra overhead for the attributes which have been hooked). -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

Nathaniel Smith wrote on 09/25/2014 07:07 AM:
I'm not sure, I picture this the same way you intended, but I believe supporting the descriptor protocol is too confusing and breaks too much code in many cases. You wouldn't normally expect to execute x.__get__, etc on module attribute access if you are just trying to export some object x that happens to be a descriptor.

I love it. +1 :). On 25 September 2014 20:16, Thomas Gläßle <t_glaessle@gmx.de> wrote:
-- -------------------------------------------------- Tennessee Leeuwenburg http://myownhat.blogspot.com/ "Don't believe everything you think"

Nathaniel Smith wrote on 09/25/2014 07:07 AM:
The reason I brought implicit imports up in isolation from (well, maybe not isolated enough) supporting a module.__getattr__ protocol altogether, is that it's much less involved. The former can be added without also adding the latter and already cover a lot of its use cases. If module.__getattr__ can be added, I'm all for it. But it also suggests to enable other class-like features in modules, which might not be so easy anymore, conceptually. In contrast, IMO, it is natural to expect package.module to *just work*, regardless of whether the submodule has already been imported. At least, if packages were only collections of modules. Maybe, this is the more fundamental problem with packages. They are more like module/package hybrids with a mixed-up namespace. This also causes other irritating issues. E.g.: package/__init__.py: foo = "foo" from . import foo from . import bar bar = "bar" baz = "baz" # has the following submodules: package/foo.py: ... package/bar.py: ... package/baz.py: ... user: >>> package.foo <module ...> >>> package.bar bar >>> import package.bar as bar >>> bar # not the module you might expect.. bar >>> package.baz baz >>> from package import baz baz >>> import package.baz as baz >>> baz <module ...> The "baz" case can be especially confusing. I know, you shouldn't write code like this. But sometimes it happens, because it's just so easy.

Nathaniel Smith wrote:
One small thing that might help is to allow the __class__ of a module to be reassigned to a subclass of the module type. That would allow a module to be given custom behaviours, while remaining a real module object so that reload() etc. continue to work. -- Greg

On Thu, Sep 25, 2014 at 11:31 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Heh, I was actually just pondering whether it would be opening too big a can of worms to suggest this myself. This is the best design I managed to come up with last time I looked at it, though in existing versions of python it requires ctypes hackitude to accomplish the __class__ reassignment. (The advantages of this approach are that (1) you get to use the full class machinery to define your "metamodule", (2) any existing references to the module get transformed in-place, so you don't have to worry about ending up with a mixture of old and new instances existing in the same program, (3) by subclassing and avoiding copying you automatically support all the subtleties and internal fields of actual module objects in a forward- and backward-compatible way.) This would work today, and would solve all these problems, except for the following code in Objects/typeobject.c:object_set_class: if (!(newto->tp_flags & Py_TPFLAGS_HEAPTYPE) || !(oldto->tp_flags & Py_TPFLAGS_HEAPTYPE)) { PyErr_Format(PyExc_TypeError, "__class__ assignment: only for heap types"); return -1; } if (compatible_for_assignment(oldto, newto, "__class__")) { Py_INCREF(newto); Py_TYPE(self) = newto; Py_DECREF(oldto); return 0; } The builtin "module" type is not a HEAPTYPE, so if we try to do mymodule.__class__ = mysubclass, then the !(oldto->tp_flags & Py_TPFLAGS_HEAPTYPE) check gets triggered and the assignment fails. This code has been around forever, but I don't know why. AFAIK we could replace the above with if (compatible_for_assignment(oldto, newto, "__class__")) { if (newto->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_INCREF(newto); } Py_TYPE(self) = newto; if (oldto->tp_flags & Py_TPFLAGS_HEAPTYPE) { Py_DECREF(oldto); } return 0; } and everything would just work, but I could well be missing something? Is there some dragon lurking inside Python's memory management or is this just an ancient overabundance of caution? -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On Thu, Sep 25, 2014, at 21:02, Nathaniel Smith wrote:
Currently, this is the message you get if you attempt to reassign the class of a list, or an int. Is there something else that would prevent it? Maybe the "object layout differs" check? What if the class you are assigning is a legitimate subclass of the basic type?

On Thu, Sep 25, 2014 at 8:32 PM, <random832@fastmail.us> wrote:
IIRC the caution is for the case where a built-in type has its own allocation policy, such as the custom free list used by float and a few other types. The custom deallocation code is careful not to use the free list for subclass instances. But (depending on how the free list is implemented) if you could switch the type out for an object that's in a custom free list, the free list could become corrupt. There is no custom allocation for modules, and even for float I don't see how switching types back and forth between float and a subclass could corrupt the free list (assuming the struct size and layout constraints are met), but it is certainly possible to have a custom allocation policy that would be broken. So indeed the smell of dragons is still there (they may exist in 3rd party modules). Perhaps we can rename HEAPTYPE to NO_CUSTOM_ALLOCATOR and set it for most built-in types (or at least for the module type) and all will be well. -- --Guido van Rossum (python.org/~guido)

On Fri, Sep 26, 2014, at 00:24, Guido van Rossum wrote:
For float I'd be worried more about the fact that it's supposed to be immutable. It would be entirely reasonable for an implementation to make all floats with the same value the same object (as cpython does do for ints in a certain range), and what happens if you change its type then? And even if it doesn't do so, it does for literals with the same value in the same function. So, realistically, an immutable type (especially an immutable type which has literals or another interning mechanism) needs to forbid __class__ from being assigned.

On Fri, Sep 26, 2014 at 10:43 AM, <random832@fastmail.us> wrote:
That's also a good one, but probably not exactly what the code we're discussing is protecting against -- the same issue could happen with immutable values implemented in pure Python. It's likely though that the HEAPTYPE flag is a proxy for a variety of invariants maintained for the built-in base types, and that is what makes it smell like dragon. -- --Guido van Rossum (python.org/~guido)

On Fri, 26 Sep 2014 02:02:01 +0100 Nathaniel Smith <njs@pobox.com> wrote:
The tp_dealloc for a heap type is not the same as the non-heap base type's tp_dealloc. See subtype_dealloc() in typeobject.c. Switching the __class__ would deallocate the instance with an incompatible tp_dealloc. (in particular, a heap type is always incref'ed when an instance is created and decref'ed when an instance is destroyed, but the base type wouldn't) Also, look at compatible_for_assignment(): it calls same_slots_added() which assumes both args are heap types. Note that this can be a gotcha when using the stable ABI: http://bugs.python.org/issue16690 Regards Antoine.

Antoine Pitrou wrote:
It looks like the easiest way to address this particular use case would be to make the module type a heap type. In the long term, how about turning *all* types into heap types? We're already having to call PyType_Ready on all the static type objects, so allocating them from the heap shouldn't incur much extra overhead. Seems to me that this would simplify a lot of the cpython code and make it easier to maintain. As it is, thinking about all the tricky differences between heap and non-heap types makes my head hurt. -- Greg

On Sep 26, 2014, at 14:43, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
What about extension modules? Deprecate static types? Automatically copy them to heap types? Use some horrible macro tricks in Python.h or a custom preprocessor in distutils?

On Sat, Sep 27, 2014 at 12:43 AM, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
I think the name "heap types" is misleading. The actual distinction being made isn't really about where the type object is allocated. Static type objects are still subject to the refcounting machinery in most cases (try sys.getrefcount(int)), but this is fine because the refcount never reaches zero. AFAICT from skimming the source a bit, what happened back in the 2.2 days is that the devs went around fixing all the random places where the assumption that all type objects were immortal had snuck in, and they hid all this fixes behind a generic switch called "heap types". It's all stuff like "we'll carefully only do standard refcounting if HEAPTYPE is set" (even though refcounting could be applied to all types without causing any problems), or "we will disable the GC machinery when walking non-heap types" (even though again, who cares), or "heap types all use the same tp_dealloc function". I'm sure some of this stuff we're stuck with due to backcompat with C extension modules that make funny assumptions, but presumably a lot of it could be cleaned up -- I think that's what Greg means. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org

On Sat, 27 Sep 2014 01:03:04 +0100 Nathaniel Smith <njs@pobox.com> wrote:
Static type objects are still subject to the refcounting machinery in most cases (try sys.getrefcount(int)),
So what about it? :-)
Regards Antoine.

On 27 Sep 2014 01:08, "Antoine Pitrou" <solipsis@pitrou.net> wrote:
Yes, that's why I said "most cases", not all cases :-). My point was that being statically allocated doesn't make list a special snowflake that *needs* some sort of protection from refcounting. If heap and non-heap types were treated the same in this regard then nothing horrible would happen. -n

On Sep 26, 2014, at 17:03, Nathaniel Smith <njs@pobox.com> wrote:
Yes, I wasn't sure whether Greg was suggesting to get rid of actual non-heap-allocated types, or just making static types fit HEAPTYPE. The former would be a lot more work, but it would also allow simplifying a lot of additional things, so they both seem like reasonable things to suggest (whether or not they're both reasonable things to actually do),
Well, there's obviously a non-zero performance cost to doing all this stuff with all types. Of course there's also a non-zero cost to checking the heap-type-ness of all types. And both costs may be so minimal they're hard to even measure.

Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
With branch prediction on a modern CPU an "if unlikely()" can probably push it down to inpunity. Both the Linux kernel and Cython does this liberally. Sturla

Sturla Molden <sturla.molden@gmail.com> wrote:
With branch prediction on a modern CPU an "if unlikely()" can probably push it down to inpunity. Both the Linux kernel and Cython does this liberally.
Just for reference, the definition of these macros in Cython and Linux are: #define likely(x) __builtin_expect(!!(x), 1) #define unlikely(x) __builtin_expect(!!(x), 0) Typical usecases are fd = open(...); if (unlikely(fd < 0)) { /* handle unlikely error */ } or ptr = malloc(...); if (unlikely(!ptr)) { /* handle unlikely error */ } If the conditionals fail, these checks have exactly zero impact on the run-time with a processor that supports branch prediction. Microsoft compilers don't know about __builtin_expect, but GCC, Clang and Intel compilers know what to do with it. Sturla

On Sun, 28 Sep 2014 17:13:06 +0000 (UTC) Sturla Molden <sturla.molden@gmail.com> wrote:
If the conditionals fail, these checks have exactly zero impact on the run-time with a processor that supports branch prediction.
Branch prediction is typically implemented using branch predictors, which is a form of cache updated with the results of previous branches. "Impunity" can therefore only be achieved with an infinite number of branch predictors :-) Regards Antoine.

On Sep 28, 2014, at 9:44, Sturla Molden <sturla.molden@gmail.com> wrote:
On what modern CPU does unlikely have any effect at all? x86 has an opcode to provide static branch prediction hints, but it's been a no-op since Core 2; ARM doesn't have one; I don't know about other instruction sets but I'd be surprised if they did. And that's a good thing. If that macro still controlled branch prediction, using it would mean blowing away the entire pipeline on every use of a non-heap type. A modern CPU will use recent history to decide which branch is more likely, so whether your loop is using a heap type or a non-heap type, it won't mispredict anything after the first run through the loop.

Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
AFAIK, the branch prediction is somewhat controlled by the order of instructions. And this compiler hint allows the compiler to restructure the code to better exploit this behavior. It does not result in specific opcodes being inserted. Sturla

Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
http://madalanarayana.wordpress.com/2013/08/29/__builtin_expect-a-must-for-s... http://benyossef.com/helping-the-compiler-help-you/

On Sep 28, 2014, at 13:17, Sturla Molden <sturla.molden@gmail.com> wrote:
The example in this post shows the exact opposite of what it purports to: the generated code puts the unlikely i++ operation immediately after the conditional branch; because Haswell processors assume, in the absence of any information, that forward branches are unlikely, this will cause the wrong branch to be speculatively executed. In other words, gcc has completely ignored the builtin_expect here--as it often does. Also note the comment in the quoted source:
In general, you should prefer to use actual profile feedback for this (`-fprofile-arcs'), as programmers are notoriously bad at predicting how their programs actually perform
This one vaguely waves its hands at the idea without providing any examples, before concluding:
It should be noted that GCC also provide a run time parameter -fprofile-arcs, which can profile the code for the actual statistics for each branch and the use of it should be prefered above guessing.
Meanwhile, this whole thing started with you saying that branch prediction means we can add conditional checks "with impunity". The exact opposite is true. On older processors, we _could_ issue checks with impunity; branch prediction means they're now an order of magnitude more expensive than they used to be unless we're very careful. The ability to hint the CPU by rearranging code (whether manually, with builtin_expect, or using PGO) partly mitigated this effect, but it doesn't reverse it. And at any rate, consider the case we're talking about. We have some heap types and some non-heap types. Neither branch is very unlikely, which means that no matter which version you mark as unlikely, it's going to be wrong quite often. Which means, exactly as I said at the start, that the check for non-heap it not free. Unnecessary refcounts are also not free. Which one is more costly? Is either one costly enough to matter? Hell if I know; that's the kind of thing you pretty much have to test. Trying to reason it from first principles is hard enough even if you get all the principles right, but even harder if you're thinking in terms of P4 chips.

On Sun, Sep 28, 2014, at 22:58, Andrew Barnert wrote:
And at any rate, consider the case we're talking about. We have some heap types and some non-heap types. Neither branch is very unlikely,
What? It is very unlikely, especially in existing code where it won't work at all, for someone to attempt to reassign the __class__ of a non heap type object. We are not talking about something that gets run on every object.

<random832@fastmail.us> wrote:
And because of that it is better to have the pipeline flushed whenever it happens, rather than, say, 50 % of the times it might happen. But I aggree with Andrew that it is something we should try to measure. Similarly, tagging functions 'hot' or 'cold' might also be a good idea. We know there are functions that will execute a lot, and there are error handlers that will only rarely be run. Anyone that has used Fortran will also know that tagging a function 'pure' is of great help to the compiler, particularly if arrays or pointers are involved. This informs the compiler that the function has no side effects. For example if we assert that a function like sin(x) is pure, it does not have to assume that calling this function will change something elsewhere. In Fortran it is a keyword, but we can use it in C as a GNU extension. Sturla

On Sep 29, 2014, at 6:27, random832@fastmail.us wrote:
Look at the subject of this thread. Go back to the first message in the thread. Greg's suggestion is that, instead of just working around the __class__ assignment test, "I'm thinking it should be possible to reduce the differences to the point where [heap allocation itself is] the *only* distinction, so the vast majority of code doesn't have to care, and the same tp_* functions can be used for both." That's what we're talking about here. Is there a potential performance impact for making all of those changes? There could be a benefit from removing the tests; there could be a cost from adding work we didn't used to do (e.g., extra refcounting or other tests that we can currently skip). So, the fact that the one check on __class__ can be statically predicted pretty well doesn't have much to do with the potential cost or benefit of removing all of the differences between heap and non-heap types instead of just the check on __class__.

Nathaniel Smith wrote:
Yes, it's probably not necessary to actually allocate them on the heap (that would cause big problems for existing extension modules that assume they can statically declare them). But I'm thinking it should be possible to reduce the differences to the point where that's the *only* distinction, so the vast majority of code doesn't have to care, and the same tp_* functions can be used for both. -- Greg

On Sep 25, 2014, at 18:02, Nathaniel Smith <njs@pobox.com> wrote:
When I tried this a year or two ago, I did I with an import hook that allows you to specify metaclass=absolute.qualified.spam in any comment that comes before any non-comment lines, so you actually construct the module object as a subclass instance rather than re-classing it. In theory that seems a lot cleaner. In practice it's a weird way to specify your type; it only works if the import-hooking module and the module that defines your type have already been imported and otherwise silently does the wrong thing; and my implementation was pretty hideous. Is there a cleaner version of that we could do if we were modifying the normal import machinery instead of hooking it, and if it didn't have to work pre-3.4, and if it were part of the language instead of a hack? IIRC (too hard to check from my phone on the train), a module is built by calling exec with a new global dict and then calling the module constructor with that dict, so it's just a matter of something like: cls = g.get('__metamodule__', module) if not issubclass(cls, module): raise TypeError('metamodule {} is not a module type'.format(cls)) mod = cls(name, doc, g) # etc. Then you could import the module subclass and assign it to __metamodule__ from inside, rather than needing to pre-import stuff, and you'd get perfectly understandable errors, and so on. It seems less hacky and more flexible than re-classing the module after construction, for the same reason metaclasses and, for that matter, normal class constructors are better than reclassing after the fact. Of course I could be misremembering how modules are constructed, in which case... Never mind.

On 26 Sep 2014 15:59, "Andrew Barnert" <abarnert@yahoo.com> wrote:
allows you to specify metaclass=absolute.qualified.spam in any comment that comes before any non-comment lines, so you actually construct the module object as a subclass instance rather than re-classing it.
In theory that seems a lot cleaner. In practice it's a weird way to
specify your type; it only works if the import-hooking module and the module that defines your type have already been imported and otherwise silently does the wrong thing; and my implementation was pretty hideous.
Is there a cleaner version of that we could do if we were modifying the
normal import machinery instead of hooking it, and if it didn't have to work pre-3.4, and if it were part of the language instead of a hack?
IIRC (too hard to check from my phone on the train), a module is built by
calling exec with a new global dict and then calling the module constructor with that dict, so it's just a matter of something like:
from inside, rather than needing to pre-import stuff, and you'd get perfectly understandable errors, and so on.
It seems less hacky and more flexible than re-classing the module after
construction, for the same reason metaclasses and, for that matter, normal class constructors are better than reclassing after the fact.
Of course I could be misremembering how modules are constructed, in which
case... Never mind. Alas, in this regard module objects are different than classes; they're constructed and placed in sys.modules before the body is exec'ed. And unfortunately it has to work that way, because if foo/__init__.py does 'import foo.bar', then the module 'foo' has to be immediately resolvable, before __init__.py finishes executing. A similar issue arises for circular imports. So this would argue for your 'magic comment' or special syntax approach, sort of like how __future__ imports work. This part we could do non-hackily if we modified the import mechanism itself. But we'd still have the other problem you mention, that the metamodule would have to be defined before the module is imported. I think this is a showstopper, given that the main use cases for metamodule support involve using it for top-level package namespaces. If the numpy project wants to define a metamodule for the 'numpy' namespace, then where do they put it? So I think we necessarily will always start out with a regular module object, and our goal is to end up with a metamodule instance instead. If this is right then it means that even in principle we really only have two options, so we should focus our attention on these. Option 1: allocate a new object, shallowly copy over all the old object properties into the new one, and then find all references to the old object and replace them with the new object. This is possible right now, but error prone: cloning a module object requires intimate knowledge of which fields exist, and swapping all the references requires that we be careful to perform the swap very early, when the only reference is the one in sys.modules. Option 2: the __class__ switcheroo. This avoids the two issues above. In exchange it's fairly brain-hurty. Oh wait, I just thought of a third option. It only works for packages, but that's okay, you can always convert a module into a package by a simple mechanical transformation. The proposal is that before exec'ing __init__.py, we check for the existence of a __preinit__.py, and if found we do something like sys.modules[package] = sentinel to block circular imports namespace = {} exec __preinit__.py in namespace cls = namespace.get("__metamodule___", ModuleType) mod = cls(name, doc, namespace) sys.modules[package] = mod exec __init__.py in namespace So preinit runs in the same namespace as init, but with a special restriction that if it tries to (directly or indirectly) import the current package, then this will trigger an ImportError. This is somewhat restrictive, but it does allow arbitrary code to be run before the module object is created. -n

On Sep 26, 2014, at 12:12, Nathaniel Smith <njs@pobox.com> wrote:
I had an email written just to say "this sounds brilliant, but why isn't it called __new__", with three paragraphs explaining why it was a good analogy... Now I guess I can delete draft. :) Anyway, I definitely like this better than re-classing modules in mid-initialization, and better than my magic comment hack (and looking at the code again, of course you're right that my magic comment hack was necessary with anything like my approach, I guess I just forgot in the intervening time).

Thomas Gläßle wrote on 09/26/2014 11:03 PM:
On second thought. Scratch all of that. This is easy enough to do it a few lines of code and customize to the specific use case. Sorry for the noise, I think it's too late for my brain to work well;) Using an __autoimport__ list could still be an option if not resorting to the metamodule.
participants (19)
-
Alexander Belopolsky
-
Andrew Barnert
-
Antoine Pitrou
-
Brett Cannon
-
Chris Angelico
-
Daniel Holth
-
Erik Bray
-
Ethan Furman
-
Greg Ewing
-
Guido van Rossum
-
M.-A. Lemburg
-
Nathaniel Smith
-
Nick Coghlan
-
random832@fastmail.us
-
Steven D'Aprano
-
Sturla Molden
-
Tennessee Leeuwenburg
-
Terry Reedy
-
Thomas Gläßle