<div class="gmail_quote">On Thu, Mar 15, 2012 at 1:50 PM, Guido van Rossum <span dir="ltr"><<a href="mailto:guido@python.org">guido@python.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div></div><div class="h5">How would you implement that anyway? </div></div></blockquote><div><br></div><div>From the PEP:</div><div><br></div><div><span style="font-family:Arial,Verdana,Geneva,'Bitstream Vera Sans',Helvetica,sans-serif;font-size:15px;line-height:21px;background-color:rgb(255,255,255)">"""If the parent package does not exist, or exists but lacks a </span><tt class="docutils literal" style="background-color:rgb(255,255,255)">__path__</tt><span style="font-family:Arial,Verdana,Geneva,'Bitstream Vera Sans',Helvetica,sans-serif;font-size:15px;line-height:21px;background-color:rgb(255,255,255)"> attribute, an attempt is first made to create a "virtual path" for the parent package (following the algorithm described in the section on </span><a class="reference internal" href="http://www.python.org/dev/peps/pep-0402/#virtual-paths" style="border-bottom-width:1px;border-bottom-style:dashed;border-bottom-color:rgb(204,204,204);color:rgb(85,26,139);text-decoration:none;font-family:Arial,Verdana,Geneva,'Bitstream Vera Sans',Helvetica,sans-serif;font-size:15px;line-height:21px;background-color:rgb(255,255,255)">virtual paths</a><span style="font-family:Arial,Verdana,Geneva,'Bitstream Vera Sans',Helvetica,sans-serif;font-size:15px;line-height:21px;background-color:rgb(255,255,255)">, below).</span>"""<br>
</div><div><br></div><div>This is actually a pretty straightforward change to the import process; I drafted a patch for importlib at one point, and somebody else created another.</div><div><br></div><div>(The main difference from the new proposal is that you do have to go back over the path list a second time in the event the parent package isn't found; but there's no reason why the protocols in the PEP wouldn't allow you to build and cache a virtual path while doing the first search, if you're worried about the performance.)</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">The import logic always tries to<br></div></div>
import the parent module before importing the child module. So the<br>
import attempt for "foo" has no idea whether it is imported as *part*<br>
of "import foo.bar", or as plain "import foo", or perhaps as part of<br>
"from foo import bar".<br></blockquote><div><br></div><div>Actually, this isn't entirely true. __import__ is called with 'foo.bar' when you import foo.bar. In importlib, it recursively invokes __import__ with parent portions, and in import.c, it loops left to right for the parents. Either way, it knows the difference throughout the process, and it's fairly straightforward to backtrack and create the parent modules when the submodule import succeeds.</div>
<div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">It would also be odd to find that<br>
<br>
import foo<br>
import foo.bar<br>
<br>
would fail, whereas<br>
<br>
import foo.bar<br>
import foo<br>
<br>
would succeed, because as a side effect of "import foo.bar", a module<br>
object for foo is created and inserted as sys.modules['foo'].<br></blockquote><div><br></div><div>Assuming we know that the foo subdirectories actually exist, the ImportError would simply say, "Can't import namespace package 'foo' before one of its modules or subpackages are imported".</div>
<div><br></div><div>Granted, that does seem a bit crufty. I erred this direction in order to avoid pitchforks coming from the backward-compatibility direction, on account of the ease with which something can get messed up at a distance without this condition, and in a way that may be hard to identify, if a piece of code is using package presence to control optional features.</div>
<div><br></div><div>IOW, it's not like either proposal results in a perfect clean result for everybody. It's a choice of which group to upset, where one group is developers fiddling with their import order (and getting an error message that says how to fix it), and the other group is people whose code suddenly crashes or behaves differently because somebody created a directory somewhere they shouldn't have (and which they might not be able to delete or remove from sys.path for one reason or another), and which was there and worked okay before until they installed a new version of the application that's built on a new version of Python.</div>
<div><br></div><div>That is, the backward compatibility problem can break an app in the field that worked perfectly in the developer's testing, and only fails when it gets to the end user who has no way of even knowing it could be a problem.</div>
<div><br></div><div>It's up to you decide which of those groups' pitchforks to face; I just want to be clear about why the tradeoff was proposed the way it was. It's not that the backward compatibilty problem harms a lot of people, so much as that when it harms them, it can harm them a lot (e.g. crashing), and at *runtime*, compared to tweaking your import sequence during *development* and getting a clear and immediate "don't do that."</div>
<div><br></div><div>Why crashing? Because "try: import json" will succeed, and then the app does json.foobar() and boom, an unexpected AttributeError. Far fetched? Perhaps, but the worst runtime import ordering problem I can think of is if you have a bad import that's working due to a global import ordering that's determined at runtime because of plugin loading. But if you have that problem, you correct the bad import in the plugin and it can never happen again.</div>
<div><br></div><div>Granted, directory naming conflicts can *also* be fixed by changing your imports; you can (and should) "try: from json import foobar" instead. But there isn't any way for us to give the user or developer an error message that *tells* them that, or even clues them in as to why the json module on that user's machine seems to be borked whenever they run the app from a certain directory...</div>
<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Finally, in your example, why on earth would unittest/mock/ exist as<br>
an empty directory???<br></blockquote><div><br></div><div>It's definitely true that the impact is limited in scope; the things most likely to be affected are generically-named top-level packages like json, email, text, xml, html, etc., that could collide with other directories lying around, AND it's a package name you try/import to test for the presence of.</div>
<div><br></div><div>As I said though, it's just that when it happens, it can happen to an *end user*, whereas import order crankiness can essentially only happen during actual coding. Also, nobody's come up with examples of breakage caused by trying to import the namespace, on account of there aren't many use cases for importing an empty namespace, vs use cases for having a 'json' directory or some such. ;-)</div>
<div><br></div><div>All this being said, if you're happy with the tradeoff, I'm happy with the tradeoff. I'm not the one they're gonna come after with the pitchforks. ;-)</div><div><br></div></div>