unexpected import behaviour

Hi, I'm not sure if this is a bug or not, I certainly didn't expect it. If you create a file called test.py with the following contents, class Test: pass def test_1(): import test print Test == test.Test if __name__ == '__main__': test_1() and then run it ($ python test.py), it'll print False. Now try: $python import test test.test_1() and it'll print True. Is this behaviour expected? What was the rationale for it if is? Thanks, Daniel -- active-thought.com

Hello. We are sorry but we cannot help you. This mailing list is to work on developing Python (adding new features to Python itself and fixing bugs); if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See http://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Thu, Jul 29, 2010 at 07:32:28AM +0100, Daniel Waterworth wrote:
class Test: pass
def test_1(): import test print Test == test.Test
if __name__ == '__main__': test_1()
and then run it ($ python test.py), it'll print False.
The problem is that when you run the code as a script it is imported as module __main__; when you import it as 'test' you get the second copy of the module. Oleg. -- Oleg Broytman http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

On Thu, Jul 29, 2010 at 4:32 PM, Daniel Waterworth <da.waterworth@gmail.com> wrote:
Hi,
I'm not sure if this is a bug or not, I certainly didn't expect it. If you create a file called test.py with the following contents,
class Test: pass
def test_1(): import test print Test == test.Test
if __name__ == '__main__': test_1()
and then run it ($ python test.py), it'll print False. Now try:
$python import test test.test_1()
and it'll print True. Is this behaviour expected? What was the rationale for it if is?
The behaviour is expected, but there's no particularly deep rationale for it - the interpreter just doesn't go out of its way to try and figure out what __main__ *would* have been called if it had been imported rather than executed (your script will still print False even if you run it via "python -m test"). We certainly *could* put __main__ into sys.modules under both names (i.e. "__main__" and "test" in your example), but the backwards compatibility implications of doing so aren't particularly clear. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 29/07/2010 07:32, Daniel Waterworth wrote:
Hi,
I'm not sure if this is a bug or not, I certainly didn't expect it. If you create a file called test.py with the following contents,
The issue is that when your code is executed as a script it is run as the __main__ module and not as the test module. When you import test you then import the same module with a different name - and your Test class is recreated (so __main__.Test is then different from test.Test). When you import your code as test and it then reimports itself it is only created once. This *is* expected behaviour (not a bug), but it frequently confuses even relatively experienced programmers (it can happen by accident and cause hard to track down bugs) and I personally think that Python would be improved by issuing a warning if a __main__ script reimports itself. All the best, Michael Foord
class Test: pass
def test_1(): import test print Test == test.Test
if __name__ == '__main__': test_1()
and then run it ($ python test.py), it'll print False. Now try:
$python import test test.test_1()
and it'll print True. Is this behaviour expected? What was the rationale for it if is?
Thanks,
Daniel
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On 29 July 2010 07:32, Daniel Waterworth <da.waterworth@gmail.com> wrote:
Hi,
I'm not sure if this is a bug or not, I certainly didn't expect it. If you create a file called test.py with the following contents,
class Test: pass
def test_1(): import test print Test == test.Test
if __name__ == '__main__': test_1()
and then run it ($ python test.py), it'll print False. Now try:
$python import test test.test_1()
and it'll print True. Is this behaviour expected? What was the rationale for it if is?
Thanks,
Daniel
-- active-thought.com
@Oleg: I know this list is plagued by people who should be on comp.lang.python, but I assure you I'm not looking to learn to program in python, in fact I've been programming competently in python for many years. This is purely CPython bug-fixing/the discussion of implementation choices. @ Nick: In terms of backward compatibility, it would only break someone's code if they were relying on having the same module imported twice as different instances. Could this behaviour be added to python3.2? I'm not sure how far you are through the release cycle. Or even just a warning as Michael suggested? @Michael: Yes, I guessed as much. In fact adding, import sys, os if globals().get("__file__") and __name__=='__main__': base = os.path.basename(__file__) ext = base.rfind('.') if ext > 0: main_name = base[:ext] else: main_name = base sys.modules[main_name] = __import__('__main__') to the beginning of a file solves the problem, but seems more than a little hacky and I think I've missed edge cases with packages. Thanks for your answers, Daniel -- active-thought.com

On Fri, Jul 30, 2010 at 07:26:26AM +0100, Daniel Waterworth wrote:
@Oleg: ... This is purely CPython bug-fixing/the discussion of implementation choices.
I am not sure it's a bug. By manipulating sys.path (or symlinks in the FS) one can import the same file as different modules as many times as [s]he wants. Should this be fixed for __main__? I doubt it. Instead of making __main__ a special case follow the rule: don't import the same module under different paths/names. Make your script simply from test import main main() Oleg. -- Oleg Broytman http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

On 30/07/2010 17:59, Oleg Broytman wrote:
On Fri, Jul 30, 2010 at 07:26:26AM +0100, Daniel Waterworth wrote:
@Oleg: ... This is purely CPython bug-fixing/the discussion of implementation choices.
I am not sure it's a bug.
It isn't a bug but it's a very common *cause* of bugs, even for relatively experienced Python programmers (this exchange being another case in point). Michael
By manipulating sys.path (or symlinks in the FS) one can import the same file as different modules as many times as [s]he wants. Should this be fixed for __main__? I doubt it. Instead of making __main__ a special case follow the rule: don't import the same module under different paths/names. Make your script simply
from test import main main()
Oleg.
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On 30 July 2010 18:32, Michael Foord <fuzzyman@voidspace.org.uk> wrote:
On 30/07/2010 17:59, Oleg Broytman wrote:
On Fri, Jul 30, 2010 at 07:26:26AM +0100, Daniel Waterworth wrote:
@Oleg: ... This is purely CPython bug-fixing/the discussion of implementation choices.
I am not sure it's a bug.
It isn't a bug but it's a very common *cause* of bugs, even for relatively experienced Python programmers (this exchange being another case in point).
Michael
By manipulating sys.path (or symlinks in the FS) one can import the same file as different modules as many times as [s]he wants. Should this be fixed for __main__? I doubt it. Instead of making __main__ a special case follow the rule: don't import the same module under different paths/names. Make your script simply
from test import main main()
Oleg.
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog
READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/da.waterworth%40googlemail...
Having thought it through thoroughly, my preference is for a warning. I don't think it's a good practise to import the __main__ module by filename, as renaming the file will break the code. I got stung after, having dropped into a python interpreter shell and imported the module, I executed a function that uses isinstance. If a warning showed up after importing the module, explaining the problem and suggested that I use __import__('__main__') instead, I would have saved myself a fair amount of time debugging code. This is another case of "Explicit is better than implicit.". It also means that code that relies on the current behaviour will not be broken. @Oleg: yes, but in the __main__ case, it's more difficult to tell that you are importing something under a different name. I suppose the proof is in the pudding, can anyone think of a case where someone has been annoyed that, having imported that same module twice via symlinks, they have had problems relating to modules being independent instances? Thanks, Daniel -- active-thought.com

On Fri, Jul 30, 2010 at 07:46:44PM +0100, Daniel Waterworth wrote:
can anyone think of a case where someone has been annoyed that, having imported that same module twice via symlinks, they have had problems relating to modules being independent instances?
I've had problems with two instances of the same module imported after sys.path manipulations. Never had a problem with reimported scripts. Oleg. -- Oleg Broytman http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

At 11:50 PM 7/30/2010 +0400, Oleg Broytman wrote:
On Fri, Jul 30, 2010 at 07:46:44PM +0100, Daniel Waterworth wrote:
can anyone think of a case where someone has been annoyed that, having imported that same module twice via symlinks, they have had problems relating to modules being independent instances?
I've had problems with two instances of the same module imported after sys.path manipulations. Never had a problem with reimported scripts.
I have. The "unittest" module used to have this problem, when used as a script.

On Sat, Jul 31, 2010 at 4:46 AM, Daniel Waterworth <da.waterworth@gmail.com> wrote:
Having thought it through thoroughly, my preference is for a warning.
That's actually harder than it sounds. Inserting "__main__" into sys.modules under its normal name as well as "__main__" is actually pretty easy (for both direct execution and -m). Making it generate a warning when accessing __main__ under its normal name is trickier (akin to the various "lazy module loading" hacks that are available in assorted packages on pypi). Extending this to work for arbitrary objects is very hard to do efficiently (you'd almost certainly need an additional index from __file__ values to sys.modules keys that would complain whenever the list of associated key entries for a given __file__ value contained more than 1 entry). That's a lot of infrastructure just to try and detect a fairly rare kind of bug. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Jul 30, 2010 at 2:46 PM, Daniel Waterworth <da.waterworth@gmail.com> wrote: ..
Having thought it through thoroughly, my preference is for a warning.
I don't think it's a good practise to import the __main__ module by filename, as renaming the file will break the code. I got stung after, having dropped into a python interpreter shell and imported the module, I executed a function that uses isinstance.
If a warning showed up after importing the module, explaining the problem and suggested that I use __import__('__main__') instead, I would have saved myself a fair amount of time debugging code. This is another case of "Explicit is better than implicit.".
You can easily disallow importing __main__ module by filename by simply giving your script a name that does not end with .py or by using say '-' character in the filename. No change to python itself is needed.

On 31 July 2010 02:21, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Fri, Jul 30, 2010 at 2:46 PM, Daniel Waterworth <da.waterworth@gmail.com> wrote: ..
Having thought it through thoroughly, my preference is for a warning.
I don't think it's a good practise to import the __main__ module by filename, as renaming the file will break the code. I got stung after, having dropped into a python interpreter shell and imported the module, I executed a function that uses isinstance.
If a warning showed up after importing the module, explaining the problem and suggested that I use __import__('__main__') instead, I would have saved myself a fair amount of time debugging code. This is another case of "Explicit is better than implicit.".
You can easily disallow importing __main__ module by filename by simply giving your script a name that does not end with .py or by using say '-' character in the filename. No change to python itself is needed.
My problem is not that I'm likely to fall for the same trap twice. It's that I want to prevent other people from doing so. @Nick: I suppose the simplest way to detect re-importation in the general case, is to store a set of hashes of files that have been imported. When a user tries to import a file where it's hash is already in the set, a warning is generated. It's simpler than trying to figure out all the different ways that a file can be imported, and will also detect copied files. This is less infrastructure than you were suggesting, but it's not a perfect solution. Thanks, Daniel -- active-thought.com

On Sat, Jul 31, 2010 at 3:57 PM, Daniel Waterworth <da.waterworth@gmail.com> wrote:
@Nick: I suppose the simplest way to detect re-importation in the general case, is to store a set of hashes of files that have been imported. When a user tries to import a file where it's hash is already in the set, a warning is generated. It's simpler than trying to figure out all the different ways that a file can be imported, and will also detect copied files. This is less infrastructure than you were suggesting, but it's not a perfect solution.
Hashing every file on import would definitely be more overhead than just checking __file__ values (since we already calculate the latter, and regardless of how a file is imported, it needs to end up in sys.modules eventually). Besides, importing the same code under different names happens in several places in our own test suite (we use it to check that code behaviour doesn't change just because we import it differently), so we can hardly disable that behaviour. That said, I really don't think catching such a rare error is worth *any* runtime overhead. Just making "__main__" and the real module name refer to the same object in sys.modules is a different matter, but I'm not confident enough that I fully grasp the implications to do it without gathering feedback from a wider audience. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 31/07/2010 16:07, Nick Coghlan wrote:
On Sat, Jul 31, 2010 at 3:57 PM, Daniel Waterworth <da.waterworth@gmail.com> wrote:
@Nick: I suppose the simplest way to detect re-importation in the general case, is to store a set of hashes of files that have been imported. When a user tries to import a file where it's hash is already in the set, a warning is generated. It's simpler than trying to figure out all the different ways that a file can be imported, and will also detect copied files. This is less infrastructure than you were suggesting, but it's not a perfect solution.
Hashing every file on import would definitely be more overhead than just checking __file__ values (since we already calculate the latter, and regardless of how a file is imported, it needs to end up in sys.modules eventually). Besides, importing the same code under different names happens in several places in our own test suite (we use it to check that code behaviour doesn't change just because we import it differently), so we can hardly disable that behaviour.
That said, I really don't think catching such a rare error is worth *any* runtime overhead. Just making "__main__" and the real module name refer to the same object in sys.modules is a different matter, but I'm not confident enough that I fully grasp the implications to do it without gathering feedback from a wider audience.
Some people workaround the potential for bugs caused by __main__ reimporting itself by doing it *deliberately*. Glyf even recommends it as good practise. ;-) http://glyf.livejournal.com/60326.html So - the fix you suggest would *break* this code. Raising a warning wouldn't... (and would eventually make this workaround unnecessary.) Michael
Cheers, Nick.
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Sun, Aug 1, 2010 at 1:14 AM, Michael Foord <fuzzyman@voidspace.org.uk> wrote:
Some people workaround the potential for bugs caused by __main__ reimporting itself by doing it *deliberately*. Glyf even recommends it as good practise. ;-)
http://glyf.livejournal.com/60326.html
So - the fix you suggest would *break* this code. Raising a warning wouldn't... (and would eventually make this workaround unnecessary.)
With my change, that code would work just fine. "from myproject.gizmo import main" and "from __main__ import main" would just return the same object, whereas currently they return something different. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 31/07/2010 16:30, Nick Coghlan wrote:
On Sun, Aug 1, 2010 at 1:14 AM, Michael Foord<fuzzyman@voidspace.org.uk> wrote:
Some people workaround the potential for bugs caused by __main__ reimporting itself by doing it *deliberately*. Glyf even recommends it as good practise. ;-)
http://glyf.livejournal.com/60326.html
So - the fix you suggest would *break* this code. Raising a warning wouldn't... (and would eventually make this workaround unnecessary.)
With my change, that code would work just fine. "from myproject.gizmo import main" and "from __main__ import main" would just return the same object, whereas currently they return something different.
Have you looked at the code in that example? I don't think it would work... Michael
Cheers, Nick.
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Sun, Aug 1, 2010 at 1:36 AM, Michael Foord <fuzzyman@voidspace.org.uk> wrote:
On 31/07/2010 16:30, Nick Coghlan wrote:
With my change, that code would work just fine. "from myproject.gizmo import main" and "from __main__ import main" would just return the same object, whereas currently they return something different.
Have you looked at the code in that example? I don't think it would work...
Ah, I see what you mean - yes, there would need to be some additional work done to detect the case of direct execution from within a package directory in order to set __main__.__package__ accordingly (as if the command line had been "python -m myproject.gizmo" rather than "python myproject/gizmo.py"). Even then, the naming problem would remain. Still, this kind of the thing is the reason I'm reluctant to arbitrarily change the existing semantics - as irritating as they can be at times (with pickling/unpickling problems being the worst of it, as pickling in particular depends on the value in __name__ being correct), people have all sorts of workarounds kicking around that need to be accounted for if we're going to make any changes. I kind of regret PEP 366 being accepted in the __package__ form now. At one point I considered proposing something like __module_name__ instead, but I didn't actually need that extra information to solve the relative import issue, and nobody else mentioned the pickling problem at the time. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, Jul 31, 2010 at 11:07 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: ..
That said, I really don't think catching such a rare error is worth *any* runtime overhead. Just making "__main__" and the real module name refer to the same object in sys.modules is a different matter, but I'm not confident enough that I fully grasp the implications to do it without gathering feedback from a wider audience.
If you make sys.module['__main__'] and sys.module['modname'] the same (let's call it mod), what will be the value of mod.__name__?

On Sun, Aug 1, 2010 at 1:23 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Sat, Jul 31, 2010 at 11:07 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: ..
That said, I really don't think catching such a rare error is worth *any* runtime overhead. Just making "__main__" and the real module name refer to the same object in sys.modules is a different matter, but I'm not confident enough that I fully grasp the implications to do it without gathering feedback from a wider audience.
If you make sys.module['__main__'] and sys.module['modname'] the same (let's call it mod), what will be the value of mod.__name__?
"__main__", so pickling would remain broken. unpickling would at least work correctly under this regime though. The only way to fix pickling is to avoid monkeying with __name__ at all (e.g. something along the lines of PEP 299, or a special "__is_main__" flag). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (6)
-
Alexander Belopolsky
-
Daniel Waterworth
-
Michael Foord
-
Nick Coghlan
-
Oleg Broytman
-
P.J. Eby