Re: [Python-Dev] PEP 3147: PYC Repository Directories

On Feb 03, 2010, at 01:17 PM, Guido van Rossum wrote:
Can you clarify? In Python 3, __file__ always points to the source. Clearly that is the way of the future. For 99.99% of uses of __file__, if it suddenly never pointed to a .pyc file any more (even if one existed) that would be just fine. So what's this talk of switching to __source__?
Upon further reflection, I agree. __file__ also points to the source in Python 2.7. Do we need an attribute to point to the compiled bytecode file? -Barry

On 08:21 pm, barry@python.org wrote:
On Feb 03, 2010, at 01:17 PM, Guido van Rossum wrote:
Can you clarify? In Python 3, __file__ always points to the source. Clearly that is the way of the future. For 99.99% of uses of __file__, if it suddenly never pointed to a .pyc file any more (even if one existed) that would be just fine. So what's this talk of switching to __source__?
Upon further reflection, I agree. __file__ also points to the source in Python 2.7. Do we need an attribute to point to the compiled bytecode file?
What if, instead of trying to annotate the module object with this assortment of metadata - metadata which depends on lots of things, and can vary from interpreter to interpreter, and even from module to module (depending on how it was loaded) - we just stuck with the __loader__ annotation, and encouraged/allowed/facilitated the use of the loader object to learn all of this extra information? Jean-Paul

exarkun@twistedmatrix.com wrote:
On 08:21 pm, barry@python.org wrote:
On Feb 03, 2010, at 01:17 PM, Guido van Rossum wrote:
Can you clarify? In Python 3, __file__ always points to the source. Clearly that is the way of the future. For 99.99% of uses of __file__, if it suddenly never pointed to a .pyc file any more (even if one existed) that would be just fine. So what's this talk of switching to __source__?
Upon further reflection, I agree. __file__ also points to the source in Python 2.7. Do we need an attribute to point to the compiled bytecode file?
What if, instead of trying to annotate the module object with this assortment of metadata - metadata which depends on lots of things, and can vary from interpreter to interpreter, and even from module to module (depending on how it was loaded) - we just stuck with the __loader__ annotation, and encouraged/allowed/facilitated the use of the loader object to learn all of this extra information?
Trickier than it sounds. In the case of answering the question "was this module loaded from bytecode or not?", the loader will need somewhere to store the answer for each file. The easiest per-module store is the module's own global namespace - the loader's own attribute namespace isn't appropriate, since one loader may handle multiple modules. The filesystem can't be used as a reference because even when the file is loaded from source, the bytecode file will usually be created as a side effect. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Sat, Feb 6, 2010 at 12:21 PM, Barry Warsaw <barry@python.org> wrote:
On Feb 03, 2010, at 01:17 PM, Guido van Rossum wrote:
Can you clarify? In Python 3, __file__ always points to the source. Clearly that is the way of the future. For 99.99% of uses of __file__, if it suddenly never pointed to a .pyc file any more (even if one existed) that would be just fine. So what's this talk of switching to __source__?
Upon further reflection, I agree. __file__ also points to the source in Python 2.7.
Not in the 2.7 svn repo I have access to. It still points to the .pyc file if it was used. And I propose not to disturb this in 2.7, at least not by default. I'm fine though with a flag or distro-overridable config setting to change this behavior.
Do we need an attribute to point to the compiled bytecode file?
I think we do. Quite unrelated to this discussion I have a use case for knowing easily whether a module was actually loaded from bytecode or not -- but I also have a need for __file__ to point to the source. So having both __file__ and __compiled__ makes sense to me. When there is no source code but only bytecode I am file with both pointing to the bytecode; in that case I presume that the bytecode is not in a __pyr__ subdirectory. For dynamically loaded extension modules I think both should be left unset, and some other __xxx__ variable could point to the .so or .dll file. FWIW the most common use case for __file__ is probably to find data files relative to it. Since the data won't be in the __pyr__ directory we couldn't make __file__ point to the __pyr__/....pyc file without much code breakage. (Yes, I am still in favor of the folder-per-folder model.) -- --Guido van Rossum (python.org/~guido)

On Feb 06, 2010, at 02:20 PM, Guido van Rossum wrote:
Upon further reflection, I agree. __file__ also points to the source in Python 2.7.
Not in the 2.7 svn repo I have access to. It still points to the .pyc file if it was used.
Ah, I was fooled by a missing pyc file. Run it a second time and you're right, it points to the pyc.
And I propose not to disturb this in 2.7, at least not by default. I'm fine though with a flag or distro-overridable config setting to change this behavior.
Cool. I'm not sure this is absolutely necessary for Debian/Ubuntu, so I'll call YAGNI on it for 2.x (until and unless it isn't ;).
Do we need an attribute to point to the compiled bytecode file?
I think we do. Quite unrelated to this discussion I have a use case for knowing easily whether a module was actually loaded from bytecode or not -- but I also have a need for __file__ to point to the source. So having both __file__ and __compiled__ makes sense to me.
__compiled__ or __cached__? I like the latter but don't have strong feelings about it either way.
When there is no source code but only bytecode I am file with both pointing to the bytecode; in that case I presume that the bytecode is not in a __pyr__ subdirectory. For dynamically loaded extension modules I think both should be left unset, and some other __xxx__ variable could point to the .so or .dll file. FWIW the most common use case for __file__ is probably to find data files relative to it. Since the data won't be in the __pyr__ directory we couldn't make __file__ point to the __pyr__/....pyc file without much code breakage.
The other main use case for having such an attribute on extension modules is diagnostics. I want to be able to find out where on the file system a .so actually lives: Python 2.7a3+ (trunk:78030, Feb 6 2010, 15:18:29) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import _socket _socket.__file__ '/home/barry/projects/python/trunk/build/lib.linux-x86_64-2.7/_socket.so'
(Yes, I am still in favor of the folder-per-folder model.)
Cool. -Barry

On 07/02/2010 17:48, Barry Warsaw wrote:
[snip...]
And I propose not to disturb this in 2.7, at least not by default. I'm fine though with a flag or distro-overridable config setting to change this behavior.
Cool. I'm not sure this is absolutely necessary for Debian/Ubuntu, so I'll call YAGNI on it for 2.x (until and unless it isn't ;).
What are the chances of getting this into 2.x at all? For it to get into the 2.7, likely to be the last major version in the 2.x series, the PEP needs to be approved and the implementation needs to be feature complete by April 3rd (first beta release according to the schedule [1]). Michael Foord [1] http://www.python.org/dev/peps/pep-0373/#release-schedule -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Feb 07, 2010, at 05:59 PM, Michael Foord wrote:
On 07/02/2010 17:48, Barry Warsaw wrote:
[snip...]
And I propose not to disturb this in 2.7, at least not by default. I'm fine though with a flag or distro-overridable config setting to change this behavior.
Cool. I'm not sure this is absolutely necessary for Debian/Ubuntu, so I'll call YAGNI on it for 2.x (until and unless it isn't ;).
Sorry, I was calling YAGNI on any change in behavior of module.__file__.
What are the chances of getting this into 2.x at all? For it to get into the 2.7, likely to be the last major version in the 2.x series, the PEP needs to be approved and the implementation needs to be feature complete by April 3rd (first beta release according to the schedule [1]).
I'd like to consult with my Debian/Ubuntu Python maintainer colleagues to see if it's worth getting into 2.7. If it is, and we can get a BDFL pronouncement on the PEP (after the next rounds of updates), then I think it will be feasible to implement in the time remaining. Heck, that's what Pycon sprints are for, no? :) -Barry
participants (5)
-
Barry Warsaw
-
exarkun@twistedmatrix.com
-
Guido van Rossum
-
Michael Foord
-
Nick Coghlan