PEP 3147, __pycache__ directories and umask

I have a pretty good start on PEP 3147 implementation [1], but I've encountered a situation that I'd like to get some feedback on. Here's the test case illustrating the problem. From test_import.py: def test_writable_directory(self): # The umask is not conducive to creating a writable __pycache__ # directory. with umask(0o222): __import__(TESTFN) self.assertTrue(os.path.exists('__pycache__')) self.assertTrue(os.path.exists(os.path.join( '__pycache__', '{}.{}.pyc'.format(TESTFN, self.tag)))) The __pycache__ directory does not exist before the import, and the import machinery creates the directory, but the umask leaves the directory unwritable by anybody. So of course when the import machinery goes to write the .pyc file inside __pycache__, it fails. This does not cause an ImportError though, just like if today the package directory were unwritable. This might be different than today's situation though because once the unwritable __pycache__ directory is created, nothing is going to change that without explicit user interaction, and that might be difficult after the fact. I'm not sure what the right answer is. Some possible choices: * Tough luck * Force the umask so that the directory is writable, but then the question is, by whom? ugo+w or something less? * Copy the permissions from the parent directory and ignore umask * Raise an exception or refuse to create __pycache__ if it's not writable (again, by whom?) Or maybe you have a better idea? What's the equivalent situation on Windows and how would things work there? -Barry [1] https://edge.launchpad.net/~barry/python/pep3147 P.S. I'm down to only 8 unit test failures.

Barry Warsaw <barry <at> python.org> writes:
I'm not sure what the right answer is. Some possible choices:
* Tough luck * Force the umask so that the directory is writable, but then the question is, by whom? ugo+w or something less? * Copy the permissions from the parent directory and ignore umask * Raise an exception or refuse to create __pycache__ if it's not writable (again, by whom?)
Welcome to a problem PHP has known for years. The problem with solution 3 is that a non-root user can't change the owner of a directory, therefore copying the permissions doesn't ensure the directory will be writable by the right users. Solution 2 (chmod 777) is obviously too laxist. Anyone could put fake bytecode files in a root-owned application. Less laxist chmods have the same problem as solution 3, and can still be insecure. I think solution 4 would be the best compromise. In the face of ambiguity, refuse to guess. But, rather than : refuse to create __pycache__ if it's not writable, the test should be : refuse to create __pycache__ if I can't create it with the same ownership and permissions as the parent directory (and raise an ImportWarning). In light of this issue, I'm -0.5 on __pycache__ becoming the default caching mechanism. The directory ownership/permissions issue is too much of a mess, especially for Web applications (think __pycache__ files created by the Apache user). __pycache__ should only be created if explicitly activated (for example by distutils when installing stuff). Otherwise, if not present, the "legacy" mechanism (writing an untagged pyc file along the py file) should be used. Actually, __pycache__ creation doesn't have to be part of the import mechanism. It can be part of distutils instead (or whatever third-party tool such as distribute, or distro-specific packaging scripts). This would relax complexity of core Python a bit. Regards Antoine.

refuse to create __pycache__ if I can't create it with the same ownership and permissions as the parent directory (and raise an ImportWarning).
I don't think an ImportWarning should be raised: AFAICT, we are not raising one, either, when the pyc file cannot be created (and it may well be the case that you don't have write permission at all, which would trigger lots of ImportWarnings). Regards, Martin

Oh, and by the way, there can be a race condition between __pycache__ creation and deletion (if it fails the test), where an attacker can stuff a hostile pyc file in the directory in the meantime (and the deletion then fails because the directory isn't empty). IMO, all these issues militate for putting __pycache__ creation out of the interpreter core, and in the hands of third-party package-time/ install-time tools (or distutils). Le Mon, 22 Mar 2010 14:30:12 +0000, Antoine Pitrou a écrit :
__pycache__ should only be created if explicitly activated (for example by distutils when installing stuff). Otherwise, if not present, the "legacy" mechanism (writing an untagged pyc file along the py file) should be used.
Actually, __pycache__ creation doesn't have to be part of the import mechanism. It can be part of distutils instead (or whatever third-party tool such as distribute, or distro-specific packaging scripts). This would relax complexity of core Python a bit.
Regards
Antoine.

On Mon, 22 Mar 2010, Antoine Pitrou wrote:
Oh, and by the way, there can be a race condition between __pycache__ creation and deletion (if it fails the test), where an attacker can stuff a hostile pyc file in the directory in the meantime (and the deletion then fails because the directory isn't empty).
Would creating it under a different name and then renaming help with this?
IMO, all these issues militate for putting __pycache__ creation out of the interpreter core, and in the hands of third-party package-time/ install-time tools (or distutils).
Speaking only for myself, but really for anybody who likes tidy source directories, I hope some version of the __pycache__ proposal becomes part of standard Python, by which I ideally mean it's enabled by default but if that is just not a good idea then at most it should be required to set a command-line option to get this feature. If I just want to write some .py code and run it, I don't see why my directories need to clutter up with .pyc files. I've previously suggested a Python version of javac's -d ("destination directory") option, but putting all the .pyc's in a __pycache__ directory per source directory is good enough to make me happy (and is Pythonically simple, in my opinion). Isaac Morland CSCF Web Guru DC 2554C, x36650 WWW Software Specialist

Isaac Morland <ijmorlan <at> uwaterloo.ca> writes:
IMO, all these issues militate for putting __pycache__ creation out of the interpreter core, and in the hands of third-party package-time/ install-time tools (or distutils).
Speaking only for myself, but really for anybody who likes tidy source directories, I hope some version of the __pycache__ proposal becomes part of standard Python, by which I ideally mean it's enabled by default but if that is just not a good idea then at most it should be required to set a command-line option to get this feature.
This doesn't contradict by my proposal. What I am proposing is that the creation of __pycache__ /directories/ be put outside of the core. It can be part of distutils, or of a separate module, or delegated to third-party tools. It could even be as simple as "python -m compileall --pycache", if someone implements it. Creation of the __pycache__ /contents/ (files inside the directory) would still be part of core Python, but only if the directory exists and is writable by the current process. Regards Antoine.

Antoine Pitrou wrote:
Isaac Morland <ijmorlan <at> uwaterloo.ca> writes:
IMO, all these issues militate for putting __pycache__ creation out of the interpreter core, and in the hands of third-party package-time/ install-time tools (or distutils). Speaking only for myself, but really for anybody who likes tidy source directories, I hope some version of the __pycache__ proposal becomes part of standard Python, by which I ideally mean it's enabled by default but if that is just not a good idea then at most it should be required to set a command-line option to get this feature.
This doesn't contradict by my proposal.
What I am proposing is that the creation of __pycache__ /directories/ be put outside of the core. It can be part of distutils, or of a separate module, or delegated to third-party tools. It could even be as simple as "python -m compileall --pycache", if someone implements it.
Creation of the __pycache__ /contents/ (files inside the directory) would still be part of core Python, but only if the directory exists and is writable by the current process.
+1 If I understand correctly, we would have the current mode as the default, and can trigger __pycache__ behavior simply by manually creating a __pycache__ directory and deleting any byte-code files in the module/program directory. I like this, it is easy to understand and can be used without messing with flags or environment variables. Ron

On Mar 22, 2010, at 02:02 PM, Ron Adam wrote:
If I understand correctly, we would have the current mode as the default, and can trigger __pycache__ behavior simply by manually creating a __pycache__ directory and deleting any byte-code files in the module/program directory.
I like this, it is easy to understand and can be used without messing with flags or environment variables.
Well, for a package with subpackages, it gets more complicated. Definitely not something you're likely to do manually. Antoine's suggestion of 'python -m compileall --pycache' would work, but I think it's also obscure enough that most Python users won't get the benefit. -Barry

On Mon, Mar 22, 2010 at 12:20 PM, Barry Warsaw <barry@python.org> wrote:
On Mar 22, 2010, at 02:02 PM, Ron Adam wrote:
If I understand correctly, we would have the current mode as the default, and can trigger __pycache__ behavior simply by manually creating a __pycache__ directory and deleting any byte-code files in the module/program directory.
Huh? Last time I looked weren't we going to make __pycache__ the default (and eventually only) behavior?
I like this, it is easy to understand and can be used without messing with flags or environment variables.
Well, for a package with subpackages, it gets more complicated. Definitely not something you're likely to do manually. Antoine's suggestion of 'python -m compileall --pycache' would work, but I think it's also obscure enough that most Python users won't get the benefit.
I see only two reasonable solutions for __pycache__ creation -- either we change all setup/install scripts (both for core Python and for 3rd party packages) to always create a __pycache__ subdirectory for every directory (including package directories) installed; or we somehow create it the first time it's needed. But creating it as needed runs into at least similar problems with ownership as creating .pyc files when first needed (if the parent directory is root-owned a mere mortal can't create it at all). So even apart from the security issue (which I haven't thought about deeply) I think precreation should at least be an easily accessible option both for the core (where it can be done by compileall) and for 3rd party packages (where I guess it's up to distutils or whatever install mechanism is used). -- --Guido van Rossum (python.org/~guido)

Guido van Rossum wrote:
On Mon, Mar 22, 2010 at 12:20 PM, Barry Warsaw <barry@python.org> wrote:
On Mar 22, 2010, at 02:02 PM, Ron Adam wrote:
If I understand correctly, we would have the current mode as the default, and can trigger __pycache__ behavior simply by manually creating a __pycache__ directory and deleting any byte-code files in the module/program directory.
Huh? Last time I looked weren't we going to make __pycache__ the default (and eventually only) behavior?
I expect that the __pycache__ directories would quickly become the recommended "defacto default" for writing and preinitializing modules and packages, with the current behavior of having bytecode in the same directory as the .py files only as the fall-back (what I meant by default) behavior when the __pycache__ directories do not exist.
I see only two reasonable solutions for __pycache__ creation -- either we change all setup/install scripts (both for core Python and for 3rd party packages) to always create a __pycache__ subdirectory for every directory (including package directories) installed; or we somehow create it the first time it's needed.
But creating it as needed runs into at least similar problems with ownership as creating .pyc files when first needed (if the parent directory is root-owned a mere mortal can't create it at all).
So even apart from the security issue (which I haven't thought about deeply) I think precreation should at least be an easily accessible option both for the core (where it can be done by compileall) and for 3rd party packages (where I guess it's up to distutils or whatever install mechanism is used).
Yes, I think that is what Antoine was also getting at. Is there a need for python to use __pycache__ directories 100% of the time? For 2.x it seems like being flexible would be best, and if 3.x is going to be strict about it, it should be strict sooner than later rather than have a lot of 3rd party packages break at some point down the road. Ron

Is there a need for python to use __pycache__ directories 100% of the time? For 2.x it seems like being flexible would be best, and if 3.x is going to be strict about it, it should be strict sooner than later rather than have a lot of 3rd party packages break at some point down the road.
For 2.x, nothing will happen, anyway (except for Linux distributions perhaps integrating a patch on their own): 2.7b1 is about to be released, after which point 2.x will not see any new features. Regards, Martin

On Tue, 23 Mar 2010 07:38:42 am Guido van Rossum wrote:
But creating it as needed runs into at least similar problems with ownership as creating .pyc files when first needed (if the parent directory is root-owned a mere mortal can't create it at all).
Isn't that a feature though? I don't see why this is a problem, perhaps somebody can explain what I'm missing. Current system: If the user doesn't have write permission in the appropriate directory, no .pyc file is created and the import uses the .py file only. New system (proposal): If the user doesn't have write permission in the appropriate directory, no __pycache__ folder is created and the import uses the .py file only. If the user has write permission and the __pycache__ folder is created, but the umask is screwy and no .pyc files can be created, no .pyc file is created and the import uses the .py file only despite the existence of an empty __pycache__ folder. Why bother to delete the unwritable __pycache__ directory? If you can't write to it, neither can any process with the same or fewer permissions. If some other process has more permissions, it could write something nasty to the directory, but surely it's not Python's responsibility to be secure in the face of a compromised root or admin account? As I see it, the only question is, should we warn the user that we can't write to the newly-created __pycache__ directory? I'm +0 on that. To be clear: Can't create __pycache__? That's a feature. Fail silently and fall back to using the .py file alone. Existing __pycache__ directory, but can't write to it? That's a feature too, treat it just like the above. No __pycache__ directory, and creating it succeeds, but then import can't write to the freshly created directory? That's just weird, so I'm +0 on raising a warning and -1 on raising an exception. -- Steven D'Aprano

Steven D'Aprano wrote:
If the user has write permission and the __pycache__ folder is created, but the umask is screwy and no .pyc files can be created, no .pyc file is created and the import uses the .py file only despite the existence of an empty __pycache__ folder.
Sounds okay to me. -- Greg

On Mar 22, 2010, at 12:38 PM, Guido van Rossum wrote:
Huh? Last time I looked weren't we going to make __pycache__ the default (and eventually only) behavior?
We definitely agreed it would be the default in Python 3.2. My recollection is that we agreed it would be the only on-demand way of writing pyc files, but that Python would read a lone .pyc file where the source would be if the source is missing, and that py_compile/compileall would support optional creation of those lone .pyc files.
I see only two reasonable solutions for __pycache__ creation -- either we change all setup/install scripts (both for core Python and for 3rd party packages) to always create a __pycache__ subdirectory for every directory (including package directories) installed; or we somehow create it the first time it's needed.
But creating it as needed runs into at least similar problems with ownership as creating .pyc files when first needed (if the parent directory is root-owned a mere mortal can't create it at all). So even apart from the security issue (which I haven't thought about deeply) I think precreation should at least be an easily accessible option both for the core (where it can be done by compileall) and for 3rd party packages (where I guess it's up to distutils or whatever install mechanism is used).
So you're +1 on Tough Luck? I think that's the best answer. You will be able to arrange pre-creation though compileall with the right layout and presumably the right umask if you want. -Barry

Barry Warsaw wrote:
On Mar 22, 2010, at 12:38 PM, Guido van Rossum wrote:
Huh? Last time I looked weren't we going to make __pycache__ the default (and eventually only) behavior?
We definitely agreed it would be the default in Python 3.2.
My recollection is that we agreed it would be the only on-demand way of writing pyc files, but that Python would read a lone .pyc file where the source would be if the source is missing, and that py_compile/compileall would support optional creation of those lone .pyc files.
Yep, that's my recollection as well. I don't recall seeing an update to state that clearly in the PEP go by on the checkins list though :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Barry Warsaw wrote:
On Mar 24, 2010, at 10:04 PM, Nick Coghlan wrote:
Yep, that's my recollection as well. I don't recall seeing an update to state that clearly in the PEP go by on the checkins list though :)
Check again <wink>.
Ah yes, the recollection of seeing such a message is now quite fresh in my mind :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Barry Warsaw writes:
On Mar 24, 2010, at 11:06 PM, Nick Coghlan wrote:
Ah yes, the recollection of seeing such a message is now quite fresh in my mind :)
Just don't tell Guido I borrowed his time machine keys!
Wouldn't that be preferable to revealing you've learned to hotwire it?

Barry Warsaw wrote:
On Mar 22, 2010, at 02:02 PM, Ron Adam wrote:
If I understand correctly, we would have the current mode as the default, and can trigger __pycache__ behavior simply by manually creating a __pycache__ directory and deleting any byte-code files in the module/program directory.
I like this, it is easy to understand and can be used without messing with flags or environment variables.
Well, for a package with subpackages, it gets more complicated. Definitely not something you're likely to do manually. Antoine's suggestion of 'python -m compileall --pycache' would work, but I think it's also obscure enough that most Python users won't get the benefit.
May be a bit more complicated, but it should be easy to write tools to handle the repetitive stuff. When I'm writing python projects I usually only work on one or two packages at a time at most. Creating a couple of directories when I first get started on a project is nothing compared with the other 10 or more files of
1,000 lines of code each. This has a very low mental hurdle too.
All the other packages I'm importing would probably already be preinstalled complete with __pycache__ directories and bytecode files. As Antoine was suggesting that it would be up to the installer's (scrips) to create the __pycache__ directories at install time, either directly or by possibly issuing a 'python -m compileall --pycache' command? Ron

On Mar 22, 2010, at 06:15 PM, Antoine Pitrou wrote:
What I am proposing is that the creation of __pycache__ /directories/ be put outside of the core. It can be part of distutils, or of a separate module, or delegated to third-party tools. It could even be as simple as "python -m compileall --pycache", if someone implements it.
Except that most people won't do this, they'll just run Python. I wouldn't count this as "enabled by default". -Barry

Barry Warsaw wrote:
On Mar 22, 2010, at 06:15 PM, Antoine Pitrou wrote:
What I am proposing is that the creation of __pycache__ /directories/ be put outside of the core. It can be part of distutils, or of a separate module, or delegated to third-party tools. It could even be as simple as "python -m compileall --pycache", if someone implements it.
Except that most people won't do this, they'll just run Python. I wouldn't count this as "enabled by default".
Therefore, I'm in favor of having it on by default. If certain use cases make it problematic (e.g. Apache creating directories which you then cannot delete), there should be a way to turn it *off*. Perhaps the existing machinery to turn of byte code generation at all might be sufficient. As for the original question (funny umasks), I think my proposal is "tough luck". Don't try to be super-smart; as Antoine explains, it gets worse, not better. If the user has arranged that Python will create unusable directories, the user better changes his setup. Regards, Martin

Martin v. Löwis <martin <at> v.loewis.de> writes:
If certain use cases make it problematic (e.g. Apache creating directories which you then cannot delete), there should be a way to turn it *off*. Perhaps the existing machinery to turn of byte code generation at all might be sufficient.
Except that you notice the problem when it happens, so turning it off is always too late.

Antoine Pitrou wrote:
Martin v. Löwis <martin <at> v.loewis.de> writes:
If certain use cases make it problematic (e.g. Apache creating directories which you then cannot delete), there should be a way to turn it *off*. Perhaps the existing machinery to turn of byte code generation at all might be sufficient.
Except that you notice the problem when it happens, so turning it off is always too late.
Why is it too late? Fix it, and get on. Regards, Martin

Martin v. Löwis <martin <at> v.loewis.de> writes:
Antoine Pitrou wrote:
Martin v. Löwis <martin <at> v.loewis.de> writes:
If certain use cases make it problematic (e.g. Apache creating directories which you then cannot delete), there should be a way to turn it *off*. Perhaps the existing machinery to turn of byte code generation at all might be sufficient.
Except that you notice the problem when it happens, so turning it off is
always
too late.
Why is it too late? Fix it, and get on.
Sure, but it is annoying, and since it's the kind of things that noone (including sysadmins) ever thinks about in advance, it's bound to repeat itself quite often. It's especially annoying, of course, if you have to ask someone else to remove the directories for you (or if you have to write custom code and get it executed by the Apache or WSGI handler...). Really, it's unfriendly to users and it's certainly not outweighed by the "benefit" of having "cleaner" source directories.

Why is it too late? Fix it, and get on.
Sure, but it is annoying, and since it's the kind of things that noone (including sysadmins) ever thinks about in advance, it's bound to repeat itself quite often.
It's especially annoying, of course, if you have to ask someone else to remove the directories for you (or if you have to write custom code and get it executed by the Apache or WSGI handler...).
Really, it's unfriendly to users and it's certainly not outweighed by the "benefit" of having "cleaner" source directories.
Whether it is outweighed would also depend on how likely and frequent the presumed problem is, no? If Apache creates a folder for me that I cannot remove, most likely, there was a configuration error in the first place: common practice tells that you should execute user code under user permissions, not as www-data. If your code does get run as Apache, this also opens a way of not asking for help: just put "os.system('chmod +w /.../__pycache__')" into your code, and have Apache run it again. So I don't think this is any more unfriendly than creating .pyc files in the first place, and the advantages of uniformity of this new approach certainly outweigh the disadvantages. Regards, Martin

Le lundi 22 mars 2010 à 23:18 +0100, "Martin v. Löwis" a écrit :
If Apache creates a folder for me that I cannot remove, most likely, there was a configuration error in the first place: common practice tells that you should execute user code under user permissions, not as www-data.
I'm sure there can be reasons not to do so. One of them is mass hosting of Python apps: you don't want to create a separate process per Python user, and therefore you can't run the code under the "right" user.
If your code does get run as Apache, this also opens a way of not asking for help: just put "os.system('chmod +w /.../__pycache__')" into your code, and have Apache run it again.
This is what I meant by "write custom code and get it executed by the Apache or WSGI handler". Having the Web server execute ad hoc system administration code is far from elegant and user-friendly.
So I don't think this is any more unfriendly than creating .pyc files in the first place, and the advantages of uniformity of this new approach certainly outweigh the disadvantages.
I don't see the "advantages of uniformity". What we had initially was a way to solve problems which are specific to Ubuntu and Debian packagers, and which was only to be activated by those people on those systems. We now end with an alleged complete solution to a problem which doesn't seem to exist, or is at least vastly overblown (the idea that having pyc files along their source counterparts is a nuisance doesn't seem to be a common grief against Python). I would really recommend reexamining it, rather than falling for the shiny new thing. Regards Antoine.

We now end with an alleged complete solution to a problem which doesn't seem to exist, or is at least vastly overblown (the idea that having pyc files along their source counterparts is a nuisance doesn't seem to be a common grief against Python).
I would really recommend reexamining it, rather than falling for the shiny new thing.
I think the appropriate action at this point is to record this specific objection in the PEP. Regards, Martin

Antoine Pitrou wrote:
Having the Web server execute ad hoc system administration code is far from elegant and user-friendly.
With the right piece of code, you could create yourself a setuid-apache shell and solve this problem once and for all. :-) -- Greg

Greg Ewing <greg.ewing <at> canterbury.ac.nz> writes:
Antoine Pitrou wrote:
Having the Web server execute ad hoc system administration code is far from elegant and user-friendly.
With the right piece of code, you could create yourself a setuid-apache shell and solve this problem once and for all.
Well, if I can create a setuid apache shell, I can probably su as root or apache as well. ("su -c rm -r whatever") Or are you talking about a Web-based shell?

Antoine Pitrou wrote:
Well, if I can create a setuid apache shell, I can probably su as root or apache as well. ("su -c rm -r whatever")
Or are you talking about a Web-based shell?
I'm just saying that if there is any way of running code of your choice as the apache user, you can get it to make a copy of /bin/sh and suid it. Of course, if you have permission to su apache, then this is not necessary. But then you wouldn't have to go through web server contortions to fix apache-generated botchups either. -- Greg

On Mar 22, 2010, at 09:57 PM, Antoine Pitrou wrote:
It's especially annoying, of course, if you have to ask someone else to remove the directories for you (or if you have to write custom code and get it executed by the Apache or WSGI handler...).
compileall should probably grow a --clean option which would be essentially equivalent to find . -name '__pycache__' | xargs rmdir -Barry

On Mar 22, 2010, at 09:47 PM, Martin v. Löwis wrote:
Therefore, I'm in favor of having it on by default. If certain use cases make it problematic (e.g. Apache creating directories which you then cannot delete), there should be a way to turn it *off*. Perhaps the existing machinery to turn of byte code generation at all might be sufficient.
That's what I'm thinking. Of course it will always be possible to run compileall and get the __pycache__ directories pre-created, presumably with the right umask.
As for the original question (funny umasks), I think my proposal is "tough luck". Don't try to be super-smart; as Antoine explains, it gets worse, not better. If the user has arranged that Python will create unusable directories, the user better changes his setup.
I completely agree; +1 -Barry

On Mon, 22 Mar 2010 18:15:01 -0000, Antoine Pitrou <solipsis@pitrou.net> wrote:
Isaac Morland <ijmorlan <at> uwaterloo.ca> writes:
IMO, all these issues militate for putting __pycache__ creation out of the interpreter core, and in the hands of third-party package-time/ install-time tools (or distutils).
Speaking only for myself, but really for anybody who likes tidy source directories, I hope some version of the __pycache__ proposal becomes part of standard Python, by which I ideally mean it's enabled by default but if that is just not a good idea then at most it should be required to set a command-line option to get this feature.
This doesn't contradict by my proposal.
What I am proposing is that the creation of __pycache__ /directories/ be put outside of the core. It can be part of distutils, or of a separate module, or delegated to third-party tools. It could even be as simple as "python -m compileall --pycache", if someone implements it.
Or even as simple as 'mkdir __pycache__', if you are working in your own library and don't want .pyc clutter. -- R. David Murray www.bitdance.com

On 3/22/2010 2:15 PM, Antoine Pitrou wrote:
What I am proposing is that the creation of __pycache__ /directories/ be put outside of the core. It can be part of distutils, or of a separate module, or delegated to third-party tools. It could even be as simple as "python -m compileall --pycache", if someone implements it.
Creation of the __pycache__ /contents/ (files inside the directory) would still be part of core Python, but only if the directory exists and is writable by the current process.
-1 If, as I have done several times recently, I create a directory and insert an empty __init__.py and several real module.py files, I want the .pycs to go into __pycache__ *automatically, by default, without me also having to remember to create an empty __pycache__ *directory*, *each time*. Ugh. Terry Jan Reedy

Terry Reedy wrote:
On 3/22/2010 2:15 PM, Antoine Pitrou wrote:
What I am proposing is that the creation of __pycache__ /directories/ be put outside of the core. It can be part of distutils, or of a separate module, or delegated to third-party tools. It could even be as simple as "python -m compileall --pycache", if someone implements it.
Creation of the __pycache__ /contents/ (files inside the directory) would still be part of core Python, but only if the directory exists and is writable by the current process.
-1
If, as I have done several times recently, I create a directory and insert an empty __init__.py and several real module.py files, I want the .pycs to go into __pycache__ *automatically, by default, without me also having to remember to create an empty __pycache__ *directory*, *each time*. Ugh.
I think I misunderstood this at first. It looks like, while developing a python 3.2+ program, if you don't create an empty __pycache__ directory, everything will still work, you just won't get the .pyc files. That can be a good thing during development because you also will not have any problems with old .pyc files hanging around if you move or rename files. The startup time may just be a tad longer, but probably not enough to be much of a problem. If it is a problem you can just create the __pycache__ directory, but nothing bad will happen if you don't. Ron

Ron Adam wrote:
I think I misunderstood this at first.
It looks like, while developing a python 3.2+ program, if you don't create an empty __pycache__ directory, everything will still work, you just won't get the .pyc files. That can be a good thing during development because you also will not have any problems with old .pyc files hanging around if you move or rename files.
The behaviour you described (not creating __pycache__ automatically) was just a suggestion in this thread. The behaviour in the actual PEP (and what will be implemented for 3.2+) is to create __pycache__ if it is missing. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan wrote:
Ron Adam wrote:
I think I misunderstood this at first.
It looks like, while developing a python 3.2+ program, if you don't create an empty __pycache__ directory, everything will still work, you just won't get the .pyc files. That can be a good thing during development because you also will not have any problems with old .pyc files hanging around if you move or rename files.
The behaviour you described (not creating __pycache__ automatically) was just a suggestion in this thread.
The behaviour in the actual PEP (and what will be implemented for 3.2+) is to create __pycache__ if it is missing.
Cheers, Nick.
OK :-) hmmmm... unless there is a __pycache__ *file* located there first. ;-) Not that I can think of any good reason to do that at this moment. Ron

Ron Adam wrote:
hmmmm... unless there is a __pycache__ *file* located there first. ;-)
Just a specific reason why attempting to create __pycache__ can fail (which has defined behaviour in the PEP - running directly from the source without caching the bytecode file). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mar 23, 2010, at 08:42 PM, Ron Adam wrote:
It looks like, while developing a python 3.2+ program, if you don't create an empty __pycache__ directory, everything will still work, you just won't get the .pyc files. That can be a good thing during development because you also will not have any problems with old .pyc files hanging around if you move or rename files.
Not quite. In PEP-3147-land, you do not need to create the empty __pycache__ directory, Python will create it for you on-demand. If you subsequently move the .py source file, the __pycache__/...pyc file will be ignored. So this is actually better than today because you can't accidentally load stale pyc files -- if they live inside __pycache__. For backward compatibility we'll still support loading lone pyc files in the source file directory (i.e. outside of __pycache__). -Barry

Antoine Pitrou wrote:
Oh, and by the way, there can be a race condition between __pycache__ creation and deletion (if it fails the test)
You can check whether the directory would be created with the right user beforehand, and if not, don't create one at all. To exploit a race condition there, the attacker would have to be capable of either changing the owner of the parent directory or removing it and replacing it with a different one, and if he can do that, he can do whatever he wants anyway. -- Greg

On Mar 22, 2010, at 02:30 PM, Antoine Pitrou wrote:
Barry Warsaw <barry <at> python.org> writes:
* Tough luck * Force the umask so that the directory is writable, but then the question is, by whom? ugo+w or something less? * Copy the permissions from the parent directory and ignore umask * Raise an exception or refuse to create __pycache__ if it's not writable (again, by whom?)
Welcome to a problem PHP has known for years.
Yay.
The problem with solution 3 is that a non-root user can't change the owner of a directory, therefore copying the permissions doesn't ensure the directory will be writable by the right users.
Solution 2 (chmod 777) is obviously too laxist. Anyone could put fake bytecode files in a root-owned application. Less laxist chmods have the same problem as solution 3, and can still be insecure.
I think solution 4 would be the best compromise. In the face of ambiguity, refuse to guess. But, rather than : refuse to create __pycache__ if it's not writable, the test should be : refuse to create __pycache__ if I can't create it with the same ownership and permissions as the parent directory (and raise an ImportWarning).
I actually think "tough luck" might be the right answer and that it's not really all that serious. Certainly not serious enough not to enable this by default. When Python is being installed, either by a from-source 'make install' or by the distro packager, then you'd expect the umask not to be insane. In the latter case, it's a bug and in the former case you screwed up so you should be able to delete and reinstall, or at the very least execute the same `find` command that Python's own Makefile calls (in my branch) for 'make clean'. When you're installing packages, again, I would expect that the system installer, or you via `easy_install` or whatever, would not have an insane umask. As with above, if you *did* have a bad umask, you could certainly also have a bad umask whenever you ran the third party tool as when you ran Python. You could also have a bad umask when you're doing local development, but that should be easy enough to fix, and besides, I think you'd have other problems if that were the case. One possibility would be to instrument compileall to remove __pycache__ directories with a flag.
In light of this issue, I'm -0.5 on __pycache__ becoming the default caching mechanism. The directory ownership/permissions issue is too much of a mess, especially for Web applications (think __pycache__ files created by the Apache user).
__pycache__ should only be created if explicitly activated (for example by distutils when installing stuff). Otherwise, if not present, the "legacy" mechanism (writing an untagged pyc file along the py file) should be used.
I don't particularly like this much, because most people won't get the benefit of it in a local dev environment. While not the primary use case, it's a useful enough side benefit, and unless Guido changes his mind, we want to enable this by default.
Actually, __pycache__ creation doesn't have to be part of the import mechanism. It can be part of distutils instead (or whatever third-party tool such as distribute, or distro-specific packaging scripts). This would relax complexity of core Python a bit.
It's not really that complicated. :) -Barry

Barry Warsaw <barry <at> python.org> writes:
When Python is being installed, either by a from-source 'make install' or by the distro packager, then you'd expect the umask not to be insane. In the latter case, it's a bug and in the former case you screwed up so you should be able to delete and reinstall, or at the very least execute the same `find` command that Python's own Makefile calls (in my branch) for 'make clean'.
When you're installing packages, again, I would expect that the system installer, or you via `easy_install` or whatever, would not have an insane umask.
Well, precisely. That's why I suggest that creating the __pycache__ directories be done *at install time* (or packaging time), and not via the core import machinery (that is, not at import time). That is, when you *know* you are the right user, with the right umask. Which also means that it can and probably should be pulled out of the core (and written in pure Python, which is also much more practical to maintain, test and debug).
I don't particularly like this much, because most people won't get the benefit of it in a local dev environment.
AFAICT it has no benefit in a local dev environment (apart from supposedly "cleaner" directories, which is not something anyone seems to consider a problem with the current CPythons). And again, this could be part of the distutils standard "setup.py install" (or "setup.py develop" with distutils2/distribute) if people care :) Regards Antoine.

Antoine Pitrou <solipsis@pitrou.net> writes:
Barry Warsaw <barry <at> python.org> writes:
When Python is being installed, either by a from-source 'make install' or by the distro packager, then you'd expect the umask not to be insane. In the latter case, it's a bug and in the former case you screwed up so you should be able to delete and reinstall, or at the very least execute the same `find` command that Python's own Makefile calls (in my branch) for 'make clean'.
When you're installing packages, again, I would expect that the system installer, or you via `easy_install` or whatever, would not have an insane umask.
Well, precisely. That's why I suggest that creating the __pycache__ directories be done *at install time* (or packaging time), and not via the core import machinery (that is, not at import time). That is, when you *know* you are the right user, with the right umask.
+1. Taking advantage of caching directories that have already been set up correctly in advance at install time is friendly. Littering the runtime directory with new subdirectories by default is not so friendly. Perhaps also of note is that the FHS recommends systems use ‘/var/cache/foo/’ for cached data from applications: /var/cache : Application cache data Purpose /var/cache is intended for cached data from applications. Such data is locally generated as a result of time-consuming I/O or calculation. The application must be able to regenerate or restore the data. Unlike /var/spool, the cached files can be deleted without data loss. The data must remain valid between invocations of the application and rebooting the system. <URL:http://www.debian.org/doc/packaging-manuals/fhs/fhs-2.3.html#VARCACHEAPPLICA...> This would suggest that Python could start using ‘/var/cache/python/’ for its cached bytecode tree on systems that implement the FHS. -- \ “Simplicity and elegance are unpopular because they require | `\ hard work and discipline to achieve and education to be | _o__) appreciated.” —Edsger W. Dijkstra | Ben Finney

On 23.03.2010 02:28, Ben Finney wrote:
Antoine Pitrou<solipsis@pitrou.net> writes:
Barry Warsaw<barry<at> python.org> writes:
When Python is being installed, either by a from-source 'make install' or by the distro packager, then you'd expect the umask not to be insane. In the latter case, it's a bug and in the former case you screwed up so you should be able to delete and reinstall, or at the very least execute the same `find` command that Python's own Makefile calls (in my branch) for 'make clean'.
When you're installing packages, again, I would expect that the system installer, or you via `easy_install` or whatever, would not have an insane umask.
Well, precisely. That's why I suggest that creating the __pycache__ directories be done *at install time* (or packaging time), and not via the core import machinery (that is, not at import time). That is, when you *know* you are the right user, with the right umask.
+1.
Taking advantage of caching directories that have already been set up correctly in advance at install time is friendly. Littering the runtime directory with new subdirectories by default is not so friendly.
Perhaps also of note is that the FHS recommends systems use ‘/var/cache/foo/’ for cached data from applications:
/var/cache : Application cache data
Purpose
/var/cache is intended for cached data from applications. Such data is locally generated as a result of time-consuming I/O or calculation. The application must be able to regenerate or restore the data. Unlike /var/spool, the cached files can be deleted without data loss. The data must remain valid between invocations of the application and rebooting the system.
<URL:http://www.debian.org/doc/packaging-manuals/fhs/fhs-2.3.html#VARCACHEAPPLICA...>
This would suggest that Python could start using ‘/var/cache/python/’ for its cached bytecode tree on systems that implement the FHS.
it reads *data*, not code.

Matthias Klose <doko@ubuntu.com> writes:
On 23.03.2010 02:28, Ben Finney wrote:
Perhaps also of note is that the FHS recommends systems use ‘/var/cache/foo/’ for cached data from applications:
/var/cache : Application cache data
Purpose
/var/cache is intended for cached data from applications. Such data is locally generated as a result of time-consuming I/O or calculation. The application must be able to regenerate or restore the data. Unlike /var/spool, the cached files can be deleted without data loss. The data must remain valid between invocations of the application and rebooting the system.
<URL:http://www.debian.org/doc/packaging-manuals/fhs/fhs-2.3.html#VARCACHEAPPLICA...>
This would suggest that Python could start using ‘/var/cache/python/’ for its cached bytecode tree on systems that implement the FHS.
it reads *data*, not code.
So what? There's no implication that data-which-happens-to-be-code is unsuitable for storage in ‘/var/cache/foo/’. Easily-regenerated Python byte code for caching meets the description quite well, AFAICT. -- \ “It seems intuitively obvious to me, which means that it might | `\ be wrong.” —Chris Torek | _o__) | Ben Finney

On Wed, 24 Mar 2010 12:35:43 am Ben Finney wrote:
Matthias Klose <doko@ubuntu.com> writes:
On 23.03.2010 02:28, Ben Finney wrote:
Perhaps also of note is that the FHS recommends systems use ‘/var/cache/foo/’ for cached data from applications:
/var/cache : Application cache data
Purpose
/var/cache is intended for cached data from applications. Such data is locally generated as a result of time-consuming I/O or calculation. The application must be able to regenerate or restore the data. Unlike /var/spool, the cached files can be deleted without data loss. The data must remain valid between invocations of the application and rebooting the system.
<URL:http://www.debian.org/doc/packaging-manuals/fhs/fhs-2.3.html #VARCACHEAPPLICATIONCACHEDATA>
This would suggest that Python could start using ‘/var/cache/python/’ for its cached bytecode tree on systems that implement the FHS.
it reads *data*, not code.
So what? There's no implication that data-which-happens-to-be-code is unsuitable for storage in ‘/var/cache/foo/’. Easily-regenerated Python byte code for caching meets the description quite well, AFAICT.
While I strongly approve of the concept of a central cache directory for many things, I don't think that .pyc files fit the bill. Since there is no privileged python user that runs all Python code, and since any unprivileged user needs to be able to write .pyc files, the cache directory needs to be world-writable. But since the data being stored is code, that opens up a fairly nasty security hole: user fred could overwrite the cached .pyc files used by user barney and cause barney to run any arbitrary code fred likes. The alternative would be to have individual caches for every user. Apart from being wasteful of disk space ("but who cares, bigger disks are cheap") that just complicates everything. You would need: /var/cache/python/<user>/ which would essentially make it impossible to ship pre-compiled .pyc files, since the packaging system couldn't predict what usernames (note plural) to store them under. It is not true that one can necessarily delete the cached files without data loss. Python still supports .pyc-only packages, and AFAIK there are no plans to stop that, so deleting the .pyc file may delete the module. -- Steven D'Aprano

On Mar 24, 2010, at 12:35 AM, Ben Finney wrote:
So what? There's no implication that data-which-happens-to-be-code is unsuitable for storage in ‘/var/cache/foo/’. Easily-regenerated Python byte code for caching meets the description quite well, AFAICT.
pyc files don't go there now, so why would PEP 3147 change that? -Barry

On Mar 22, 2010, at 08:33 PM, Antoine Pitrou wrote:
Well, precisely. That's why I suggest that creating the __pycache__ directories be done *at install time* (or packaging time), and not via the core import machinery (that is, not at import time). That is, when you *know* you are the right user, with the right umask.
I don't think they're mutually exclusive. We will definitely give users the tool to do compilation at install time via compileall. That needn't preclude on-demand creation, which will generally Just Work. -Barry

Antoine Pitrou wrote:
In light of this issue, I'm -0.5 on __pycache__ becoming the default caching mechanism. The directory ownership/permissions issue is too much of a mess, especially for Web applications (think __pycache__ files created by the Apache user).
Doesn't the existing .pyc mechanism have the same problem? Seems to me it's just as insecure to allow the Apache user to create .pyc files, since an attacker could overwrite them with arbitrary bytecode. The only safe way is to pre-compile under a different user and make everything read-only to Apache. The same thing would apply under the __pycache__ regime.
Actually, __pycache__ creation doesn't have to be part of the import mechanism. It can be part of distutils instead (or whatever third-party tool
What about development, or if a user installs by dragging into site-packages instead of using an installer? I don't like the idea of being required to use an installation tool in order to get .pyc files. -- Greg

Greg Ewing <greg.ewing <at> canterbury.ac.nz> writes:
Doesn't the existing .pyc mechanism have the same problem? Seems to me it's just as insecure to allow the Apache user to create .pyc files, since an attacker could overwrite them with arbitrary bytecode.
The problem is that you can't delete the __pycache__ directory if it doesn't have the right ownership and if it's non-empty. This problem doesn't exist with a pyc file situated in a directory you own.
Actually, __pycache__ creation doesn't have to be part of the import mechanism. It can be part of distutils instead (or whatever third-party tool
What about development,
The main point of the __pycache__ proposal is to solve the needs of Ubuntu/Debian packagers. If you are developing (rather than deploying or building packages), you shouldn't have these needs AFAICT.
or if a user installs by dragging into site-packages instead of using an installer?
Well... do people actually do this? "python setup.py install" is simpler than finding the right place to drag your package to, and doing the dragging. It also gives you metadata for free. And there's less risk of screwing up.

Antoine Pitrou wrote:
or if a user installs by dragging into site-packages instead of using an installer?
Well... do people actually do this?
Yes. We do it all the time with unpackaged only-for-internal-use Python code. I wouldn't expect it to work with random packages downloaded from the 'net, but it works fine for our own stuff (since it expects to be used this way). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Antoine Pitrou wrote:
The main point of the __pycache__ proposal is to solve the needs of Ubuntu/Debian packagers. If you are developing (rather than deploying or building packages), you shouldn't have these needs AFAICT.
Maybe it's one point, but I'm not sure it's the *main* one. Personally I would benefit most from it during development. I hardly ever look in the directories of installed packages, so I don't care what they look like. -- Greg

Greg Ewing <greg.ewing <at> canterbury.ac.nz> writes:
The main point of the __pycache__ proposal is to solve the needs of Ubuntu/Debian packagers. If you are developing (rather than deploying or building packages), you shouldn't have these needs AFAICT.
Maybe it's one point, but I'm not sure it's the *main* one.
It's the only reason the PEP was originally designed, and proposed.
Personally I would benefit most from it during development.
Why? What benefit would it bring to you?
I hardly ever look in the directories of installed packages, so I don't care what they look like.
Neither do I, but Ubuntu/Debian packagers want to share the source code of third-party libraries across Python versions while keeping distinct bytecode files (obviously). Again, that's the original point of the PEP as proposed by Barry.

On Tue, 23 Mar 2010, Antoine Pitrou wrote:
Greg Ewing <greg.ewing <at> canterbury.ac.nz> writes:
The main point of the __pycache__ proposal is to solve the needs of Ubuntu/Debian packagers. If you are developing (rather than deploying or building packages), you shouldn't have these needs AFAICT.
Maybe it's one point, but I'm not sure it's the *main* one.
It's the only reason the PEP was originally designed, and proposed.
At least one additional use case has appeared. Actually, my use case was mentioned long ago, but I didn't really push (e.g. by writing a patch) and nobody jumped on it. But this PEP solves my case too, so it should not be ignored just because the immediate impetus for the PEP is another case.
Personally I would benefit most from it during development.
Why? What benefit would it bring to you?
I'm sure Greg will jump in if I'm wrong about what he is saying, but the benefit to me and to Greg and to others writing .py code is that our directories will contain *.py and __pycache__, rather than *.py and *.pyc. So it will be much easier to see what is actually there. Or if we're using SVN and we do "svn status", the only spurious result will be "? __pycache__" rather than "? X.pyc" for every X.py in the directory. Or whatever other good effects come from having less junk in our source directories. Directory tidiness is a positive general feature with at least a few specific benefits. Isaac Morland CSCF Web Guru DC 2554C, x36650 WWW Software Specialist

Le mardi 23 mars 2010 à 20:50 -0400, Isaac Morland a écrit :
I'm sure Greg will jump in if I'm wrong about what he is saying, but the benefit to me and to Greg and to others writing .py code is that our directories will contain *.py and __pycache__, rather than *.py and *.pyc. So it will be much easier to see what is actually there.
I don't really get it. I have never had any problem to see "what is actually here" in a Python source directory, despite the presence of pyc files.
Or if we're using SVN and we do "svn status", the only spurious result will be "? __pycache__" rather than "? X.pyc" for every X.py in the directory.
Well, I was assuming that everyone had been using svn:ignore, or .hgignore, for years. Similarly, you will configure svn, hg, or any other system, to ignore __pycache__ directories (or .pyc files) so that they don't appear in "svn status".
Directory tidiness is a positive general feature with at least a few specific benefits.
It's still mostly cosmetic and I don't think it's as serious as any positive or negative system administration effects the change may have.

Isaac Morland wrote:
the benefit to me and to Greg and to others writing .py code is that our directories will contain *.py and __pycache__, rather than *.py and *.pyc. So it will be much easier to see what is actually there.
Yes. When using MacOSX I do most of my work using the Finder's column view. With two windows open one above the other, there's room for about a dozen files to be seen at once. There's no way to filter the view or sort by anything other than name, so with the current .pyc scheme, half of that valuable screen space is wasted. -- Greg

Antoine Pitrou <solipsis@pitrou.net> writes: Steven D'Aprano <steve@pearwood.info> writes:
On Wed, 24 Mar 2010 12:35:43 am Ben Finney wrote:
On 23.03.2010 02:28, Ben Finney wrote:
<URL:http://www.debian.org/doc/packaging-manuals/fhs/fhs-2.3.html #VARCACHEAPPLICATIONCACHEDATA>
This would suggest that Python could start using ‘/var/cache/python/’ for its cached bytecode tree on systems that implement the FHS. […] There's no implication that data-which-happens-to-be-code is unsuitable for storage in ‘/var/cache/foo/’. Easily-regenerated Python byte code for caching meets the description quite well, AFAICT.
While I strongly approve of the concept of a central cache directory for many things, I don't think that .pyc files fit the bill.
Since there is no privileged python user that runs all Python code, and since any unprivileged user needs to be able to write .pyc files,
Hold up; my understanding is that, as Antoine Pitrou says:
The main point of the __pycache__ proposal is to solve the needs of Ubuntu/Debian packagers. If you are developing (rather than deploying or building packages), you shouldn't have these needs AFAICT.
So, the packaging system will, by definition, have access to write to FHS directories and those directories don't need to be world-writable. -- \ “Pinky, are you pondering what I'm pondering?” “I think so, | `\ Brain, but how will we get a pair of Abe Vigoda's pants?” | _o__) —_Pinky and The Brain_ | Ben Finney

In article <4BA80418.6030905@canterbury.ac.nz>, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Antoine Pitrou wrote:
In light of this issue, I'm -0.5 on __pycache__ becoming the default caching mechanism. The directory ownership/permissions issue is too much of a mess, especially for Web applications (think __pycache__ files created by the Apache user).
Doesn't the existing .pyc mechanism have the same problem? Seems to me it's just as insecure to allow the Apache user to create .pyc files, since an attacker could overwrite them with arbitrary bytecode.
The only safe way is to pre-compile under a different user and make everything read-only to Apache. The same thing would apply under the __pycache__ regime.
This does sound like a bit security hole both in existing Python and the new __pycache__ proposed mechanism. It seems like this is the time to address it, while changing the caching mechanism. If .pyc files are to be shared, it seems essential to (by default) generate them at install time and make them read-only for unprivileged users. This in turn implies that we may have to give up some support for dragging python modules into site-packages, e.g. not generate .pyc files for such modules. At least if we go that route it will mostly affect power users, who can presumably cope. -- Russell

Russell E. Owen wrote:
If .pyc files are to be shared, it seems essential to (by default) generate them at install time and make them read-only for unprivileged users.
This in turn implies that we may have to give up some support for dragging python modules into site-packages
No, I don't think so. Currently, when you install a package (by whatever means) in a directory that's not writable by the people who will be running it, there are two possibilities: 1) Precompiled .pyc files are generated at installation time, which are then read-only to users. 2) No .pyc files are installed, in which case none can or will be created by users either, since they don't have write permission to the directory. None of this would change if __pycache__ directories were used. The only difference would be that there would be an additional mode for failure to create .pyc files, i.e. __pycache__ could be created but nothing could be written to it because of a umask issue. If you install in a shared site-packages by dragging, you already have to be careful about setting the permissions. You'd just have to be sure to extended that diligence to any contained __pycache__ directories. -- Greg

On Mar 23, 2010, at 12:49 PM, Russell E. Owen wrote:
If .pyc files are to be shared, it seems essential to (by default) generate them at install time and make them read-only for unprivileged users.
I think in practice this is what's almost always going to happen for system Python source, either via your distribution's installer or distutils. I think most on-demand creation of pyc files will be when you're developing your own code. I don't want to have to run a separate step to get the benefit of __pycache__, but I admit it's outside the scope of the PEP's original intention. I leave it to the BDFL. -Barry

Russell E. Owen wrote:
This in turn implies that we may have to give up some support for dragging python modules into site-packages, e.g. not generate .pyc files for such modules. At least if we go that route it will mostly affect power users, who can presumably cope.
And when someone drags a Python module into the per-user site-packages instead? [1] Yes, a shared Python needs to be managed carefully. Systems with a shared Python should also generally have a vaguely competent sysadmin running them. An unshared Python and associated packages under PEP 3147 should work just as well as they do under the existing pyc scheme (only without the source directory clutter). Cheers, Nick. [1] http://www.python.org/dev/peps/pep-0370/ -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Or maybe you have a better idea? What's the equivalent situation on Windows and how would things work there?
On Windows, this problem is easy: create the directory with no specification of an ACL, and it will (usually) inherit the ACL of the parent directory. IOW, the same users will have write access to __pycache__ as to the parent directory. It is possible to defeat this machinery, by setting ACL entries on the parent that explicitly don't get inherited, or that only apply to subdirectories. In this case, I would say "tough luck": this is what the users requested, so it may have a reason. Regards, Martin

On Mon, Mar 22, 2010 at 06:56, Barry Warsaw <barry@python.org> wrote:
I have a pretty good start on PEP 3147 implementation [1], but I've encountered a situation that I'd like to get some feedback on. Here's the test case illustrating the problem. From test_import.py:
def test_writable_directory(self): # The umask is not conducive to creating a writable __pycache__ # directory. with umask(0o222): __import__(TESTFN) self.assertTrue(os.path.exists('__pycache__')) self.assertTrue(os.path.exists(os.path.join( '__pycache__', '{}.{}.pyc'.format(TESTFN, self.tag))))
The __pycache__ directory does not exist before the import, and the import machinery creates the directory, but the umask leaves the directory unwritable by anybody. So of course when the import machinery goes to write the .pyc file inside __pycache__, it fails. This does not cause an ImportError though, just like if today the package directory were unwritable.
This might be different than today's situation though because once the unwritable __pycache__ directory is created, nothing is going to change that without explicit user interaction, and that might be difficult after the fact.
I'm not sure what the right answer is. Some possible choices:
* Tough luck * Force the umask so that the directory is writable, but then the question is, by whom? ugo+w or something less? * Copy the permissions from the parent directory and ignore umask * Raise an exception or refuse to create __pycache__ if it's not writable (again, by whom?)
I say tough luck. At the moment if you can't write a .pyc it is a silent failure; don't see the difference as significant enough to warrant changing the practice. -Brett
Or maybe you have a better idea? What's the equivalent situation on Windows and how would things work there?
-Barry
[1] https://edge.launchpad.net/~barry/python/pep3147
P.S. I'm down to only 8 unit test failures.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org

On 22Mar2010 09:56, Barry Warsaw <barry@python.org> wrote: | I have a pretty good start on PEP 3147 implementation [1], but I've | encountered a situation that I'd like to get some feedback on. Here's the | test case illustrating the problem. From test_import.py: | | def test_writable_directory(self): | # The umask is not conducive to creating a writable __pycache__ | # directory. | with umask(0o222): [...] | The __pycache__ directory does not exist before the import, and the import | machinery creates the directory, but the umask leaves the directory unwritable | by anybody. So of course when the import machinery goes to write the .pyc | file inside __pycache__, it fails. This does not cause an ImportError though, | just like if today the package directory were unwritable. | | This might be different than today's situation though because once the | unwritable __pycache__ directory is created, nothing is going to change that | without explicit user interaction, and that might be difficult after the | fact. Like any bad/suboptimal permission. | I'm not sure what the right answer is. Some possible choices: | | * Tough luck +1 I'd go with this one myself. | * Force the umask so that the directory is writable, but then the question is, | by whom? ugo+w or something less? -2 Racy and dangerous. The umask is a UNIX process global, and other threads may get bad results if they're active during this window. | * Copy the permissions from the parent directory and ignore umask -1 Maybe. But consider that you may not be the owner of the parent: then the new child will have different ownership than the parent but the same permission mask. Potentially a bad mix. This approach is very hard to get right. | * Raise an exception or refuse to create __pycache__ if it's not writable | (again, by whom?) -3 Bleah. My python program won't run because an obscure (to the user) directory had unusual permissions? Tough ove, it's the only way:-) -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ When asked what would I most want to try before doing it, I said Death. - Michael Burton, michaelb@compnews.co.uk

On 23Mar2010 11:40, I wrote: | | * Raise an exception or refuse to create __pycache__ if it's not writable | | (again, by whom?) | | -3 | Bleah. My python program won't run because an obscure (to the user) | directory had unusual permissions? Clarification: I'm -3 on the exception. Silent failure to make the __pycache__ would do, and may be better than silently making a useless (unwritable) one. How about: made_it = False ok = False try: if not isdir(__pycache__dir): mkdir(__pycache__dir) made_it = True write pyc content ... ok = True except OSError, IOerror etc: if not ok: os.remove pyc content file if made_it: rmdir(__pycache__dir) but be quiet if this fails, eg if it is not empty because another process or thread added stuff So silent tidyup attempt on failure, but no escaping exception. Cheers, -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ The govt MUST regulate the Net NOW! We can't have average people saying what's on their minds! - ezwriter@netcom.com
participants (17)
-
"Martin v. Löwis"
-
Antoine Pitrou
-
Barry Warsaw
-
Ben Finney
-
Brett Cannon
-
Cameron Simpson
-
Greg Ewing
-
Guido van Rossum
-
Isaac Morland
-
Matthias Klose
-
Nick Coghlan
-
R. David Murray
-
Ron Adam
-
Russell E. Owen
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy