python-dev Summary for 2004-09-16 through 2004-09-30 [draft]
So we were all rather quiet in the last half of September. The whole summary fits on two sheets of 8.5x11 (normally it is over 10 and I have hit over 20 when I was summarizing *everything*). Going to send this out no earlier than Friday night so send in corrections by then. ---------------------------------- ===================== Summary Announcements ===================== Wow. This must have been the easiest summary I have ever done. Why can't they all be like this? I didn't even skip that much! ========= Summaries ========= ------------------------------------------ Assume nothing when mutability is possible ------------------------------------------ Tim Peters discovered a new way to create an infinite list thanks to generator expressions. But what really came out of this whole discussion came about when someone else came up with an example that exposed a bug in list.extend(). The first thing was that "you can't assume anything about a mutable object after potentially calling back into Python." Basically you can't assume the state of any mutable object was not changed if you execute Python code from C. While it might seem handy to store state while in a loop for instance, you can't count on things not change by the time you get control back so you just have to do it the hard way and get state all over again. Second was that you need to be careful when dealing with iterators. If you mutate an iterator while iterating you don't have a guarantee it won't explode in your face. Unless you explicitly support it, document it, and take care to protect against it then just don't assume you can mutate an iterator while using it. Contributing threads: - `A cute new way to get an infinite loop <>`__ - `More data points <>`__ ---------------------------- The less licenses the better ---------------------------- The idea of copying some code from OpenSSH_ for better pty handling was proposed. This was frowned upon since that becomes one more legal issue to keep track of. Minimizing the licenses that Python must keep track of and make sure to comply with, no matter how friendly, is a good thing. .. _OpenSSH: http://www.openssh.com/ Contributing threads: - `using openssh's pty code <>`__ ------------------------------------------------------------------------ Trying to deal with the exception hierarchy and a backwards-friendly way ------------------------------------------------------------------------- Nick Coghlan came up with the idea of having a tuple that contained all of the exceptions you normally would not want to catch in a blanket 'except' statement; KeyboardInterrupt, MemoryError, SystemExit, etc.). This tuple was proposed to live in sys.special_exceptions with the intended usage of:: try: pass # stuff... except sys.special_exceptions: raise # exceptions that you would not want to catch should keep propogating up the call chain except: pass # if you reach here the exception should not be a *huge* deal Obviously the best solution is to just clean up the exception hierarchy, but that breaks backwards-compatibility. But this idea seemed to lose steam. Contributing threads: - `Proposing a sys.special_exceptions tuple <>`__ =============== Skipped Threads =============== - Decimal, copyright and license - Planning to drop gzip compression for future releases. - built on beer? - Noam's open regex requests - Socket/Asyncore bug needs attention - open('/dev/null').read() -> MemoryError - Finding the module from PyTypeObject? - Odd compile errors for bad genexps - Running a module as a script
Brett C wrote:
- Running a module as a script
That reminds me - the version of this that got checked in is restricted to top-level modules in order to keep things simple. I put a recipe up on the Python cookbook for those that wanted to be able to easily run scripts that live inside packages. The recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/307772 Cheers, Nick.
On Thu, 2004-10-14 at 08:01, Nick Coghlan wrote:
That reminds me - the version of this that got checked in is restricted to top-level modules in order to keep things simple. I put a recipe up on the Python cookbook for those that wanted to be able to easily run scripts that live inside packages.
The recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/307772
Why not add this to the stdlib? I'd call it pyrun.py. -Barry
Barry Warsaw <barry@python.org> writes:
On Thu, 2004-10-14 at 08:01, Nick Coghlan wrote:
That reminds me - the version of this that got checked in is restricted to top-level modules in order to keep things simple. I put a recipe up on the Python cookbook for those that wanted to be able to easily run scripts that live inside packages.
The recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/307772
Why not add this to the stdlib? I'd call it pyrun.py.
... and call it automatically when the '-m' implementation detects a dotted name? Thomas
On Thu, 2004-10-14 at 09:06, Thomas Heller wrote:
... and call it automatically when the '-m' implementation detects a dotted name?
+1. Also, if a non-dotted module name can't be found, pyrun should be called to see if the module is actually a package (with an __main__ that lives in the package's __init__.py). -Barry
On Oct 14, 2004, at 9:58 AM, Barry Warsaw wrote:
On Thu, 2004-10-14 at 09:06, Thomas Heller wrote:
... and call it automatically when the '-m' implementation detects a dotted name?
+1. Also, if a non-dotted module name can't be found, pyrun should be called to see if the module is actually a package (with an __main__ that lives in the package's __init__.py).
Wouldn't it make more sense to look for main rather than __main__? It seems that __main__ (as a function) occurs EXACTLY ZERO times in the standard library ;) -bob
On Thu, 2004-10-14 at 10:04, Bob Ippolito wrote:
+1. Also, if a non-dotted module name can't be found, pyrun should be called to see if the module is actually a package (with an __main__ that lives in the package's __init__.py).
Wouldn't it make more sense to look for main rather than __main__? It seems that __main__ (as a function) occurs EXACTLY ZERO times in the standard library ;)
I think that's a fine convention, so +1. -Barry
Barry Warsaw wrote:
On Thu, 2004-10-14 at 10:04, Bob Ippolito wrote:
+1. Also, if a non-dotted module name can't be found, pyrun should be called to see if the module is actually a package (with an __main__ that lives in the package's __init__.py).
Actually, if we were going to do something for packages, I'd be more inclined to look for a script called __main__.py in the package directory, rather than looking for a __main__ function inside __init__.py. Or else simply run __init__.py itself as __main__ (i.e. allow the use of the existing 'Python main' idiom inside a package's __init__.py) (Interestingly, that's at least the second time it has been suggested to turn this idea into 'C-like main functions for Python'. '-m' is about another way to invoke the current 'if __name__ == "__main__":' idiom. It is most definitely *not* about creating a new idiom for main functions - an activity which would seem to be PEP-worthy)
Wouldn't it make more sense to look for main rather than __main__? It seems that __main__ (as a function) occurs EXACTLY ZERO times in the standard library ;)
I think that's a fine convention, so +1.
Except we'd be violating Python's tradition that magic methods start and end with double underscores, as well as potentially breaking existing scripts. At the moment, the following will print 'Hello World', but with 'def main' being special, it would do nothing: def main(*args, **kwds): pass if __name__ == "__main__: print "Hello world" main() I know that I often do any sys.argv manipulation in the 'if __name__' block rather than inside my main() function (which is often actually called 'main', and almost always takes a list of arguments rather than looking at sys.argv for itself). So while I'd be quite happy for code that included a "def __main__" to break (since the Python docs advise against using names in that format), allowing "def main" to suddenly acquire a magic meaning may not break the stdlib, but I bet it would break a lot of single-purpose scripts that are out there. Cheers, Nick.
At 07:06 AM 10/15/04 +1000, Nick Coghlan wrote:
Barry Warsaw wrote:
On Thu, 2004-10-14 at 10:04, Bob Ippolito wrote:
+1. Also, if a non-dotted module name can't be found, pyrun should be called to see if the module is actually a package (with an __main__ that lives in the package's __init__.py).
Actually, if we were going to do something for packages, I'd be more inclined to look for a script called __main__.py in the package directory, rather than looking for a __main__ function inside __init__.py. Or else simply run __init__.py itself as __main__ (i.e. allow the use of the existing 'Python main' idiom inside a package's __init__.py)
(Interestingly, that's at least the second time it has been suggested to turn this idea into 'C-like main functions for Python'. '-m' is about another way to invoke the current 'if __name__ == "__main__":' idiom. It is most definitely *not* about creating a new idiom for main functions - an activity which would seem to be PEP-worthy)
Perhaps this means that -m is premature? I personally would rather wait for 2.5 if it means we get a nice, future-proof "main" convention out of the deal. While -m would not then be "backward compatible" with existing scripts, people could start changing scripts to match the convention as soon as there was an accepted PEP.
Phillip J. Eby wrote:
Perhaps this means that -m is premature? I personally would rather wait for 2.5 if it means we get a nice, future-proof "main" convention out of the deal. While -m would not then be "backward compatible" with existing scripts, people could start changing scripts to match the convention as soon as there was an accepted PEP.
I see the two issues as entirely orthogonal. A new main function idiom should not have to rely on the use of a particular command line switch (particularly since there is no way to 'switch off' the current idiom without significant changes to the interpreter). If we were actually developing a new idiom, I would be thinking more along the following lines: "After initial execution of the __main__ module in non-interactive mode (e.g. running a script supplied by filename or using -m), the interpreter looks for a function also called __main__ in the __main__ module's namespace. If such a function is found, it is executed by the interpreter as if the call __main__() had been appended to the end of the supplied script." Existing scripts would continue to work as is, or else could be converted to use the new idiom by replacing the line "if __name__ == '__main__':" with the line "def __main__():" Scripts that need to be backwards compatible can use the old idiom and include the following to mark themselves as definitely being usable as scripts: def __main__(): pass Or else, they can be made compatible with older versions by adding the following snippet to the end: if __name__ == '__main__': import sys if sys.version_info[0:2] < [2, 5]: __main__() About the only real practical (as opposed to aesthetic) advantage I see to such an idiom is that the question "Is this module useful as a script?" can easily be answered as "Yes" by looking for a __main__ function. (We can't really answer 'No' definitively, since a script may be using the existing idiom without using the 'def __main__(): pass' trick). Although "def __main__():" could be easier to explain to beginners than the current approach. That might also be a slight benefit. *stops to think for a bit* Actually, while proofreading this, one other potential advantage occured to me - this might allow C extensions and the like to define __main__ functions, and so be usable as scripts via -m despite their non-script packaging. Although that seems like an awful lot of work for not much gain - it would presumably be easier to just distribute a package that included both the extension module and a 'launcher' Python script. Anyway, regardless of whether a new main idiom is ever selected or not, I just don't see the link between that and being able to search for scripts using Python's module namespace instead of the OS filesystem's namespace. Cheers, Nick.
On Thu, 14 Oct 2004, Phillip J. Eby wrote:
Perhaps this means that -m is premature? I personally would rather wait for 2.5 if it means we get a nice, future-proof "main" convention out of the deal. While -m would not then be "backward compatible" with existing scripts, people could start changing scripts to match the convention as soon as there was an accepted PEP.
But to me -m option and __main__() conventions seem like orthogonal features... Even if current __name__=="__main__" blocks get replaced by a magic __main__() function, you would still benefit from -m cmd line option Or is there some hidden dependency? Ilya
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ilya%40bluefir.net
Here's another good reason to change the way "main()" works today: A consequence of having the invoked python script's module always be named "__main__" is that when another module imports it by its real name the module will be loaded twice with 2 separate class and function definitions, etc. This can lead to unobvious bugs; for example, an except clause will think your trying to catch another type of exception. to illustrate: anothermodule.py: import mymodule def raiseFoo(): raise mymodule.foo() mymodule.py: import anothermodule class foo(exception): pass if __name__ == '__main__': try: anothermodule.raiseFoo() except foo: #nope, this isn't going to work 'cuz #its trying to catch __main__.foo not mymodule.foo print 'caught foo!' One way to prevent bugs like this (which can be tricky to track down) is to use an idiom like this for modules that can be invoked from the command line: if __name__ == '__main__': import mymodule mymodule.main() else: #define the module contents in this conditional block, including a main() BTW, the way this problem also exists when doing a relative import of module inside a package. e.g.: package1/example.py: import package1.module1 import module1 #module1 is loaded twice, treated as different modules -- adam On Thu, 14 Oct 2004 17:16:39 -0400, Phillip J. Eby <pje@telecommunity.com> wrote:
At 07:06 AM 10/15/04 +1000, Nick Coghlan wrote:
Barry Warsaw wrote:
On Thu, 2004-10-14 at 10:04, Bob Ippolito wrote:
+1. Also, if a non-dotted module name can't be found, pyrun should be called to see if the module is actually a package (with an __main__ that lives in the package's __init__.py).
Actually, if we were going to do something for packages, I'd be more inclined to look for a script called __main__.py in the package directory, rather than looking for a __main__ function inside __init__.py. Or else simply run __init__.py itself as __main__ (i.e. allow the use of the existing 'Python main' idiom inside a package's __init__.py)
(Interestingly, that's at least the second time it has been suggested to turn this idea into 'C-like main functions for Python'. '-m' is about another way to invoke the current 'if __name__ == "__main__":' idiom. It is most definitely *not* about creating a new idiom for main functions - an activity which would seem to be PEP-worthy)
Perhaps this means that -m is premature? I personally would rather wait for 2.5 if it means we get a nice, future-proof "main" convention out of the deal. While -m would not then be "backward compatible" with existing scripts, people could start changing scripts to match the convention as soon as there was an accepted PEP.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/adamsz%40gmail.com
Nick Coghlan wrote:
Brett C wrote:
- Running a module as a script
That reminds me - the version of this that got checked in is restricted to top-level modules in order to keep things simple. I put a recipe up on the Python cookbook for those that wanted to be able to easily run scripts that live inside packages.
The recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/307772
(Hm, I was going to reply to the discussion there, but I think it's more appropriate here.) I understand the difficulty of implementing it, but as a user I find it a really really stupid restriction. I routively run (doc)tests of individual modules, which usually are submodules of a package. Using -m to do this would help me tremendously. As it stands, -m doesn't help me at all. I'd even go so far and -1 the entire feature if it doesn't support submodules. I guess it's too late for that :) Just
Just van Rossum wrote:
I understand the difficulty of implementing it, but as a user I find it a really really stupid restriction. I routively run (doc)tests of individual modules, which usually are submodules of a package. Using -m to do this would help me tremendously. As it stands, -m doesn't help me at all. I'd even go so far and -1 the entire feature if it doesn't support submodules. I guess it's too late for that :)
This may be more a case of my misjudging py-dev's likely collective reaction than anything else. Support for '-m' was lukewarm enough when it last came up, that I didn't expect to get a good reaction if I suggested adding a stdlib module in order to enhance it to support packages. While I wrote a patch to enable it (#1043356 - it uses the simple C-level strategy of 'try to locate at the top level, if that doesn't work, hand it over to the Python version'), we seemed to be too close to the beta to push for inclusion this time around. Add in the fact that I was about to be moving back to Brisbane after being Oregon for three months. . . (I'm back in Brisbane now, though) At the moment, '-m's behaviour is pretty easy to explain: "Look for a top-level module with the specified name. If it is found, run it as if its name had been fully specified on the command line. If it is not found, report an error" The behaviour currently implemented in the enhancement patch is: "Look for a top-level module with the specified name. If it is found, run it as if its name had been fully specified on the command line. If it is not found, attempt to import any containing package, then look for the module within that package. Run the located module as for a top-level module. If it is still not found, report an error. Note: For modules within packages, this differs slightly from running them directly from the command line by filename. Using this switch, the script's containing package is fully imported prior to execution of the script. This does not happen when the script's filename is used directly." As an implementation detail, the top-level scan is done in C, the scan that understands packages is done in Python. The main reasons for that are that the top-level scan gets used to *find* the Python version if it's needed, and even a simple scan looking for dots is a pain in C (although that *would* likely be slightly quicker than the current 'failed lookup' approach for scripts inside modules, it would also be slightly slower for top-level modules, as well as adding more code to main.c). Selling the additional complexity was the main reason I didn't expect to get a good reaction to this idea with the 2.4 beta almost out the door. I'm happy to make whatever changes to that patch are needed for inclusion (e.g. changing the module name, adding it to the docs underwhatever name is chosen) - I guess it's mainly Anthony's call whether he's prepared to accept such a change after the 2.4b1 release. Cheers, Nick. P.S. I'd also like some feedback on a quirk of the current version of that patch - as noted on SF, at the moment it potentially tramples on argv[0] at the C-level, which seems questionable given the existence of Py_GetArgcArgv(). The simplest way around that is to *not* set sys.argv[0] correctly when running pyrun/execmodule implicitly (i.e. sys.argv[0] may contain the name of the interpreter, or a command line switch, rather than the name of pyrun/execmodule - document this possibility with a comment in the __main__ block of pyrun/execmodule).
On Thu, 2004-10-14 at 16:38, Nick Coghlan wrote:
While I wrote a patch to enable it (#1043356 - it uses the simple C-level strategy of 'try to locate at the top level, if that doesn't work, hand it over to the Python version'), we seemed to be too close to the beta to push for inclusion this time around. Add in the fact that I was about to be moving back to Brisbane after being Oregon for three months. . . (I'm back in Brisbane now, though)
Well, you're practically next door to Anthony (at least, compared to some of us), so I suggest you hop on down there on Friday, buy Anthony a beer or ten as a friendly convincer for slipping this into beta 1. To seal the deal, you can even offer to help (or threaten to harass) him while he makes the release. A win all around! -Barry
Barry Warsaw wrote:
Well, you're practically next door to Anthony (at least, compared to some of us), so I suggest you hop on down there on Friday, buy Anthony a beer or ten as a friendly convincer for slipping this into beta 1. To seal the deal, you can even offer to help (or threaten to harass) him while he makes the release. A win all around!
I'm extremely unconvinced that the semantics of "-m package" or "-m package.module" are suitably well thought out to see it in b1. If a single compelling way of making it work can be seen in the next week, _maybe_ we could sneak it into b2, but I'm really not hopeful. Anthony -- Anthony Baxter <anthony@interlink.com.au> It's never too late to have a happy childhood.
On Friday 15 October 2004 12:36 am, Anthony Baxter wrote:
I'm extremely unconvinced that the semantics of "-m package" or "-m package.module" are suitably well thought out to see it in b1.
I don't see this as a useful feature at all, so I guess I'm biased, but if it's hard to decide just what the right semantics are, this certainly isn't the time for it. We won't be able to change it substantially later due to backward compatibility constraints.
If a single compelling way of making it work can be seen in the next week, _maybe_ we could sneak it into b2, but I'm really not hopeful.
Let's keep it out of 2.4 and see what proposals show up for 2.5. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org>
Anthony Baxter wrote:
I'm extremely unconvinced that the semantics of "-m package" or "-m package.module" are suitably well thought out to see it in b1.
This was one of the reasons I wrote execmodule - to figure out semantics which made sense, after my initial attempt at supporting "-m package.module" failed to handle __path__ correctly. Executing anything other than PY_SOURCE or PY_COMPILED modules simply doesn't make sense to me (both execmodule and the current '-m' implementation report errors if you try to do so)
If a single compelling way of making it work can be seen in the next week, _maybe_ we could sneak it into b2, but I'm really not hopeful.
The approach of 'import the containing package' seems to work fairly well. I can't see any other way to get at the package's __path__ variable in order to locate the module inside it. I haven't seen any good reasons to lift the PY_SOURCE/PY_COMPILED restriction, though. FWIW, the example which convinced me that running modules inside packages was valuable was "python -m pychecker.checker <script>". However, given that execmodule can provide this functionality fairly easily, I voluntarily bumped the relevant patch to Python 2.5 and posted the Cookbook recipe instead. If we do postpone this, I'd suggest putting something in the relevant Python 2.4 "module not found" error message in main.c that notes that packages aren't supported, though (otherwise I expect we'll get at least a few bug reports about not finding modules inside packages). Cheers, Nick.
On Fri, 2004-10-15 at 00:36, Anthony Baxter wrote:
I'm extremely unconvinced that the semantics of "-m package" or "-m package.module" are suitably well thought out to see it in b1.
If a single compelling way of making it work can be seen in the next week, _maybe_ we could sneak it into b2, but I'm really not hopeful.
Okay, based on the discussions so far, I'm going to have to agree. While I do think it's a very useful feature, it's more important to do it right than to do it right now. Which means it should probably go through the full PEP treatment. -Barry
On Fri, Oct 15, 2004 at 07:58:05AM -0400, Barry Warsaw wrote:
While I do think it's a very useful feature, it's more important to do it right than to do it right now. Which means it should probably go through the full PEP treatment.
Yes, it should. Neil
Barry Warsaw wrote:
On Fri, 2004-10-15 at 00:36, Anthony Baxter wrote:
I'm extremely unconvinced that the semantics of "-m package" or "-m package.module" are suitably well thought out to see it in b1.
If a single compelling way of making it work can be seen in the next week, _maybe_ we could sneak it into b2, but I'm really not hopeful.
Okay, based on the discussions so far, I'm going to have to agree. While I do think it's a very useful feature, it's more important to do it right than to do it right now. Which means it should probably go through the full PEP treatment.
+1 from me. -Brett
Brett C. wrote:
Barry Warsaw wrote:
Okay, based on the discussions so far, I'm going to have to agree. While I do think it's a very useful feature, it's more important to do it right than to do it right now. Which means it should probably go through the full PEP treatment.
+1 from me.
Works for me, too. I don't think the PEP has to be particularly *complicated* - it just needs to nail down specific semantics so everyone is talking about the same thing. And it should point out that this has nothing to do with changing the idiom for making modules that also work as scripts. If people want to do that, they can write their own PEP ;) I'll see if I can come up with a first draft this weekend. Cheers, Nick.
participants (13)
-
Adam Souzis
-
Anthony Baxter
-
Barry Warsaw
-
Bob Ippolito
-
Brett C
-
Brett C.
-
Fred L. Drake, Jr.
-
Ilya Sandler
-
Just van Rossum
-
Neil Schemenauer
-
Nick Coghlan
-
Phillip J. Eby
-
Thomas Heller