Python Language FAQ - Section 4
This FAQ newsgroup posting has been automatically converted from an HTML snapshot of the original Python FAQ; please refer to the original "Python FAQ Wizard" at <http://grail.cnri.reston.va.us/cgi-bin/faqw.py> if source code snippets given in this document do not work - incidentally some formatting information may have been lost during the conversion. ---------------------------------------------------------------------------- The whole Python FAQ - Section 4 Last changed on Mon Jun 28 19:36:09 1999 EDT (Entries marked with ** were changed within the last 24 hours; entries marked with * were changed within the last 7 days.) ---------------------------------------------------------------------------- 4. Programming in Python 4.1. Is there a source code level debugger with breakpoints, step, etc.? 4.2. Can I create an object class with some methods implemented in C and others in Python (e.g. through inheritance)? (Also phrased as: Can I use a built-in type as base class?) 4.3. Is there a curses/termcap package for Python? 4.4. Is there an equivalent to C's onexit() in Python? 4.5. When I define a function nested inside another function, the nested function seemingly can't access the local variables of the outer function. What is going on? How do I pass local data to a nested function? 4.6. How do I iterate over a sequence in reverse order? 4.7. My program is too slow. How do I speed it up? 4.8. When I have imported a module, then edit it, and import it again (into the same Python process), the changes don't seem to take place. What is going on? 4.9. How do I find the current module name? 4.10. I have a module in which I want to execute some extra code when it is run as a script. How do I find out whether I am running as a script? 4.11. I try to run a program from the Demo directory but it fails with ImportError: No module named ...; what gives? 4.12. I have successfully built Python with STDWIN but it can't find some modules (e.g. stdwinevents). 4.13. What GUI toolkits exist for Python? 4.14. Are there any interfaces to database packages in Python? 4.15. Is it possible to write obfuscated one-liners in Python? 4.16. Is there an equivalent of C's "?:" ternary operator? 4.17. My class defines __del__ but it is not called when I delete the object. 4.18. How do I change the shell environment for programs called using os.popen() or os.system()? Changing os.environ doesn't work. 4.19. What is a class? 4.20. What is a method? 4.21. What is self? 4.22. What is an unbound method? 4.23. How do I call a method defined in a base class from a derived class that overrides it? 4.24. How do I call a method from a base class without using the name of the base class? 4.25. How can I organize my code to make it easier to change the base class? 4.26. How can I find the methods or attributes of an object? 4.27. I can't seem to use os.read() on a pipe created with os.popen(). 4.28. How can I create a stand-alone binary from a Python script? 4.29. What WWW tools are there for Python? 4.30. How do I run a subprocess with pipes connected to both input and output? 4.31. How do I call a function if I have the arguments in a tuple? 4.32. How do I enable font-lock-mode for Python in Emacs? 4.33. Is there a scanf() or sscanf() equivalent? 4.34. Can I have Tk events handled while waiting for I/O? 4.35. How do I write a function with output parameters (call by reference)? 4.36. Please explain the rules for local and global variables in Python. 4.37. How can I have modules that mutually import each other? 4.38. How do I copy an object in Python? 4.39. How to implement persistent objects in Python? (Persistent == automatically saved to and restored from disk.) 4.40. I try to use __spam and I get an error about _SomeClassName__spam. 4.41. How do I delete a file? And other file questions. 4.42. How to modify urllib or httplib to support HTTP/1.1? 4.43. Unexplicable syntax errors in compile() or exec. 4.44. How do I convert a string to a number? 4.45. How do I convert a number to a string? 4.46. How do I copy a file? 4.47. How do I check if an object is an instance of a given class or of a subclass of it? 4.48. What is delegation? 4.49. How do I test a Python program or component. 4.50. My multidimensional list (array) is broken! What gives? 4.51. I want to do a complicated sort: can you do a Schwartzian Transform in Python? 4.52. How to convert between tuples and lists? 4.53. Files retrieved with urllib contain leading garbage that looks like email headers. 4.54. How do I get a list of all instances of a given class? 4.55. A regular expression fails with regex.error: match failure. 4.56. I can't get signal handlers to work. 4.57. I can't use a global variable in a function? Help! 4.58. What's a negative index? Why doesn't list.insert() use them? 4.59. How can I sort one list by values from another list? 4.60. Why doesn't dir() work on builtin types like files and lists? 4.61. How can I mimic CGI form submission (METHOD=POST)? 4.62. If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come? 4.63. How do I make a Python script executable on Unix? 4.64. How do you remove duplicates from a list? 4.65. Are there any known year 2000 problems in Python? 4.66. I want a version of map that applies a method to a sequence of objects! Help! 4.67. How do I generate random numbers in Python? 4.68. How do I access the serial (RS232) port? 4.69. Images on Tk-Buttons don't work in Py15? 4.70. Where is the math.py (socket.py, regex.py, etc.) source file? 4.71. How do I send mail from a Python script? 4.72. How do I avoid blocking in connect() of a socket? 4.73. How do I specify hexadecimal and octal integers? 4.74. How to get a single keypress at a time? 4.75. How can I overload constructors (or methods) in Python? 4.76. How do I pass keyword arguments from one method to another? 4.77. What module should I use to help with generating HTML? 4.78. How do I create documentation from doc strings? 4.79. How do I read (or write) binary data? 4.80. I can't get key bindings to work in Tkinter 4.81. "import crypt" fails 4.82. Are there coding standards or a style guide for Python programs? 4.83. How do I freeze Tkinter applications? 4.84. How do I create static class data and static class methods? 4.85. __import__('x.y.z') returns <module 'x'>; how do I get z? 4.86. Basic thread wisdom 4.87. Why doesn't closing sys.stdout (stdin, stderr) really close it? 4.88. What kinds of global value mutation are thread-safe? 4.89. How do I modify a string in place? 4.90. How to pass on keyword/optional parameters/arguments ---------------------------------------------------------------------------- 4. Programming in Python ---------------------------------------------------------------------------- 4.1. Is there a source code level debugger with breakpoints, step, etc.? Yes. Check out module pdb. It is documented in the Library Reference Manual; pdb.help() also prints the documentation. You can write your own debugger by using the code for pdb as an example. Pythonwin also has a GUI debugger available, based on bdb, which colors breakpoints and has quite a few cool features (including debugging non-Pythonwin programs). The interface needs some work, but is interesting none the less. A reference can be found in http://www.python.org/ftp/python/pythonwin/pwindex.html Richard Wolff has created a modified version of pdb, called Pydb, for use with the popular Data Display Debugger (DDD). Pydb can be found at http://daikon.tuc.noao.edu/python/, and DDD can be found at http://www.cs.tu-bs.de/softech/ddd/ ---------------------------------------------------------------------------- 4.2. Can I create an object class with some methods implemented in C and others in Python (e.g. through inheritance)? (Also phrased as: Can I use a built-in type as base class?) No, but you can easily create a Python class which serves as a wrapper around a built-in object, e.g. (for dictionaries): # A user-defined class behaving almost identical # to a built-in dictionary. class UserDict: def __init__(self): self.data = {} def __repr__(self): return repr(self.data) def __cmp__(self, dict): if type(dict) == type(self.data): return cmp(self.data, dict) else: return cmp(self.data, dict.data) def __len__(self): return len(self.data) def __getitem__(self, key): return self.data[key] def __setitem__(self, key, item): self.data[key] = item def __delitem__(self, key): del self.data[key] def keys(self): return self.data.keys() def items(self): return self.data.items() def values(self): return self.data.values() def has_key(self, key): return self.data.has_key(key) A2. See Jim Fulton's ExtensionClass for an example of a mechanism which allows you to have superclasses which you can inherit from in Python -- that way you can have some methods from a C superclass (call it a mixin) and some methods from either a Python superclass or your subclass. See http://www.digicool.com/papers/ExtensionClass.html. ---------------------------------------------------------------------------- 4.3. Is there a curses/termcap package for Python? [Andrew Kuchling] The standard Python distribution comes with a curses module in the Modules/ subdirectory, though it's not compiled by default. However, that module only supports plain curses; you can't use ncurses features like colors with it (though it will link with ncurses). Oliver Andrich has an enhanced module that does support such features; there's a version available at http://andrich.net/python/selfmade.html#ncursesmodule . ---------------------------------------------------------------------------- 4.4. Is there an equivalent to C's onexit() in Python? Yes, if you import sys and assign a function to sys.exitfunc, it will be called when your program exits, is killed by an unhandled exception, or (on UNIX) receives a SIGHUP or SIGTERM signal. ---------------------------------------------------------------------------- 4.5. When I define a function nested inside another function, the nested function seemingly can't access the local variables of the outer function. What is going on? How do I pass local data to a nested function? Python does not have arbitrarily nested scopes. When you need to create a function that needs to access some data which you have available locally, create a new class to hold the data and return a method of an instance of that class, e.g.: class MultiplierClass: def __init__(self, factor): self.factor = factor def multiplier(self, argument): return argument * self.factor def generate_multiplier(factor): return MultiplierClass(factor).multiplier twice = generate_multiplier(2) print twice(10) # Output: 20 An alternative solution uses default arguments, e.g.: def generate_multiplier(factor): def multiplier(arg, fact = factor): return arg*fact return multiplier twice = generate_multiplier(2) print twice(10) # Output: 20 ---------------------------------------------------------------------------- 4.6. How do I iterate over a sequence in reverse order? If it is a list, the fastest solution is list.reverse() try: for x in list: "do something with x" finally: list.reverse() This has the disadvantage that while you are in the loop, the list is temporarily reversed. If you don't like this, you can make a copy. This appears expensive but is actually faster than other solutions: rev = list[:] rev.reverse() for x in rev: <do something with x> If it's not a list, a more general but slower solution is: for i in range(len(sequence)-1, -1, -1): x = sequence[i] <do something with x> A more elegant solution, is to define a class which acts as a sequence and yields the elements in reverse order (solution due to Steve Majewski): class Rev: def __init__(self, seq): self.forw = seq def __len__(self): return len(self.forw) def __getitem__(self, i): return self.forw[-(i + 1)] You can now simply write: for x in Rev(list): <do something with x> Unfortunately, this solution is slowest of all, due to the method call overhead... ---------------------------------------------------------------------------- 4.7. My program is too slow. How do I speed it up? That's a tough one, in general. There are many tricks to speed up Python code; I would consider rewriting parts in C only as a last resort. One thing to notice is that function and (especially) method calls are rather expensive; if you have designed a purely OO interface with lots of tiny functions that don't do much more than get or set an instance variable or call another method, you may consider using a more direct way, e.g. directly accessing instance variables. Also see the standard module "profile" (described in the Library Reference manual) which makes it possible to find out where your program is spending most of its time (if you have some patience -- the profiling itself can slow your program down by an order of magnitude). Remember that many standard optimization heuristics you may know from other programming experience may well apply to Python. For example it may be faster to send output to output devices using larger writes rather than smaller ones in order to avoid the overhead of kernel system calls. Thus CGI scripts that write all output in "one shot" may be notably faster than those that write lots of small pieces of output. Also, be sure to use "aggregate" operations where appropriate. For example the "slicing" feature allows programs to chop up lists and other sequence objects in a single tick of the interpreter mainloop using highly optimized C implementations. Thus to get the same effect as L2 = [] for i in range[3]: L2.append(L1[i]) it is much shorter and far faster to use L2 = list(L1[:3]) # "list" is redundant if L1 is a list. Note that the map() function, particularly used with builtin methods or builtin functions can be a convenient accellerator. For example to pair the elements of two lists together: >>> map(None, [1,2,3], [4,5,6]) [(1, 4), (2, 5), (3, 6)] or to compute a number of sines: >>> map( math.sin, (1,2,3,4)) [0.841470984808, 0.909297426826, 0.14112000806, -0.756802495308] The map operation completes very quickly in such cases. Other examples of aggregate operations include the join, joinfields, split, and splitfields methods of the standard string builtin module. For example if s1..s7 are large (10K+) strings then string.joinfields([s1,s2,s3,s4,s5,s6,s7], "") may be far faster than the more obvious s1+s2+s3+s4+s5+s6+s7, since the "summation" will compute many subexpressions, whereas joinfields does all copying in one pass. For manipulating strings also consider the regular expression libraries and the "substitution" operations String % tuple and String % dictionary. Also be sure to use the list.sort builtin method to do sorting, and see FAQ's 4.51 and 4.59 for examples of moderately advanced usage -- list.sort beats other techniques for sorting in all but the most extreme circumstances. There are many other aggregate operations available in the standard libraries and in contributed libraries and extensions. Another common trick is to "push loops into functions or methods." For example suppose you have a program that runs slowly and you use the profiler (profile.run) to determine that a Python function ff is being called lots of times. If you notice that ff def ff(x): ...do something with x computing result... return result tends to be called in loops like (A) list = map(ff, oldlist) or (B) for x in sequence: value = ff(x) ...do something with value... then you can often eliminate function call overhead by rewriting ff to def ffseq(seq): resultseq = [] for x in seq: ...do something with x computing result... resultseq.append(result) return resultseq and rewrite (A) to list = ffseq(oldlist) and (B) to for value in ffseq(sequence): ...do something with value... Other single calls ff(x) translate to ffseq([x])[0] with little penalty. Of course this technique is not always appropriate and there are other variants, which you can figure out. You can gain some performance by explicitly storing the results of a function or method lookup into a local variable. A loop like for key in token: dict[key] = dict.get(key, 0) + 1 resolves dict.get every iteration. If the method isn't going to change, a faster implementation is dict_get = dict.get # look up the method once for key in token: dict[key] = dict_get(key, 0) + 1 Default arguments can be used to determine values once, at compile time instead of at run time. This can only be done for functions or objects which will not be changed during program execution, such as replacing def degree_sin(deg): return math.sin(deg * math.pi / 180.0) with def degree_sin(deg, factor = math.pi/180.0, sin = math.sin): return sin(deg * factor) Because this trick uses default arguments for terms which should not be changed, it should only be used when you are not concerned with presenting a possibly confusing API to your users. For an anecdote related to optimization, see http://www.python.org/doc/essays/list2str.html ---------------------------------------------------------------------------- 4.8. When I have imported a module, then edit it, and import it again (into the same Python process), the changes don't seem to take place. What is going on? For reasons of efficiency as well as consistency, Python only reads the module file on the first time a module is imported. (Otherwise a program consisting of many modules, each of which imports the same basic module, would read the basic module over and over again.) To force rereading of a changed module, do this: import modname reload(modname) Warning: this technique is not 100% fool-proof. In particular, modules containing statements like from modname import some_objects will continue to work with the old version of the imported objects. ---------------------------------------------------------------------------- 4.9. How do I find the current module name? A module can find out its own module name by looking at the (predefined) global variable __name__. If this has the value '__main__' you are running as a script. ---------------------------------------------------------------------------- 4.10. I have a module in which I want to execute some extra code when it is run as a script. How do I find out whether I am running as a script? See the previous question. E.g. if you put the following on the last line of your module, main() is called only when your module is running as a script: if __name__ == '__main__': main() ---------------------------------------------------------------------------- 4.11. I try to run a program from the Demo directory but it fails with ImportError: No module named ...; what gives? This is probably an optional module (written in C!) which hasn't been configured on your system. This especially happens with modules like "Tkinter", "stdwin", "gl", "Xt" or "Xm". For Tkinter, STDWIN and many other modules, see Modules/Setup.in for info on how to add these modules to your Python, if it is possible at all. Sometimes you will have to ftp and build another package first (e.g. Tcl and Tk for Tkinter). Sometimes the module only works on specific platforms (e.g. gl only works on SGI machines). NOTE: if the complaint is about "Tkinter" (upper case T) and you have already configured module "tkinter" (lower case t), the solution is not to rename tkinter to Tkinter or vice versa. There is probably something wrong with your module search path. Check out the value of sys.path. For X-related modules (Xt and Xm) you will have to do more work: they are currently not part of the standard Python distribution. You will have to ftp the Extensions tar file, i.e. ftp://ftp.python.org/pub/python/src/X-extension.tar.gz and follow the instructions there. See also the next question. ---------------------------------------------------------------------------- 4.12. I have successfully built Python with STDWIN but it can't find some modules (e.g. stdwinevents). There's a subdirectory of the library directory named 'stdwin' which should be in the default module search path. There's a line in Modules/Setup(.in) that you have to enable for this purpose -- unfortunately in the latest release it's not near the other STDWIN-related lines so it's easy to miss it. ---------------------------------------------------------------------------- 4.13. What GUI toolkits exist for Python? Depending on what platform(s) you are aiming at, there are several. Currently supported solutions: There's a neat object-oriented interface to the Tcl/Tk widget set, called Tkinter. It is part of the standard Python distribution and well-supported -- all you need to do is build and install Tcl/Tk and enable the _tkinter module and the TKPATH definition in Modules/Setup when building Python. This is probably the easiest to install and use, and the most complete widget set. It is also very likely that in the future the standard Python GUI API will be based on or at least look very much like the Tkinter interface. For more info about Tk, including pointers to the source, see the Tcl/Tk home page at http://www.scriptics.com. Tcl/Tk is now fully portable to the Mac and Windows platforms (NT and 95 only); you need Python 1.4beta3 or later and Tk 4.1patch1 or later. There's an interface to X11, including the Athena and Motif widget sets (and a few individual widgets, like Mosaic's HTML widget and SGI's GL widget) available from ftp://ftp.python.org/pub/python/src/X-extension.tar.gz. Support by Sjoerd Mullender sjoerd@cwi.nl. On top of the X11 interface there's the (recently revived) vpApp toolkit by Per Spilling, now also maintained by Sjoerd Mullender sjoerd@cwi.nl. See ftp://ftp.cwi.nl/pub/sjoerd/vpApp.tar.gz. The Mac port has a rich and ever-growing set of modules that support the native Mac toolbox calls. See the documentation that comes with the Mac port. See ftp://ftp.python.org/pub/python/mac. Support by Jack Jansen jack@cwi.nl. The NT port supported by Mark Hammond MHammond@skippinet.com.au (see question 7.2) includes an interface to the Microsoft Foundation Classes and a Python programming environment using it that's written mostly in Python. See ftp://ftp.python.org/pub/python/pythonwin/. There's an object-oriented GUI based on the Microsoft Foundation Classes model called WPY, supported by Jim Ahlstrom jim@interet.com. Programs written in WPY run unchanged and with native look and feel on Windows NT/95, Windows 3.1 (using win32s), and on Unix (using Tk). Source and binaries for Windows and Linux are available in ftp://ftp.python.org/pub/python/wpy/. Obsolete or minority solutions: There's an interface to wxWindows. wxWindows is a portable GUI class library written in C++. It supports XView, Motif, MS-Windows as targets. There is some support for Macs and CURSES as well. wxWindows preserves the look and feel of the underlying graphics toolkit. See the wxPython WWW page at http://www.aiai.ed.ac.uk/~jacs/wx/wxpython/wxpython.html. Support for wxPython (by Harri Pasanen pa@tekla.fi) appears to have a low priority. For SGI IRIX only, there are unsupported interfaces to the complete GL (Graphics Library -- low level but very good 3D capabilities) as well as to FORMS (a buttons-and-sliders-etc package built on top of GL by Mark Overmars -- ftp'able from ftp://ftp.cs.ruu.nl/pub/SGI/FORMS/). This is probably also becoming obsolete, as OpenGL takes over. There's an interface to STDWIN, a platform-independent low-level windowing interface for Mac and X11. This is totally unsupported and rapidly becoming obsolete. The STDWIN sources are at ftp://ftp.cwi.nl/pub/stdwin/. (For info about STDWIN 2.0, please refer to Steven Pemberton steven@cwi.nl -- I believe it is also dead.) There is an interface to WAFE, a Tcl interface to the X11 Motif and Athena widget sets. WAFE is at http://www.wu-wien.ac.at/wafe/wafe.html. (The Fresco port that was mentioned in earlier versions of this FAQ no longer seems to exist. Inquire with Mark Linton.) ---------------------------------------------------------------------------- 4.14. Are there any interfaces to database packages in Python? There's a whole collection of them in the contrib area of the ftp server, see http://www.python.org/ftp/python/contrib/Database/. ---------------------------------------------------------------------------- 4.15. Is it possible to write obfuscated one-liners in Python? Yes. See the following three examples, due to Ulf Bartelt: # Primes < 1000 print filter(None,map(lambda y:y*reduce(lambda x,y:x*y!=0, map(lambda x,y=y:y%x,range(2,int(pow(y,0.5)+1))),1),range(2,1000))) # First 10 Fibonacci numbers print map(lambda x,f=lambda x,f:(x<=1) or (f(x-1,f)+f(x-2,f)): f(x,f), range(10)) # Mandelbrot set print (lambda Ru,Ro,Iu,Io,IM,Sx,Sy:reduce(lambda x,y:x+y,map(lambda y, Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,Sy=Sy,L=lambda yc,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,i=IM, Sx=Sx,Sy=Sy:reduce(lambda x,y:x+y,map(lambda x,xc=Ru,yc=yc,Ru=Ru,Ro=Ro, i=i,Sx=Sx,F=lambda xc,yc,x,y,k,f=lambda xc,yc,x,y,k,f:(k<=0)or (x*x+y*y >=4.0) or 1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):f(xc,yc,x,y,k,f):chr( 64+F(Ru+x*(Ro-Ru)/Sx,yc,0,0,i)),range(Sx))):L(Iu+y*(Io-Iu)/Sy),range(Sy ))))(-2.1, 0.7, -1.2, 1.2, 30, 80, 24) # \___ ___/ \___ ___/ | | |__ lines on screen # V V | |______ columns on screen # | | |__________ maximum of "iterations" # | |_________________ range on y axis # |____________________________ range on x axis Don't try this at home, kids! ---------------------------------------------------------------------------- 4.16. Is there an equivalent of C's "?:" ternary operator? Not directly. In many cases you can mimic a?b:c with "a and b or c", but there's a flaw: if b is zero (or empty, or None -- anything that tests false) then c will be selected instead. In many cases you can prove by looking at the code that this can't happen (e.g. because b is a constant or has a type that can never be false), but in general this can be a problem. Tim Peters (who wishes it was Steve Majewski) suggested the following solution: (a and [b] or [c])[0]. Because [b] is a singleton list it is never false, so the wrong path is never taken; then applying [0] to the whole thing gets the b or c that you really wanted. Ugly, but it gets you there in the rare cases where it is really inconvenient to rewrite your code using 'if'. ---------------------------------------------------------------------------- 4.17. My class defines __del__ but it is not called when I delete the object. There are several possible reasons for this. The del statement does not necessarily call __del__ -- it simply decrements the object's reference count, and if this reaches zero __del__ is called. If your data structures contain circular links (e.g. a tree where each child has a parent pointer and each parent has a list of children) the reference counts will never go back to zero. You'll have to define an explicit close() method which removes those pointers. Please don't ever call __del__ directly -- __del__ should call close() and close() should make sure that it can be called more than once for the same object. If the object has ever been a local variable (or argument, which is really the same thing) to a function that caught an expression in an except clause, chances are that a reference to the object still exists in that function's stack frame as contained in the stack trace. Normally, deleting (better: assigning None to) sys.exc_traceback will take care of this. If a stack was printed for an unhandled exception in an interactive interpreter, delete sys.last_traceback instead. There is code that deletes all objects when the interpreter exits, but it is not called if your Python has been configured to support threads (because other threads may still be active). You can define your own cleanup function using sys.exitfunc (see question 4.4). Finally, if your __del__ method raises an exception, this will be ignored. Starting with Python 1.4beta3, a warning message is printed to sys.stderr when this happens. See also question 6.14 for a discussion of the possibility of adding true garbage collection to Python. ---------------------------------------------------------------------------- 4.18. How do I change the shell environment for programs called using os.popen() or os.system()? Changing os.environ doesn't work. You must be using either a version of python before 1.4, or on a (rare) system that doesn't have the putenv() library function. Before Python 1.4, modifying the environment passed to subshells was left out of the interpreter because there seemed to be no well-established portable way to do it (in particular, some systems, have putenv(), others have setenv(), and some have none at all). As of Python 1.4, almost all Unix systems do have putenv(), and so does the Win32 API, and thus the os module was modified so that changes to os.environ are trapped and the corresponding putenv() call is made. ---------------------------------------------------------------------------- 4.19. What is a class? A class is the particular object type that is created by executing a class statement. Class objects are used as templates, to create class instance objects, which embody both the data structure and program routines specific to a datatype. ---------------------------------------------------------------------------- 4.20. What is a method? A method is a function that you normally call as x.name(arguments...) for some object x. The term is used for methods of classes and class instances as well as for methods of built-in objects. (The latter have a completely different implementation and only share the way their calls look in Python code.) Methods of classes (and class instances) are defined as functions inside the class definition. ---------------------------------------------------------------------------- 4.21. What is self? Self is merely a conventional name for the first argument of a method -- i.e. a function defined inside a class definition. A method defined as meth(self, a, b, c) should be called as x.meth(a, b, c) for some instance x of the class in which the definition occurs; the called method will think it is called as meth(x, a, b, c). ---------------------------------------------------------------------------- 4.22. What is an unbound method? An unbound method is a method defined in a class that is not yet bound to an instance. You get an unbound method if you ask for a class attribute that happens to be a function. You get a bound method if you ask for an instance attribute. A bound method knows which instance it belongs to and calling it supplies the instance automatically; an unbound method only knows which class it wants for its first argument (a derived class is also OK). Calling an unbound method doesn't "magically" derive the first argument from the context -- you have to provide it explicitly. Trivia note regarding bound methods: each reference to a bound method of a particular object creates a bound method object. If you have two such references (a = inst.meth; b = inst.meth), they will compare equal (a == b) but are not the same (a is not b). ---------------------------------------------------------------------------- 4.23. How do I call a method defined in a base class from a derived class that overrides it? If your class definition starts with "class Derived(Base): ..." then you can call method meth defined in Base (or one of Base's base classes) as Base.meth(self, arguments...). Here, Base.meth is an unbound method (see previous question). ---------------------------------------------------------------------------- 4.24. How do I call a method from a base class without using the name of the base class? DON'T DO THIS. REALLY. I MEAN IT. It appears that you could call self.__class__.__bases__[0].meth(self, arguments...) but this fails when a doubly-derived method is derived from your class: for its instances, self.__class__.__bases__[0] is your class, not its base class -- so (assuming you are doing this from within Derived.meth) you would start a recursive call. Often when you want to do this you are forgetting that classes are first class in Python. You can "point to" the class you want to delegate an operation to either at the instance or at the subclass level. For example if you want to use a "glorp" operation of a superclass you can point to the right superclass to use. class subclass(superclass1, superclass2, superclass3): delegate_glorp = superclass2 ... def glorp(self, arg1, arg2): ... subclass specific stuff ... self.delegate_glorp.glorp(self, arg1, arg2) ... class subsubclass(subclass): delegate_glorp = superclass3 ... Note, however that setting delegate_glorp to subclass in subsubclass would cause an infinite recursion on subclass.delegate_glorp. Careful! Maybe you are getting too fancy for your own good. Consider simplifying the design (?). ---------------------------------------------------------------------------- 4.25. How can I organize my code to make it easier to change the base class? You could define an alias for the base class, assign the real base class to it before your class definition, and use the alias throughout your class. Then all you have to change is the value assigned to the alias. Incidentally, this trick is also handy if you want to decide dynamically (e.g. depending on availability of resources) which base class to use. Example: BaseAlias = <real base class> class Derived(BaseAlias): def meth(self): BaseAlias.meth(self) ... ---------------------------------------------------------------------------- 4.26. How can I find the methods or attributes of an object? This depends on the object type. For an instance x of a user-defined class, instance attributes are found in the dictionary x.__dict__, and methods and attributes defined by its class are found in x.__class__.__bases__[i].__dict__ (for i in range(len(x.__class__.__bases__))). You'll have to walk the tree of base classes to find all class methods and attributes. Many, but not all built-in types define a list of their method names in x.__methods__, and if they have data attributes, their names may be found in x.__members__. However this is only a convention. For more information, read the source of the standard (but undocumented) module newdir. ---------------------------------------------------------------------------- 4.27. I can't seem to use os.read() on a pipe created with os.popen(). os.read() is a low-level function which takes a file descriptor (a small integer). os.popen() creates a high-level file object -- the same type used for sys.std{in,out,err} and returned by the builtin open() function. Thus, to read n bytes from a pipe p created with os.popen(), you need to use p.read(n). ---------------------------------------------------------------------------- 4.28. How can I create a stand-alone binary from a Python script? The "freeze" tool in "Tools/freeze/" does what you want. See the README. This works by scanning your source recursively for import statements (both forms) and looking for the modules on the standard Python path as well as in the source directory (for built-in modules). It then "compiles" the modules written in Python to C code (array initializers that can be turned into code objects using the marshal module) and creates a custom-made config file that only contains those built-in modules which are actually used in the program. It then compiles the generated C code and links it with the rest of the Python interpreter to form a self-contained binary which acts exactly like your script. Hint: the freeze program only works if your script's filename ends in ".py". ---------------------------------------------------------------------------- 4.29. What WWW tools are there for Python? See the chapter titled "Internet and WWW" in the Library Reference Manual. There's also a web browser written in Python, called Grail -- see http://grail.cnri.reston.va.us/grail/. ---------------------------------------------------------------------------- 4.30. How do I run a subprocess with pipes connected to both input and output? Use the standard popen2 module. For example: import popen2 fromchild, tochild = popen2.popen2("command") tochild.write("input\n") tochild.flush() output = fromchild.readline() Warning: in general, it is unwise to do this, because you can easily cause a deadlock where your process is blocked waiting for output from the child, while the child is blocked waiting for input from you. This can be caused because the parent expects the child to output more text than it does, or it can be caused by data being stuck in stdio buffers due to lack of flushing. The Python parent can of course explicitly flush the data it sends to the child before it reads any output, but if the child is a naive C program it can easily have been written to never explicitly flush its output, even if it is interactive, since flushing is normally automatic. Note on a bug in popen2: unless your program calls wait() or waitpid(), finished child processes are never removed, and eventually calls to popen2 will fail because of a limit on the number of child processes. Calling os.waitpid with the os.WNOHANG option can prevent this; a good place to insert such a call would be before calling popen2 again. In many cases, all you really need is to run some data through a command and get the result back. Unless the data is infinite in size, the easiest (and often the most efficient!) way to do this is to write it to a temporary file and run the command with that temporary file as input. The standard module tempfile exports a function mktemp() which generates unique temporary file names. Note that many interactive programs (e.g. vi) don't work well with pipes substituted for standard input and output. You will have to use pseudo ttys ("ptys") instead of pipes. There is some undocumented code to use these in the library module pty.py -- I'm afraid you're on your own here. A different answer is a Python interface to Don Libes' "expect" library. A Python extension that interfaces to expect is called "expy" and available from ftp://ftp.python.org/pub/python/contrib/System/. A pure Python solution that works like expect is PIPE by John Croix. A prerelease of PIPE is available from ftp://ftp.python.org/pub/python/contrib/System/. ---------------------------------------------------------------------------- 4.31. How do I call a function if I have the arguments in a tuple? Use the built-in function apply(). For instance, func(1, 2, 3) is equivalent to args = (1, 2, 3) apply(func, args) Note that func(args) is not the same -- it calls func() with exactly one argument, the tuple args, instead of three arguments, the integers 1, 2 and 3. ---------------------------------------------------------------------------- 4.32. How do I enable font-lock-mode for Python in Emacs? If you are using XEmacs 19.14 or later, any XEmacs 20, FSF Emacs 19.34 or any Emacs 20, font-lock should work automatically for you if you are using the latest python-mode.el. If you are using an older version of XEmacs or Emacs you will need to put this in your .emacs file: (defun my-python-mode-hook () (setq font-lock-keywords python-font-lock-keywords) (font-lock-mode 1)) (add-hook 'python-mode-hook 'my-python-mode-hook) ---------------------------------------------------------------------------- 4.33. Is there a scanf() or sscanf() equivalent? Not as such. For simple input parsing, the easiest approach is usually to split the line into whitespace-delimited words using string.split(), and to convert decimal strings to numeric values using string.atoi(), string.atol() or string.atof(). (Python's atoi() is 32-bit and its atol() is arbitrary precision.) If you want to use another delimiter than whitespace, use string.splitfield() (possibly combining it with string.strip() which removes surrounding whitespace from a string). For more complicated input parsing, regular expressions (see module regex) are better suited and more powerful than C's sscanf(). There's a contributed module that emulates sscanf(), by Steve Clift; see contrib/Misc/sscanfmodule.c of the ftp site: http://www.python.org/ftp/python/contrib/Misc/sscanfmodule.c ---------------------------------------------------------------------------- 4.34. Can I have Tk events handled while waiting for I/O? Yes, and you don't even need threads! But you'll have to restructure your I/O code a bit. Tk has the equivalent of Xt's XtAddInput() call, which allows you to register a callback function which will be called from the Tk mainloop when I/O is possible on a file descriptor. Here's what you need: from Tkinter import tkinter tkinter.createfilehandler(file, mask, callback) The file may be a Python file or socket object (actually, anything with a fileno() method), or an integer file descriptor. The mask is one of the constants tkinter.READABLE or tkinter.WRITABLE. The callback is called as follows: callback(file, mask) You must unregister the callback when you're done, using tkinter.deletefilehandler(file) Note: since you don't know *how many bytes* are available for reading, you can't use the Python file object's read or readline methods, since these will insist on reading a predefined number of bytes. For sockets, the recv() or recvfrom() methods will work fine; for other files, use os.read(file.fileno(), maxbytecount). ---------------------------------------------------------------------------- 4.35. How do I write a function with output parameters (call by reference)? [Mark Lutz] The thing to remember is that arguments are passed by assignment in Python. Since assignment just creates references to objects, there's no alias between an argument name in the caller and callee, and so no call-by-reference per se. But you can simulate it in a number of ways: 1) By using global variables; but you probably shouldn't :-) 2) By passing a mutable (changeable in-place) object: def func1(a): a[0] = 'new-value' # 'a' references a mutable list a[1] = a[1] + 1 # changes a shared object args = ['old-value', 99] func1(args) print args[0], args[1] # output: new-value 100 3) By returning a tuple, holding the final values of arguments: def func2(a, b): a = 'new-value' # a and b are local names b = b + 1 # assigned to new objects return a, b # return new values x, y = 'old-value', 99 x, y = func2(x, y) print x, y # output: new-value 100 4) And other ideas that fall-out from Python's object model. For instance, it might be clearer to pass in a mutable dictionary: def func3(args): args['a'] = 'new-value' # args is a mutable dictionary args['b'] = args['b'] + 1 # change it in-place args = {'a':' old-value', 'b': 99} func3(args) print args['a'], args['b'] 5) Or bundle-up values in a class instance: class callByRef: def __init__(self, **args): for (key, value) in args.items(): setattr(self, key, value) def func4(args): args.a = 'new-value' # args is a mutable callByRef args.b = args.b + 1 # change object in-place args = callByRef(a='old-value', b=99) func4(args) print args.a, args.b But there's probably no good reason to get this complicated :-). [Python's author favors solution 3 in most cases.] ---------------------------------------------------------------------------- 4.36. Please explain the rules for local and global variables in Python. [Ken Manheimer] In Python, procedure variables are implicitly global, unless they are assigned anywhere within the block. In that case they are implicitly local, and you need to explicitly declare them as 'global'. Though a bit surprising at first, a moment's consideration explains this. On one hand, requirement of 'global' for assigned vars provides a bar against unintended side-effects. On the other hand, if global were required for all global references, you'd be using global all the time. Eg, you'd have to declare as global every reference to a builtin function, or to a component of an imported module. This clutter would defeat the usefulness of the 'global' declaration for identifying side-effects. ---------------------------------------------------------------------------- 4.37. How can I have modules that mutually import each other? Jim Roskind recommends the following order in each module: First: all exports (like globals, functions, and classes that don't need imported base classes). Then: all import statements. Finally: all active code (including globals that are initialized from imported values). Python's author doesn't like this approach much because the imports appear in a strange place, but has to admit that it works. His recommended strategy is to avoid all uses of "from <module> import *" (so everything from an imported module is referenced as <module>.<name>) and to place all code inside functions. Initializations of global variables and class variables should use constants or built-in functions only. ---------------------------------------------------------------------------- 4.38. How do I copy an object in Python? There is no generic copying operation built into Python, however most object types have some way to create a clone. Here's how for the most common objects: For immutable objects (numbers, strings, tuples), cloning is unnecessary since their value can't change. For lists (and generally for mutable sequence types), a clone is created by the expression l[:]. For dictionaries, the following function returns a clone: def dictclone(o): n = {} for k in o.keys(): n[k] = o[k] return n Finally, for generic objects, the "copy" module defines two functions for copying objects. copy.copy(x) returns a copy as shown by the above rules. copy.deepcopy(x) also copies the elements of composite objects. See the section on this module in the Library Reference Manual. ---------------------------------------------------------------------------- 4.39. How to implement persistent objects in Python? (Persistent == automatically saved to and restored from disk.) The library module "pickle" now solves this in a very general way (though you still can't store things like open files, sockets or windows), and the library module "shelve" uses pickle and (g)dbm to create persistent mappings containing arbitrary Python objects. For possibly better performance also look for the latest version of the relatively recent cPickle module. A more awkward way of doing things is to use pickle's little sister, marshal. The marshal module provides very fast ways to store noncircular basic Python types to files and strings, and back again. Although marshal does not do fancy things like store instances or handle shared references properly, it does run extremely fast. For example loading a half megabyte of data may take less than a third of a second (on some machines). This often beats doing something more complex and general such as using gdbm with pickle/shelve. ---------------------------------------------------------------------------- 4.40. I try to use __spam and I get an error about _SomeClassName__spam. Variables with double leading underscore are "mangled" to provide a simple but effective way to define class private variables. See the chapter "New in Release 1.4" in the Python Tutorial. ---------------------------------------------------------------------------- 4.41. How do I delete a file? And other file questions. Use os.remove(filename) or os.unlink(filename); for documentation, see the posix section of the library manual. They are the same, unlink() is simply the Unix name for this function. In earlier versions of Python, only os.unlink() was available. To remove a directory, use os.rmdir(); use os.mkdir() to create one. To rename a file, use os.rename(). To truncate a file, open it using f = open(filename, "r+"), and use f.truncate(offset); offset defaults to the current seek position. (The "r+" mode opens the file for reading and writing.) There's also os.ftruncate(fd, offset) for files opened with os.open() -- for advanced Unix hacks only. ---------------------------------------------------------------------------- 4.42. How to modify urllib or httplib to support HTTP/1.1? Apply the following patch to the vanilla Python 1.4 httplib.py: 41c41 < replypat = regsub.gsub('\\.', '\\\\.', HTTP_VERSION) + \ --- > replypat = regsub.gsub('\\.', '\\\\.', 'HTTP/1.[0-9]+') + \ ---------------------------------------------------------------------------- 4.43. Unexplicable syntax errors in compile() or exec. When a statement suite (as opposed to an expression) is compiled by compile(), exec or execfile(), it must end in a newline. In some cases, when the source ends in an indented block it appears that at least two newlines are required. ---------------------------------------------------------------------------- 4.44. How do I convert a string to a number? For integers, use the built-in int() function, e.g. int('144') == 144. Similarly, long() converts from string to long integer, e.g. long('144') == 144L; and float() to floating-point, e.g. float('144') == 144.0. Note that these are restricted to decimal interpretation, so that int('0144') == 144 and int('0x144') raises ValueError. For greater flexibility, or before Python 1.5, import the module string and use the string.atoi() function for integers, string.atol() for long integers, or string.atof() for floating-point. E.g., string.atoi('100', 16) == string.atoi('0x100', 0) == 256. See the library reference manual section for the string module for more details. While you could use the built-in function eval() instead of any of those, this is not recommended, because someone could pass you a Python expression that might have unwanted side effects (like reformatting your disk). ---------------------------------------------------------------------------- 4.45. How do I convert a number to a string? To convert, e.g., the number 144 to the string '144', use the built-in function repr() or the backquote notation (these are equivalent). If you want a hexadecimal or octal representation, use the built-in functions hex() or oct(), respectively. For fancy formatting, use the % operator on strings, just like C printf formats, e.g. "%04d" % 144 yields '0144' and "%.3f" % (1/3.0) yields '0.333'. See the library reference manual for details. ---------------------------------------------------------------------------- 4.46. How do I copy a file? Most of the time this will do: infile = open("file.in", "rb") outfile = open("file.out", "wb") outfile.write(infile.read()) However for huge files you may want to do the reads/writes in pieces (or you may have to), and if you dig deeper you may find other technical problems. Unfortunately, there's no totally platform independent answer. On Unix, you can use os.system() to invoke the "cp" command (see your Unix manual for how it's invoked). On DOS or Windows, use os.system() to invoke the "COPY" command. On the Mac, use macostools.copy(srcpath, dstpath). It will also copy the resource fork and Finder info. There's also the shutil module which contains a copyfile() function that implements the copy loop; but in Python 1.4 and earlier it opens files in text mode, and even in Python 1.5 it still isn't good enough for the Macintosh: it doesn't copy the resource fork and Finder info. ---------------------------------------------------------------------------- 4.47. How do I check if an object is an instance of a given class or of a subclass of it? If you are developing the classes from scratch it might be better to program in a more proper object-oriented style -- instead of doing a different thing based on class membership, why not use a method and define the method differently in different classes? However, there are some legitimate situations where you need to test for class membership. In Python 1.5, you can use the built-in function isinstance(obj, cls). The following approaches can be used with earlier Python versions: An unobvious method is to raise the object as an exception and to try to catch the exception with the class you're testing for: def is_instance_of(the_instance, the_class): try: raise the_instance except the_class: return 1 except: return 0 This technique can be used to distinguish "subclassness" from a collection of classes as well try: raise the_instance except Audible: the_instance.play(largo) except Visual: the_instance.display(gaudy) except Olfactory: sniff(the_instance) except: raise ValueError, "dunno what to do with this!" This uses the fact that exception catching tests for class or subclass membership. A different approach is to test for the presence of a class attribute that is presumably unique for the given class. For instance: class MyClass: ThisIsMyClass = 1 ... def is_a_MyClass(the_instance): return hasattr(the_instance, 'ThisIsMyClass') This version is easier to inline, and probably faster (inlined it is definitely faster). The disadvantage is that someone else could cheat: class IntruderClass: ThisIsMyClass = 1 # Masquerade as MyClass ... but this may be seen as a feature (anyway, there are plenty of other ways to cheat in Python). Another disadvantage is that the class must be prepared for the membership test. If you do not "control the source code" for the class it may not be advisable to modify the class to support testability. ---------------------------------------------------------------------------- 4.48. What is delegation? Delegation refers to an object oriented technique Python programmers may implement with particular ease. Consider the following: from string import upper class UpperOut: def __init__(self, outfile): self.__outfile = outfile def write(self, str): self.__outfile.write( upper(str) ) def __getattr__(self, name): return getattr(self.__outfile, name) Here the UpperOut class redefines the write method to convert the argument string to upper case before calling the underlying self.__outfile.write method, but all other methods are delegated to the underlying self.__outfile object. The delegation is accomplished via the "magic" __getattr__ method. Please see the language reference for more information on the use of this method. Note that for more general cases delegation can get trickier. Particularly when attributes must be set as well as gotten the class must define a __settattr__ method too, and it must do so carefully. The basic implementation of __setattr__ is roughly equivalent to the following: class X: ... def __setattr__(self, name, value): self.__dict__[name] = value ... Most __setattr__ implementations must modify self.__dict__ to store local state for self without causing an infinite recursion. ---------------------------------------------------------------------------- 4.49. How do I test a Python program or component. First, it helps to write the program so that it may be easily tested by using good modular design. In particular your program should have almost all functionality encapsulated in either functions or class methods -- and this sometimes has the surprising and delightful effect of making the program run faster (because local variable accesses are faster than global accesses). Furthermore the program should avoid depending on mutating global variables, since this makes testing much more difficult to do. The "global main logic" of your program may be as simple as if __name__=="__main__": main_logic() at the bottom of the main module of your program. Once your program is organized as a tractible collection of functions and class behaviours you should write test functions that exercise the behaviours. A test suite can be associated with each module which automates a sequence of tests. This sounds like a lot of work, but since Python is so terse and flexible it's surprisingly easy. You can make coding much more pleasant and fun by writing your test functions in parallel with the "production code", since this makes it easy to find bugs and even design flaws earlier. "Support modules" that are not intended to be the main module of a program may include a "test script interpretation" which invokes a self test of the module. if __name__ == "__main__": self_test() Even programs that interact with complex external interfaces may be tested when the external interfaces are unavailable by using "fake" interfaces implemented in Python. For an example of a "fake" interface, the following class defines (part of) a "fake" file interface: import string testdata = "just a random sequence of characters" class FakeInputFile: data = testdata position = 0 closed = 0 def read(self, n=None): self.testclosed() p = self.position if n is None: result= self.data[p:] else: result= self.data[p: p+n] self.position = p + len(result) return result def seek(self, n, m=0): self.testclosed() last = len(self.data) p = self.position if m==0: final=n elif m==1: final=n+p elif m==2: final=len(self.data)+n else: raise ValueError, "bad m" if final<0: raise IOError, "negative seek" self.position = final def isatty(self): return 0 def tell(self): return self.position def close(self): self.closed = 1 def testclosed(self): if self.closed: raise IOError, "file closed" Try f=FakeInputFile() and test out its operations. ---------------------------------------------------------------------------- 4.50. My multidimensional list (array) is broken! What gives? You probably tried to make a multidimensional array like this. A = [[None] * 2] * 3 This makes a list containing 3 references to the same list of length two. Changes to one row will show in all rows, which is probably not what you want. The following works much better: A = [None]*3 for i in range(3): A[i] = [None] * 2 This generates a list containing 3 different lists of length two. If you feel weird, you can also do it in the following way: w, h = 2, 3 A = map(lambda i,w=w: [None] * w, range(h)) ---------------------------------------------------------------------------- 4.51. I want to do a complicated sort: can you do a Schwartzian Transform in Python? Yes, and in Python you only have to write it once: def st(List, Metric): def pairing(element, M = Metric): return (M(element), element) paired = map(pairing, List) paired.sort() return map(stripit, paired) def stripit(pair): return pair[1] This technique, attributed to Randal Schwartz, sorts the elements of a list by a metric which maps each element to its "sort value". For example, if L is a list of string then import string Usorted = st(L, string.upper) def intfield(s): return string.atoi( string.strip(s[10:15] ) ) Isorted = st(L, intfield) Usorted gives the elements of L sorted as if they were upper case, and Isorted gives the elements of L sorted by the integer values that appear in the string slices starting at position 10 and ending at position 15. Note that Isorted may also be computed by def Icmp(s1, s2): return cmp( intfield(s1), intfield(s2) ) Isorted = L[:] Isorted.sort(Icmp) but since this method computes intfield many times for each element of L, it is slower than the Schwartzian Transform. ---------------------------------------------------------------------------- 4.52. How to convert between tuples and lists? The function tuple(seq) converts any sequence into a tuple with the same items in the same order. For example, tuple([1, 2, 3]) yields (1, 2, 3) and tuple('abc') yields ('a', 'b', 'c'). If the argument is a tuple, it does not make a copy but returns the same object, so it is cheap to call tuple() when you aren't sure that an object is already a tuple. The function list(seq) converts any sequence into a list with the same items in the same order. For example, list((1, 2, 3)) yields [1, 2, 3] and list('abc') yields ['a', 'b', 'c']. If the argument is a list, it makes a copy just like seq[:] would. ---------------------------------------------------------------------------- 4.53. Files retrieved with urllib contain leading garbage that looks like email headers. The server is using HTTP/1.1; the vanilla httplib in Python 1.4 only recognizes HTTP/1.0. See question 4.42 for a patch. ---------------------------------------------------------------------------- 4.54. How do I get a list of all instances of a given class? Python does not keep track of all instances of a class (or of a built-in type). You can program the class's constructor to keep track of all instances, but unless you're very clever, this has the disadvantage that the instances never get deleted,because your list of all instances keeps a reference to them. (The trick is to regularly inspect the reference counts of the instances you've retained, and if the reference count is below a certain level, remove it from the list. Determining that level is tricky -- it's definitely larger than 1.) ---------------------------------------------------------------------------- 4.55. A regular expression fails with regex.error: match failure. This is usually caused by too much backtracking; the regular expression engine has a fixed size stack which holds at most 4000 backtrack points. Every character matched by e.g. ".*" accounts for a backtrack point, so even a simple search like regex.match('.*x',"x"*5000) will fail. This is fixed in the re module introduced with Python 1.5; consult the Library Reference section on re for more information. ---------------------------------------------------------------------------- 4.56. I can't get signal handlers to work. The most common problem is that the signal handler is declared with the wrong argument list. It is called as handler(signum, frame) so it should be declared with two arguments: def handler(signum, frame): ... ---------------------------------------------------------------------------- 4.57. I can't use a global variable in a function? Help! Did you do something like this? x = 1 # make a global def f(): print x # try to print the global ... for j in range(100): if q>3: x=4 If you did, all references to x in f are local, not global by virtue of the "x=4" assignment. Any variable assigned in a function is local to that function unless it is declared global. Consequently the "print x" attempts to print an uninitialized local variable and will trigger a NameError. ---------------------------------------------------------------------------- 4.58. What's a negative index? Why doesn't list.insert() use them? Python sequences are indexed with positive numbers and negative numbers. For positive numbers 0 is the first index 1 is the second index and so forth. For negative indices -1 is the last index and -2 is the pentultimate (next to last) index and so forth. Think of seq[-n] as the same as seq[len(seq)-n]. Using negative indices can be very convenient. For example if the string Line ends in a newline then Line[:-1] is all of Line except the newline. Sadly the list builtin method L.insert does not observe negative indices. This feature could be considered a mistake but since existing programs depend on this feature it may stay around forever. L.insert for negative indices inserts at the start of the list. To get "proper" negative index behaviour use L[n:n] = [x] in place of the insert method. ---------------------------------------------------------------------------- 4.59. How can I sort one list by values from another list? You can sort lists of tuples. >>> list1 = ["what", "I'm", "sorting", "by"] >>> list2 = ["something", "else", "to", "sort"] >>> pairs = map(None, list1, list2) >>> pairs [('what', 'something'), ("I'm", 'else'), ('sorting', 'to'), ('by', 'sort')] >>> pairs.sort() >>> pairs [("I'm", 'else'), ('by', 'sort'), ('sorting', 'to'), ('what', 'something')] >>> result = pairs[:] >>> for i in xrange(len(result)): result[i] = result[i][1] ... >>> result ['else', 'sort', 'to', 'something'] And if you didn't understand the question, please see the example above ;c). Note that "I'm" sorts before "by" because uppercase "I" comes before lowercase "b" in the ascii order. Also see 4.51. ---------------------------------------------------------------------------- 4.60. Why doesn't dir() work on builtin types like files and lists? It should have -- and it does starting with Python 1.5 (currently in development -- see Questions 1.13 and 2.10). Using 1.4, you can find out which methods a given object supports by looking at its __methods__ attribute: >>> List = [] >>> List.__methods__ ['append', 'count', 'index', 'insert', 'remove', 'reverse', 'sort'] ---------------------------------------------------------------------------- 4.61. How can I mimic CGI form submission (METHOD=POST)? I would like to retrieve web pages that are the result of POSTing a form. Is there existing code that would let me do this easily? Yes. Here's a simple example that uses httplib. #!/usr/local/bin/python import httplib, sys, time ### build the query string qs = "First=Josephine&MI=Q&Last=Public" ### connect and send the server a path httpobj = httplib.HTTP('www.some-server.out-there', 80) httpobj.putrequest('POST', '/cgi-bin/some-cgi-script') ### now generate the rest of the HTTP headers... httpobj.putheader('Accept', '*/*') httpobj.putheader('Connection', 'Keep-Alive') httpobj.putheader('Content-type', 'application/x-www-form-urlencoded') httpobj.putheader('Content-length', '%d' % len(qs)) httpobj.endheaders() httpobj.send(qs) ### find out what the server said in response... reply, msg, hdrs = httpobj.getreply() if reply != 200: sys.stdout.write(httpobj.getfile().read()) Note that in general for "url encoded posts" (the default) query strings must be "quoted" to, for example, change equals signs and spaces to an encoded form when they occur in name or value. Use urllib.quote to perform this quoting. For example to send name="Guy Steele, Jr.": >>> from urllib import quote >>> x = quote("Guy Steele, Jr.") >>> x 'Guy%20Steele,%20Jr.' >>> query_string = "name="+x >>> query_string 'name=Guy%20Steele,%20Jr.' ---------------------------------------------------------------------------- 4.62. If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come? Databases opened for write access with the bsddb module (and often by the anydbm module, since it will preferentially use bsddb) must explcitly be closed using the close method of the database. The underlying libdb package caches database contents which need to be converted to on-disk form and written, unlike regular open files which already have the on-disk bits in the kernel's write buffer, where they can just be dumped by the kernel with the program exits. If you have initialized a new bsddb database but not written anything to it before the program crashes, you will often wind up with a zero-length file and encounter an exception the next time the file is opened. ---------------------------------------------------------------------------- 4.63. How do I make a Python script executable on Unix? You need to do two things: the script file's mode must be executable (include the 'x' bit), and the first line must begin with #! followed by the pathname for the Python interpreter. The first is done by executing 'chmod +x scriptfile' or perhaps 'chmod 755 scriptfile'. The second can be done in a number of way. The most straightforward way is to write #!/usr/local/bin/python as the very first ine of your file - or whatever the pathname is where the python interpreter is installed on your platform. If you would like the script to be independent of where the python interpreter lives, you can use the "env" program. On almost all platforms, the following woll work, assuming the python interpreter is in a directory on the user's $PATH: #! /usr/bin/env python Note -- *don't* do this for CGI scripts. The $PATH variable for CGI scripts is often very minimal, so you need to use the actual absolute pathname of the interpreter. Occasionally, a user's environment is so full that the /usr/bin/env program fails; or there's no env program at all. In that case, you can try the following hack (due to Alex Rezinsky): #! /bin/sh """:" exec python $0 ${1+"$@"} """ The disadvantage is that this defines the script's __doc__ string. However, you can fix that by adding __doc__ = """...Whatever...""" ---------------------------------------------------------------------------- 4.64. How do you remove duplicates from a list? Generally, if you don't mind reordering the List if List: List.sort() last = List[-1] for i in range(len(List)-2, -1, -1): if last==List[i]: del List[i] else: last=List[i] If all elements of the list may be used as dictionary keys (ie, they are all hashable) this is often faster d = {} for x in List: d[x]=x List = d.values() Also, for extremely large lists you might consider more optimal alternatives to the first one. The second one is pretty good whenever it can be used. ---------------------------------------------------------------------------- 4.65. Are there any known year 2000 problems in Python? I am not aware of year 2000 deficiencies in Python 1.5. Python does very few date calculations and for what it does, it relies on the C library functions. Python generally represent times either as seconds since 1970 or as a tuple (year, month, day, ...) where the year is expressed with four digits, which makes Y2K bugs unlikely. So as long as your C library is okay, Python should be okay. Of course, I cannot vouch for your Python code! Given the nature of freely available software, I have to add that this statement is not legally binding. The Python copyright notice contains the following disclaimer: STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. The good news is that if you encounter a problem, you have full source available to track it down and fix it! ---------------------------------------------------------------------------- 4.66. I want a version of map that applies a method to a sequence of objects! Help! Get fancy! def method_map(objects, method, arguments): """method_map([a,b], "flog", (1,2)) gives [a.flog(1,2), b.flog(1,2)]""" nobjects = len(objects) methods = map(getattr, objects, [method]*nobjects) return map(apply, methods, [arguments]*nobjects) It's generally a good idea to get to know the mysteries of map and apply and getattr and the other dynamic features of Python. ---------------------------------------------------------------------------- 4.67. How do I generate random numbers in Python? The standard library module "whrandom" implements a random number generator. Usage is simple: import whrandom whrandom.random() This returns a random floating point number in the range [0, 1). There are also other specialized generators in this module: randint(a, b) chooses an integer in the range [a, b) choice(S) chooses from a given sequence uniform(a, b) chooses a floating point number in the range [a, b) To force the random number generator's initial setting, use seed(x, y, z) set the seed from three integers in [1, 256) There's also a class, whrandom, whoch you can instantiate to create independent multiple random number generators. The module "random" contains functions that approximate various standard distributions. All this is documented in the library reference manual. Note that the module "rand" is obsolete. ---------------------------------------------------------------------------- 4.68. How do I access the serial (RS232) port? There's a Windows serial communication module (for communication over RS 232 serial ports) at http://www.python.org/ftp/python/contrib/System/siomodule.README http://www.python.org/ftp/python/contrib/System/siomodule.zip For DOS, try Hans Nowak's Python-DX, which supports this, at: http://www.cuci.nl/~hnowak/ For Unix, search Deja News (using http://www.python.org/search/) for "serial port" with author Mitch Chapman (his post is a little too long to include here). ---------------------------------------------------------------------------- 4.69. Images on Tk-Buttons don't work in Py15? They do work, but you must keep your own reference to the image object now. More verbosely, you must make sure that, say, a global variable or a class attribute refers to the object. Quoting Fredrik Lundh from the mailinglist: Well, the Tk button widget keeps a reference to the internal photoimage object, but Tkinter does not. So when the last Python reference goes away, Tkinter tells Tk to release the photoimage. But since the image is in use by a widget, Tk doesn't destroy it. Not completely. It just blanks the image, making it completely transparent... And yes, there was a bug in the keyword argument handling in 1.4 that kept an extra reference around in some cases. And when Guido fixed that bug in 1.5, he broke quite a few Tkinter programs... ---------------------------------------------------------------------------- 4.70. Where is the math.py (socket.py, regex.py, etc.) source file? If you can't find a source file for a module it may be a builtin or dynamically loaded module implemented in C, C++ or other compiled language. In this case you may not have the source file or it may be something like mathmodule.c, somewhere in a C source directory (not on the Python Path). Fredrik Lundh (fredrik@pythonware.com) explains (on the python-list): There are (at least) three kinds of modules in Python: 1) modules written in Python (.py); 2) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc); 3) modules written in C and linked with the interpreter; to get a list of these, type: import sys print sys.builtin_module_names ---------------------------------------------------------------------------- 4.71. How do I send mail from a Python script? On Unix, it's very simple, using sendmail. The location of the sendmail program varies between systems; sometimes it is /usr/lib/sendmail, sometime /usr/sbin/sendmail. The sendmail manual page will help you out. Here's some sample code: SENDMAIL = "/usr/sbin/sendmail" # sendmail location import os p = os.popen("%s -t" % SENDMAIL, "w") p.write("To: cary@ratatosk.org\n") p.write("Subject: test\n") p.write("\n") # blank line separating headers from body p.write("Some text\n") p.write("some more text\n") sts = p.close() if sts != 0: print "Sendmail exit status", sts On non-Unix systems (and on Unix systems too, of course!), you can use SMTP to send mail to a nearby mail server. A library for SMTP (smtplib.py) is included in Python 1.5.1; in 1.5.2 it will be documented and extended. Here's a very simple interactive mail sender that uses it: import sys, smtplib fromaddr = raw_input("From: ") toaddrs = string.splitfields(raw_input("To: "), ',') print "Enter message, end with ^D:" msg = '' while 1: line = sys.stdin.readline() if not line: break msg = msg + line # The actual mail send server = smtplib.SMTP('localhost') server.sendmail(fromaddr, toaddrs, msg) server.quit() This method will work on any host that supports an SMTP listener; otherwise, you will have to ask the user for a host. ---------------------------------------------------------------------------- 4.72. How do I avoid blocking in connect() of a socket? The select module is widely known to help with asynchronous I/O on sockets once they are connected. However, it is less than common knowledge how to avoid blocking on the initial connect() call. Jeremy Hylton has the following advice (slightly edited): To prevent the TCP connect from blocking, you can set the socket to non-blocking mode. Then when you do the connect(), you will either connect immediately (unlikely) or get an exception that contains the errno. errno.EINPROGRESS indicates that the connection is in progress, but hasn't finished yet. Different OSes will return different errnos, so you're going to have to check. I can tell you that different versions of Solaris return different errno values. In Python 1.5 and later, you can use connect_ex() to avoid creating an exception. It will just return the errno value. To poll, you can call connect_ex() again later -- 0 or errno.EISCONN indicate that you're connected -- or you can pass this socket to select (checking to see if it is writeable). ---------------------------------------------------------------------------- 4.73. How do I specify hexadecimal and octal integers? To specify an octal digit, precede the octal value with a zero. For example, to set the variable "a" to the octal value "10" (8 in decimal), type: >>> a = 010 To verify that this works, you can type "a" and hit enter while in the interpreter, which will cause Python to spit out the current value of "a" in decimal: >>> a 8 Hexadecimal is just as easy. Simply precede the hexadecimal number with a zero, and then a lower or uppercase "x". Hexadecimal digits can be specified in lower or uppercase. For example, in the Python interpreter: >>> a = 0xa5 >>> a 165 >>> b = 0XB2 >>> b 178 ---------------------------------------------------------------------------- 4.74. How to get a single keypress at a time? For Windows, see question 8.2. Here is an answer for Unix. There are several solutions; some involve using curses, which is a pretty big thing to learn. Here's a solution without curses, due to Andrew Kuchling (adapted from code to do a PGP-style randomness pool): import termios, TERMIOS, sys, os fd = sys.stdin.fileno() old = termios.tcgetattr(fd) new = termios.tcgetattr(fd) new[3] = new[3] & ~TERMIOS.ICANON & ~TERMIOS.ECHO new[6][TERMIOS.VMIN] = 1 new[6][TERMIOS.VTIME] = 0 termios.tcsetattr(fd, TERMIOS.TCSANOW, new) s = '' # We'll save the characters typed and add them to the pool. try: while 1: c = os.read(fd, 1) print "Got character", `c` s = s+c finally: termios.tcsetattr(fd, TERMIOS.TCSAFLUSH, old) You need the termios module for any of this to work, and I've only tried it on Linux, though it should work elsewhere. It turns off stdin's echoing and disables canonical mode, and then reads a character at a time from stdin, noting the time after each keystroke. ---------------------------------------------------------------------------- 4.75. How can I overload constructors (or methods) in Python? (This actually applies to all methods, but somehow the question usually comes up first in the context of constructors.) Where in C++ you'd write class C { C() { cout << "No arguments\n"; } C(int i) { cout << "Argument is " << i << "\n"; } } in Python you have to write a single constructor that catches all cases using default arguments. For example: class C: def __init__(self, i=None): if i is None: print "No arguments" else: print "Argument is", i This is not entirely equivalent, but close enough in practice. You could also try a variable-length argument list, e.g. def __init__(self, *args): .... The same approach works for all method definitions. ---------------------------------------------------------------------------- 4.76. How do I pass keyword arguments from one method to another? Use apply. For example: class Account: def __init__(self, **kw): self.accountType = kw.get('accountType') self.balance = kw.get('balance') class CheckingAccount(Account): def __init__(self, **kw): kw['accountType'] = 'checking' apply(Account.__init__, (self,), kw) myAccount = CheckingAccount(balance=100.00) ---------------------------------------------------------------------------- 4.77. What module should I use to help with generating HTML? Check out HTMLgen written by Robin Friedrich. It's a class library of objects corresponding to all the HTML 3.2 markup tags. It's used when you are writing in Python and wish to synthesize HTML pages for generating a web or for CGI forms, etc. It can be found in the FTP contrib area on python.org or on the Starship. Use the search engines there to locate the latest version. It might also be useful to consider DocumentTemplate, which offers clear separation between Python code and HTML code. DocumentTemplate is part of the Bobo objects publishing system (http:/www.digicool.com/releases) but can be used independantly of course! ---------------------------------------------------------------------------- 4.78. How do I create documentation from doc strings? Use gendoc, by Daniel Larson. See http://starship.skyport.net/crew/danilo/ It can create HTML from the doc strings in your Python source code. ---------------------------------------------------------------------------- 4.79. How do I read (or write) binary data? For complex data formats, it's best to use use the struct module. It's documented in the library reference. It allows you to take a string read from a file containing binary data (usually numbers) and convert it to Python objects; and vice versa. For example, the following code reads two 2-byte integers and one 4-byte integer in big-endian format from a file: import struct f = open(filename, "rb") # Open in binary mode for portability s = f.read(8) x, y, z = struct.unpack(">hhl", s) The '>' in the format string forces bin-endian data; the letter 'h' reads one "short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the string. For data that is more regular (e.g. a homogeneous list of ints or floats), you can also use the array module, also documented in the library reference. ---------------------------------------------------------------------------- 4.80. I can't get key bindings to work in Tkinter An oft-heard complaint is that event handlers bound to events with the bind() method don't get handled even when the appropriate key is pressed. The most common cause is that the widget to which the binding applies doesn't have "keyboard focus". Check out the Tk documentation for the focus command. Usually a widget is given the keyboard focus by clicking in it (but not for labels; see the taketocus option). ---------------------------------------------------------------------------- 4.81. "import crypt" fails [Unix] Starting with Python 1.5, the crypt module is disabled by default. In order to enable it, you must go into the Python source tree and edit the file Modules/Setup to enable it (remove a '#' sign in front of the line starting with '#crypt'). Then rebuild. You may also have to add the string '-lcrypt' to that same line. ---------------------------------------------------------------------------- 4.82. Are there coding standards or a style guide for Python programs? Yes, Guido has written the "Python Style Guide". See http://www.python.org/doc/essays/styleguide.html ---------------------------------------------------------------------------- 4.83. How do I freeze Tkinter applications? Freeze is a tool to create stand-alone applications (see 4.28). When freezing Tkinter applications, the applications will not be truly stand-alone, as the application will still need the tcl and tk libraries. One solution is to ship the application with the tcl and tk libraries, and point to them at run-time using the TCL_LIBRARY and TK_LIBRARY environment variables. To get truly stand-alone applications, the Tcl scripts that form the library have to be integrated into the application as well. One tool supporting that is SAM (stand-alone modules), which is part of the Tix distribution (http://tix.mne.com). Build Tix with SAM enabled, perform the appropriate call to Tclsam_init etc inside Python's Modules/tkappinit.c, and link with libtclsam and libtksam (you might include the Tix libraries as well). ---------------------------------------------------------------------------- 4.84. How do I create static class data and static class methods? [Tim Peters, tim_one@email.msn.com] Static data (in the sense of C++ or Java) is easy; static methods (again in the sense of C++ or Java) are not supported directly. STATIC DATA For example, class C: count = 0 # number of times C.__init__ called def __init__(self): C.count = C.count + 1 def getcount(self): return C.count # or return self.count c.count also refers to C.count for any c such that isinstance(c, C) holds, unless overridden by c itself or by some class on the base-class search path from c.__class__ back to C. Caution: within a method of C, self.count = 42 creates a new and unrelated instance vrbl named "count" in self's own dict. So rebinding of a class-static data name needs the C.count = 314 form whether inside a method or not. STATIC METHODS Static methods (as opposed to static data) are unnatural in Python, because C.getcount returns an unbound method object, which can't be invoked without supplying an instance of C as the first argument. The intended way to get the effect of a static method is via a module-level function: def getcount(): return C.count If your code is structured so as to define one class (or tightly related class hierarchy) per module, this supplies the desired encapsulation. Several tortured schemes for faking static methods can be found by searching DejaNews. Most people feel such cures are worse than the disease. Perhaps the least obnoxious is due to Pekka Pessi (mailto:ppessi@hut.fi): # helper class to disguise function objects class _static: def __init__(self, f): self.__call__ = f class C: count = 0 def __init__(self): C.count = C.count + 1 def getcount(): return C.count getcount = _static(getcount) def sum(x, y): return x + y sum = _static(sum) C(); C() c = C() print C.getcount() # prints 3 print c.getcount() # prints 3 print C.sum(27, 15) # prints 42 ---------------------------------------------------------------------------- 4.85. __import__('x.y.z') returns <module 'x'>; how do I get z? Try __import__('x.y.z').y.z For more realistic situations, you may have to do something like m = __import__(s) for i in string.split(s, ".")[1:]: m = getattr(m, i) ---------------------------------------------------------------------------- 4.86. Basic thread wisdom If you write a simple test program like this: import thread def run(name, n): for i in range(n): print name, i for i in range(10): thread.start_new(run, (i, 100)) none of the threads seem to run! The reason is that as soon as the main thread exits, all threads are killed. A simple fix is to add a sleep to the end of the program, sufficiently long for all threads to finish: import thread, time def run(name, n): for i in range(n): print name, i for i in range(10): thread.start_new(run, (i, 100)) time.sleep(10) # <----------------------------! But now (on many platforms) the threads don't run in parallel, but appear to run sequentially, one at a time! The reason is that the OS thread scheduler doesn't start a new thread until the previous thread is blocked. A simple fix is to add a tiny sleep to the start of the run function: import thread, time def run(name, n): time.sleep(0.001) # <---------------------! for i in range(n): print name, i for i in range(10): thread.start_new(run, (i, 100)) time.sleep(10) Some more hints: Instead of using a time.sleep() call at the end, it's better to use some kind of semaphore mechanism. One idea is to use a the Queue module to create a queue object, let each thread append a token to the queue when it finishes, and let the main thread read as many tokens from the queue as there are threads. Use the threading module instead of the thread module. It's part of Python since version 1.5.1. It takes care of all these details, and has many other nice features too! ---------------------------------------------------------------------------- 4.87. Why doesn't closing sys.stdout (stdin, stderr) really close it? Python file objects are a high-level layer of abstraction on top of C streams, which in turn are a medium-level layer of abstraction on top of (among other things) low-level C file descriptors. For most file objects f you create in Python via the builtin "open" function, f.close() marks the Python file object as being closed from Python's point of view, and also arranges to close the underlying C stream. This happens automatically too, in f's destructor, when f becomes garbage. But stdin, stdout and stderr are treated specially by Python, because of the special status also given to them by C: doing sys.stdout.close() # ditto for stdin and stderr marks the Python-level file object as being closed, but does not close the associated C stream (provided sys.stdout is still bound to its default value, which is the stream C also calls "stdout"). To close the underlying C stream for one of these three, you should first be sure that's what you really want to do (e.g., you may confuse the heck out of extension modules trying to do I/O). If it is, use os.close: os.close(0) # close C's stdin stream os.close(1) # close C's stdout stream os.close(2) # close C's stderr stream ---------------------------------------------------------------------------- 4.88. What kinds of global value mutation are thread-safe? [adapted from c.l.py responses by Gordon McMillan & GvR] A global interpreter lock is used internally to ensure that only one thread runs in the Python VM at a time. In general, Python offers to switch among threads only between bytecode instructions (how frequently it offers to switch can be set via sys.setcheckinterval). Each bytecode instruction-- and all the C implementation code reached from it --is therefore atomic. In theory, this means an exact accounting requires an exact understanding of the PVM bytecode implementation. In practice, it means that operations on shared vrbls of builtin data types (ints, lists, dicts, etc) that "look atomic" really are. For example, these are atomic (L, L1, L2 are lists, D, D1, D2 are dicts, x, y are objects, i, j are ints): L.append(x) L1.extend(L2) x = L[i] x = L.pop() L1[i:j] = L2 L.sort() x = y x.field = y D[x] = y D1.update(D2) D.keys() These aren't: i = i+1 L.append(L[-1]) L[i] = L[j] D[x] = D[x] + 1 Note: operations that replace other objects may invoke those other objects' __del__ method when their reference count reaches zero, and that can affect things. This is especially true for the mass updates to dictionaries and lists. When in doubt, use a mutex! ---------------------------------------------------------------------------- 4.89. How do I modify a string in place? Strings are immutable (see question 6.2) so you cannot modify a string directly. If you need an object with this ability, try converting the string to a list or take a look at the array module. >>> s = "Hello, world" >>> a = list(s) >>> print a ['H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd'] >>> a[7:] = list("there!") >>> import string >>> print string.join(a, '') 'Hello, there!' >>> import array >>> a = array.array('c', s) >>> print a array('c', 'Hello, world') >>> a[0] = 'y' ; print a array('c', 'yello world') >>> a.tostring() 'yello, world' ---------------------------------------------------------------------------- 4.90. How to pass on keyword/optional parameters/arguments Q: How can I pass on optional or keyword parameters from one function to another? A: Use 'apply', like: def f1(a, *b, **c): ... def f2(x, *y, **z): ... z['width']='14.3c' ... apply(f1, (a,)+b, c) ---------------------------------------------------------------------------- -- ----------- comp.lang.python.announce (moderated) ---------- Article Submission Address: python-announce@python.org Python Language Home Page: http://www.python.org/ Python Quick Help Index: http://www.python.org/Help.html ------------------------------------------------------------
participants (1)
-
Markus Fleck