
[Guido]
It's not Mark's fault, it's Microsoft's fault. If you don't do things the way MS wants you to, experienced Windows users will gripe, misunderstand what you do, etc. [Tim] Something just occurred to me: MS's guidelines aren't arbitrary, they actually have very good reasons. In the case of putting all an app's crucial info in the Registry, it's the only way to allow a site administrator to set policy and site options remotely (an admin can fiddle other machines' registries remotely). This works very well indeed when there's only "one copy" of an app on a machine (or at most one copy "per user"). [Gordon] And actually, the business about separate subtrees for the machine's configuration and the user's configuration is pretty clever. MS doesn't explain it well, and it gets misused, but when done right, it's a lot simpler than the maze of .xxxrc files you sometimes find in other OSes.
I agree. And I am guilty of not even try to find MS' explanation -- I just looked in the registry at what other apps did and tried to mimic that (plus what Mark had already done), without really knowing what I was doing. I now know a little better -- see the end of this message.
In my Linux version, I went to the heart of the matter - getpath.c. It occurs to me that getpath.c might do better to follow a normal bootstrap process - ie, create the absolute minimal sys.path required to go to the next step. Then the rest of what goes on in getpath.c could be written in Python. Maybe that Python code needs to get frozen in (to prevent bozos from destroying an installation by stepping on getpath.py), but it would make it a lot easier to create independent installations, and also reduce the variations between platforms at the C level. (Then again, I've never heard of anyone stepping on exceptions.py.)
Yes, this is exactly what was proposed in the thread on the Big Import Rewrite.
If some registry manipulation primitives were exposed (say, through ntpath) that would mean that Windows developers could (if they wanted) play by the MS rules with at least the option of not stepping on each other.
That's a good idea. These functions are already available through Mark's win32api extension -- much of which will eventually (I hope before 1.6 is out!) become part of the core distribution. In the mean time, I've been thinking a bit more about how Python should be using the Windows registry. (It's clear to me that Python should use the registry -- those who disagree can go build their own Python distribution.) The basic ideas of Python's current registry usage are sound: there's a resource built into the DLL which is part of the key into the registry used for all information. The problem lies in which key is used. All versions of Python 1.5.x (1.5, 1.5.1, 1.5.2) use the same key! This is a main cause of trouble, because it means that different versions cannot peacefully live together even if the user installs them into different directories -- they will all use the registry keys of the last version installed. This, in turn, means that someone who writes a Python application that has a dependency on a particular Python version (and which application worth distributing doesn't :-) cannot trust that if a Python installation is present, it is the right one. But they also cannot simply bundle the standard installer for the correct Python version with their program, because its installation would overwrite an existing Python application, thus breaking some *other* Python apps that the user might already have installed. (There's a solution for app builders who are willing to do a lot of work -- you can change the registry key resource in the DLL. For example, Alice comes with its own version of Python 1.5.1 and it uses "1.5.1-alice" as its registry key. The Alice installer installs Python in a subdirectory of the Alice installation directory and points the 1.5.1-alice registry entries there. The problem is that this is a lot of work for the average app builder.) I thought a bit about how VB solves this. I think that when you wrap up a VB app in, all the support code (mostly a big DLL) is wrapped with it. When the user runs the installer, the DLL is installed (probably in the WINDOWS directory). If a user installs several VB apps built with the same VB version, they all attempt to install the exact same DLL; of course the installers notice this and optimize it away, keeping a reference count. (Ignoring for now the fact that those reference counts don't always work!) If an app builty with a different VB version is installed, it has a DLL with a different name, and that is installed separately. Other support files, I presume, are dealt with in much the same way. Voila, there's the theory. How can we do something similar for Python? A app written in Python should need to install only three or four files: - a driver EXE to start the app - a copy of the Python DLL - the Python library in an archive - the app code in an archive The latter two could be combined into a single archive, but I propose that we use two archives so that the DLL and the Python library archive can be shared between installations of independent Python apps as long as they use the exact same Python version and don't need additional 3rd party packages. (I believe that Jim A's proposal combines the archives with the EXE and the DLL, reducing the number of files to two. That's fine too.) Is there a use for the registry here at all? Maybe not. (I notice that VB seems to have a single registry entry, pointing to a DLL; all other VB files also seem to live there.) Complications: - Some apps may need a custom extension module, which has to be installed as a PYD file. So it seems that there needs to be a directory per app, and perhaps per version of the app (if the app distributor cares). - Some apps need other, non-pyc files (e.g. data tables or help files); it would be handy if these could be stored in the archives as well. - Some standard extension modules are in their own PYD files; these also need to be installed. They aren't typically marked with a version, so perhaps a path directory per version of Python (if not per installed app) is wise. - How to distribute an app that needs 3rd party stuff, e.g. Tcl/Tk, or PIL, or NumPy? Their Python code can easily be wrapped up in another archive with a standard name incorporating a version number; but the required PYD and DLL files are a separate story. (E.g. for Tkinter, you need _tkinter.pyd which links against tcl80.dll.) Basically the same solution as for standard PYD files can work; the needed DLL files can be installed either systemwide (if they have a reliable version number in their name, like tcl80.dll) or in the per-app or per-package directory (like NumPy). - Presumably, the archives will contain PYC files only. This means that tracebacks will not show source code, only line numbers. For Jim A, this is probably exactly what he wants (if the user gets a traceback, his "robust app" has miserably failed, and he takes it in pride that this doesn't happen). But for some others, access to the sources could be essential. For example, I might want to distribute IDLE using this mechanism; users of IDLE who are curious about the standard library (or about IDLE itself) should be able to open the source for an arbitrary module (and maybe even edit it, although that's not a priority and perhaps should even be discouraged). Library source access is an important feature of the IDLE debugger as well. A way out for IDLE is to install a classic distribution of the Python library sources, into the filesystem at an IDLE specific location. Other apps, with only the need for source code in tracebacks, might choose to to have the PY files in the archives sitting next to the PYC files, and somehow the traceback mechanism should be accessing the archive to get a hold of the source. And yes, I realize that Jim A's latest offering solves most of these problems to a large extent -- well done. (Jim, would you care to comment on the issues that you don't address? Will you address them in a future version?) Final notes: There are two different problems here. One is how to distribute Python apps robustly to end users who don't particular care about Python. This is Jim A's problem (and he has a solution that works for him). In general the solutions here try to isolate the installed app from other Python installations. I'm proposing that at least the DLL and the Python library archive can probably be shared between apps without reducing robustness if we keep track more carefully of version numbers. The other problem is how to distribute packages of Python and extension modules for use by Python users. These typically need to drop into some existing Python installation. This is Paul Dubois' problem with NumPy (amongst others) and is the current focus of the distutil SIG. However I believe that there could be a lot of common infrastructure that would help us create better solutions for both problems. For package distribution, common infrastructure (a.k.a. standards) is essential. For app distribution, common infrastructure isn't so important (since the solutions strive for total isolation, there's no problem if different apps use solutions). However, this changes when app creators want to distribute robust self-sufficient apps that use 3rd party packages -- then the 3rd party packages must allow being packaged up using the app distribution creator of choice. Solving this compound problem (creating package distributions that can be redistributed easily as part of robust Python app distributions) should be an important goal for the infrastructure we're building here. The Big Import Rewrite ought to add this to its list of objectives if it isn't already on it. My guess is that the solution for this compound problem will increase the dependency of app distribution tools on the package distribution infrastructure; which to me seems like a Good Thing because it would lead to more code sharing. --Guido van Rossum (home page: http://www.python.org/~guido/)