Hi all,
I've taken Greg Stein's "small" distribution and (with his advice(*)
and help) created a mechanism for building compressed single-file
importable python libraries. This is cross-platform, depending only
on zlib (and Greg's imputil.py). There are a number of ways you can
create these archives - grabbing directories, packages or computing
the dependencies of a particular script.
Then, for Win32 users, I've taken it a few steps further. I've mixed
this with Christian Tismer's sqfreeze (based on /F's squeeze) and
freeze and created a compiler-less Python installer. You can also use
it much like freeze, except you don't need a compiler. The major
difference is that, once installed, it won't be a completely
standalone executable. The installer will unpack the dependencies
into the exe's directory. There will be no dependencies outside
that directory(**). You can pack in extension modules, dlls and
anything else you like. The python lib support will be contained in a
single archive file. So the equivalent of a frozen pure-python script
would be one self-installing exe that expands to 5 files - the
"squeezed" main script, python15.dll, zlib.dll, zlib.pyd and your
private support lib (something.pyz).
Check it out - http://www.mcmillan-inc.com/install.html
Oh yes, while the code has my copyright, it's released under a
do-what-you-want license. Mostly, I glued together what others had
done (freeze, sqfreeze, Greg's stuff...).
- Gordon
(*) Not always followed. Don't blame Greg.
(**) Note that if you want to install something that makes use of
Mark's COM extensions (particularly, Python COM servers), you can't
get away with this.
Hi,
after being invited by Greg Ward to this list, I tried to read all documents
available concerning the topic of the list and I also tried to get uptodate
with the mailing list by reading the archives. Now I like to give my $0.02
as someone who tries to keep a quite big distribution uptodate which also
includes a lot of third party modules. This mail is rather long,
cause I like to comment all all things in one mail, cause I think that quite a
lot relates to each other.
And please keep in mind that I am speaking as an enduser in one or another
way. That means I will have to use this as a developer and will have to use it
as a packager of third party developer. So I won't comment on any
implementation issue where it is not absolutely critical.
And please keep also in mind that may be I am talking about things you have
already talked about, but I am quite new to this list and I only read
the archives of this and the last month. ;-)
- Tasks and division of labour
I think that the packager and installer role are a little bit mixed up. In
my eyes the packagers job is to build and test the package on a specific
platform and also build the binary distribution. This is also what you wrote
on the webpage.
But the installers role should only be to install the prebuilt package,
cause his normal job is to provide uptodate software that the users of the
system he manages use. And he has enough to do with that.
- The proposed Interface
Basically the that I can say that I like the idea that I can write in my RPM
spec file
setup.py build
setup.py test
setup.py install
and afterwards I have installed the package somewhere on my machine. And I
am absolutely sure that it works as indented. I think that this is the way
it works for most Perl modules.
But I have problems with bdist option of setup.py cause I think that this is
hard to implement. If I got this right I as a RPM and debian package
maintainer should be able to say
setup.py bdist rpm
setup.py bdist debian
And afterwards I have a debian linux and rpm package of the Python package.
Nice in theory but this would require that setup.py or the distutils
packages how to create these packages, that means we have to implement a
meta packaging system on top of existing packaging systems which are
powerful themselves. So what would it look like when I call these commands
above?
Would the distutils stuff create a spec file (input file to create a rpm)
and then call rpm -ba <specfile>? And inside the rpm build process setup.py
is called again to compile and install the packages content? Finally rpm
creates the two normal output files, which are the actual binary package and
the other is the source rpm from which you can recompile the binary package
on your machine.
This is the same for debian linux, slackaware linux, rpm based linux
versions, Solaris packages and BeOS software packages. The last is only a
vague guess cause I only looked into the Be system very shortly.
- What I would suggest what setup.py should do
The options that I have no problem with are
build_py - copy/compile .py files (pure Python modules)
build_ext - compile .c files, link to .so in blib
build_doc - process documentation (targets: html, info, man, ...?)
build - build_py, build_ext, build_doc
dist - create source distribution
test - run test suite
install - install on local machine
What should make_blib do ?
But I require is that I can tell the build_ext which compiler switches to
use, cause may be I need on my system different switches then the original
developer can use.
I also like to provide the install option with an argument to tell where the
files should be installed, cause I can tell rpm for example that it should
compile the extension package as if it would be installed in
/usr/lib/python1.5 but could it in the install stage to install it in
/tmp/py-root/usr/lib/python1.5. So I can build and install the package
without overwriting an existing installation of a older version and I also
have a clean way to determine what files actually got installed.
install should also be split up into install and install_doc and installdoc
should also be able to take an argument where I tell it where to install the
files to.
I would remove the bdist option cause it would introduce a lot of work,
cause you not only have to tackle various systems but also various packaging
systems. I would add an option files instead which returns a list of files
this packages consists of. And consequently an option doc_files is also
required cause I like to stick to the way rpm manages doc files, I simply
tell it what files are doc files and it installs them the right way.
Another that would be fine if I could extract the package information with
setup.py. Something like setup description returns the full description and
so on.
And I would also add an option system to the command line options, cause I
like to tell the setup.py script an option from which it can determine on
which system it is running. Why this is required will follow.
- ARCH dependent sections should be added
What is not clear in my eyes, may be I have missed something, but how do you
deal with different architectures? What I would suggest here is we should
use a dictionary instead of plain definitions of cc, ccshared, cflags and
ldflags. Such a dictionary may look like that
compilation_flags = { "Linux" : { "cc": "gcc", "cflags": "-O3", ...},
"Linux2.2": { "cc": "egcs", ....},
"Solaris": { "cc": "cc", ....}
}
And now I would call setup.py like that
setup.py -system Linux build
or whatever convention you want to use for command line arguments.
- Subpackages are also required
Well, this is something that I like very much and what I really got
accustomed to. They you build PIL and also a Tkinter version that supports
PIL, then like to create both packages and also state that PIL-Tkinter
requires PIL.
Conclusion (or whatever you want to call it)
I as a packager don't require the distutils stuff to be some kind of meta
packaging system that generates from some kind of meta information the
actual package creation file from which it is called again. And I don't
believe that have to develop a complete new packaging system, cause for
quite a lot systems such systems exist. And I also think that we introduce
such a system the acceptance wouldn't be very high. The people want to
maintain the software basis with their natural tools. A RedHat Linux user
would like to use rpm, a Solaris user would like to use pkg and a
WindowsUser would like to use INstallShield (or whatever is the standard).
The target of distutils should be to develop a package which can be
configured to compile and install the extension package. The developed
software should be usable by the packager to extract all required
information to create his native package and the installer, should use the
prebuilt packages at best or should be able to install the package by
calling setup install.
I hope that I described as good as possible what I require as packager and I
think that is not a business of distutils but of the native packaging system.
Any comments are welcome and I am willing to discuss this, as I am absolutely
aware that we need a standard way of installing python extensions.
Best regards,
Oliver
--
Oliver Andrich, RZ-Online, Schlossstrasse Str. 42, D-56068 Koblenz
Telefon: 0261-3921027 / Fax: 0261-3921033 / Web: http://rhein-zeitung.de
Private Homepage: http://andrich.net/
Just van Rossum wrote:
>
> At 8:00 AM -0800 1/31/99, Greg Stein wrote:
> >...
> >Okay... enough background and rambling. If you're interested, go look at the
> >four messages in the thread titled "Freeze and new import architecture" in
> >the distutils-sig archive at:
> >http://www.python.org/pipermail/distutils-sig/1998-December/thread.html
>
> I'm not on that sig so I missed your post originally. I agree with most you
> say here:
> http://www.python.org/pipermail/distutils-sig/1998-December/000077.html
> Especially that entries in sys.path should be loader instances (or
> directory paths).
>
> Some questions:
> - what is the interface of a loader
By "loader", I will presume that you mean an instance of an Importer
subclass that is defining get_code(). Here is the method as defined by
imputil.Importer:
def get_code(self, parent, modname, fqname):
"""Find and retrieve the code for the given module.
parent specifies a parent module to define a context for importing.
It
may be None, indicating no particular context for the search.
modname specifies a single module (not dotted) within the parent.
fqname specifies the fully-qualified module name. This is a
(potentially)
dotted name from the "root" of the module namespace down to the
modname.
If there is no parent, then modname==fqname.
This method should return None, a 2-tuple, or a 3-tuple.
* If the module was not found, then None should be returned.
* The first item of the 2- or 3-tuple should be the integer 0 or 1,
specifying whether the module that was found is a package or not.
* The second item is the code object for the module (it will be
executed within the new module's namespace).
* If present, the third item is a dictionary of name/value pairs
that
will be inserted into new module before the code object is
executed.
This provided in case the module's code expects certain values
(such
as where the module was found).
"""
raise RuntimeError, "get_code not implemented"
That method is the sole interface used by Importer subclasses. To define
a custom import mechanism, you would just derive from imputil.Importer
and override that one method.
I'm not sure if that answers your question, however. Please let me know
if something is unclear so that I can correct the docstring.
Oh, geez. And I totally spaced on one feature of imputil.py. There is a
way to define an import mechanism for very simple uses. I created a
subclass named FuncImporter that delegates the get_code() method to a
user-supplied function. This allows a user to do something like this:
import imputil
def get_code(parent, modname, fqname):
...do something...
imputil.install_with(get_code)
The install_with() utility simply creates a FuncImporter with the
specified function and then installs the importer. No need to mess with
subclasses.
> - how are packages identified
If get_code() determines that the requested module is a package, then it
should return the integer 1 along with the code object for that
package's module. In the standard package system, the code object is
loaded from __init__.py.
An example: let's say that get_code() is called with (None, "mypkg",
"mypkg") for its arguments. get_code() finds "mypkg" in whatever path it
is configured for, and then determines that it represents a directory.
It looks in the directory for __init__.py or .pyc. If it finds it, then
mypkg is a real package. It loads code from __init__.py and returns (1,
code). The Importer superclass will create a new module for "mypkg" and
execute the code within it, and then label it as a package (for future
reference).
Internally, packages are labelled with a module-level name: __ispkg__.
That is set to 0 or 1 accordingly. The Importer that actually imports a
module places itself into the module with "__importer__ = self". This
latter variable is to allow Importers to only work on modules that they
imported themselves, and helps with identifying the context of an
importer (whether an import is being performed by a module within a
package, or not).
> - can it be in Python 1.6 or sooner?
I imagine that it could be in Python 1.6, but I doubt that it would go
into Python 1.5.2, as it has not had enough exposure yet. Guido's call
on all counts, though :-)
> PS: I was just wondering, why doesn't reload() use the standard import hook?
The standard import hook returns a *new* module object. reload() must
repopulate the module object.
I'm not sure that I answered your questions precisely, as they were
rather non-specific. If I haven't answered well enough, then please ask
again and I'll try again :-).
Cheers,
-g
--
Greg Stein, http://www.lyra.org/
Gordon McMillan wrote:
>
> [Greg puts code where his mouth is]
>
> > A while back, Mark Hammond brought up a "new import architecture"
> > proposal after some discussion with Jack Jansen and Guido. I
> > responded in my usual diplomatic style and said "feh. the two-step
> > import style is bunk."
>
> [snip...]
>
> This is outrageously cool!
>
> It appears that you're aiming for a rewrite of import.c (and
> friends) and giving new meaing to "sys.path". (I love it - Java says
> "it can be a directory or a zip file" and Greg says "why stop there?
> It can be anything at all" - hee hee).
Yup :-)
Last month, I even argued that the import mechanism could simply be
shifted entirely to Python, too, since the overhead of interpreted
Python code is minimal next to the parsing, execution, and/or I/O
involved with importing. hehe...
> What is enabling this magic in this prototype? We pick up your
> site.py automatically, but is your python.exe doing something in
> between Py_Initialize and Py_Main? Or is this all based on making a
> chain out of __builtin__.__import__?
If python.exe doesn't find its registry settings, then it assumes a
default sys.path (which includes the current directory). Using that, it
loads exceptions.py and then site.py. Once site.py loads, then I install
the custom import hook so that all future imports will be yanked from
py15.pyl. The builtin modules aren't in there, of course, so those fall
down the chain to the builtin importer.
Cheers,
-g
--
Greg Stein, http://www.lyra.org/