Disposition of C extensions and packages

[Crossposted to xml-sig, distutils-sig] I'm working on getting the XML-SIG's CVS tree to install using the current version of the Distutils. Right now there are two C extensions, sgmlop.so and pyexpat.so, and they're installed under xml/parsers/ . It's hard to handle this case using the distutils code as it stands, because it expects to put extensions into a build/platlib/ directory, from which they'll be installed into site-packages. I can coerce setup.py into installing them into xml/parsers/, by subclassing the BuildExt command and setting build_dir myself: from distutils.command.build_ext import BuildExt class XMLBuildExt(BuildExt): def set_default_options (self): BuildExt.set_default_options( self ) self.build_dir = 'lib/build/xml/parser' setup (name = "PyXML", cmdclass = {'build_ext':XMLBuildExt}, ...) You also have to subclass the Install command and set build_dir there; I've trimmed that code. It's really clunky.\ Note that this scheme will break once there are C modules that need to be installed anywhere other than xml/parsers/, because build_dir is being hardwired without knowledge of what module is being compiled. Questions: 1) A general Python question about packaging style: Is mixing C extensions and Python modules in one package tree a bad idea? It makes the whole tree platform-dependent, which is probably annoying for sites maintaining Python installation for different architectures. 2) Another general question, this time o: how should this be handled? Should C extensions always be effectively top-level, and therefore go into site-packages? Should there be an xml package holding .py files, and an X package holding all the C extensions? (X = 'plat_xml', 'xml_binary', or something like that) 3) XML-SIG question: should I go ahead and change it (since I first changed it to use xml.parsers.sgmlop)? 4) Distutils question: is this a problem with the Distutils code that needs fixing? I suspect not; if the tools make it difficult to do stupid things like mix .py and .so files, that's a good thing. -- A.M. Kuchling http://starship.python.net/crew/amk/ The Kappamaki, a whaling research ship, was currently researching the question: How many whales can you catch in one week? -- Terry Pratchett & Neil Gaiman, _Good Omens_

"A.M. Kuchling" wrote:
I have been using that setup for two years now with all of my mx extensions and so far it has been working great. If you maintain packages with C extensions for several platforms, you can simply install the packages under the platform subdirs in /usr/local/lib/python1.5 -- one copy for every platform. Disk space is no argument anymore nowadays.
Just leave them in the package. I use a separate subpackage for the C extension which the packages modules then import. This makes mixed Python + C extensions and prototyping of C APIs in Python very simple and straight forward.
I wouldn't like this; for a very simple reason: if someone wants to provide a Python rewrite of a C module which works as dropin replacement, the only way to handle this is by having a .so file and a .py file with the same name in the same directory. mxDateTime uses such a setup, for example. Note that .so files are found before .py files, thus if someone does have the .so file, Python will use the C module and not the Python one. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 13 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

M.-A. Lemburg writes:
Hmmm... disk space is no longer a problem, but maintaining those parallel trees might be a problem; if a bugfix is released as a patch to Python code, you'd have to apply it to several trees. On the other hand, this isn't a very common way of fixing bugs in the Python community; a .1 release is more common. So your point is a good one; this would argue that the Distutils should make it easier to build things in one tree, and then install things under the right plat- directory. -- A.M. Kuchling http://starship.python.net/crew/amk/ He felt that his whole life was some kind of dream and he sometimes wondered whose it was and whether they were enjoying it. -- Douglas Adams, _The Hitch-Hiker's Guide to the Galaxy_

On 18 December 1999, A.M. Kuchling said:
Well, I missed this thread first time around because I was on holiday. Andrew went to such amazing lengths to do something that's actually quite easy that I thought it would be worth following up, even if I'm a month late. In short, putting Python extension modules into a package is really, really, *really* easy. Here's what *was* in Andrew's setup.py for the XML package (edited for readability): ext_modules = [('sgmlop', { 'sources' : ['extensions/sgmlop.c'] }), ('pyexpat', { 'define': [('XML_NS', None)], 'include_dirs': [ 'extensions/expat/xmltok', 'extensions/expat/xmlparse' ], 'sources' : [...] } ) ] Note that the names of the two extensions are 'sgmlop' and 'pyexpat'. In regards to where the .so files wind up, *the name is the most important thing* (and usually all you need). All we had to do to get rid of Andrew's extended command classes (and other kludgery) was set the names of the two extension modules appropriately, so the XML setup script now has this: ext_modules = [('xml.parsers.sgmlop', { 'sources' : ['extensions/sgmlop.c'] }), ('xml.parsers.pyexpat', { 'define': [('XML_NS', None)], 'include_dirs': [ 'extensions/expat/xmltok', 'extensions/expat/xmlparse' ], 'sources' : [...] } ) ] i.e. the names of the extensions had to change... and nothing else. (Well, we commented out the unnecessarily extended command classes.) I assumed that large module distributions with many extensions will probably put them in the same package. Thus, we could have done the change this way as well: ext_package = 'xml.parsers' ext_modules = [('sgmlop', { 'sources' : ['extensions/sgmlop.c'] }), ('pyexpat', { 'define': [('XML_NS', None)], 'include_dirs': [ 'extensions/expat/xmltok', 'extensions/expat/xmlparse' ], 'sources' : [...] } ) ] And if you only have a common base package for many extensions, you should still be able to specify that base package with 'ext_package': ext_package = 'xml.parsers' ext_modules = [('pkg1.sgmlop', { 'sources' : ['extensions/sgmlop.c'] }), ('pkg2.pyexpat', { 'define': [('XML_NS', None)], 'include_dirs': [ 'extensions/expat/xmltok', 'extensions/expat/xmlparse' ], 'sources' : [...] } ) ] If everything goes as planned, that should result in two extensions called 'xml.parsers.pkg1.sgmlop' and 'xml.parsers.pkg2.pyexpat'. Note that my assertions about these last two examples are based solely on reading the code and a dim recollection of what I had in mind when I wrote it -- i.e. I haven't tested them. YMMV, but please let me know if it does. Greg

"A.M. Kuchling" wrote:
I have been using that setup for two years now with all of my mx extensions and so far it has been working great. If you maintain packages with C extensions for several platforms, you can simply install the packages under the platform subdirs in /usr/local/lib/python1.5 -- one copy for every platform. Disk space is no argument anymore nowadays.
Just leave them in the package. I use a separate subpackage for the C extension which the packages modules then import. This makes mixed Python + C extensions and prototyping of C APIs in Python very simple and straight forward.
I wouldn't like this; for a very simple reason: if someone wants to provide a Python rewrite of a C module which works as dropin replacement, the only way to handle this is by having a .so file and a .py file with the same name in the same directory. mxDateTime uses such a setup, for example. Note that .so files are found before .py files, thus if someone does have the .so file, Python will use the C module and not the Python one. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 13 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

M.-A. Lemburg writes:
Hmmm... disk space is no longer a problem, but maintaining those parallel trees might be a problem; if a bugfix is released as a patch to Python code, you'd have to apply it to several trees. On the other hand, this isn't a very common way of fixing bugs in the Python community; a .1 release is more common. So your point is a good one; this would argue that the Distutils should make it easier to build things in one tree, and then install things under the right plat- directory. -- A.M. Kuchling http://starship.python.net/crew/amk/ He felt that his whole life was some kind of dream and he sometimes wondered whose it was and whether they were enjoying it. -- Douglas Adams, _The Hitch-Hiker's Guide to the Galaxy_

On 18 December 1999, A.M. Kuchling said:
Well, I missed this thread first time around because I was on holiday. Andrew went to such amazing lengths to do something that's actually quite easy that I thought it would be worth following up, even if I'm a month late. In short, putting Python extension modules into a package is really, really, *really* easy. Here's what *was* in Andrew's setup.py for the XML package (edited for readability): ext_modules = [('sgmlop', { 'sources' : ['extensions/sgmlop.c'] }), ('pyexpat', { 'define': [('XML_NS', None)], 'include_dirs': [ 'extensions/expat/xmltok', 'extensions/expat/xmlparse' ], 'sources' : [...] } ) ] Note that the names of the two extensions are 'sgmlop' and 'pyexpat'. In regards to where the .so files wind up, *the name is the most important thing* (and usually all you need). All we had to do to get rid of Andrew's extended command classes (and other kludgery) was set the names of the two extension modules appropriately, so the XML setup script now has this: ext_modules = [('xml.parsers.sgmlop', { 'sources' : ['extensions/sgmlop.c'] }), ('xml.parsers.pyexpat', { 'define': [('XML_NS', None)], 'include_dirs': [ 'extensions/expat/xmltok', 'extensions/expat/xmlparse' ], 'sources' : [...] } ) ] i.e. the names of the extensions had to change... and nothing else. (Well, we commented out the unnecessarily extended command classes.) I assumed that large module distributions with many extensions will probably put them in the same package. Thus, we could have done the change this way as well: ext_package = 'xml.parsers' ext_modules = [('sgmlop', { 'sources' : ['extensions/sgmlop.c'] }), ('pyexpat', { 'define': [('XML_NS', None)], 'include_dirs': [ 'extensions/expat/xmltok', 'extensions/expat/xmlparse' ], 'sources' : [...] } ) ] And if you only have a common base package for many extensions, you should still be able to specify that base package with 'ext_package': ext_package = 'xml.parsers' ext_modules = [('pkg1.sgmlop', { 'sources' : ['extensions/sgmlop.c'] }), ('pkg2.pyexpat', { 'define': [('XML_NS', None)], 'include_dirs': [ 'extensions/expat/xmltok', 'extensions/expat/xmlparse' ], 'sources' : [...] } ) ] If everything goes as planned, that should result in two extensions called 'xml.parsers.pkg1.sgmlop' and 'xml.parsers.pkg2.pyexpat'. Note that my assertions about these last two examples are based solely on reading the code and a dim recollection of what I had in mind when I wrote it -- i.e. I haven't tested them. YMMV, but please let me know if it does. Greg
participants (4)
-
A.M. Kuchling
-
Andrew M. Kuchling
-
Greg Ward
-
M.-A. Lemburg