[lxml-dev] lxml Mac installation idea

So... I hear that lxml installs better on a Mac if it's built along with libxml2/libxslt. That's not what everyone would do, so I was unclear how to enable something like that, and if setup.py would be the right place. A number of attempts to get stuff setup have been tried before external to lxml (like buildouts and now staticlxml). While it's kind of lame, I wonder if enabling this static installation via an environmental variable would be reasonable? It would be easier to apply in a number of circumstances. I imagine it would mean something like, on installation, if a variable like LXML_INSTALL_STATIC (or INSTALL_LIBXML2 or something) was set, it'd download the libxml2 and libxslt libraries, run configure/make/make install with a prefix inside the lxml source directory itself, then build lxml using that library. Another option would be simply a different tarball that contains the libxml2/libxslt source, and its setup.py would always build those. It could be versioned like 2.1static or something, which should keep it from being implicitly used by easy_install, etc. (since 2.1static is considered an earlier version than 2.1). This might be more reasonable? staticlxml is kind of weird, because installing staticlxml installs lxml, which can confuse tools. Maybe the two versions could be arranged with some svn:externals, or just as a build script of some sort (e.g., drop a marker file in the source to make it "static", and have setup.py look for that file and change the sdist/install commands appropriately). One nuisance with any attempt to fix this is that I don't think either myself nor Stefan have ready access to a Mac to test this stuff... are there any Macs we can ssh into for testing? (Obviously patches are also welcome, but this Mac thing has caused so many support problems for me that I really want to get it resolved.) -- Ian Bicking : ianb@colorstudy.com : http://blog.ianbicking.org

Hi Ian, Ian Bicking wrote:
So... I hear that lxml installs better on a Mac if it's built along with libxml2/libxslt. That's not what everyone would do, so I was unclear how to enable something like that, and if setup.py would be the right place. A number of attempts to get stuff setup have been tried before external to lxml (like buildouts and now staticlxml).
I hadn't heard of static lxml yet, so I'll have to check what it's doing. Anyway, is it still that hard to install lxml on a Mac? Later 2.0 versions and 2.1 should behave much better here, given that they use "-flat_namespace" now.
While it's kind of lame, I wonder if enabling this static installation via an environmental variable would be reasonable? It would be easier to apply in a number of circumstances. I imagine it would mean something like, on installation, if a variable like LXML_INSTALL_STATIC (or INSTALL_LIBXML2 or something) was set, it'd download the libxml2 and libxslt libraries, run configure/make/make install with a prefix inside the lxml source directory itself, then build lxml using that library.
But isn't that what a buildout does best?
Another option would be simply a different tarball that contains the libxml2/libxslt source, and its setup.py would always build those. It could be versioned like 2.1static or something, which should keep it from being implicitly used by easy_install, etc. (since 2.1static is considered an earlier version than 2.1). This might be more reasonable?
The problem with this (and with the static Windows builds) is that libxml2/libxslt both have their release cycles, which are independent of lxml's releases. If you want to upgrade your libxml2 in a static build, you'll have to copy it to the right place anyway.
staticlxml is kind of weird, because installing staticlxml installs lxml, which can confuse tools.
Yep, I agree that that's not the way to go.
Maybe the two versions could be arranged with some svn:externals,
You mean lxml and libxml2? That would mean you either build libxml2 from a tag (implying the same problems as with a shipped, ready-to-get-outdated version), or from the trunk, which is definitely not suitable for most users.
or just as a build script of some sort (e.g., drop a marker file in the source to make it "static", and have setup.py look for that file and change the sdist/install commands appropriately).
That's just another way of triggering it, in which case I prefer the env var way.
One nuisance with any attempt to fix this is that I don't think either myself nor Stefan have ready access to a Mac to test this stuff...
Yes, that's part of the problem. Another problem is the way this problem pops up. From time to time, Mac users complain on the list that it doesn't work for them to build lxml. Some can provide helpful hints, debugging time or patches, others cannot. It's impossible for me to find out if things are really settled, or if there are still Mac users out there who just do not feel like investing any work to at least complain.
From what I witness, there haven't been any complains for a while, so I considered this problem settled since the late days of 2.0. If you bring it back up now (and if people feel urged to do things like staticlxml), it sounds to me like it's not.
Stefan

Stefan Behnel wrote:
Hi Ian,
Ian Bicking wrote:
So... I hear that lxml installs better on a Mac if it's built along with libxml2/libxslt. That's not what everyone would do, so I was unclear how to enable something like that, and if setup.py would be the right place. A number of attempts to get stuff setup have been tried before external to lxml (like buildouts and now staticlxml).
I hadn't heard of static lxml yet, so I'll have to check what it's doing.
Anyway, is it still that hard to install lxml on a Mac? Later 2.0 versions and 2.1 should behave much better here, given that they use "-flat_namespace" now.
Yeah, I'm getting reports of problems. Some class of people have figured out the way to get it installed, but it's still a big problem, and I hear from lots of people who won't use lxml because of it.
While it's kind of lame, I wonder if enabling this static installation via an environmental variable would be reasonable? It would be easier to apply in a number of circumstances. I imagine it would mean something like, on installation, if a variable like LXML_INSTALL_STATIC (or INSTALL_LIBXML2 or something) was set, it'd download the libxml2 and libxslt libraries, run configure/make/make install with a prefix inside the lxml source directory itself, then build lxml using that library.
But isn't that what a buildout does best?
Yes, but I'd like it to work without buildout, and there's also several buildout recipes and configurations out there and not one clear canonical way to build lxml.
Another option would be simply a different tarball that contains the libxml2/libxslt source, and its setup.py would always build those. It could be versioned like 2.1static or something, which should keep it from being implicitly used by easy_install, etc. (since 2.1static is considered an earlier version than 2.1). This might be more reasonable?
The problem with this (and with the static Windows builds) is that libxml2/libxslt both have their release cycles, which are independent of lxml's releases. If you want to upgrade your libxml2 in a static build, you'll have to copy it to the right place anyway.
Another option is yet another environmental variable to set the libxml2/libxslt versions, which are set to defaults. If you chose a version that didn't exist in the tarball it'd download that version.
staticlxml is kind of weird, because installing staticlxml installs lxml, which can confuse tools.
Yep, I agree that that's not the way to go.
Maybe the two versions could be arranged with some svn:externals,
You mean lxml and libxml2? That would mean you either build libxml2 from a tag (implying the same problems as with a shipped, ready-to-get-outdated version), or from the trunk, which is definitely not suitable for most users.
No, I was just thinking about ways of structuring a full build (that includes the libxml2 library) and the current build.
or just as a build script of some sort (e.g., drop a marker file in the source to make it "static", and have setup.py look for that file and change the sdist/install commands appropriately).
That's just another way of triggering it, in which case I prefer the env var way.
One nuisance with any attempt to fix this is that I don't think either myself nor Stefan have ready access to a Mac to test this stuff...
Yes, that's part of the problem. Another problem is the way this problem pops up. From time to time, Mac users complain on the list that it doesn't work for them to build lxml. Some can provide helpful hints, debugging time or patches, others cannot. It's impossible for me to find out if things are really settled, or if there are still Mac users out there who just do not feel like investing any work to at least complain.
From what I witness, there haven't been any complains for a while, so I considered this problem settled since the late days of 2.0. If you bring it back up now (and if people feel urged to do things like staticlxml), it sounds to me like it's not.
Yeah... there seem to be a few problems: * Macs come with bad versions of libxml2 and libxslt (depending on the version of the OS, you get either very bad, or not as bad, but bad enough that you'll eventually get a segfault but not immediately, which is actually much worse) * There's several kinds of Python that people use on a Mac: at least the system python, macports python, and fink python. I think there might be another. They are all somewhat different. * People keep getting the wrong runtime linking, and DYLD_LIBRARY_PATH seems to be necessary. * It's not that obvious how to build libxml2, unless you are using macports (which has a port for it). * Once you do build it, you have to be sure to get the right xml2-config, which doesn't happen by default. These are the problems I've heard about at least. I do now have ssh access to a Mac. Unfortunately it's behind our VPN, so I'm not sure if I can get you access to it too, but I'll ask about that. -- Ian Bicking : ianb@colorstudy.com : http://blog.ianbicking.org

Hi Ian, Ian Bicking wrote:
Stefan Behnel wrote: but I'd like it to work without buildout, and there's also several buildout recipes and configurations out there and not one clear canonical way to build lxml.
I wouldn't mind to ship lxml with a buildout recipe. I think the current problem is that if we wait to find one that works well for as many people as possible, we'll wait forever. So I'm fine with adding any script that's somewhat tested and in use. Even a set of scripts that you can try is better than none if people can't manage to build lxml themselves.
Another option would be simply a different tarball that contains the libxml2/libxslt source, and its setup.py would always build those. It could be versioned like 2.1static or something, which should keep it from being implicitly used by easy_install, etc. (since 2.1static is considered an earlier version than 2.1). This might be more reasonable?
The problem with this (and with the static Windows builds) is that libxml2/libxslt both have their release cycles, which are independent of lxml's releases. If you want to upgrade your libxml2 in a static build, you'll have to copy it to the right place anyway.
Another option is yet another environmental variable to set the libxml2/libxslt versions, which are set to defaults. If you chose a version that didn't exist in the tarball it'd download that version.
What about providing a script that checks for the most recent version first, and then builds lxml with that? The "LATEST_IS_X.Y.Z" file in the libxml2 FTP directory will help here. Then it's fine to override the latest version with an env variable if people need to.
Maybe the two versions could be arranged with some svn:externals,
You mean lxml and libxml2? That would mean you either build libxml2 from a tag (implying the same problems as with a shipped, ready-to-get-outdated version), or from the trunk, which is definitely not suitable for most users.
No, I was just thinking about ways of structuring a full build (that includes the libxml2 library) and the current build.
Shipping with the latest libxml2/libxslt is not a big problem, setup.py could be changed to support a "--all-sources" option for sdist. There could be a separate tar.gz (or zip) file on PyPI that easy_install won't find at all. And when the build detects that the libxml2 and libxslt sources lie next to the src directory, it can switch to a static build automatically. It's not an unsolvable problem, these things just need to get implemented.
* Macs come with bad versions of libxml2 and libxslt (depending on the version of the OS, you get either very bad, or not as bad, but bad enough that you'll eventually get a segfault but not immediately, which is actually much worse)
* People keep getting the wrong runtime linking, and DYLD_LIBRARY_PATH seems to be necessary.
That makes a case for a static build, IMHO.
* There's several kinds of Python that people use on a Mac: at least the system python, macports python, and fink python. I think there might be another. They are all somewhat different.
I'm not sure this makes a difference, distutils should handle this. The only problem is that this keeps us from preparing binaries for MacOS.
* It's not that obvious how to build libxml2, unless you are using macports (which has a port for it).
Not sure what you mean here. Isn't cmmi enough?
* Once you do build it, you have to be sure to get the right xml2-config, which doesn't happen by default.
Another reason for defaulting to a static build here.
I do now have ssh access to a Mac. Unfortunately it's behind our VPN, so I'm not sure if I can get you access to it too, but I'll ask about that.
I'm still waiting for the usual "we can pay you the hardware, so that you can do the testing yourself" offers. ;o) Stefan

On Sun, 02 Nov 2008 19:55:09 -0000, Stefan Behnel <stefan_ml@behnel.de> wrote:
Hi Ian,
Ian Bicking wrote:
Stefan Behnel wrote: but I'd like it to work without buildout, and there's also several buildout recipes and configurations out there and not one clear canonical way to build lxml.
I wouldn't mind to ship lxml with a buildout recipe. I think the current problem is that if we wait to find one that works well for as many people as possible, we'll wait forever. So I'm fine with adding any script that's somewhat tested and in use. Even a set of scripts that you can try is better than none if people can't manage to build lxml themselves.
* Macs come with bad versions of libxml2 and libxslt (depending on the version of the OS, you get either very bad, or not as bad, but bad enough that you'll eventually get a segfault but not immediately, which is actually much worse)
* People keep getting the wrong runtime linking, and DYLD_LIBRARY_PATH seems to be necessary.
That makes a case for a static build, IMHO.
Another way might be as done by Enthought and have lxml's own copy of the dynamic libraries ending up in site-packages/lxml-2.1.1.0001-py2.5-macosx-10.3-fat.egg/lxml/libxslt.dylib
* There's several kinds of Python that people use on a Mac: at least the system python, macports python, and fink python. I think there might be another. They are all somewhat different.
I'm not sure this makes a difference, distutils should handle this. The only problem is that this keeps us from preparing binaries for MacOS.
I would suggest that for binaries you only need to consider the system python although that differs with each version of the OS. The reason I say this is that macports and Fink tend to have everything done through their own port environment. Macports does have lxml already There are also users that have built their own python to get the latest version e.g. 2.5.2 or 2.6 but they can use a compiler so a setup or buildout should be sufficient Even if I am wrong on this it is a start and I think would help those who know the least about the non Python parts of the Mac environment
* It's not that obvious how to build libxml2, unless you are using macports (which has a port for it).
Not sure what you mean here. Isn't cmmi enough?
* Once you do build it, you have to be sure to get the right xml2-config, which doesn't happen by default.
Another reason for defaulting to a static build here.
-- Mark

On Nov 2, 2008, at 2:55 PM, Stefan Behnel wrote:
Hi Ian,
Ian Bicking wrote:
Stefan Behnel wrote: but I'd like it to work without buildout, and there's also several buildout recipes and configurations out there and not one clear canonical way to build lxml.
I wouldn't mind to ship lxml with a buildout recipe. I think the current problem is that if we wait to find one that works well for as many people as possible, we'll wait forever. So I'm fine with adding any script that's somewhat tested and in use. Even a set of scripts that you can try is better than none if people can't manage to build lxml themselves.
FWIW, we have a very lean buildout that only focuses on the problem of building lxml. I could contribute it if you'd like.
Another option would be simply a different tarball that contains the libxml2/libxslt source, and its setup.py would always build those. It could be versioned like 2.1static or something, which should keep it from being implicitly used by easy_install, etc. (since 2.1static is considered an earlier version than 2.1). This might be more reasonable?
The problem with this (and with the static Windows builds) is that libxml2/libxslt both have their release cycles, which are independent of lxml's releases. If you want to upgrade your libxml2 in a static build, you'll have to copy it to the right place anyway.
Another option is yet another environmental variable to set the libxml2/libxslt versions, which are set to defaults. If you chose a version that didn't exist in the tarball it'd download that version.
What about providing a script that checks for the most recent version first, and then builds lxml with that? The "LATEST_IS_X.Y.Z" file in the libxml2 FTP directory will help here. Then it's fine to override the latest version with an env variable if people need to.
Maybe the two versions could be arranged with some svn:externals,
You mean lxml and libxml2? That would mean you either build libxml2 from a tag (implying the same problems as with a shipped, ready-to-get-outdated version), or from the trunk, which is definitely not suitable for most users.
No, I was just thinking about ways of structuring a full build (that includes the libxml2 library) and the current build.
Shipping with the latest libxml2/libxslt is not a big problem, setup.py could be changed to support a "--all-sources" option for sdist. There could be a separate tar.gz (or zip) file on PyPI that easy_install won't find at all. And when the build detects that the libxml2 and libxslt sources lie next to the src directory, it can switch to a static build automatically.
That sounds pretty good.
I do now have ssh access to a Mac. Unfortunately it's behind our VPN, so I'm not sure if I can get you access to it too, but I'll ask about that.
I'm still waiting for the usual "we can pay you the hardware, so that you can do the testing yourself" offers. ;o)
I'll chip in. I wonder if there are OS X hosting companies where we could rent a login. --Paul

On Mon, Nov 3, 2008 at 11:38 AM, Paul Everitt <paul@agendaless.com> wrote:
I do now have ssh access to a Mac. Unfortunately it's behind our VPN, so I'm not sure if I can get you access to it too, but I'll ask about that.
I'm still waiting for the usual "we can pay you the hardware, so that you can do the testing yourself" offers. ;o)
I'll chip in. I wonder if there are OS X hosting companies where we could rent a login.
Sourceforge.net Build Farm used to have OSX hosts. They discontinued the service somewhere in 2007 though. :( -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 Skype zopedc

Paul Everitt wrote:
Ian Bicking wrote:
Stefan Behnel wrote: but I'd like it to work without buildout, and there's also several buildout recipes and configurations out there and not one clear canonical way to build lxml.
I wouldn't mind to ship lxml with a buildout recipe. I think the current problem is that if we wait to find one that works well for as many people as possible, we'll wait forever. So I'm fine with adding any script that's somewhat tested and in use. Even a set of scripts that you can try is better than none if people can't manage to build lxml themselves.
FWIW, we have a very lean buildout that only focuses on the problem of building lxml. I could contribute it if you'd like.
How does this compare to plone.recipe.lxml (http://pypi.python.org/pypi/plone.recipe.lxml)? I notice that package just had a release. Is this the config you've been using? https://svn.plone.org/svn/plone/PloneOrg/sandbox/xdv/new.plone.org/osx.cfg (if so, in your run-deliverance script, it'd be best to use exec to avoid the intermediate shell process and make it easier to kill the script) One problem with buildout recipes is that they aren't, AFAIK, very compatible with other systems. That is, they can build the egg, but the egg isn't on the path of any interpreter, so unless you use buildout entirely you'll have to do some further manipulation to make lxml available. I think I've also seen some reports that even using the macports lxml build, DYLD_LIBRARY_PATH can matter. I don't remember where I saw that now... maybe actually as part of another recipe? -- Ian Bicking : ianb@colorstudy.com : http://blog.ianbicking.org

On Nov 3, 2008, at 2:35 PM, Ian Bicking wrote:
Paul Everitt wrote:
Ian Bicking wrote:
Stefan Behnel wrote: but I'd like it to work without buildout, and there's also several buildout recipes and configurations out there and not one clear canonical way to build lxml.
I wouldn't mind to ship lxml with a buildout recipe. I think the current problem is that if we wait to find one that works well for as many people as possible, we'll wait forever. So I'm fine with adding any script that's somewhat tested and in use. Even a set of scripts that you can try is better than none if people can't manage to build lxml themselves. FWIW, we have a very lean buildout that only focuses on the problem of building lxml. I could contribute it if you'd like.
How does this compare to plone.recipe.lxml (http://pypi.python.org/pypi/plone.recipe.lxml)? I notice that package just had a release. Is this the config you've been using? https://svn.plone.org/svn/plone/PloneOrg/sandbox/xdv/new.plone.org/osx.cfg (if so, in your run-deliverance script, it'd be best to use exec to avoid the intermediate shell process and make it easier to kill the script)
Here's the buildout.cfg that we've used: [libxml2] recipe = zc.recipe.cmmi url = http://somecustomer.agendaless.com/indexes/karl/cmmi/libxml2-2.6.32.tar.gz extra_options = --without-python [libxslt] recipe = zc.recipe.cmmi url = http://somecustomer.agendaless.com/indexes/karl/cmmi/libxslt-1.1.24.tar.gz extra_options = --with-libxml-prefix=${libxml2:location} --without-python [lxml-environment] XSLT_CONFIG=${buildout:directory}/parts/libxslt/bin/xslt-config XML2_CONFIG=${buildout:directory}/parts/libxml2/bin/xml2-config [lxml] recipe = zc.recipe.egg:custom egg = lxml include-dirs = ${libxml2:location}/include/libxml2 ${libxslt:location}/include library-dirs = ${libxml2:location}/lib ${libxslt:location}/lib rpath = ${libxml2:location}/lib ${libxslt:location}/lib environment = lxml-environment
One problem with buildout recipes is that they aren't, AFAIK, very compatible with other systems. That is, they can build the egg, but the egg isn't on the path of any interpreter, so unless you use buildout entirely you'll have to do some further manipulation to make lxml available.
Right. On one hand, I'm reluctant to put all lxml's eggs in the buildout basket [wink] for that reason. lxml is much bigger than the world of Zope. Being confronted with something you didn't know (buildout) just to get to the thing you wanted (lxml) is a pain. OTOH, I wonder if the solution is to simply provide a number of options. We're really only talking about Mac users, so it isn't as big of a decision. And for them, perhaps the answer is to ensure there are a number of choices (MacPorts and fink kept up-to-date, easier to build by hand with Stefan's XML2_CONFIG support, etc.)
I think I've also seen some reports that even using the macports lxml build, DYLD_LIBRARY_PATH can matter. I don't remember where I saw that now... maybe actually as part of another recipe?
I think I saw something in another recipe, perhaps Martin's Deliverance one. My guess is that some of these are false blips from the time before XML2_CONFIG. --Paul
participants (5)
-
Ian Bicking
-
Mark Bestley
-
Paul Everitt
-
Sidnei da Silva
-
Stefan Behnel