Mailman 3 July 2007 - lxml - The Python XML Toolkit

Re: [lxml-dev] OpenVMS port of lxml
by Stefan Behnel July 15, 2007

July 15, 2007

Jean-François Piéronne wrote: > Stefan Behnel wrote: >> Jean-François Piéronne wrote: >>> lxml 1.3.2 has been successfully ported to OpenVMS (Alpha and Itanium >>> platform). > I have attach the generated warning for one of the source file. > Others source files produce the same king of warning. Thanks. That's actually my 'fault' rather than Pyrex', but those are only irrelevant warnings about generated generic code, nothing that could break anything. It looks like the C compiler on VMS is a bit more picky about some things. The warnings should go away with an explicit cast in the right place of the generated header file (see attached pyrex-cast.patch) > ====================================================================== > FAIL: test_module_HTML_unicode > (lxml.tests.test_htmlparser.HtmlParserTestCaseBase) > ---------------------------------------------------------------------- I know that one. Does the VMS port of libxml2 use iconv? Do you know the native unicode encoding that VMS Python uses? UTF16 or UCS-4? big or little endian? > The NAN/NaNQ if probably a VMS Looks like it. Here's a simple fix for the test case: ======================= Index: src/lxml/tests/test_xpathevaluator.py =================================================================== --- src/lxml/tests/test_xpathevaluator.py (Revision 44997) +++ src/lxml/tests/test_xpathevaluator.py (Arbeitskopie) @@ -23,7 +23,7 @@ tree.xpath('number(/a)')) tree = self.parse('<a>A</a>') actual = str(tree.xpath('number(/a)')) - expected = ['nan', '1.#qnan'] + expected = ['nan', '1.#qnan', 'nanq'] if not actual.lower() in expected: self.fail('Expected a NAN value, got %s' % actual) ======================= > So on AXP 432 tests are executed and 794 on Itanium. You probably have ElementTree installed on Itanium (maybe Python 2.5?), but not on AXP. The test suite also contains comparative compatibility tests against ElementTree that only run when it is installed. >>> * test_xslt.py >>> Python interpreter crash with an error during the rundown of the program: >>> assert error: expression = autoInterpreterState, in file >>> PYTHON_ROOT:[Python]pystate.c;1 at line 563 >> >> Hmmm, that one looks bad, though. Would you have any more hints on what >> happens here? > > Full traceback: > #> python test_xslt.py > ............................................... > ---------------------------------------------------------------------- > Ran 47 tests in 0.223s > > OK > assert error: expression = autoInterpreterState, in file > PYTHON_ROOT:[Python]pys > tate.c;1 at line 563 > %SYSTEM-F-OPCCUS, opcode reserved to customer fault at > PC=FFFFFFFF80AA0DF4, PS=0 > 000001B > %TRACE-F-TRACEBACK, symbolic stack dump follows > image module routine line rel PC abs PC > 0 FFFFFFFF80AA0DF4 > FFFFFFFF80AA0DF4 > 0 FFFFFFFF80B34A74 > FFFFFFFF80B34A74 > PYTHONSHR pystate PyGILState_Ensure 20250 0000000000001374 > 00000000002C7E04 > libxml2xsltlxmlmod etree __pyx_f_5etree__receiveError > 67362 000000000002F1E4 > 0000000000B1E704 > libxml2xsltlxmlmod error __xmlRaiseError > 16027 0000000000001248 [...] Hmmm, but your above tests ran through, didn't they? Not sure what happens here. Could this be a problem with differences in threading on VMS? Stefan

1 0

Re: [lxml-dev] DOM2 range() support?
by Stefan Behnel July 15, 2007

July 15, 2007

Hi, just continuing this discussion on the right mailing list (from XML-SIG). Gloria W wrote: > I need to be able to do DOM2 range() functionality, to meet the > requirements of a back end Dojo interface I have written in Python. > I have already written my own DOM2 compliant node schema, out of > necessity, but without the range functionality, since it is so tedious. > I have forced a requirement on my Dojo developer colleague to not make > use of range() for the time being. > I wish it were properly supported in Python, but for the time being, I > seem to be the only person needing it. This is actually the first time I come across the concept of DOM2 ranges. http://www.w3.org/TR/DOM-Level-2-Traversal-Range/ranges.html Please correct me, but from a quick look at this: http://www.w3.org/TR/DOM-Level-2-Traversal-Range/ranges.html#Level-2-Range-… isn't that simply a tuple of two positions, where each position contains an Element and optionally one of the following: - an attribute "{ns}name" and a string position in the attribute value - a string position in the text - a string position in the tail Admittedly, the DOM2 interface on top of that is a little more complex and (should I say it?) DOM-ishly obfuscated, but there shouldn't be more to it than that, right? I mean, the respective W3C spec is only some 13 sections long, there *can't* be more than that. :) Hmm, now that we have cool HTML support and CSS selection, I wouldn't mind having a Range class hanging around in some (lxml.range?) module. Doesn't even sound like you'd have to implement it in Pyrex, Python code should be enough here. Stefan

1 0

[lxml-dev] OpenVMS port of lxml
by Jean-François Piéronne July 15, 2007

July 15, 2007

Hi, lxml 1.3.2 has been successfully ported to OpenVMS (Alpha and Itanium platform). The problems founds are: - A lot of compilation warning, I can send then if there is some interest. TESTED VERSION: Python: (2, 5, 1, 'final', 0) lxml.etree: (1, 3, 2, 0) libxml used: (2, 6, 29) libxml compiled: (2, 6, 29) libxslt used: (1, 1, 21) libxslt compiled: (1, 1, 21) - Some of the tests failed * test_etree.py ................................................................................ ..................................EE ====================================================================== ERROR: test_xinclude (__main__.XIncludeTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "test_etree.py", line 1604, in test_xinclude self.include( tree ) AttributeError: 'XIncludeTestCase' object has no attribute 'include' ====================================================================== ERROR: test_xinclude_text (__main__.XIncludeTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "test_etree.py", line 1597, in test_xinclude_text self.include( etree.ElementTree(root) ) AttributeError: 'XIncludeTestCase' object has no attribute 'include' ---------------------------------------------------------------------- Ran 116 tests in 0.258s FAILED (errors=2) I don't think it's a specific VMS problem but I don't have any others platforms to test * test_xslt.py Python interpreter crash with an error during the rundown of the program: assert error: expression = autoInterpreterState, in file PYTHON_ROOT:[Python]pystate.c;1 at line 563 I can provide a complete traceback * test_elementtree.py and some others tests raised error like ====================================================================== ERROR: test_ElementTree (__main__.ETreeTestCaseBase) ---------------------------------------------------------------------- Traceback (most recent call last): File "test_elementtree.py", line 252, in test_ElementTree Element = self.etree.Element AttributeError: 'NoneType' object has no attribute 'Element' Thanks for any advises. Jean-François

2 1

Re: [lxml-dev] Element children not right?
by Stefan Behnel July 15, 2007

July 15, 2007

Hi, Mike Meyer wrote: > On Sat, 14 Jul 2007 13:17:24 +0200 Stefan Behnel <stefan_ml(a)behnel.de> wrote: >> Mike Meyer wrote: >>> I think I have a very, very basic bug here: >>> Note that the original ElementTree implementation only returns >>> children that are *elements*, whereas the lxml version returns all >>> the children. This makes life much more interesting, especially as >>> there isn't an obvious method for checking whether or not the node >>> value is actually an element. >> Sure there is, just check for the tag property being a string. lxml.etree is >> compatible to ElementTree in that it returns the factory functions for >> everything that does not have a tag (comments, PIs, entities). >> >>> [el for el in n if isinstance(el.tag, basestring)] > > Obvious is relative. I did find that test, but this doesn't say "I'm > processing only element children". You have to know that those are the > only children an element can have for which the tag attribute is a > string. Maybe that's obvious to you, it certainly wasn't to me, and > probably isn't to the casual reader either. Not when compared to > something like: > > ***** WARNING: CODE FOR EXPOSITION ONLY. THESE DO NOT WORK ****** > > [el for el in n if isinstance(el, etree.Element)] > [el for el in n if etree.is_element(el)] > [el for el in n if el.type == etree.ELEMENT_TYPE] > > (or any of a dozen ways that says "n is an element" as opposed to > saying "some attribute of n is some other type that the two examples > we're looking at only use for that attribute on the type of interest.") Well, I admit that it's not the most obvious thing ever and that the lxml docs do not provide obvious advise here. This is definitely something we can improve. Still, that's how ElementTree works. So, if you start by comparing lxml.etree's behaviour to ElementTree and claim that we "have a very, very basic bug here", you may have to accept that someone tells you that it's not a bug in lxml.etree. Note that ElementTree only shows this behaviour because its parser silently drops comments and processing instructions completely. If you constructed the same tree through the API, the behaviour of lxml.etree and ElementTree would be identical for what you tested and you would run into the same problem with both. >>> import xml.etree.ElementTree as ET >>> root = ET.XML("<root><?pi test?><b/></root>") >>> root.getchildren() [<Element b at b79e294c>] >>> root.append(ET.PI("test")) >>> root.getchildren() [<Element b at b79e294c>, <Element <function ProcessingInstruction at 0xb79e3cdc> at b79dd04c>] The difference is that lxml.etree also behaves this way for a tree that was parsed. I find this rather consistent. >>> import lxml.etree as et >>> et.XML("<root><?pi test?><b/></root>") <Element root at b79d4504> >>> root = et.XML("<root><?pi test?><b/></root>") >>> root.getchildren() [<?pi test?>, <Element b at b79d457c>] >> should do what you want, in both lxml.etree and ElementTree. > > Right. But I don't need the extra filter in ElementTree - it returns > just the element children and that's what it's documented as doing. I > quite literally found this when I took working ElementTree code and > tried to move it to lxml's implementation of ElementTree (and then > moved back to ElementTree rather than use test on the tag attributes > type). Hmm, I see your point. Maybe we should provide an ElemenTree compatible parser that strips comments and PIs from the document. That would at least help in porting existing code. > And this doesn't help at all if I want to distinguish between PIs and > comments. Well, read the ElementTree docs (or re-read my last mail). For elements, the tag property returns the tag name, for everything else, it returns the respective *factory function*, i.e. etree.Comment or etree.ProcessingInstruction. So, testing what you have if it's not an Element is actually not more than a straight "is". >> I don't see why this should be a bug, it's just an extended tree model. >> Comments are nothing to frown upon, they are as much part of the XML world as >> element nodes, so why would you want to ignore them? > > You want to ignore them because you're working with an API that > ignores them. If ElementTree returned them, then I wouldn't consider > it a bug. ElementTree *does* return them if they are in the tree. It's just the parser that does not put them in the tree. So it actually encourages you to write code that depends on the way the tree was constructed. If, one day, you feed one of your functions with a tree containing Comments that were added through the API, it will just stop working and you will have a hard time figuring out why your perfectly working and not-touched-in-a-long-time function fails for a certain subset of the input. > However, the single most common thing to want to do when > processing the children of a node is to recurse on the element > children. ElementTree makes that easy by only exposing the children > that are elements via the list API, as otherwise you wind up having to > write more code for the most common use case. > > Even if you don't want to ignore them, I've never seen a case where > you wanted to do the same things to all the children of a node. So > there should be an easy way to figure out what kind of child you're > dealing with, if nothing else. > > Basically, I think this is a bug in the design of the lxml extensions > to the ElementTree API. Rather than extending the API, lxml changes > the API by adding comments and PIs to the list interface. This isn't > mentioned in any of the lxml.etree documentation, other than one > paragraph on the compatibility page that implies that this might > happen. As an aside, lxml doesn't add the other children of an element > to that interface, though it would make as much sense to do so. This > is presumably because the ElementTree API already has good interfaces > for dealing with text and attribute children. ... and Elements and PIs and Comments, which all behave mostly the same at the API level of lxml.etree and ElementTree. > That same paragraph notes that you can enable ignoring comments by > tweaking the parser (but doesn't deal with ignoring PIs). True, as I said, an ET compatibility parser would help here. > If the goal > is to be compatible with ElementTree, this is wrong - the default mode > should be the one that's compatible, and getting the incompatible mode > should take extra work. If the goal is to be easy-to-use, then it's > still wrong - the default mode should be the more common use case > (though what's more common clearly depends on your environment). Ok, but your proposal is based on the wrong assumption that it's the API that shows this behaviour in ET. But since it's the parser, being compatible would mean drop PIs and comments by default, thus changing the document behind the scenes in an I/O cycle. I think this is the wrong behaviour for a default. > Even if you don't agree that this is the most common use case, being > easy-to-use means there should be a way to check for each of the three > types in the list API that's obvious when you read it and easy to find > by looking at the class docstrings. Of course, doing this breaks > compatibility with other ElementTree implementations, but we punted on > that when we added extra children to the list API. I assume you read carefully up to this point, so I won't need to comment on this. > My gut reaction is that it would be better to actually extend the API > rather than changing it. For example, have get_children, getiterator, > iterchildren, find, and so on accept extra keyword arguments to > indicate it will *not* (won't because we're extending the ElementTree > API, which only provides "will") filter PIs or comments (or even > elements, thought that one has the opposite default). However, that's > just a first thought on the issue. At least for all the iterator functions, you can pass a "tag" argument. If you pass nothing, you will get all the nodes. If you pass "*", you will get only Elements (i.e. nodes that have any tag name). If you pass one of the factory functions, you will get only those. Currently, you cannot pass the Element function (which is ET compatible behaviour), but maybe that would be a nice and consistent alternative to passing "*". Stefan

1 0

[lxml-dev] Custom resolvers vs. RelaxNG?
by Mike Meyer July 15, 2007

July 15, 2007

Hi, Should custom resolvers work with Relax NG documents? They don't seem to be being invoked at all in my tests (and I already figured out that I have to have libxml2.6.29 for them to work at all, so that's what I'm using). Here's the resovler class: class MyResolver(Resolver): """A resolver that returns local strings.""" __entities = {'vivid.rng': _vivid, 'dev.rng': _dev} def resolve(self, uri, id, context): """Returns the right string for the given name.""" print "Resolving", uri, id, context if self.__entities.has_key(uri): return self.resolve_string(self.__entities[uri], context) else: return None And it's invoked like so: parser = XMLParser() parser.resolvers.add(MyResolver()) vivid = RelaxNG(fromstring(_vivid, parser)) dev = RelaxNG(fromstring(_dev, parser)) The critical schema is dev: _dev = """<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <include href="vivid.rng"/> <define name="host_contents" combine="choice"> <element name="module"> <ref name="anything"/> </element> </define> <define name="anything"> <zeroOrMore> <choice> <element> <anyName/> <ref name="anything"/> </element> <attribute> <anyName/> </attribute> <text/> </choice> </zeroOrMore> </define> </grammar> """ I'd like the include of vivid.rng to pick up the _vivid string. However, when I run this, it just complains about the RelaxNG format, even though they it works fine if I have a copy of _vivid in vivid.rng in the current directory: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "schema/__init__.py", line 545, in <module> dev = RelaxNG(fromstring(_dev, parser), parser) File "relaxng.pxi", line 70, in etree.RelaxNG.__init__ etree.RelaxNGParseError: Document is not valid Relax NG Clearly, there's a bug here. The question is, is it in my understanding of things, or in lxml? Thanks, <mike -- Mike Meyer <mwm(a)mired.org> http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information.

2 2

[lxml-dev] Element children not right?
by Mike Meyer July 14, 2007

July 14, 2007

I think I have a very, very basic bug here: >>> from sys import version >>> version '2.5.1 (r251:54863, May 15 2007, 15:31:37) \n[GCC 3.4.6 [FreeBSD] 20060305]' >>> from lxml import etree >>> etree.LXML_VERSION (1, 3, 2, 0) >>> etree.LIBXML_VERSION (2, 6, 29) >>> etree.LIBXSLT_VERSION (1, 1, 21) >>> d = etree.parse('/home/mwm/.plpwmrc.xml') >>> n = d.find('namemenu') >>> [el for el in n] [<Element getapp at 7f1e10>, <Element getapp at 8aa418>, , <Element getapp at 8aa470>, <Element getapp at 8aa4c8>, <Element getapp at 926ec0>, <Element getapp at 926f18>, <Element getapp at 926f70>, <Element getapp at 926fc8>, <Element run at 977b50>, <Element getapp at 97d4c8>] >>> from xml.etree.ElementTree import parse >>> d2 = parse('/home/mwm/.plpwmrc.xml') >>> n2 = d2.find('namemenu') >>> [el for el in n2] [<Element getapp at 9845a8>, <Element getapp at 9845f0>, <Element getapp at 984638>, <Element getapp at 984680>, <Element getapp at 9846c8>, <Element getapp at 984710>, <Element getapp at 984758>, <Element getapp at 9847a0>, <Element run at 9847e8>, <Element getapp at 984830>] Note that the original ElementTree implementation only returns children that are *elements*, whereas the lxml version returns all the children. This makes life much more interesting, especially as there isn't an obvious method for checking whether or not the node value is actually an element. <mike -- Mike Meyer <mwm(a)mired.org> http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information.

2 1

[lxml-dev] Can't build lxml sources - failure to link - Windows
by Robert Dailey July 12, 2007

July 12, 2007

Hi, I'm attempting to build LXML for windows. Below are details on the linker errors I'm getting (the compile works fine). Anyone that can help would be greatly appreciated. Thank you! Here is my modified paths in the setup.py file: STATIC_INCLUDE_DIRS = [ "..\\libxml2\\include", "..\\libxslt\\include", "..\\zlib\\include", "..\\iconv\\include" ] STATIC_LIBRARY_DIRS = [ "..\\libxml2\\lib", "..\\libxslt\\lib", "..\\zlib\\lib", "..\\iconv\\lib", "C:\\Program Files\\Microsoft Visual Studio 8\\VC\\lib" ] STATIC_CFLAGS = [] I get the following output in the command line (note the first line is the line I typed in): C:\IT\SDK\lxml>python setup.py build -c mingw32 --static Building lxml version 1.3.2 C:\Python25\lib\distutils\dist.py:263: UserWarning: Unknown distribution option: 'zip_safe' warnings.warn(msg) running build running build_py running build_ext building 'lxml.etree' extension writing build\temp.win32-2.5\Release\src\lxml\etree.def C:\mingw\bin\gcc.exe -mno-cygwin -shared -s build\temp.win32- 2.5\Release\src\lxml\etree.o build\temp .win32-2.5\Release\src\lxml\etree.def -L..\libxml2\lib -L..\libxslt\lib -L..\zlib\lib -L..\iconv\lib "-LC:\Program Files\Microsoft Visual Studio 8\VC\lib" -LC:\Python25\libs -LC:\Python25\PCBuild -lli bxslt_a -llibexslt_a -llibxml2_a -liconv_a -lzlib -lWS2_32 -lpython25 -lmsvcr71 -o build\lib.win32-2 .5\lxml\etree.pyd Warning: .drectve `/DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"ws2_32.lib" /DEFAULTLI B:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"advapi32.lib" /DEFAULT LIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"O LDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"O LDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"O LDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"uuid.lib" /DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"O LDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized Warning: .drectve `/DEFAULTLIB:"MSVCRT" /DEFAULTLIB:"OLDNAMES" ' unrecognized ..\libxslt\lib\libxslt_a.lib(int.xslta.msvc /xsltutils.obj):..\libxslt\xsltuti:(.text[_xsltTimestamp] +0xa5): undefined reference to `_ftol2' ..\libxslt\lib\libxslt_a.lib(int.xslta.msvc /numbers.obj):..\libxslt\numbers:(.text[_xsltNumberFormat Decimal]+0x9c): undefined reference to `_ftol2' ..\libxslt\lib\libxslt_a.lib(int.xslta.msvc /numbers.obj):..\libxslt\numbers:(.text[_xsltNumberFormat Alpha]+0x4b): undefined reference to `_ftol2' ..\libxslt\lib\libxslt_a.lib(int.xslta.msvc /numbers.obj):..\libxslt\numbers:(.text[_xsltNumberFormat ]+0x6): undefined reference to `_chkstk' ..\libxslt\lib\libexslt_a.lib(int.exslta.msvc /date.obj):..\libexslt\date.c:(.text[_exsltDateParseDur ation]+0x226): undefined reference to `_ftol2' ..\libxslt\lib\libexslt_a.lib(int.exslta.msvc /date.obj):..\libexslt\date.c:(.text[_exsltDateParseDur ation]+0x230): undefined reference to `_ftol2' ..\libxslt\lib\libexslt_a.lib(int.exslta.msvc /date.obj):..\libexslt\date.c:(.text[_exsltDateFormatDu ration]+0x119): undefined reference to `_ftol2' ..\libxslt\lib\libexslt_a.lib(int.exslta.msvc /date.obj):..\libexslt\date.c:(.text[_exsltDateFormatDu ration]+0x175): undefined reference to `_ftol2' ..\libxslt\lib\libexslt_a.lib(int.exslta.msvc /date.obj):..\libexslt\date.c:(.text[_exsltDateFormatDu ration]+0x213): undefined reference to `_ftol2' ..\libxslt\lib\libexslt_a.lib(int.exslta.msvc /date.obj):..\libexslt\date.c:(.text[_exsltDateFormatDu ration]+0x28a): more undefined references to `_ftol2' follow ..\libxml2\lib\libxml2_a.lib(int.a.msvc/encoding.obj):..\encoding.c:(.text[_xmlByteConsumed]+0x6): u ndefined reference to `_chkstk' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /valid.obj):..\valid.c:(.text[_xmlValidBuildContentModel]+0x6 ): undefined reference to `_chkstk' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /valid.obj):..\valid.c:(.text[_xmlValidateElementContent]+0x6 ): undefined reference to `_chkstk' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /xpointer.obj):..\xpointer.c:(.text[_xmlXPtrStringRangeFuncti on]+0x65): undefined reference to `_ftol2' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /xpointer.obj):..\xpointer.c:(.text[_xmlXPtrStringRangeFuncti on]+0x9d): undefined reference to `_ftol2' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /debugXML.obj):..\debugXML.c:(.text[_xmlCtxtDumpElemDecl]+0x6 ): undefined reference to `_chkstk' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /xmlschemastypes.obj):..\xmlschemastypes:(.text[_xmlSchemaVal idateDuration]+0x21c): undefined reference to `_ftol2' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /xmlschemastypes.obj):..\xmlschemastypes:(.text[_xmlSchemaVal idateDuration]+0x226): undefined reference to `_ftol2' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /xmlschemastypes.obj):..\xmlschemastypes:(.text[_xmlSchemaCom pareDurations]+0x2f): undefined reference to `_ftol2' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /xmlschemastypes.obj):..\xmlschemastypes:(.text[__xmlSchemaDa teAdd]+0xfe): undefined reference to `_ftol2' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /xmlschemastypes.obj):..\xmlschemastypes:(.text[__xmlSchemaDa teAdd]+0x120): undefined reference to `_ftol2' ..\libxml2\lib\libxml2_a.lib(int.a.msvc /xmlschemastypes.obj):..\xmlschemastypes:(.text[__xmlSchemaDa teAdd]+0x171): more undefined references to `_ftol2' follow ..\libxml2\lib\libxml2_a.lib(int.a.msvc /nanohttp.obj):..\nanohttp.c:(.text[_xmlNanoHTTPReadLine]+0x6 ): undefined reference to `_chkstk' ..\libxml2\lib\libxml2_a.lib(int.a.msvc/nanoftp.obj):..\nanoftp.c:(.text[_xmlNanoFTPList]+0x6): unde fined reference to `_chkstk' ..\libxml2\lib\libxml2_a.lib(int.a.msvc/nanoftp.obj):..\nanoftp.c:(.text[_xmlNanoFTPGet]+0x6): undef ined reference to `_chkstk' ..\iconv\lib\iconv_a.lib(iconv.obj):./iconv.c:(.text[_libiconvlist]+0x9): undefined reference to `_c hkstk' ..\zlib\lib\zlib.lib(gzio.obj):gzio.c:(.text[_gzprintf]+0x6): undefined reference to `_chkstk' collect2: ld returned 1 exit status error: command 'gcc' failed with exit status 1

3 10

[lxml-dev] Version 1.2.1 not working?
by Robert Dailey July 11, 2007

July 11, 2007

Hi, I have the following Python code: from lxml import etree from StringIO import StringIO def loadXMLFile( filename ): f = open( filename, 'r' ) xmldata = f.read() root = etree.parse( StringIO( xmldata ) ) f.close() return root Python either crashes or hangs at the etree.parse() call. Below is the contents of the XML file I'm opening: <Page> <Frame type="Root">  <OnIdleTime time="10000"> <StartAttractMode dispatch="GAME"/> </OnIdleTime>  <Frame type="Image" value="Interface/Titlescreen.spr"> <OnInput key="KEY_ACCEPT" require_focus="false">  <GoToPage dispatch="GUI" page="MainMenu"/> <PlaySound dispatch="GAME" file="InterfaceSchwing"/> </OnInput> </Frame> </Frame> </Page> Anyone know why it isn't working? Thanks!

3 4

Re: [lxml-dev] naming the lxml.html parse functions
by Stefan Behnel July 10, 2007

July 10, 2007

Ian Bicking wrote: > I'm still not sure what to call all the parsing functions for HTML. Hmm, there isn't really something comparable in lxml's API so far, so we can't just copy names here. "parse_string()" would match their intention, so that would make it "parse_string_element()" and "parse_string_elements()". Maybe that's too long for an every-day-use function, but at least the names are clear. I don't even think length matters here as parse functions may be used in every program, but likely only once or a couple of times in a few selected places, so clarity outweighs typing here IMHO. "strparse()" would be shorter but might suggest that they only parse plain strings, not unicode (although unicode parsing is somewhat 'advanced use' anyway). On the other hand, I'm wondering why they parse strings in the first place. Wouldn't parsing from a file make more sense? There's always StringIO if you need it (which is efficiently special cased in lxml). Note that libxml2 can even parse from http and ftp URLs directly, so you would even loose something (if only performance) if you required people to load a document into memory first and then pass it to the parser as a string. You'd also loose base URL information, BTW. So, my preferred solution would be to keep the names and make them functions that parse from a filename or file-like object, just like etree.parse() works. Admittedly, that's a bit tricky as you can't check what the file starts with to decide how to parse it without opening it first... > Also > I'd like some method on at least HTML elements for doing CSS selections, > but I'm not sure what to call it. Any ideas? Well, the xpath() method is named after the language, so why not just call the method "cssselect()" ? That makes it clear where the implementation comes from and matches the existing API. Stefan

2 10

[lxml-dev] xpath on newly created elements
by Doug Winter July 8, 2007

July 8, 2007

I can't make xpath work on elements that have been created using etree.Element when they have a namespace that doesn't use Clark notation. I have a test case: -- begins -- from lxml import etree print "lxml.etree: ", etree.LXML_VERSION print "libxml used: ", etree.LIBXML_VERSION print "libxml compiled: ", etree.LIBXML_COMPILED_VERSION print "libxslt used: ", etree.LIBXSLT_VERSION print "libxslt compiled: ", etree.LIBXSLT_COMPILED_VERSION nsmap=dict(test="http://test.com") e = [] e.append(etree.fromstring('<test:foo xmlns:test="http://test.com" />')) e.append(etree.Element("test:foo", nsmap=nsmap)) e.append(etree.Element("test:foo", {'xmlns:test': nsmap['test']})) e.append(etree.Element("{%(test)s}foo" % nsmap)) e.append(etree.Element("{%(test)s}foo" % nsmap, nsmap=nsmap)) for i, elem in enumerate(e): print i, elem.xpath("/test:foo", nsmap) -- ends -- I get this output if I run the above: lxml.etree: (1, 3, 2, 0) libxml used: (2, 6, 27) libxml compiled: (2, 6, 27) libxslt used: (1, 1, 20) libxslt compiled: (1, 1, 20) 0 [<Element {http://test.com}foo at b7a18374>] 1 [] 2 [] 3 [<Element {http://test.com}foo at b7a1848c>] 4 [<Element {http://test.com}foo at b7a184dc>] I would expect all 5 cases to match the root element, but cases 1 and 2 do not. It appears to be only for elements created using namespace prefixes - and yet these work perfectly well in all other respects. Is this a bug, or should elements not be created this way? Cheers, Doug. -- Isotoma, Open Source Software Consulting - http://www.isotoma.com Tel: 01904 567349, Mobile: 07879 423002, Fax: 020 79006980 Postal Address: Tower House, Fishergate, York, YO10 4UA, UK Registered in England. Company No 5171172. VAT GB843570325. Registered Office: 19a Goodge Street, London, W1T 2PH

2 2