
On Sat, Apr 22, 2017 at 8:18 AM, Bob Kline <bkline@rksystems.com> wrote:
On Sat, Apr 22, 2017 at 2:31 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
I left some comments, looks good otherwise. Could you also add a test for the new feature?
I'm about to head out of town for the weekend. When I get back I'll add the test and prepare and submit the pull request.
Well, that's what I thought would happen. I did add the test, and the test passes perfectly. However, when I run the entire suite of tests, a segmentation fault dumps core. More specifically, if I run any of test_etree.py, test_xslt.py, or test_xpathevaluator.py, I get a core dump. All the others do fine. Here's my modified xmlerror.pxi: https://github.com/bkline/lxml/blob/error-location/src/lxml/xmlerror.pxi And here's the diff for the three changed files: https://github.com/lxml/lxml/compare/master...bkline:error-location I have done as much sleuthing as I could think to do (short of running Python under gdb), but without success. I tried a bunch of different things, described below. I wondered if libxml2 did something bizarre and undocumented instead of just returning NULL when it couldn't come up with a path for the node, so I tried adding a test to avoid calling xmlGetNodePath() unless _isElement(self._c_path) was true. Didn't eliminate the core dumps. I wondered if some code somewhere was instantiating _LogEntry objects without ever calling _setError() or _setGeneric(), thus avoiding the code which initializes self._c_path to NULL, so I tried adding a __cinit__() method, which the Cython docs [1] say is guaranteed to called exactly once, to do that initialization for all instances of the class. Didn't help. Besides, I saw that that same paragraph in the Cython docs says "By the time your __cinit__() method is called, memory has been allocated for the object and any C attributes it has have been initialised to 0 or null." That would mean that what I was doing in the __cinit__() method had already been done. I would also interpret that sentence in the Cython documentation to imply that the C attributes for the object are automatically initialized to 0 or null even if a __cinit__() method is not explicitly provided (though the path for extension types [2] oddly makes no mention of this initialization, unless it's there and I just missed it). If my interpretation is correct, the assignment of self._c_path (and self._c_message and self._c_filename, for that matter) to NULL in self._setError() and self._setGeneric() are superfluous, no? I wondered if xmlFree() (for which I couldn't find much documentation) did not behave the way free() in the standard library does (doing nothing if the passed pointer is NULL). So I added a test in the __dealloc__() method to avoid calling xmlFree() unless self._c_path is not NULL. But that didn't eliminate the core dumps, either. Besides, if this had been the problem, self._c_filename and self._c_message would have been causing segmentation faults. And in fact, even commenting out the call to xmlFree() altogether didn't eliminate the problem (though it would add another problem, leaking memory). I tried some other, even sillier things, but those were all in the "grasping at straws" category, and not worth mentioning. The only way to make the core dumps go away was to comment out the call to xmlGetNodePath() altogether, hard-coding the return value for the property getter. But that's obviously not much of a solution. :-) I did confirm that the baseline code on top of which I added my patch runs the test suite without any errors or core dumps. This is so tantalizing, because Stephan's proposed approach, using xmlGetNodePath(), is so clean and does exactly what I need it to do. At this point, the only thing I can think of is to fall back on the idea I mentioned earlier of making the capture of the node's location conditional, based on some thread-based setting turning it on. But I would much rather understand why xmlGetNodePath() is blowing up, and I would guess that would be everyone else's preference as well. So, what am I missing, Cython/lxml gurus? Bewildered, Bob [1] http://cython.readthedocs.io/en/latest/src/userguide/special_methods.html [2] http://cython.readthedocs.io/en/latest/src/userguide/extension_types.html