
Stefan Behnel wrote: [snip]
We could also consider adding an external utility module to provide helpers like this that are not really worth poluting the API. Something like
lxml.tools.lineof(element)
That would be icing on the cake; either way is fine, If you consider such a tool, I would probably call it "parseInfo" or so, where maybe the filename, endline, and column info is available too.
The filename would be available from documents, I don't know what you mean with "endline" (the last line number?) and the parser column is not available from libxml2 (at least not once the parser has passed the element...)
So, what about an 'lxml.docinfo' module then that provides this kind of info helper functions? I was never really happy with the DocInfo class, so it might be a good idea to just move this kind of information to a separate module that people can use if they need it.
I'm pretty confident that there is even more that we could provide at that level. And it would help us in keeping the already bigger-than-big-enough API of lxml at least a little smaller.
I really think this is overkill. I think an attribute 'line' is fine. lxml has an explicit mission to take ElementTree and expand its API with more functionality. We do this with namespaces, we do this with xpath, and why wouldn't we do this with line numbers? I don't understand how line numbers are different.
By the way, even if 0 is both used for line 0 and elements that have an unknown line number, it seems actually possible to distinguish between the two! What would be required if 'line 0' is found is to go backwards in document order, until a textnode is found that contains a newline. If so, the answer is None. If not (and this can be done quickly), the answer is 0. Oh, possibly even more efficient would be to look for *another* node. If this node contains a line number that's non-0, you know you can return None. That would make the 'line' API pretty reliable.
Regards,
Martijn