Re: [lxml] Lxml aborts with an odd error message
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Martin Mueller, 26.05.2014 16:18:
Here's an implementation: https://github.com/lxml/lxml/commit/35316b052af48921657813bb68563fe4a301d1b8 I attached a little test program that I used for benchmarking. It stress tests the XML ID handling by parsing lots of elements with different IDs and discarding them right after parsing. The new implementation performs 5x better with the normal parser and about 50x better with the new collect_ids=False option. Given how rare the usage of the XML ID hash table should be in real code, this makes me wonder if the option should not be switched off by default, however backwards incompatible that is. Can you test it from the latest github version? BTW, the lxml homepage has a Paypal link to allow for sponsorship of lxml's development, just in case this wasn't generally known. :) Stefan
participants (1)
-
Stefan Behnel