Martin Mueller schrieb am 31.12.21 um 18:06:
I have used lxml extensively in a Pycharm environment that calls on a conda environment. Lately I encountered an odd error. The correct output of a marylamb.py script goes like this:
<l xmlns="http://www.tei-c.org/ns/1.0"> Mary had a little lamb,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Its fleece was white as snow, yeah.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Everywhere the child went,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> The little lamb was sure to go, yeah.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> He followed her to school one day,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> And broke the teacher's rule.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> What a time did they have,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> That day at school.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Tisket, tasket,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> A green and yellow basket.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Sent a letter to my baby,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> On my way I passed it.</l>
In the buggy output the script runs amok and prints the current line plus the rest of the text. I print it out at the end of this memo. The Pycharms folks were able to identify the conda environment as the likely culprit. If I run the script outside it doeesn’t happen. The problem seems to be limited to lxml running in a conda environment, because scripts that don’t use lxml are not plague by that bug.
It's most likely an issue with the libxml2 version. You probably have 2.9.12 installed in your condaenv. If you go back to 2.9.10, then it would probably work. conda install libxml2=2.9.10 You can find the version that lxml uses with """ from lxml import etree print("%-20s: %s" % ('lxml.etree', etree.LXML_VERSION)) print("%-20s: %s" % ('libxml used', etree.LIBXML_VERSION)) print("%-20s: %s" % ('libxml compiled', etree.LIBXML_COMPILED_VERSION)) """ The "LIBXML_VERSION" is what is currently used. Stefan