question about a bug when lxml runs in a conda environment
I have used lxml extensively in a Pycharm environment that calls on a conda environment. Lately I encountered an odd error. The correct output of a marylamb.py script goes like this: <l xmlns="http://www.tei-c.org/ns/1.0"> Mary had a little lamb,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Its fleece was white as snow, yeah.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Everywhere the child went,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> The little lamb was sure to go, yeah.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> He followed her to school one day,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> And broke the teacher's rule.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> What a time did they have,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> That day at school.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Tisket, tasket,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> A green and yellow basket.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Sent a letter to my baby,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> On my way I passed it.</l> In the buggy output the script runs amok and prints the current line plus the rest of the text. I print it out at the end of this memo. The Pycharms folks were able to identify the conda environment as the likely culprit. If I run the script outside it doeesn’t happen. The problem seems to be limited to lxml running in a conda environment, because scripts that don’t use lxml are not plague by that bug. Has anybody encountered a similar problem or has a suggestion how to fix it. I guess not using conda is an option, but it’s convenient in many other ways. Here is the buggy output, with an empty line added after each line to make it a little more readable <l xmlns="http://www.tei-c.org/ns/1.0"> Mary had a little lamb,</l><l> Its fleece was white as snow, yeah.</l><l> Everywhere the child went,</l><l> The little lamb was sure to go, yeah.</l></lg><lg><l> He followed her to school one day,</l><l> And broke the teacher's rule.</l><l> What a time did they have,</l><l> That day at school.</l></lg><lg><l> Tisket, tasket,</l><l> A green and yellow basket.</l><l> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> Its fleece was white as snow, yeah.</l><l> Everywhere the child went,</l><l> The little lamb was sure to go, yeah.</l></lg><lg><l> He followed her to school one day,</l><l> And broke the teacher's rule.</l><l> What a time did they have,</l><l> That day at school.</l></lg><lg><l> Tisket, tasket,</l><l> A green and yellow basket.</l><l> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> Everywhere the child went,</l><l> The little lamb was sure to go, yeah.</l></lg><lg><l> He followed her to school one day,</l><l> And broke the teacher's rule.</l><l> What a time did they have,</l><l> That day at school.</l></lg><lg><l> Tisket, tasket,</l><l> A green and yellow basket.</l><l> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> The little lamb was sure to go, yeah.</l></lg><lg><l> He followed her to school one day,</l><l> And broke the teacher's rule.</l><l> What a time did they have,</l><l> That day at school.</l></lg><lg><l> Tisket, tasket,</l><l> A green and yellow basket.</l><l> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> He followed her to school one day,</l><l> And broke the teacher's rule.</l><l> What a time did they have,</l><l> That day at school.</l></lg><lg><l> Tisket, tasket,</l><l> A green and yellow basket.</l><l> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> And broke the teacher's rule.</l><l> What a time did they have,</l><l> That day at school.</l></lg><lg><l> Tisket, tasket,</l><l> A green and yellow basket.</l><l> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> What a time did they have,</l><l> That day at school.</l></lg><lg><l> Tisket, tasket,</l><l> A green and yellow basket.</l><l> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> That day at school.</l></lg><lg><l> Tisket, tasket,</l><l> A green and yellow basket.</l><l> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> Tisket, tasket,</l><l> A green and yellow basket.</l><l> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> A green and yellow basket.</l><l> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> Sent a letter to my baby,</l><l> On my way I passed it.</l></lg></div></body></text></TEI> <l xmlns="http://www.tei-c.org/ns/1.0"> On my way I passed it.</l></lg></div></body></text></TEI>
Dear Martin, Please send marylamb.py and the XML files of interest. Cordially, Thomas
Martin Mueller schrieb am 31.12.21 um 18:06:
I have used lxml extensively in a Pycharm environment that calls on a conda environment. Lately I encountered an odd error. The correct output of a marylamb.py script goes like this:
<l xmlns="http://www.tei-c.org/ns/1.0"> Mary had a little lamb,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Its fleece was white as snow, yeah.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Everywhere the child went,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> The little lamb was sure to go, yeah.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> He followed her to school one day,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> And broke the teacher's rule.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> What a time did they have,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> That day at school.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Tisket, tasket,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> A green and yellow basket.</l> <l xmlns="http://www.tei-c.org/ns/1.0"> Sent a letter to my baby,</l> <l xmlns="http://www.tei-c.org/ns/1.0"> On my way I passed it.</l>
In the buggy output the script runs amok and prints the current line plus the rest of the text. I print it out at the end of this memo. The Pycharms folks were able to identify the conda environment as the likely culprit. If I run the script outside it doeesn’t happen. The problem seems to be limited to lxml running in a conda environment, because scripts that don’t use lxml are not plague by that bug.
It's most likely an issue with the libxml2 version. You probably have 2.9.12 installed in your condaenv. If you go back to 2.9.10, then it would probably work. conda install libxml2=2.9.10 You can find the version that lxml uses with """ from lxml import etree print("%-20s: %s" % ('lxml.etree', etree.LXML_VERSION)) print("%-20s: %s" % ('libxml used', etree.LIBXML_VERSION)) print("%-20s: %s" % ('libxml compiled', etree.LIBXML_COMPILED_VERSION)) """ The "LIBXML_VERSION" is what is currently used. Stefan
participants (3)
-
_@thomaslevine.com
-
Martin Mueller
-
Stefan Behnel