
Hi, I'd like to try a tiny change to the CDATA class. In order to try, I have to be able to build lxml. Unfortunately, on Windows. I've downloaded Visual Studio 2019 CE. I created a (Python 3.12) virtual environment, where I installed Cython (latest version). I cloned lxml sources from GitHub. I then opened a "Developer command prompt for VS 2019", activated the virtual environment, and typed: (.venv) C:\Temp\lxml\lxml>python setup.py build_ext -i --with-cython --static-deps This downloads the dependencies like libxml2 etc.; this goes without problems. Then compilation starts, and gives errors: [...] Creating library build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.lib and object build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.exp etree.obj : error LNK2001: unresolved external symbol _xmlStrchr etree.obj : error LNK2001: unresolved external symbol _xmlIOParseDTD etree.obj : error LNK2001: unresolved external symbol _xmlMemShow [...] There are in total 503 unresolved externals. I checked the first one, and find that is is present in the downloaded libxml2_a.lib, but without the underscore. The directories of the downloaded libraries are correctly added to the compiler command line. I am out of my depth here. I hope someone is willing to help me proceed. Kind regards, Gertjan.

Gi, Gertjan Klein schrieb am 21.04.24 um 16:12:
I'd like to try a tiny change to the CDATA class. In order to try, I have to be able to build lxml. Unfortunately, on Windows.
Yeah, supporting Windows is everything but trivial due to the general lack of platform provided build support. Thus, all libraries have to do their own thing, and bringing that together is not easy. I'm happy myself that there is a working build setup at all. If it's a somewhat straightforward change that doesn't need tons of back-and-forth testing and debugging, and you have a github account, you could also use their CI service (Github Actions), either on your own account or in lxml's account via a pull request.
It might help to see the command line. Stefan

Op 25-04-2024 om 16:58 schreef Stefan Behnel:> If it's a somewhat straightforward change that doesn't need tons of
That would be an interesting approach (I do have a github account). I have looked at the actions to see if they would help me figure out how to build lxml, but what it does is beyond me, I'm afraid. If I could fork lxml, experiment, and have an action build it for me, I'd be very happy. 🙂 I'm not sure how to go about that, though.
It might help to see the command line.
For completeness, I've included everything I did below (it isn't actually that much). It wraps, which doesn't help readability, but I don't know how to stop Thunderbird doing that. (Is attaching a text file allowed on the list? Perhaps that would work better.) As you can see below, compiling seems to work fine, but linking fails. At least the first two symbols _xmlStrchr and _xmlIOParseDTD, can be found within libxml2_a.lib, but without the underscore. Does this mean the libraries are perhaps compiled with differerent flags? Thanks for your help, kind regards, Gertjan. ========== C:\Temp>git clone https://github.com/lxml/lxml.git [...] C:\Temp>cd lxml C:\Temp\lxml>py -m venv .venv C:\Temp\lxml>.venv\Scripts\activate (.venv) C:\Temp\lxml>call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\Tools\VsDevCmd.bat" [...] (.venv) C:\Temp\lxml>pip install cython Using cached Cython-3.0.10-cp312-cp312-win_amd64.whl (2.8 MB) (.venv) C:\Temp\lxml>pip install setuptools Using cached setuptools-69.5.1-py3-none-any.whl (894 kB) (.venv) C:\Temp\lxml>python setup.py build_ext -i --with-cython --static-deps Building lxml version 5.2.1. C:\Temp\lxml\setup.py:67: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html import pkg_resources Latest version of libxml2 is 2.11.7 Latest version of libxslt is 1.1.39 Latest version of zlib is 1.2.12 Latest version of iconv is 1.15 Retrieving "https://github.com/lxml/libxml2-win-binaries/releases/download/2024.03.26/li..." to "libs\libxml2-2.11.7.win64.zip" Unpacking libxml2-2.11.7.win64.zip into libs Retrieving "https://github.com/lxml/libxml2-win-binaries/releases/download/2024.03.26/li..." to "libs\libxslt-1.1.39.win64.zip" Unpacking libxslt-1.1.39.win64.zip into libs Retrieving "https://github.com/lxml/libxml2-win-binaries/releases/download/2024.03.26/zl..." to "libs\zlib-1.2.12.win64.zip" Unpacking zlib-1.2.12.win64.zip into libs Retrieving "https://github.com/lxml/libxml2-win-binaries/releases/download/2024.03.26/ic..." to "libs\iconv-1.15.win64.zip" Unpacking iconv-1.15.win64.zip into libs Building with Cython 3.0.10. Building against pre-built libxml2 andl libxslt libraries Building against libxml2/libxslt in one of the following directories: libs\libxml2-2.11.7.win64\lib libs\libxslt-1.1.39.win64\lib libs\zlib-1.2.12.win64\lib libs\iconv-1.15.win64\lib Compiling src\lxml\etree.pyx because it changed. Compiling src\lxml\objectify.pyx because it changed. Compiling src\lxml\builder.py because it changed. Compiling src\lxml\_elementpath.py because it changed. Compiling src\lxml\html\diff.py because it changed. Compiling src\lxml\sax.py because it changed. [1/6] Cythonizing src\lxml\_elementpath.py [2/6] Cythonizing src\lxml\builder.py [3/6] Cythonizing src\lxml\etree.pyx warning: src\lxml\xmlerror.pxi:660:22: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:661:69: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:662:20: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:667:22: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:668:73: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:669:20: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:674:22: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:675:73: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:676:20: local variable 'args' referenced before assignment [4/6] Cythonizing src\lxml\html\diff.py [5/6] Cythonizing src\lxml\objectify.pyx [6/6] Cythonizing src\lxml\sax.py running build_ext building 'lxml.etree' extension creating build creating build\temp.win32-cpython-312 creating build\temp.win32-cpython-312\Release creating build\temp.win32-cpython-312\Release\src creating build\temp.win32-cpython-312\Release\src\lxml "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x86\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DLIBXML_STATIC -DLIBXSLT_STATIC -DLIBEXSLT_STATIC -DCYTHON_CLINE_IN_TRACEBACK=0 -Isrc\lxml -Isrc\lxml\includes -Ilibs\libxml2-2.11.7.win64\include -Ilibs\libxslt-1.1.39.win64\include -Ilibs\zlib-1.2.12.win64\include -Ilibs\iconv-1.15.win64\include -Isrc -IC:\Temp\lxml\.venv\include -IC:\dev\Python\Python312\include -IC:\dev\Python\Python312\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" /Tcsrc\lxml\etree.c /Fobuild\temp.win32-cpython-312\Release\src\lxml\etree.obj -w cl : Command line warning D9025 : overriding '/W3' with '/w' etree.c creating C:\Temp\lxml\build\lib.win32-cpython-312 creating C:\Temp\lxml\build\lib.win32-cpython-312\lxml "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x86\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:libs\libxml2-2.11.7.win64\lib /LIBPATH:libs\libxslt-1.1.39.win64\lib /LIBPATH:libs\zlib-1.2.12.win64\lib /LIBPATH:libs\iconv-1.15.win64\lib /LIBPATH:C:\Temp\lxml\.venv\libs /LIBPATH:C:\dev\Python\Python312\libs /LIBPATH:C:\dev\Python\Python312 /LIBPATH:C:\Temp\lxml\.venv\PCbuild\win32 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\lib\x86" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\lib\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\\lib\10.0.19041.0\\um\x86" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\lib\x86" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\lib\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\um\x86" libxslt_a.lib libexslt_a.lib libxml2_a.lib iconv_a.lib zlib.lib WS2_32.lib /EXPORT:PyInit_etree build\temp.win32-cpython-312\Release\src\lxml\etree.obj /OUT:build\lib.win32-cpython-312\lxml\etree.cp312-win_amd64.pyd /IMPLIB:build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.lib Creating library build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.lib and object build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.exp etree.obj : error LNK2001: unresolved external symbol _xmlStrchr etree.obj : error LNK2001: unresolved external symbol _xmlIOParseDTD etree.obj : error LNK2001: unresolved external symbol _xmlMemShow etree.obj : error LNK2001: unresolved external symbol __imp__PyUnicode_AsEncodedString [...] etree.obj : error LNK2001: unresolved external symbol __imp__PyExc_ModuleNotFoundError etree.obj : error LNK2001: unresolved external symbol __imp___PyObject_GetDictPtr build\lib.win32-cpython-312\lxml\etree.cp312-win_amd64.pyd : fatal error LNK1120: 503 unresolved externals error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\bin\\HostX86\\x86\\link.exe' failed with exit code 1120

Op 25-04-2024 om 16:58 schreef Stefan Behnel:
I now created a fork of lxml on GitHub[1]. I made my change to the CDATA class, and the CI action started running. 🙂 It had build failures that suggest the build matrix may not be entirely correct anymore. However, the Windows builds had no issues. I was happy to find that wheels had been created as build artifacts. I could test my change. As to the actual change: I'm trying to write a conversion program that outputs XML[2]. It must match the output of an existing program. Semantically it already does, but I'd like it to match the way CDATA is handled. To this end, I'd like to allow "wrapped" CDATA. The CDATA class currently disallows this: it checks for the presence of ']]>', and raises if found. I added a parameter to turn off this check. I expected to need to do the escaping myself, but it seems lxml handles this just fine out of the box. For example, this tester code: fromlxml importetree fromlxml.etree importCDATA defmain(): root = etree.Element("dummy") txt = '<root><![CDATA[Something]]></root>' root.text = CDATA(txt, False) out = etree.tostring(root).decode() print(out) if__name__ == '__main__': main() ...prints this: <dummy><![CDATA[<root><![CDATA[Something]]]]><![CDATA[></root>]]></dummy> This looks good to me (given my constraints, that is 🙂). So I wonder how to proceed. Would you be willing to change anything here? If so, would you prefer a flag to turn off the check, or just to remove the check (or something else)? If so, I'll try my hands at a pull request. (I may later try and study how the CI does the build, and why it succeeds where my manual attempts failed, out of curiousity.) Kind regards, Gertjan. [1] https://github.com/gertjanklein/lxml [2] https://github.com/gertjanklein/iris-udl-to-xml

Hi, Gertjan Klein schrieb am 02.05.24 um 17:52:
The exception probably comes from a time where libxml2 didn't handle this itself.
Such a flag would need to be a keyword-only argument to make this readable. It's entirely unclear what the "False" refers to, unless you know the call signature by heart.
Looks good to me. According to the XML spec (both 1.0 and 1.1), "CDATA sections cannot nest": https://www.w3.org/TR/REC-xml/#sec-cdata-sect But splitting the CDATA section makes perfect sense. This does not even need an option, we can just remove the check and add a test for it. Do you want to propose a PR? The Python "xml.etree.ElementTree" package can also parse this correctly, but escapes this on output since it doesn't support CDATA sections directly. Thus, it seems best to add the test in "test_etree.py" rather than "test_elementtree.py" since the behaviour of both differ here. Stefan

Op 03-05-2024 om 06:22 schreef Stefan Behnel:
Yes, I will. It will take me a little while to figure out how to run the tests, seeing as the lxml tree doesn't compile, but if need be I'll just test the test method elsewhere before copying over. CI will then pick up any mistakes I may have made.
Thanks, that saves me some searching. 🙂 Kind regards, Gertjan.

Op 03-05-2024 om 20:59 schreef Gertjan Klein:
I ran the test out-of-tree, using the lxml build artifact, seeing as I haven't managed to get building working. I now created a pull request, even though CI failed, as the failures seem unrelated. As to the CI failures: 4 "checks" were not successful. These are for macos-latest and Python 3.6 and 3.7, and are caused by these Python versions not being available. However, if I click on the summary, I see more errors under "Annotations". For example, in "ubuntu-latest, pypy-3.10, false, true", I see test failures: https://github.com/gertjanklein/lxml/actions/runs/8949293511/job/24583430392... FAILED (failures=29, errors=9) Oddly, this CI run does get a green check mark. Anyway, I see you got the same errors on the last lxml CI run, so they were not introduced by my change. 🙂 The AppVeyor build failed on Python version that are too old (2.7, 3.5). Kind regards, Gertjan.

Op 25-04-2024 om 16:58 schreef Stefan Behnel:
I have now figured out what I did wrong. I have both VS 2019 and VS 2022 installed. If I don't do anything, setuptools selects the latter, and that doesn't match the compiler the external libs are compiled with (and, presumably, Python itself). So, I searched for a batch file to make setuptools use the correct compiler, and chose the wrong one, apparently. Adding some debug statements to msvc.py taught me that I need to call this batch file before starting the build: "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat" (with a x86_amd64 parameter). After that, everything works as expected. Kind regards, Gertjan.

Gi, Gertjan Klein schrieb am 21.04.24 um 16:12:
I'd like to try a tiny change to the CDATA class. In order to try, I have to be able to build lxml. Unfortunately, on Windows.
Yeah, supporting Windows is everything but trivial due to the general lack of platform provided build support. Thus, all libraries have to do their own thing, and bringing that together is not easy. I'm happy myself that there is a working build setup at all. If it's a somewhat straightforward change that doesn't need tons of back-and-forth testing and debugging, and you have a github account, you could also use their CI service (Github Actions), either on your own account or in lxml's account via a pull request.
It might help to see the command line. Stefan

Op 25-04-2024 om 16:58 schreef Stefan Behnel:> If it's a somewhat straightforward change that doesn't need tons of
That would be an interesting approach (I do have a github account). I have looked at the actions to see if they would help me figure out how to build lxml, but what it does is beyond me, I'm afraid. If I could fork lxml, experiment, and have an action build it for me, I'd be very happy. 🙂 I'm not sure how to go about that, though.
It might help to see the command line.
For completeness, I've included everything I did below (it isn't actually that much). It wraps, which doesn't help readability, but I don't know how to stop Thunderbird doing that. (Is attaching a text file allowed on the list? Perhaps that would work better.) As you can see below, compiling seems to work fine, but linking fails. At least the first two symbols _xmlStrchr and _xmlIOParseDTD, can be found within libxml2_a.lib, but without the underscore. Does this mean the libraries are perhaps compiled with differerent flags? Thanks for your help, kind regards, Gertjan. ========== C:\Temp>git clone https://github.com/lxml/lxml.git [...] C:\Temp>cd lxml C:\Temp\lxml>py -m venv .venv C:\Temp\lxml>.venv\Scripts\activate (.venv) C:\Temp\lxml>call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\Tools\VsDevCmd.bat" [...] (.venv) C:\Temp\lxml>pip install cython Using cached Cython-3.0.10-cp312-cp312-win_amd64.whl (2.8 MB) (.venv) C:\Temp\lxml>pip install setuptools Using cached setuptools-69.5.1-py3-none-any.whl (894 kB) (.venv) C:\Temp\lxml>python setup.py build_ext -i --with-cython --static-deps Building lxml version 5.2.1. C:\Temp\lxml\setup.py:67: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html import pkg_resources Latest version of libxml2 is 2.11.7 Latest version of libxslt is 1.1.39 Latest version of zlib is 1.2.12 Latest version of iconv is 1.15 Retrieving "https://github.com/lxml/libxml2-win-binaries/releases/download/2024.03.26/li..." to "libs\libxml2-2.11.7.win64.zip" Unpacking libxml2-2.11.7.win64.zip into libs Retrieving "https://github.com/lxml/libxml2-win-binaries/releases/download/2024.03.26/li..." to "libs\libxslt-1.1.39.win64.zip" Unpacking libxslt-1.1.39.win64.zip into libs Retrieving "https://github.com/lxml/libxml2-win-binaries/releases/download/2024.03.26/zl..." to "libs\zlib-1.2.12.win64.zip" Unpacking zlib-1.2.12.win64.zip into libs Retrieving "https://github.com/lxml/libxml2-win-binaries/releases/download/2024.03.26/ic..." to "libs\iconv-1.15.win64.zip" Unpacking iconv-1.15.win64.zip into libs Building with Cython 3.0.10. Building against pre-built libxml2 andl libxslt libraries Building against libxml2/libxslt in one of the following directories: libs\libxml2-2.11.7.win64\lib libs\libxslt-1.1.39.win64\lib libs\zlib-1.2.12.win64\lib libs\iconv-1.15.win64\lib Compiling src\lxml\etree.pyx because it changed. Compiling src\lxml\objectify.pyx because it changed. Compiling src\lxml\builder.py because it changed. Compiling src\lxml\_elementpath.py because it changed. Compiling src\lxml\html\diff.py because it changed. Compiling src\lxml\sax.py because it changed. [1/6] Cythonizing src\lxml\_elementpath.py [2/6] Cythonizing src\lxml\builder.py [3/6] Cythonizing src\lxml\etree.pyx warning: src\lxml\xmlerror.pxi:660:22: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:661:69: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:662:20: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:667:22: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:668:73: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:669:20: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:674:22: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:675:73: local variable 'args' referenced before assignment warning: src\lxml\xmlerror.pxi:676:20: local variable 'args' referenced before assignment [4/6] Cythonizing src\lxml\html\diff.py [5/6] Cythonizing src\lxml\objectify.pyx [6/6] Cythonizing src\lxml\sax.py running build_ext building 'lxml.etree' extension creating build creating build\temp.win32-cpython-312 creating build\temp.win32-cpython-312\Release creating build\temp.win32-cpython-312\Release\src creating build\temp.win32-cpython-312\Release\src\lxml "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x86\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DLIBXML_STATIC -DLIBXSLT_STATIC -DLIBEXSLT_STATIC -DCYTHON_CLINE_IN_TRACEBACK=0 -Isrc\lxml -Isrc\lxml\includes -Ilibs\libxml2-2.11.7.win64\include -Ilibs\libxslt-1.1.39.win64\include -Ilibs\zlib-1.2.12.win64\include -Ilibs\iconv-1.15.win64\include -Isrc -IC:\Temp\lxml\.venv\include -IC:\dev\Python\Python312\include -IC:\dev\Python\Python312\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" /Tcsrc\lxml\etree.c /Fobuild\temp.win32-cpython-312\Release\src\lxml\etree.obj -w cl : Command line warning D9025 : overriding '/W3' with '/w' etree.c creating C:\Temp\lxml\build\lib.win32-cpython-312 creating C:\Temp\lxml\build\lib.win32-cpython-312\lxml "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x86\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:libs\libxml2-2.11.7.win64\lib /LIBPATH:libs\libxslt-1.1.39.win64\lib /LIBPATH:libs\zlib-1.2.12.win64\lib /LIBPATH:libs\iconv-1.15.win64\lib /LIBPATH:C:\Temp\lxml\.venv\libs /LIBPATH:C:\dev\Python\Python312\libs /LIBPATH:C:\dev\Python\Python312 /LIBPATH:C:\Temp\lxml\.venv\PCbuild\win32 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\lib\x86" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\lib\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\\lib\10.0.19041.0\\um\x86" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\lib\x86" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\lib\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x86" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\um\x86" libxslt_a.lib libexslt_a.lib libxml2_a.lib iconv_a.lib zlib.lib WS2_32.lib /EXPORT:PyInit_etree build\temp.win32-cpython-312\Release\src\lxml\etree.obj /OUT:build\lib.win32-cpython-312\lxml\etree.cp312-win_amd64.pyd /IMPLIB:build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.lib Creating library build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.lib and object build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.exp etree.obj : error LNK2001: unresolved external symbol _xmlStrchr etree.obj : error LNK2001: unresolved external symbol _xmlIOParseDTD etree.obj : error LNK2001: unresolved external symbol _xmlMemShow etree.obj : error LNK2001: unresolved external symbol __imp__PyUnicode_AsEncodedString [...] etree.obj : error LNK2001: unresolved external symbol __imp__PyExc_ModuleNotFoundError etree.obj : error LNK2001: unresolved external symbol __imp___PyObject_GetDictPtr build\lib.win32-cpython-312\lxml\etree.cp312-win_amd64.pyd : fatal error LNK1120: 503 unresolved externals error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\bin\\HostX86\\x86\\link.exe' failed with exit code 1120

Op 25-04-2024 om 16:58 schreef Stefan Behnel:
I now created a fork of lxml on GitHub[1]. I made my change to the CDATA class, and the CI action started running. 🙂 It had build failures that suggest the build matrix may not be entirely correct anymore. However, the Windows builds had no issues. I was happy to find that wheels had been created as build artifacts. I could test my change. As to the actual change: I'm trying to write a conversion program that outputs XML[2]. It must match the output of an existing program. Semantically it already does, but I'd like it to match the way CDATA is handled. To this end, I'd like to allow "wrapped" CDATA. The CDATA class currently disallows this: it checks for the presence of ']]>', and raises if found. I added a parameter to turn off this check. I expected to need to do the escaping myself, but it seems lxml handles this just fine out of the box. For example, this tester code: fromlxml importetree fromlxml.etree importCDATA defmain(): root = etree.Element("dummy") txt = '<root><![CDATA[Something]]></root>' root.text = CDATA(txt, False) out = etree.tostring(root).decode() print(out) if__name__ == '__main__': main() ...prints this: <dummy><![CDATA[<root><![CDATA[Something]]]]><![CDATA[></root>]]></dummy> This looks good to me (given my constraints, that is 🙂). So I wonder how to proceed. Would you be willing to change anything here? If so, would you prefer a flag to turn off the check, or just to remove the check (or something else)? If so, I'll try my hands at a pull request. (I may later try and study how the CI does the build, and why it succeeds where my manual attempts failed, out of curiousity.) Kind regards, Gertjan. [1] https://github.com/gertjanklein/lxml [2] https://github.com/gertjanklein/iris-udl-to-xml

Hi, Gertjan Klein schrieb am 02.05.24 um 17:52:
The exception probably comes from a time where libxml2 didn't handle this itself.
Such a flag would need to be a keyword-only argument to make this readable. It's entirely unclear what the "False" refers to, unless you know the call signature by heart.
Looks good to me. According to the XML spec (both 1.0 and 1.1), "CDATA sections cannot nest": https://www.w3.org/TR/REC-xml/#sec-cdata-sect But splitting the CDATA section makes perfect sense. This does not even need an option, we can just remove the check and add a test for it. Do you want to propose a PR? The Python "xml.etree.ElementTree" package can also parse this correctly, but escapes this on output since it doesn't support CDATA sections directly. Thus, it seems best to add the test in "test_etree.py" rather than "test_elementtree.py" since the behaviour of both differ here. Stefan

Op 03-05-2024 om 06:22 schreef Stefan Behnel:
Yes, I will. It will take me a little while to figure out how to run the tests, seeing as the lxml tree doesn't compile, but if need be I'll just test the test method elsewhere before copying over. CI will then pick up any mistakes I may have made.
Thanks, that saves me some searching. 🙂 Kind regards, Gertjan.

Op 03-05-2024 om 20:59 schreef Gertjan Klein:
I ran the test out-of-tree, using the lxml build artifact, seeing as I haven't managed to get building working. I now created a pull request, even though CI failed, as the failures seem unrelated. As to the CI failures: 4 "checks" were not successful. These are for macos-latest and Python 3.6 and 3.7, and are caused by these Python versions not being available. However, if I click on the summary, I see more errors under "Annotations". For example, in "ubuntu-latest, pypy-3.10, false, true", I see test failures: https://github.com/gertjanklein/lxml/actions/runs/8949293511/job/24583430392... FAILED (failures=29, errors=9) Oddly, this CI run does get a green check mark. Anyway, I see you got the same errors on the last lxml CI run, so they were not introduced by my change. 🙂 The AppVeyor build failed on Python version that are too old (2.7, 3.5). Kind regards, Gertjan.

Op 25-04-2024 om 16:58 schreef Stefan Behnel:
I have now figured out what I did wrong. I have both VS 2019 and VS 2022 installed. If I don't do anything, setuptools selects the latter, and that doesn't match the compiler the external libs are compiled with (and, presumably, Python itself). So, I searched for a batch file to make setuptools use the correct compiler, and chose the wrong one, apparently. Adding some debug statements to msvc.py taught me that I need to call this batch file before starting the build: "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat" (with a x86_amd64 parameter). After that, everything works as expected. Kind regards, Gertjan.
participants (2)
-
Gertjan Klein
-
Stefan Behnel