jholg@gmx.de wrote:
2.6.20 through 2.6.29. But what about the iconv version? Is there any difference on the systems that were tested so far? "iconv --version" says 2.5 for me. I assume it's about the same for Tres (who's on Ubuntu also). What about the others?
libxml2 built without iconv here (Sparc Solaris).
I first thought your comment wasn't relevant as Sparc uses a different encoding already, but then I looked back into the code of libxml2 and found that iconv is not used for detecting the encoding, only for later decoding if libxml2 itself doesn't support the encoding. So iconv isn't the real problem here, it's rather libxml2 that fails to detect the encoding on some platforms. What we use here is the function xmlDetectCharEncoding() in encoding.c, which (AFAICT) checks for a BOM. Maybe these platforms do not have a that in their unicode strings... Here is a patch that will print out the internal representation of a unicode string when importing etree. Could someone with a Windows or MacOS machine please try this and send me the results? Stefan