On 2007-11-09 14:10, Walter Dörwald wrote:
Martin v. Löwis wrote:
Yes, an XML parser should be able to use UTF-8, UTF-16, UTF-32, etc codecs to do the encoding. There's no need to create a magical mystery codec to pick out which though. So the code is good, if it is inside an XML parser, and it's bad if it is inside a codec? Exactly so. This functionality just *isn't* a codec - there is no encoding. Instead, it is an algorithm for *detecting* an encoding.
And what do you do once you've detected the encoding? You decode the input, so why not combine both into an XML decoder?
FWIW: I'm +1 on adding such a codec. It makes working with XML data a lot easier: you simply don't have to bother with the encoding of the XML data anymore and can just let the codec figure out the details. The XML parser can then work directly on the Unicode data. Whether it needs to be in C or not is another question (I would have done this in Python since performance is not really an issue), but since the code is already written, why not use it ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 09 2007)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611