[XML-SIG] PyExpat encoding (was: XML support in Python 1.6)
Greg Stein
gstein@lyra.org
Thu, 1 Jun 2000 12:56:28 -0700 (PDT)
On Thu, 1 Jun 2000, Fred L. Drake, Jr. wrote:
>...
> We also need to determine how Unicode should be supported; should
> the parser always produce Unicode strings, or UTF-8, and provide a
> wrapper that converts everything? Since it appears likely that
> auto-conversion between Unicode and narrow strings will likely only
> work for 7-bit narrow strings, it may be reasonable to create Unicode
> output directly from the parser (probably at the pyexpat level for
> efficiency).
Expat is typically compiled to spit out a particular encoding. By default,
this is UTF-8.
Presuming that the compilation flags are exposed and/or runtime-queryable,
then pyexpat can compensate accordingly. This implies that it would
sometimes return UTF-8 strings, or Unicode objects.
IMO, we should have a fixed output format, which is the Expat default:
UTF-8.
Cheers,
-g
--
Greg Stein, http://www.lyra.org/