[XML-SIG] Re: checking a string for well-formedness
"Martin v. Löwis"
martin@v.loewis.de
Fri, 09 May 2003 14:57:40 +0200
Paul Tremblay wrote:
> I must be dense when it comes to unicode. So Python converts unicode
> to a 7-bit (ASCII) string?
In some cases, yes. If you use an API function that requires a byte
string, such as file.write, it converts to byte strings using the system
default encoding, which is ASCII.
The resulting strings are still 8-bit strings (i.e. byte strings), since
your computer cannot represent 7-bit quantities. However, for each byte,
the MSB will be 0.
> The first time the string is tested, it comes out as valid. But every
> single instance afterwards comes out all ill-formed XML.
The parser maintains internal state, to remember where inside the
document it is. When parse completes, the state says "at the end of the
document". It is an error to provide more markup at this point.
You either need to throw away the parser object and create a new one, or
reset the parser object that you already have.
Regards,
Martin