Py 2.5: Bug in sgmllib

Fredrik Lundh fredrik at
Sun Oct 22 13:47:22 CEST 2006

Michael Butscher wrote:

> if I execute the following two lines in Python 2.5 (to feed in a 
> *unicode* string):
> import sgmllib
> sgmllib.SGMLParser().feed(u'<a title="te&#223;t"></a>')

source documents are encoded byte streams, not decoded Unicode 
sequences.  I suggest reading up on how Python's Unicode string
type is, and what a Unicode string represents.  it's not the same
thing as a byte string.


More information about the Python-list mailing list