Convert from unicode chars to HTML entities

Steven D'Aprano steve at
Mon Jan 29 06:13:05 CET 2007

On Sun, 28 Jan 2007 23:41:19 -0500, Leif K-Brooks wrote:

>  >>> s = u"© and many more..."
>  >>> s.encode('ascii', 'xmlcharrefreplace')
> '© and many more...'

Wow. That's short and to the point. I like it.

A few issues:

(1) It doesn't seem to be reversible:

>>> '© and many more...'.decode('latin-1')
u'© and many more...'

What should I do instead?

(2) Are XML entities guaranteed to be the same as HTML entities?

(3) Is there a way to find out at runtime what encoders/decoders/error
handlers are available, and what they do? 


Steven D'Aprano 

More information about the Python-list mailing list