Convert from unicode chars to HTML entities
steve at REMOVEME.cybersource.com.au
Mon Jan 29 06:13:05 CET 2007
On Sun, 28 Jan 2007 23:41:19 -0500, Leif K-Brooks wrote:
> >>> s = u"© and many more..."
> >>> s.encode('ascii', 'xmlcharrefreplace')
> '© and many more...'
Wow. That's short and to the point. I like it.
A few issues:
(1) It doesn't seem to be reversible:
>>> '© and many more...'.decode('latin-1')
u'© and many more...'
What should I do instead?
(2) Are XML entities guaranteed to be the same as HTML entities?
(3) Is there a way to find out at runtime what encoders/decoders/error
handlers are available, and what they do?
More information about the Python-list