Looking for a lib fun

Michael P. Reilly arcege at shore.net
Tue Jun 1 18:54:41 CEST 1999

Boris Borcic <zorro at zipzap.ch> wrote:
: There is a function I can't seem to find in the
: standard library. One that will turn XML/HTML
: offending characters to corresponding entity- or
: character references.

: Is it indeed in the lib ? Where ? If not, could a kind
: soul hand me one ? Writing it really feels like
: reinventing the wheel for the 10^nth time.


Python 1.5 has the htmlentitydef module which you can use:

import htmlentitydefs
charmap = {}
for key, val in htmlentitydefs.entitydefs.items():
  charmap[val] = key

At this point, charmap will contain character to entity references.
  charmap['>'] == 'gt'
  charmap[chr(160)] == 'nbsp'

I don't know if anyone has a more "complete" set tho.


More information about the Python-list mailing list