Converting foreign characters to HTML characters entities

Paul Boddie paul at
Tue May 22 06:18:23 EDT 2001

Lutz.Schroeer at (Lutz Schroeer) wrote in message news:<Xns90A8B4310B8E7Latzikatz at>...
> My program reads strings out of a database and converts them to an HTML-
> page. The string contains German Umlauts which I would like to convert to 
> HTML character entities.
> Is there any simple method to do this without explicitly writing a function 
> by myself? Are there any standard modules which I didn't find or any third 
> party objects?

In the Python Library Reference [1] there seem to be a few modules
which might help, such as 'htmlentitydefs' and possibly 'cgi'. Webware
[2] might include some functions in its 'WebUtils' package. Then,
there's always the PyXML package [3] which is included in Python 2.1
as far as I can tell.

Of course, the function wouldn't be hard to write. I think it is
permitted to use entities based on the ISO 8869-1 character values in
HTML, although you would need to check with the applicable
specifications [4]. Thus, your function would produce entities of the
form &#ddd; where d is a decimal digit.




More information about the Python-list mailing list