[xml] convert funny chars to char entites?

Wade Leftwich wade at lightlink.com
Mon Sep 16 12:17:58 EDT 2002


":B nerdy" <thoa0025 at mail.usyd.edu.au> wrote in message news:<vNah9.10101$Ee4.23987 at news-server.bigpond.net.au>...
> ive got a string "Il Est Né" and i want to put it into a xml document
> 
> is there a quick and easy way to conver the funny characters to char
> entities so i can store them?
> or is there existing libraries?
> 
> cheers

I've got a content syndication application that sends out XML, and
some of the recipients have a surprisingly hard time coping with
UTF-8. So I supply them with ASCII, after escaping everything over 7
bits to a character reference. This also guarantees proper display in
Internet Explorer.

>>> s = 'Il est n\x82!'
>>> u = unicode(s, 'iso-8859-1')
>>> L = []
>>> for char in u:
	val = ord(char)
	if val > 127:
		L.append('&#%s;' % val)
	else:
		L.append(char)

		
>>> u2 = ''.join(L)
>>> u2
u'Il est n‚!'
>>> s2 = u2.encode('ascii')
>>> s2
'Il est n‚!'


-- Wade Leftwich
Ithaca, NY



More information about the Python-list mailing list