accent letters in xml
Alessio Pace
puccio_13 at yahoo.it
Fri May 2 05:42:28 EDT 2003
Hi, I wrote a config file for an application in xml format.
Some xml.dom.minidom.Text node must contain accents (they are real words)
and so I write them in the default way: à è and so on...
When I parse the xml file, python (2.3a2) "skip" the character (which if
occurs in word, it occurs always at the end).
For example:
<word>perchè</word>
python returns me the string: u'perch'
The xml file is declared to be in UTF-8 and this is how I take the Text
elements data (suggestions for modfications are welcome):
set = sets.Set([]) # Set
wordsList = xmldoc.getElementsByTagName('word')
for element in wordsList: # for each word element
children = element.childNodes # the children
for child in children:
if isinstance(child, minidom.Text):
set.add(child.data)
Thanks.
--
bye
Alessio Pace
More information about the Python-list
mailing list