convert unicode characters to visibly similar ascii characters
tjreedy at udel.edu
Tue Jul 1 21:46:45 CEST 2008
Peter Bulychev wrote:
> I want to convert unicode character into ascii one.
> The method ".encode('ASCII') " can convert only those unicode
> characters, which fit into 0..128 range.
> But there are still lots of characters beyond this range, which can be
> manually converted to some visibly similar ascii characters. For
> instance, there are several quotation marks in unicode, which can be
> converted into ascii quotation mark.
> Can this conversion be performed in automatic manner? After googling
> I've only found that there exists Unicode database, which stores
> human-readable information on notation of all unicode characters
> (ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt). And there also
> exists the Python adapter for this database
> (http://docs.python.org/lib/module-unicodedata.html). Using this
> database I can do something like `if
> notation.find('QUOTATION')!=-1:\n\treturn "'"`. I believe there is more
> elegant way. Am I right?
I believe you will have to make up your own translation dictionary for
the translations *you* want. You should then be able to use that with
the .translate() method.
More information about the Python-list