convert unicode characters to visibly similar ascii characters
M.-A. Lemburg
mal at egenix.com
Wed Jul 2 04:39:13 EDT 2008
On 2008-07-01 20:31, Peter Bulychev wrote:
> Hello.
>
> I want to convert unicode character into ascii one.
> The method ".encode('ASCII') " can convert only those unicode characters,
> which fit into 0..128 range.
>
> But there are still lots of characters beyond this range, which can be
> manually converted to some visibly similar ascii characters. For instance,
> there are several quotation marks in unicode, which can be converted into
> ascii quotation mark.
>
> Can this conversion be performed in automatic manner? After googling I've
> only found that there exists Unicode database, which stores human-readable
> information on notation of all unicode characters (
> ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt). And there also exists
> the Python adapter for this database (
> http://docs.python.org/lib/module-unicodedata.html). Using this database I
> can do something like `if notation.find('QUOTATION')!=-1:\n\treturn "'"`. I
> believe there is more elegant way. Am I right?
You could write a codec which translates Unicode into a ASCII
lookalike characters, but AFAIK there is no standard for doing
this.
I guess the best choice is to use the Unicode code point names
as basis. These can be accessed via unicodedata.name(). You can
then create a mapping which can be processed by the character
map codec.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Jul 02 2008)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
2008-07-07: EuroPython 2008, Vilnius, Lithuania 4 days to go
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
More information about the Python-list
mailing list