[Python-ideas] Extend unicodedata with a name search

Fri Oct 3 23:15:29 CEST 2014

On 03.10.2014 23:10, Philipp A. wrote:
> I noticed that the excellent perl utility unum
> <http://www.fourmilab.ch/webtools/unum/> uses an obsolete unicode database.
> 
> Since I’m a Pythonista, i recalled hearing about the stdlib unicodedata
> module, using which I either wanted to rewrite unum or extend its database.
> 
> Unfortunately, unicodedata is very limited. Partly rightfully so, since you
> can convert codepoints and chars with chr() and ord(), and str.upper() and
> friends are unicode-aware.
> 
> But the name database is only queryable using full names! I want to do
> unicodedata.search('clock') and get a list of dozens of glyphs with names
> like CLOCKWISE RIGHTWARDS AND LEFTWARDS OPEN CIRCLE ARROWS
>  and CLOCK FACE THREE-THIRTY.
> 
> Maybe this should spit out a list of (name, char) tuples? or a {name: char}
> dict?
> 
> What do you mean?

You should be able to code this as a PyPI package. I don't think
it's a use case that warrants making the unicodedata module more
complex.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 03 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/