[Python-ideas] Extend unicodedata with a name search

Fri Oct 3 23:10:59 CEST 2014

I noticed that the excellent perl utility unum
<http://www.fourmilab.ch/webtools/unum/> uses an obsolete unicode database.

Since I’m a Pythonista, i recalled hearing about the stdlib unicodedata
module, using which I either wanted to rewrite unum or extend its database.

Unfortunately, unicodedata is very limited. Partly rightfully so, since you
can convert codepoints and chars with chr() and ord(), and str.upper() and
friends are unicode-aware.

But the name database is only queryable using full names! I want to do
unicodedata.search('clock') and get a list of dozens of glyphs with names
like CLOCKWISE RIGHTWARDS AND LEFTWARDS OPEN CIRCLE ARROWS
 and CLOCK FACE THREE-THIRTY.

Maybe this should spit out a list of (name, char) tuples? or a {name: char}
dict?

What do you mean?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20141003/5b001e25/attachment.html>