Missing unicode data?

Fredrik Lundh fredrik at pythonware.com
Sat Jun 3 11:16:39 CEST 2006

Klaus Alexander Seistrup wrote:

> When checking unicodedata.name() against each uchar in the file 
> /usr/share/unidata/UnicodeData-4.0.1d1b.txt that came with the 
> console-data package on my Ubuntu Linux installation a total of 
> 1226 unicode characters seems to be missing from the unicodedata 
> module (2477 missing characters when checking against the latest 
> database from unicode.org¹).  Is this a deliberate omission?

I'm pretty sure unicodename.name() doesn't look in the UnicodeData file 
on your machine, nor in the latest file from unicode.org.  in other 
words, you get whatever version that was used to create the Unicode data 
set in your Python distribution.

this is usually the version that was current when that Python version 
was originally released (i.e. in your case, when 2.4 was released).

iirc, 2.4 uses Unicode 3.2, and 2.5 uses Unicode 4.1.  to update, use 
the tools under Tools/unicode.


More information about the Python-list mailing list