Missing unicode data?
fredrik at pythonware.com
Sat Jun 3 11:16:39 CEST 2006
Klaus Alexander Seistrup wrote:
> When checking unicodedata.name() against each uchar in the file
> /usr/share/unidata/UnicodeData-4.0.1d1b.txt that came with the
> console-data package on my Ubuntu Linux installation a total of
> 1226 unicode characters seems to be missing from the unicodedata
> module (2477 missing characters when checking against the latest
> database from unicode.org¹). Is this a deliberate omission?
I'm pretty sure unicodename.name() doesn't look in the UnicodeData file
on your machine, nor in the latest file from unicode.org. in other
words, you get whatever version that was used to create the Unicode data
set in your Python distribution.
this is usually the version that was current when that Python version
was originally released (i.e. in your case, when 2.4 was released).
iirc, 2.4 uses Unicode 3.2, and 2.5 uses Unicode 4.1. to update, use
the tools under Tools/unicode.
More information about the Python-list