individually updating unicodedata db?

Vlastimil Brom vlastimil.brom at gmail.com
Mon Mar 22 20:19:04 EDT 2010


Hi all,
I just tried to find some information about the unicodedata database
and the possibilities of updating it to the latest version of the
unicode standards (currently 5.2, while python supports 5.1 in the
latest versions).
An option to update this database individually might be useful as the
unicode standard updates seem to be more frequent than the official
python releases (and not every release is updated to the latest
available unicode db version either).
Am I right, that this is not possible without recompiling python from source?
I eventually found the promissing file
...Python-src--2.6.5\Python-2.6.5\Tools\unicode\makeunicodedata.py
which required the following files from the unicode database to be in
the same folder:
EastAsianWidth-3.2.0.txt
UnicodeData-3.2.0.txt
CompositionExclusions-3.2.0.txt
UnicodeData.txt
EastAsianWidth.txt
CompositionExclusions.txt

and also
Modules/unicodedata_db.h
Modules/unicodename_db.h,
Objects/unicodetype_db.h

After a minor correction - addig the missing "import re" - the script
was able to run and recreate the above h files.
I guess, I am stuck here, as I use the precompiled version supplied in
the windows installer and can't compile python from source to obtain
the needed unicodedata.pyd.
Or are there any possibilities I missed to individually upgrade the
unicodedata databese? (Using Python 2.6.5, Win XPh SP3)

Thanks in advance for any hints,
   vbr



More information about the Python-list mailing list