[Python-Dev] unicodedata module is out of date
A.J.Miller at bcs.org.uk
Fri Sep 6 17:55:22 CEST 2013
I've just checked on Python 2.7.5 and Python 3.3.2 (Win32 versions).
In Python 3.3.2 unicodedata.unidata_version is set to '6.1.0'.
In Python 2.7.5 it is set to '5.2.0' so it looks as though this version is
no longer being updated.
Since my initial post I've downloaded the Python 2.7.5 source and have
found the makeunicodedata.py script which creates this module.
Are there plans to add the extra data from the other UCD files to this
module? At the moment I am using a module from
https://gist.github.com/anonymous/2204527 to obtain the script of a
character but it would be nice if this was available from the standard
On 6 September 2013 16:38, MRAB <python at mrabarnett.plus.com> wrote:
> On 06/09/2013 10:54, Andrew Miller wrote:
>> The unicodedata module only contains data up to Unicode 5.2 (October
>> 2009), so attempting to reference any character from a later version e.g:
>> unicodedata.lookup("TURKISH LIRA SIGN")
>> results in a KeyError.
>> Also, it seems to be limited to properties in the UnicodeData.txt file
>> and does not contain any data from the other files from the Unicode
>> Character Database (the perl library Unicode::UCD is far more complete).
>> Are there any plans to update this module to the latest Unicode version
>> (6.2, with 6.3 being released shortly), or is there another module that
>> provides more up to date information?
>> Which version of Python are you talking about? Python 3.3 uses Unicode
> version 6.1.
> Python-Dev mailing list
> Python-Dev at python.org
> Unsubscribe: https://mail.python.org/**mailman/options/python-dev/a.**
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev