[Python-Dev] getting rid of ucnhash

Tim Peters tim.one@home.com
Tue, 23 Jan 2001 23:50:49 -0500


[/F]
> It's probably just me, but the names of the two unicode
> modules tend to irritate me:

I don't care much about the names, but having two Unicode subprojects in the
MS build seems overkill <wink>.

> ls u*.pyd
> ucnhash.pyd      unicodedata.pyd
>
> (the former contains names, the latter data)

Maybe that's the reason:  the names don't get loaded at all unless you *use*
one of the name APIs?  Hard to say whether that's worth the bother; now that
everything has been nicely compressed, it's sure not as compelling as it may
have been earlier.

> I've been meaning to rename the former, but I just realized
> that it might be better to get rid of it completely, and move
> its functionality into the unicodedata module.
>
> The result is a single 200k unicodedata module, which con-
> tains the name database as well as two new functions:
>
>     name(character [, default]) => map unicode
>     character to name.  if the name doesn't exist,
>     return the default object, or raise ValueError.
>
>     lookup(name) => unicode character
>     (or raise KeyError if it doesn't exist)
>
> Should I check it in now, change the names/semantics and check
> it in, or post it to sourceforge?

I have no opinion on what's best:  you're working with it, you're the best
judge of that.  I only vote for checking in whatever you decide sooner
rather than later; I'll fiddle the MS project files and readmes accordingly
ASAP after that.