[Pythonmac-SIG] TECManager
Jack Jansen
Jack.Jansen at cwi.nl
Fri Sep 26 07:19:36 EDT 2003
On Friday, September 26, 2003, at 11:42 AM, Bob Ippolito wrote:
> Currently TECManager only does decoding (not encoding) of MacSanskrit.
> The reason I didn't write the encoding routines is because the API
> for it is more far complex and I didn't need the stuff :)
We'll wait for a native Sanskrit speaker who's also a Python programmer
to add those:-)
> The issue with adding it to Python's unicode support is that you don't
> get to pass a lot of context when you say str.decode('macroman'),
> where you may want to say TECManager.ConvertToUnicode(str,
> script=smRoman, language=langDutch, region=verNetherlands) .. there's
> also a richer set of unicode fallbacks than the Python version. Of
> course, that said, just putting it in place of what is already there
> would be better than what's there now, but we'll have to make
> decisions as to what to call the scripts (do we use 'smRoman' or
> 'macroman' or both.. rinse and repeat for the other 36 script codes).
I looked specifically at this at the time I looked at TEC, and I got
the impression
that there's a mapping between the Apple script/language/region tuple
and the unicode
name. I haven't tried these, but I would expect them to return
something like
"MacRoman" or "roman" or so that we could convert to "macroman" or
"mac_roman"
to register as the codec. As long as we have *any* bidirectional mapping
between script/language/region tuples and strings that are acceptable to
unicode.encode() we should be fine.
--
Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma
Goldman
More information about the Pythonmac-SIG
mailing list