[Pythonmac-SIG] TECManager
Bob Ippolito
bob at redivi.com
Fri Sep 26 05:42:28 EDT 2003
On Friday, Sep 26, 2003, at 05:22 America/New_York, Jack Jansen wrote:
>
> On Friday, September 26, 2003, at 02:19 AM, Sarwat Khan wrote:
>
>> On Thursday, September 25, 2003, at 06:32 PM, Jack Jansen wrote:
>>> 1. Why do you use TECManager in stead of Python's builtin unicode
>>> support?
>>
>> I haven't used Bob's TECManager but I'll second its usefulness.
>> Apple's text encoding support is far more functional than Python's,
>> including support for just about any script system in use on Windows
>> and on the Mac. If you go to Safari's View > Text Encoding menu,
>> you'll see a list of popular text encodings used on the web. Several
>> of them aren't available in Python, such as Traditional Chinese.
>
> Ok, point taken. Then I think what we (this is the "we" as in
> "someone, probably not me":-)
> should do is add TEXManager support to Python unicode support. This
> would mean that on
> the Mac the unicode converters will use TECManager (unless explicitly
> instructed
> otherwise) and on other platforms it will use the standard method.
> Which may
> be less complete, but at least work.
>
> This has two advantages over using TECManager explicitly:
> 1. Code developed on other platforms and brought over to the Mac will
> automatically
> understand the MacSanskrit and other esoteric character sets
> 2. Code written on the Mac is more portable to other platforms, and
> much more readable to Python programmers without Mac-knowledge.
>
> I've looked at plugging Apple's Text Encoding stuff into the Python
> unicode codec architecture in the past and it looked doable.
Currently TECManager only does decoding (not encoding) of MacSanskrit.
The reason I didn't write the encoding routines is because the API for
it is more far complex and I didn't need the stuff :)
The issue with adding it to Python's unicode support is that you don't
get to pass a lot of context when you say str.decode('macroman'), where
you may want to say TECManager.ConvertToUnicode(str, script=smRoman,
language=langDutch, region=verNetherlands) .. there's also a richer set
of unicode fallbacks than the Python version. Of course, that said,
just putting it in place of what is already there would be better than
what's there now, but we'll have to make decisions as to what to call
the scripts (do we use 'smRoman' or 'macroman' or both.. rinse and
repeat for the other 36 script codes).
-bob
More information about the Pythonmac-SIG
mailing list