Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py, 1.4, 1.5

On Sat, 2004-01-17 at 09:29, perky@users.sourceforge.net wrote:
Update of /cvsroot/python/python/dist/src/Lib/email/test In directory sc8-pr-cvs1:/tmp/cvs-serv14239/Lib/email/test
Modified Files: test_email_codecs.py Log Message: Add CJK codecs support as discussed on python-dev. (SF #873597)
Should we backport this to 2.3? I still intend to backport 852347 when the branch is unfrozen <wink>. -Barry

On Sat, Jan 17, 2004, Barry Warsaw wrote:
On Sat, 2004-01-17 at 09:29, perky@users.sourceforge.net wrote:
Modified Files: test_email_codecs.py Log Message: Add CJK codecs support as discussed on python-dev. (SF #873597)
Should we backport this to 2.3? I still intend to backport 852347 when the branch is unfrozen <wink>.
-1 to both Unless, that is, you can say with complete certainty that the codecs will behave the same as in 2.3.3. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay?

Barry Warsaw wrote:
Should we backport this to 2.3?
I think the maintainer of 2.3 has voiced a clear "no to new features" policy for 2.3. This would be certainly a new feature. Regards, Martin

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote I think the maintainer of 2.3 has voiced a clear "no to new features" policy for 2.3. This would be certainly a new feature.
If it can be shown that this is a pure win (that is, it improves code for 2.3.4, but is both backwards and forwards compatible, then I don't mind. I _do_ mind if we end up with the horror of 2.2.2's not-really- booleans, which has resulted in FAR FAR too much code of the form: try: True, False except: True, False = 1, 0 I do not wish to see something like this again. Anthony -- Anthony Baxter <anthony@interlink.com.au> It's never too late to have a happy childhood.

[Martin v. Löwis]
I think the maintainer of 2.3 has voiced a clear "no to new features" policy for 2.3. This would be certainly a new feature.
[Anthony Baxter]
If it can be shown that this is a pure win (that is, it improves code for 2.3.4, but is both backwards and forwards compatible, then I don't mind. I _do_ mind if we end up with the horror of 2.2.2's not-really- booleans, which has resulted in FAR FAR too much code of the form:
try: True, False except: True, False = 1, 0
I do not wish to see something like this again.
+1 on backporting it. The CJK codecs are essentially invisible to most users. However, its availability will increase python's acceptance in places where it is not currently being used. Increasing the user base should not have to wait for Py2.4. Raymond Hettinger

On Sun, Jan 18, 2004, Raymond Hettinger wrote:
The CJK codecs are essentially invisible to most users. However, its availability will increase python's acceptance in places where it is not currently being used. Increasing the user base should not have to wait for Py2.4.
That's not the question. IIUC, there are already some kind of CJK codecs available; this is a different set of codecs. Unless these new codecs produce the same results, we can't afford to switch. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay?

On Sun, 2004-01-18 at 10:15, Aahz wrote:
That's not the question. IIUC, there are already some kind of CJK codecs available; this is a different set of codecs. Unless these new codecs produce the same results, we can't afford to switch.
We're not switching codecs since 2.3 doesn't ship with either JapaneseCodecs or CJKCodecs. No one's proposing that 2.3 start doing that now. What we're doing is removing a hardcoded dependency on exactly the JapaneseCodecs, which have to be installed separately anyway. With these change, an end-user could install either JapaneseCodecs or CJKCodecs and the rest of the code would/should work without modification. -Barry

On Sun, Jan 18, 2004, Barry Warsaw wrote:
On Sun, 2004-01-18 at 10:15, Aahz wrote:
That's not the question. IIUC, there are already some kind of CJK codecs available; this is a different set of codecs. Unless these new codecs produce the same results, we can't afford to switch.
We're not switching codecs since 2.3 doesn't ship with either JapaneseCodecs or CJKCodecs. No one's proposing that 2.3 start doing that now.
What we're doing is removing a hardcoded dependency on exactly the JapaneseCodecs, which have to be installed separately anyway. With these change, an end-user could install either JapaneseCodecs or CJKCodecs and the rest of the code would/should work without modification.
Just to make completely sure I understand this: if someone has an existing installation with JapanseCodecs and they upgrade with these code changes, will there be any changes in the way they access codecs or in the results they get? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ A: No. Q: Is top-posting okay?

Anthony Baxter wrote:
If it can be shown that this is a pure win (that is, it improves code for 2.3.4, but is both backwards and forwards compatible, then I don't mind. I _do_ mind if we end up with the horror of 2.2.2's not-really- booleans, which has resulted in FAR FAR too much code of the form:
try: True, False except: True, False = 1, 0
It depends on the usage. If applications just do foo.encode("some-cjk-encoding") then all will work fine. However, it may be that applications do try: codecs.lookup("some-cjk-encoding") except LookupError: try: import japanese japanese.register_aliases() except ImportError: pass Regards, Martin

On Sun, Jan 18, 2004 at 03:56:51PM +0100, "Martin v. L?wis" wrote:
Anthony Baxter wrote:
If it can be shown that this is a pure win (that is, it improves code for 2.3.4, but is both backwards and forwards compatible, then I don't mind. I _do_ mind if we end up with the horror of 2.2.2's not-really- booleans, which has resulted in FAR FAR too much code of the form:
try: True, False except: True, False = 1, 0
It depends on the usage. If applications just do
foo.encode("some-cjk-encoding")
then all will work fine. However, it may be that applications do
try: codecs.lookup("some-cjk-encoding") except LookupError: try: import japanese japanese.register_aliases() except ImportError: pass
This usage is useful only for -S mode. The recent Asian codecs register its aliases by their respective .pth. So, backporting CJK codecs won't let Japanese developers code in this way. If there's a man who need real JapaneseCodecs not CJK codecs, they can use it by installing JapaneseCodecs package as they've been done. Hye-Shik

Hye-Shik Chang wrote:
try: codecs.lookup("some-cjk-encoding") except LookupError: try: import japanese japanese.register_aliases() except ImportError: pass
This usage is useful only for -S mode. The recent Asian codecs register its aliases by their respective .pth. So, backporting CJK codecs won't let Japanese developers code in this way.
Of course, users may have installed JapaneseCodecs with --without-aliases, in which case the aliases won't be there (neither automatically, nor after explicitly importing japanese). I believe mailman use it that way So people who currently do encodings.aliases.aliases.update( {"shift_jis" : "japanese.shift_jis"}) would have to adjust their code. I also find that installing JapaneseCodecs on top of a CJK-enabled Python 2.4 causes shift_jis to use the CJK codec, not the japanese codec. Regards, Martin

On Sun, Jan 18, 2004 at 10:24:12PM +0100, "Martin v. L?wis" wrote:
I also find that installing JapaneseCodecs on top of a CJK-enabled Python 2.4 causes shift_jis to use the CJK codec, not the japanese codec.
Aah. We need to change codec lookup order of encodings.search_function to enable to override default codec by 3rd party one. With attached patch, I could get JapaneseCodecs's one. Now, I admit that CJK codecs are obviously un-backportable to 2.3 due to this change. ;-) Hye-Shik

Hye-Shik Chang wrote:
On Sun, Jan 18, 2004 at 10:24:12PM +0100, "Martin v. L?wis" wrote:
I also find that installing JapaneseCodecs on top of a CJK-enabled Python 2.4 causes shift_jis to use the CJK codec, not the japanese codec.
Aah. We need to change codec lookup order of encodings.search_function to enable to override default codec by 3rd party one. With attached patch, I could get JapaneseCodecs's one.
Now, I admit that CJK codecs are obviously un-backportable to 2.3 due to this change. ;-)
I think it's a good idea, to have the aliases dictionary checked before trying to do the import. I'll fix that. Backporting the fix should also be possible, since it is not related to any particular codec package. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 18 2004)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

M.-A. Lemburg wrote:
I think it's a good idea, to have the aliases dictionary checked before trying to do the import.
I'll fix that. Backporting the fix should also be possible, since it is not related to any particular codec package.
No. Instead, it applies uniformly to all codecs - so in some cases, users may see changes in the behaviour. Regards, Martin

Martin v. Löwis wrote:
M.-A. Lemburg wrote:
I think it's a good idea, to have the aliases dictionary checked before trying to do the import.
I'll fix that. Backporting the fix should also be possible, since it is not related to any particular codec package.
No. Instead, it applies uniformly to all codecs - so in some cases, users may see changes in the behaviour.
Could you give an example ? The only change that I can think of is that setting up aliases to override builtin codecs would start working and that's more like a bug fix than a new feature. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 19 2004)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

M.-A. Lemburg wrote:
No. Instead, it applies uniformly to all codecs - so in some cases, users may see changes in the behaviour.
Could you give an example ?
The only change that I can think of is that setting up aliases to override builtin codecs would start working and that's more like a bug fix than a new feature.
Currently, if somebody defines a codec windows_1252.py, foo.encode("windows_1252") uses this codec. With the proposed change, it uses cp1252.py. To see the behaviour change, no changes to the aliases list is necessary. Regards, Martin

Martin v. Löwis wrote:
M.-A. Lemburg wrote:
No. Instead, it applies uniformly to all codecs - so in some cases, users may see changes in the behaviour.
Could you give an example ?
The only change that I can think of is that setting up aliases to override builtin codecs would start working and that's more like a bug fix than a new feature.
Currently, if somebody defines a codec windows_1252.py, foo.encode("windows_1252") uses this codec. With the proposed change, it uses cp1252.py. To see the behaviour change, no changes to the aliases list is necessary.
Thanks for the example. I agree that we should not backport the change then. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 20 2004)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

On Sun, 2004-01-18 at 17:01, Hye-Shik Chang wrote:
Now, I admit that CJK codecs are obviously un-backportable to 2.3 due to this change. ;-)
Hmm, sounds like trying to make those changes to the email package for 2.3 is doomed, or at least not worth it if CJK won't work well with 2.3. -Barry

On Sun, 2004-01-18 at 16:24, "Martin v. Löwis" wrote:
Of course, users may have installed JapaneseCodecs with --without-aliases, in which case the aliases won't be there (neither automatically, nor after explicitly importing japanese). I believe mailman use it that way So people who currently do
Actually Mailman 2.1 does this: import japanese import korean import korean.aliases It also does not install the codec packages --without-aliases. -Barry

On Sun, 2004-01-18 at 07:26, Anthony Baxter wrote:
If it can be shown that this is a pure win (that is, it improves code for 2.3.4, but is both backwards and forwards compatible, then I don't mind.
Understood. I'll do the best I can to test this, but it would help if someone less monolingual than I can help. Here's the plan: I'm going to run the tests w/o the patch, but with JapaneseCodecs installed. Then I'll apply the patch and re-run the test. Then I'll install/enable CJKCodecs and re-run the test. I'd expect everything to pass, but I'm not sure that's a complete enough test. In addition, I plan on running Mailman 2.1.x with Python 2.3.x under similar environments as above. I'd expect to see the same pretty pictures <wink> in the web interface under both scenarios. I'll try to cobble together some Japanese charset emails and send them through, but since I can't read Japanese, I'm really shooting blind. -Barry
participants (7)
-
"Martin v. Löwis"
-
Aahz
-
Anthony Baxter
-
Barry Warsaw
-
Hye-Shik Chang
-
M.-A. Lemburg
-
Raymond Hettinger