Module renaming and pickle mechanisms

I'd like to bring a potential problem to attention that is caused by the recent module renaming approach: Object serialization protocols like e.g. pickle usually store the complete module path to the object class together with the object. They access this module path by looking at the __module__ attribute of the object classes. With the renaming, all objects which use classes from the renamed modules will now refer to the renamed modules in their serialized form, e.g. queue.Queue instead of Queue.Queue (just to name one example). While this is nice for forward compatibility, it causes rather serious problems for making object serialization backwards compatible, since the older Python versions can no longer unserialize objects due to missing modules. This can happen in client-server setups where e.g. the server uses Python 2.6 and the clients some other Python version (e.g. Python 2.5). It can also happen in storage setups where Python objects are stored using e.g. pickle, ZODB being a prominent example. As soon as a Python 2.6 application starts writing to such storages, Python 2.5 and lower versions will no longer be able to read back all the data. Now, I think there's a way to solve this puzzle: Instead of renaming the modules (e.g. Queue -> queue), we leave the code in the existing modules and packages and instead add the new module names and package structure with pointers and redirects to the existing 2.5 modules. Code can (and probably should) still be changed to try to import the new module name. In cases where backwards compatibility is needed, this can also be done using try: import newname except ImportError: import oldname Later on, when porting applications to 3.0, the 2to3 script can then apply the final renaming in the source code. Example: queue.py: --------- import sys, Queue sys.modules[__name__] = Queue -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 17 2008)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

On Sat, May 17, 2008 at 5:05 AM, M.-A. Lemburg <mal@egenix.com> wrote:
I'd like to bring a potential problem to attention that is caused by the recent module renaming approach:
Object serialization protocols like e.g. pickle usually store the complete module path to the object class together with the object.
Thanks for bringing this up. I was aware of the problem myself, but I hadn't yet worked out a good solution to it.
It can also happen in storage setups where Python objects are stored using e.g. pickle, ZODB being a prominent example. As soon as a Python 2.6 application starts writing to such storages, Python 2.5 and lower versions will no longer be able to read back all the data.
The opposite problem exists for Python 3.0, too. Pickle streams written by Python 2.x applications will not be readable by Python 3.0. And, one solution to this is to use Python 2.6 to regenerate pickle stream. Another solution would be to write a 2to3 pickle converter using the pickletools module. It is surely not the most elegant or robust solution, but I could work.
Now, I think there's a way to solve this puzzle:
Instead of renaming the modules (e.g. Queue -> queue), we leave the code in the existing modules and packages and instead add the new module names and package structure with pointers and redirects to the existing 2.5 modules.
This would certainly work for simple modules, but what about packages? For packages, you can't use the ``sys.modules[__name__] = Queue`` to preserve module identity. Therefore, pickle will use the new package name when writing its streams. So, we are back to the same problem again. A possible solution could be writing a compatibility layer for the Pickler class, which would map new module names to their old at runtime. Again, this is neither an elegant, nor robust, solution, but it should work in most cases. -- Alexandre

Errata: On Sat, May 17, 2008 at 10:59 AM, Alexandre Vassalotti <alexandre@peadrop.com> wrote:
And, one solution to this is to use Python 2.6 to regenerate pickle stream.
... to regenerate *the* pickle *streams*.
It is surely not the most elegant or robust solution, but I could work.
... but *it* could work.
This would certainly work for simple modules, but what about packages? For packages, you can't use the ``sys.modules[__name__] = Queue`` to preserve module identity.
... you can't use the ``sys.modules[__name__] = Queue`` *trick* to preserve module identity.
A possible solution could be writing a compatibility layer for the
... could be *to write* a compatibility layer... I guess I should start proofreading my emails before sending them, not after... -- Alexandre

On Sat, May 17, 2008 at 7:59 AM, Alexandre Vassalotti <alexandre@peadrop.com> wrote:
Another solution would be to write a 2to3 pickle converter using the pickletools module. It is surely not the most elegant or robust solution, but I could work.
This could be done even for 2.x <--> 2.6 to be translate module names at unpickling and pickling time. IMHO thats preferable to leaving stub modules with the old names around. Anyways I'm not a heavy user of pickle so people who are should decide.

Alexandre Vassalotti wrote:
On Sat, May 17, 2008 at 5:05 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Object serialization protocols like e.g. pickle usually store the complete module path to the object class together with the object.
The opposite problem exists for Python 3.0, too.
This is just one manifestation of what I consider a serious shortcoming of the pickle format for long-term storage: it ties the data to implementation details of the program. When I brought this up earlier, various people assured me that it wasn't a problem in practice. I think we're seeing one situation here where it *is* a problem. -- Greg

On 10:22 pm, greg.ewing@canterbury.ac.nz wrote:
When I brought this up earlier, various people assured me that it wasn't a problem in practice. I think we're seeing one situation here where it *is* a problem.
Just my two cents here - experience has taught me that it's definitely a problem in practice. One big problem with pickle is that it's even difficult to tell when or how much your persistence format depends on your application code. For example, if you're pickling a dict that is supposed to map strings to integers, but you have a bug which accidentally ends up using a string subclass instead, it can be very difficult to figure out that this ever happened. pickletools is really neat, and can help with this problem once you're stuck, but it's a better idea to use a more explicit persistence mechanism in the first place if you can.

On 2008-05-17 16:59, Alexandre Vassalotti wrote:
On Sat, May 17, 2008 at 5:05 AM, M.-A. Lemburg <mal@egenix.com> wrote:
I'd like to bring a potential problem to attention that is caused by the recent module renaming approach:
Object serialization protocols like e.g. pickle usually store the complete module path to the object class together with the object.
Thanks for bringing this up. I was aware of the problem myself, but I hadn't yet worked out a good solution to it.
It can also happen in storage setups where Python objects are stored using e.g. pickle, ZODB being a prominent example. As soon as a Python 2.6 application starts writing to such storages, Python 2.5 and lower versions will no longer be able to read back all the data.
The opposite problem exists for Python 3.0, too. Pickle streams written by Python 2.x applications will not be readable by Python 3.0. And, one solution to this is to use Python 2.6 to regenerate pickle stream.
Another solution would be to write a 2to3 pickle converter using the pickletools module. It is surely not the most elegant or robust solution, but I could work.
I'm not really worried much about going from 2.x to 3.x. Breakage is allowed for that transition. However, the case is different for going from 2.5 to 2.6. Breakage should be avoided if at all possible.
Now, I think there's a way to solve this puzzle:
Instead of renaming the modules (e.g. Queue -> queue), we leave the code in the existing modules and packages and instead add the new module names and package structure with pointers and redirects to the existing 2.5 modules.
This would certainly work for simple modules, but what about packages? For packages, you can't use the ``sys.modules[__name__] = Queue`` to preserve module identity. Therefore, pickle will use the new package name when writing its streams. So, we are back to the same problem again.
A possible solution could be writing a compatibility layer for the Pickler class, which would map new module names to their old at runtime. Again, this is neither an elegant, nor robust, solution, but it should work in most cases.
While it's possible to fix pickle (at least the Python version), this would not help with other serialization formats that rely on the .__module__ attribute mapping to an existing module. It's better to address the problem at the module level. Perhaps I have a misunderstanding of the reasoning behind doing the renaming in the 2.x branch, but it appears that the only reason is to get used to the new names. That's a rather low priority argument in comparison to the breakage the renaming will cause in the 2.x branch. I think it's much better to have 2to3.py do the renaming and only add warnings to the renamed modules in 2.x (without actually applying any renaming). It would also be possible to seed sys.modules with module proxy objects (see e.g. mx.Misc.LazyModule from egenix-mx-base) which only turn into real module object if the module is referenced. This would allow adding a "from __future__ import new_module_names" which then results in loading proxies for all renamed modules (without actually loading the modules until they are used under their new names). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 18 2008)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

M.-A. Lemburg wrote:
Perhaps I have a misunderstanding of the reasoning behind doing the renaming in the 2.x branch, but it appears that the only reason is to get used to the new names. That's a rather low priority argument in comparison to the breakage the renaming will cause in the 2.x branch.
I think this is the key point here. The possibility of breaking pickling compatibility never came up during the PEP 3108 discussions, so wasn't taken into account in deciding whether or not backporting the name changes was a good idea. I think it's pretty clear that the code needs to be moved back into the modules with the old names for 2.6. The only question is whether or not we put any effort into making the new stdlib organisation usable in 2.x, or just rely on 2to3 to fix it (note that the "increasing the common subset" argument doesn't really apply, since you can catch the import errors in order to try both names). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

On Sun, May 18, 2008 at 6:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
M.-A. Lemburg wrote:
Perhaps I have a misunderstanding of the reasoning behind doing the renaming in the 2.x branch, but it appears that the only reason is to get used to the new names. That's a rather low priority argument in comparison to the breakage the renaming will cause in the 2.x branch.
I think this is the key point here. The possibility of breaking pickling compatibility never came up during the PEP 3108 discussions, so wasn't taken into account in deciding whether or not backporting the name changes was a good idea.
I think it's pretty clear that the code needs to be moved back into the modules with the old names for 2.6. The only question is whether or not we put any effort into making the new stdlib organisation usable in 2.x, or just rely on 2to3 to fix it (note that the "increasing the common subset" argument doesn't really apply, since you can catch the import errors in order to try both names).
Problem with this is it makes forward-porting revisions to 3.0 a PITA. By keeping the module names consistent between the versions merging a revision is just a matter of ``svnmerge merge`` with the usual 3.0-specific changes. Reverting the modules back to the old name will make forward-porting much more difficult as I don't think svn keeps rename information around (and thus map the old name to the new name in terms of diffs). Alexandre's idea of teaching pickle the mapping of old names to new might be the best solution. We could have a flag to pickle that deactivates the renaming. Otherwise we could bump the pickle version number so that the new number doesn't do the mapping while the old versions to the implicit module mapping. And as Greg and Glpyh have pointed out, this is a problem that might need to be addressed in the future with some changes to our serialization method (I have no clue how since I don't deal with pickle very much). -Brett

On 2008-05-18 22:24, Brett Cannon wrote:
On Sun, May 18, 2008 at 6:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
M.-A. Lemburg wrote:
Perhaps I have a misunderstanding of the reasoning behind doing the renaming in the 2.x branch, but it appears that the only reason is to get used to the new names. That's a rather low priority argument in comparison to the breakage the renaming will cause in the 2.x branch. I think this is the key point here. The possibility of breaking pickling compatibility never came up during the PEP 3108 discussions, so wasn't taken into account in deciding whether or not backporting the name changes was a good idea.
I think it's pretty clear that the code needs to be moved back into the modules with the old names for 2.6. The only question is whether or not we put any effort into making the new stdlib organisation usable in 2.x, or just rely on 2to3 to fix it (note that the "increasing the common subset" argument doesn't really apply, since you can catch the import errors in order to try both names).
Problem with this is it makes forward-porting revisions to 3.0 a PITA. By keeping the module names consistent between the versions merging a revision is just a matter of ``svnmerge merge`` with the usual 3.0-specific changes. Reverting the modules back to the old name will make forward-porting much more difficult as I don't think svn keeps rename information around (and thus map the old name to the new name in terms of diffs).
svnmerge is written in Python, so wouldn't it be possible to add support for maintaining such renaming to that tool ? I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch. After all, the renaming was approached for Python 3.0 and not 2.6 *because* it introduces major breakage. AFAIR, the discussion on the stdlib-sig also didn't include the plan to backport such changes to 2.6. Otherwise, we would have hashed them out there.
Alexandre's idea of teaching pickle the mapping of old names to new might be the best solution. We could have a flag to pickle that deactivates the renaming. Otherwise we could bump the pickle version number so that the new number doesn't do the mapping while the old versions to the implicit module mapping.
And as Greg and Glpyh have pointed out, this is a problem that might need to be addressed in the future with some changes to our serialization method (I have no clue how since I don't deal with pickle very much).
It is possible to make pickle aware of the module renames, but that doesn't solve problems with other forms of serialization or use of the .__module__ attribute in general. Why can't we just provide a "from __future__ import renamed_modules" which then provides all the new name to old name mappings in some form (e.g. module proxies or whatever) and leave the existing modules in 2.x untouched ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 19 2008)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

M.-A. Lemburg wrote:
I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch.
After all, the renaming was approached for Python 3.0 and not 2.6 *because* it introduces major breakage.
AFAIR, the discussion on the stdlib-sig also didn't include the plan to backport such changes to 2.6. Otherwise, we would have hashed them out there.
I think MAL is 100% correct here (and I expect Raymond will chime in to support him at some point as well). Taking the time to fix out mistake may mean we need to do another alpha rather than being able to go into the betas as planned, but I think that would be a much better option than breaking any 2.x code that relies on __module__ staying the same across releases. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

"Nick Coghlan" <ncoghlan@gmail.com> wrote in message news:4831704D.1060201@gmail.com... | M.-A. Lemburg wrote: | > I don't think that an administrative problem such as forward- | > porting patches to 3.x warrants breakage in the 2.x branch. | > | > After all, the renaming was approached for Python 3.0 and not | > 2.6 *because* it introduces major breakage. | > | > AFAIR, the discussion on the stdlib-sig also didn't include the | > plan to backport such changes to 2.6. Otherwise, we would have | > hashed them out there. | | I think MAL is 100% correct here (and I expect Raymond will chime in to | support him at some point as well). For what little it's worth, I was surprised too that the 3.0 renames were backported as thr default versions. It strikes me as possibly a 'bridge too far' ;-). tjr

Nick writes:
M.-A. Lemburg wrote:
I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch.
After all, the renaming was approached for Python 3.0 and not 2.6 *because* it introduces major breakage.
AFAIR, the discussion on the stdlib-sig also didn't include the plan to backport such changes to 2.6. Otherwise, we would have hashed them out there.
I think MAL is 100% correct here (and I expect Raymond will chime in to support him at some point as well).
And until then, a +1 for MAL's position from me as well. 2.x should be quite conservative about such changes... Cheers, Mark

Nick writes:
M.-A. Lemburg wrote:
I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch.
After all, the renaming was approached for Python 3.0 and not 2.6 *because* it introduces major breakage.
AFAIR, the discussion on the stdlib-sig also didn't include the plan to backport such changes to 2.6. Otherwise, we would have hashed them out there.
I think MAL is 100% correct here (and I expect Raymond will chime in to support him at some point as well).
And until then, a +1 for MAL's position from me as well. 2.x should be quite conservative about such changes...
I concur. Raymond

On Mon, May 19, 2008 at 8:39 AM, Raymond Hettinger <python@rcn.com> wrote:
Nick writes:
M.-A. Lemburg wrote:
I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch.
After all, the renaming was approached for Python 3.0 and not 2.6 *because* it introduces major breakage.
AFAIR, the discussion on the stdlib-sig also didn't include the plan to backport such changes to 2.6. Otherwise, we would have hashed them out there.
I think MAL is 100% correct here (and I expect Raymond will chime in to support him at some point as well).
And until then, a +1 for MAL's position from me as well. 2.x should be quite conservative about such changes...
I concur.
And a "me too" post about being conservative by default as well. I'm not sure how effective a "from __future__ import renamed_modules" would be, since such future imports are meant to affect the semantics of the *current* module only, whereas which name to use when pickling a module reference is most likely a global choice. So perhaps some other way to changing the default behavior globally would be more appropriate. Assuming it's really the pickle module that needs to know about this, how about making this a per-Pickler-instance flag? if you wanted to write 3.0 compatible pickles you'd have to do something like p = pikle.Pickler() p.use_new_module_names(True) pkl = p.dump(<object>) We could supply an extra flag to the dump() and dumps() convenience functions as well. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

On Mon, May 19, 2008 at 9:22 AM, Guido van Rossum <guido@python.org> wrote:
On Mon, May 19, 2008 at 8:39 AM, Raymond Hettinger <python@rcn.com> wrote:
Nick writes:
M.-A. Lemburg wrote:
I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch.
After all, the renaming was approached for Python 3.0 and not 2.6 *because* it introduces major breakage.
AFAIR, the discussion on the stdlib-sig also didn't include the plan to backport such changes to 2.6. Otherwise, we would have hashed them out there.
I think MAL is 100% correct here (and I expect Raymond will chime in to support him at some point as well).
And until then, a +1 for MAL's position from me as well. 2.x should be quite conservative about such changes...
I concur.
And a "me too" post about being conservative by default as well.
I will update the PEP some time today. I think if we take MAL's idea of doing the __dict__.update() trick and suppress the Py3K warnings then it should be able to keep the warnings (it will require a very specific filter). Otherwise the Py3K warnings will just have to go. -Brett

On Mon, May 19, 2008 at 2:08 PM, M.-A. Lemburg <mal@egenix.com> wrote:
Why can't we just provide a "from __future__ import renamed_modules" which then provides all the new name to old name mappings in some form (e.g. module proxies or whatever) and leave the existing modules in 2.x untouched ?
If I understand this correctly, the pickles would then be compatible between 2.6 and 2.5, unless you did from __future__ import renamed_modules, which would make the pickles compatible between 2.6 and 3.0. This sounds like the best solution to me, especially if the old names are still available after the future import, as all that would then be needed it to repickle all the pickles to convert from 2.5 to 3.0 pickles, right? So, if I understood this correctly, that sounds like a perfect solution. :) -- Lennart Regebro: Zope and Plone consulting. http://www.colliberty.com/ +33 661 58 14 64

On Mon, May 19, 2008 at 5:08 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2008-05-18 22:24, Brett Cannon wrote:
On Sun, May 18, 2008 at 6:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
M.-A. Lemburg wrote:
Perhaps I have a misunderstanding of the reasoning behind doing the renaming in the 2.x branch, but it appears that the only reason is to get used to the new names. That's a rather low priority argument in comparison to the breakage the renaming will cause in the 2.x branch.
I think this is the key point here. The possibility of breaking pickling compatibility never came up during the PEP 3108 discussions, so wasn't taken into account in deciding whether or not backporting the name changes was a good idea.
I think it's pretty clear that the code needs to be moved back into the modules with the old names for 2.6. The only question is whether or not we put any effort into making the new stdlib organisation usable in 2.x, or just rely on 2to3 to fix it (note that the "increasing the common subset" argument doesn't really apply, since you can catch the import errors in order to try both names).
Problem with this is it makes forward-porting revisions to 3.0 a PITA. By keeping the module names consistent between the versions merging a revision is just a matter of ``svnmerge merge`` with the usual 3.0-specific changes. Reverting the modules back to the old name will make forward-porting much more difficult as I don't think svn keeps rename information around (and thus map the old name to the new name in terms of diffs).
svnmerge is written in Python, so wouldn't it be possible to add support for maintaining such renaming to that tool ?
Don't know, possibly. But I am not about to try to figure out.
I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch.
That's why I suggested changing pickle to deal with the rename, but obviously I am in the minority in that idea.
After all, the renaming was approached for Python 3.0 and not 2.6 *because* it introduces major breakage.
AFAIR, the discussion on the stdlib-sig also didn't include the plan to backport such changes to 2.6. Otherwise, we would have hashed them out there.
Never came up.
Alexandre's idea of teaching pickle the mapping of old names to new might be the best solution. We could have a flag to pickle that deactivates the renaming. Otherwise we could bump the pickle version number so that the new number doesn't do the mapping while the old versions to the implicit module mapping.
And as Greg and Glpyh have pointed out, this is a problem that might need to be addressed in the future with some changes to our serialization method (I have no clue how since I don't deal with pickle very much).
It is possible to make pickle aware of the module renames, but that doesn't solve problems with other forms of serialization or use of the .__module__ attribute in general.
Why can't we just provide a "from __future__ import renamed_modules" which then provides all the new name to old name mappings in some form (e.g. module proxies or whatever) and leave the existing modules in 2.x untouched ?
I have started a discussion on the stdlib SIG on how to handle this, so I will defer this discussion to there. But one thing that needs to be decided is if we are ever going to allow ourselves to rename modules without a major version bump, and if so how to deal with this problem. I would hope we don't have to wait another eight years before there is another chance to shift things around if it becomes apparent that some new package should be introduced since 2to3 gives us a very nice way to handle the mechanical aspect of porting code. -Brett

On 2008-05-19 21:26, Brett Cannon wrote:
It is possible to make pickle aware of the module renames, but that doesn't solve problems with other forms of serialization or use of the .__module__ attribute in general.
Why can't we just provide a "from __future__ import renamed_modules" which then provides all the new name to old name mappings in some form (e.g. module proxies or whatever) and leave the existing modules in 2.x untouched ?
I have started a discussion on the stdlib SIG on how to handle this, so I will defer this discussion to there.
Thanks.
But one thing that needs to be decided is if we are ever going to allow ourselves to rename modules without a major version bump, and if so how to deal with this problem. I would hope we don't have to wait another eight years before there is another chance to shift things around if it becomes apparent that some new package should be introduced since 2to3 gives us a very nice way to handle the mechanical aspect of porting code.
We could some kind of module aliasing support to Python. Backporting name changes would then be a matter of loading the right aliasing map into the older Python version. This could probably be done by adding a line if hasattr(sys, 'module_aliases'): modname = sys.module_aliases.get(modname, modname) to the __import__ implementation. By turning .__module__ into a property and applying the same aliasing there, we should be able to resolve most technical issues with a renaming. Alas, too late to change 2.4 and 2.5 :-/ -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 19 2008)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611

On Mon, May 19, 2008 at 12:26 PM, Brett Cannon <brett@python.org> wrote:
On Mon, May 19, 2008 at 5:08 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2008-05-18 22:24, Brett Cannon wrote:
On Sun, May 18, 2008 at 6:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
M.-A. Lemburg wrote:
Perhaps I have a misunderstanding of the reasoning behind doing the renaming in the 2.x branch, but it appears that the only reason is to get used to the new names. That's a rather low priority argument in comparison to the breakage the renaming will cause in the 2.x branch.
I think this is the key point here. The possibility of breaking pickling compatibility never came up during the PEP 3108 discussions, so wasn't taken into account in deciding whether or not backporting the name changes was a good idea.
I think it's pretty clear that the code needs to be moved back into the modules with the old names for 2.6. The only question is whether or not we put any effort into making the new stdlib organisation usable in 2.x, or just rely on 2to3 to fix it (note that the "increasing the common subset" argument doesn't really apply, since you can catch the import errors in order to try both names).
Problem with this is it makes forward-porting revisions to 3.0 a PITA. By keeping the module names consistent between the versions merging a revision is just a matter of ``svnmerge merge`` with the usual 3.0-specific changes. Reverting the modules back to the old name will make forward-porting much more difficult as I don't think svn keeps rename information around (and thus map the old name to the new name in terms of diffs).
svnmerge is written in Python, so wouldn't it be possible to add support for maintaining such renaming to that tool ?
Don't know, possibly. But I am not about to try to figure out.
I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch.
That's why I suggested changing pickle to deal with the rename, but obviously I am in the minority in that idea.
After all, the renaming was approached for Python 3.0 and not 2.6 *because* it introduces major breakage.
AFAIR, the discussion on the stdlib-sig also didn't include the plan to backport such changes to 2.6. Otherwise, we would have hashed them out there.
Never came up.
Alexandre's idea of teaching pickle the mapping of old names to new might be the best solution. We could have a flag to pickle that deactivates the renaming. Otherwise we could bump the pickle version number so that the new number doesn't do the mapping while the old versions to the implicit module mapping.
And as Greg and Glpyh have pointed out, this is a problem that might need to be addressed in the future with some changes to our serialization method (I have no clue how since I don't deal with pickle very much).
It is possible to make pickle aware of the module renames, but that doesn't solve problems with other forms of serialization or use of the .__module__ attribute in general.
Why can't we just provide a "from __future__ import renamed_modules" which then provides all the new name to old name mappings in some form (e.g. module proxies or whatever) and leave the existing modules in 2.x untouched ?
I have started a discussion on the stdlib SIG on how to handle this, so I will defer this discussion to there.
The decision was made (by me) to just revert all of the renames in 2.6. A note will be in the docs stating the rename, but otherwise 2to3 will be relied upon for all transitions from old names to new names. I have updated the PEP to note about which modules need to be reverted and the new steps to rename a module, and added/re-opened the appropriate issues (all attached to issue 2775). -Brett

On Mon, May 19, 2008 at 7:08 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2008-05-18 22:24, Brett Cannon wrote:
On Sun, May 18, 2008 at 6:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
M.-A. Lemburg wrote:
Perhaps I have a misunderstanding of the reasoning behind doing the renaming in the 2.x branch, but it appears that the only reason is to get used to the new names. That's a rather low priority argument in comparison to the breakage the renaming will cause in the 2.x branch.
I think this is the key point here. The possibility of breaking pickling compatibility never came up during the PEP 3108 discussions, so wasn't taken into account in deciding whether or not backporting the name changes was a good idea.
I think it's pretty clear that the code needs to be moved back into the modules with the old names for 2.6. The only question is whether or not we put any effort into making the new stdlib organisation usable in 2.x, or just rely on 2to3 to fix it (note that the "increasing the common subset" argument doesn't really apply, since you can catch the import errors in order to try both names).
Problem with this is it makes forward-porting revisions to 3.0 a PITA. By keeping the module names consistent between the versions merging a revision is just a matter of ``svnmerge merge`` with the usual 3.0-specific changes. Reverting the modules back to the old name will make forward-porting much more difficult as I don't think svn keeps rename information around (and thus map the old name to the new name in terms of diffs).
svnmerge is written in Python, so wouldn't it be possible to add support for maintaining such renaming to that tool ?
svnmerge.py is mostly a wrapper over svn merge, and svn merge can't handle it, so I don't think is easily possible.
I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch.
I am a bit worried for the sanity of the Merger, though. Merges into non-existent files are skipped automatically, so it doesn't make life any easier. <shameless_advertising>Bazaar can handle renames correctly.</shameless_advertising> -- Cheers, Benjamin Peterson "There's no place like 127.0.0.1."

On Mon, May 19, 2008 at 3:26 PM, Benjamin Peterson <musiccomposition@gmail.com> wrote:
On Mon, May 19, 2008 at 7:08 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 2008-05-18 22:24, Brett Cannon wrote:
On Sun, May 18, 2008 at 6:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
M.-A. Lemburg wrote:
Perhaps I have a misunderstanding of the reasoning behind doing the renaming in the 2.x branch, but it appears that the only reason is to get used to the new names. That's a rather low priority argument in comparison to the breakage the renaming will cause in the 2.x branch.
I think this is the key point here. The possibility of breaking pickling compatibility never came up during the PEP 3108 discussions, so wasn't taken into account in deciding whether or not backporting the name changes was a good idea.
I think it's pretty clear that the code needs to be moved back into the modules with the old names for 2.6. The only question is whether or not we put any effort into making the new stdlib organisation usable in 2.x, or just rely on 2to3 to fix it (note that the "increasing the common subset" argument doesn't really apply, since you can catch the import errors in order to try both names).
Problem with this is it makes forward-porting revisions to 3.0 a PITA. By keeping the module names consistent between the versions merging a revision is just a matter of ``svnmerge merge`` with the usual 3.0-specific changes. Reverting the modules back to the old name will make forward-porting much more difficult as I don't think svn keeps rename information around (and thus map the old name to the new name in terms of diffs).
svnmerge is written in Python, so wouldn't it be possible to add support for maintaining such renaming to that tool ?
svnmerge.py is mostly a wrapper over svn merge, and svn merge can't handle it, so I don't think is easily possible.
I think MAL was suggesting add some property that kept track of skipped merges or something.
I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch.
I am a bit worried for the sanity of the Merger, though. Merges into non-existent files are skipped automatically, so it doesn't make life any easier.
It will either have to be done in 2.6 and the immediately forward-ported or done in 3.0 and backported. I will follow the latter IF I feel like bothering with the backport.
<shameless_advertising>Bazaar can handle renames correctly.</shameless_advertising>
Yeah, yeah. One thing at a time. -Brett

Benjamin Peterson schrieb:
svnmerge is written in Python, so wouldn't it be possible to add support for maintaining such renaming to that tool ?
svnmerge.py is mostly a wrapper over svn merge, and svn merge can't handle it, so I don't think is easily possible.
I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch.
I am a bit worried for the sanity of the Merger, though. Merges into non-existent files are skipped automatically, so it doesn't make life any easier.
<shameless_advertising>Bazaar can handle renames correctly.</shameless_advertising>
So can dozens of other VCSs. Just to keep perspective. Georg
participants (14)
-
Alexandre Vassalotti
-
Benjamin Peterson
-
Brett Cannon
-
Georg Brandl
-
glyph@divmod.com
-
Greg Ewing
-
Gregory P. Smith
-
Guido van Rossum
-
Lennart Regebro
-
M.-A. Lemburg
-
Mark Hammond
-
Nick Coghlan
-
Raymond Hettinger
-
Terry Reedy