(and about tests) Re: Pedantic pickling error after reload?

Robert no-spam at non-existing.invalid
Fri Feb 26 13:04:38 CET 2010

Diez B. Roggisch wrote:
> Am 25.02.10 18:08, schrieb Robert:
>> After (intended/controlled) reload or similar action on a module/class
>> the pickle/cPickle.dump raises errors like
>> pickle.PicklingError: Can't pickle <class 'somemodule.SomeClass'>: it's
>> not the same object as somemodule.SomeClass
>> Cause in pickle.py (and cPickle) is a line
>> "if klass is not obj:"
>> Shouldn't it be enough to have "if klass.__name__ != obj.__name__:"
>> there?
> No. This would alias classes of same name, but living in different modules.
> So at least you need to compare these, too. I'm not sure if there aren't

at that point of comparison the module is already identical 
("klass = getattr(mod, name)")

> even more corner-cases. Python's import-mechanism can sometimes be
> rather foot-shoot-prone.

still don't see a real reason against the mere module+name 
comparison. same issues as during pickle.load. Just the class 
object is renewed (intentionally)

If there are things with nested classes etc, the programmer will 
have to rethink things on a different level: design errors. a 
subject for pychecker/pylint - not for breaking pickle .dump ... ?

>> => a bug report/feature request?
>> Classes can change face anyway during pickled state, why should a
>> over-pedantic reaction break things here during runtime?
>> (So far I'd need to walk the object tree in all facets and save against
>> inf loops like pickle himself and re-class things .. )
> If anything it's a feature - and I doubt it's really needed. Because
> reloading pickles with intermittend reload-module-operations is to rare
> a case to be really supported IMHO.

well, reloading is the thing which I do most in coding practice :-)
For me its a basic thing like cell proliferation in biology.

In my projects particularly with GUI or with python based http 
serving, I typically support good live module reloadabily even 
actively by some extra little "reload support code" (which fixes 
up the .__class__ etc of living Windows tree, main objects, 
servers ... plus a ´xisinstance´ in very few locations) - at least 
I do this for the frequently changing core modules/classes.
This way I feel a edit-run cycle >2x faster when the project is 
getting bigger and bigger, or when developing things out 
interactively. Code is exchanged frequently while living objects 
stay for long ... works well in practice.

Reentering into the same (complex) app state for evolving those 
thousands of small thing (where a full parallel test coverage 
doesn't work out) is a major dev time consuming factor in bigger 
projects - in C, Java projects and even with other dynamic languages.
Dynamic classes are a main reason why I use Python (adopted from 
Lisp long time ago; is that reload thing here possible with Ruby too?)

I typically need just 1 full app reboot on 20..50 edit-run-cycles 
I guess. And just few unit test runs per release. Even for 
Cython/pyximport things I added support for this reload 
edit-run-cycle, because I cannot imagine to dev without this.

Just standard pickle issues stood in the way. And this patch (and 
a failover from cPickle to pickle) did well so far.

> Do yourself a favor, write a unit-test that tests the desired behavior
> that makes you alter your code & then reload. This makes the problem go
> away, and you have a more stable development through having more tests :)

this is a comfortable quasi religious theory raised often and 
easily here and there - impracticable and very slow on that fine 
grained code evolution level however. an interesting issue.

I do unit tests for getting stability on a much higher level 
where/when things and functionality are quite wired.
Generally after having compared I cannot confirm that "write 
always tests before development" ideologies pay off in practice.
"Reload > pychecker/pylint > tests" works most effectively with 
Python in my opinion.
And for GUI-development the difference is max.
(min for math algorithms which are well away from data structures/OO)

Another issue regarding tests IMHO is, that one should not waste 
the "validation power" of unit tests too easily for permanent low 
level evolution purposes because its a little like bacteria 
becoming resistent against antibiotics: Code becoming 'fit' 
against artificial tests, but not against real word.
For example in late stage NASA tests of rockets and like, there is 
a validation rule, that when those tests do not go through green, 
there is not just a fix of the (one) cause - screwing until it 
works. the whole thing is at stake.
And the whole test scheme has to be rethought too. ideally, 
whenever such a late test brakes, it requires that a completely 
new higher test has to be invented (in addition) ... until there 
is a minimal set of "fresh green lights" which were red only 
during there own tests, but never red regarding the real test run.

A rule that unit tests are used only near a release or a milestone 
is healthy in that sense I think.
(And a quick edit-(real)run-interact cycle is good for speed)


More information about the Python-list mailing list