
"tomer filiba" <tomerfiliba@gmail.com> wrote:
On 1/25/07, Josiah Carlson <jcarlson@uci.edu> wrote:
Overall, I like the idea; I'm a big fan of simplifying object persistence and/or serialization. A part of me also likes how the objects can choose to lie about their types.
But another part of me says; the basic objects that you specified already have a format that is unambiguous, repr(obj). They also are able to be reconstructed from their component parts via eval(repr(obj)), or even via the 'unrepr' function in the ConfigObj module. It doesn't handle circular referencse.
well, repr is fine for most simple things, but you don't use repr to serialize objects, right? it's not powerful/introspective enough. besides repr is meant to be readable, while __getstate__ can return any object. imagine this:
I use repr to serialize objects all the time. ConfigObj is great when I want to handle python-based configuration information, and/or I don't want to worry about the security implications of 'eval(arbitrary string)', or 'import module'. With a proper __repr__ method, I can even write towards your API: class mylist(object): def __repr__(self): state = ... return 'mylist.__setstate__(%r)'%(state,)
class complex: def __repr__(self): return "(%f+%fj)" % (self.real, self.imag)
I would use 'return "(%r+%rj)"% (self.real, self.imag)', but it doesn't much matter.
repr is made for humans of course, while serialization is made for machines. they serves different purposes, so they need different APIs.
I happen to disagree. The only reason to use a different representation or API is if there are size and/or performance benefits to offering a machine readable vs. human readable format. I'm know that there are real performance advantages to using (c)Pickle over repr/unrepr, but I use it also so that I can change settings with notepad (as has been necessary on occasion).
Even better, it has 3 native representations; repr(a).encode('zlib'), repr(a), pprint.pprint(a); each offering a different amount of user readability. I digress.
you may have digressed, but that's a good point -- that's exactly why i do NOT specify how objects are encoded as a stream of bytes.
all i'm after is the state of the object (which is expressed in terms of other, more primitive objects).
Right, but as 'primative objects' go, you cant get significantly more primitive than producing a string that can be naively understood by someone familliar with Python *and* the built-in Python parser. Nevermind that it works *today* with all of the types you specified earlier (with the exception of file objects - which you discover on parsing/reproducing the object).
you can think of repr as a textual serializer to some extent, that can use the proposed __getstate__ API. pprint is yet another form of serializer.
Well, pprint is more or less a pretty repr.
I believe the biggest problem with the proposal, as specified, is that changing the semantics of __getstate__ and __setstate__ is a bad idea. Add a new pair of methods and ask the twisted people what they think. My only criticism will then be the strawman repr/unrepr.
i'll try to come up with new names... but i don't have any ideas at the moment.
Like Colin, I also like __rebuild__. - Josiah