Json object-level serializer

Hello, What about adding in the json package the ability for an object to provide a different object to serialize ? This would be useful to translate a class into a structure that can be passed to json.dumps So, it __json__ is provided, its used for serialization instead of the object itself:
Cheers Tarek -- Tarek Ziadé | http://ziade.org

On Thu, Jul 29, 2010 at 01:35:41PM +0200, Tarek Ziad? wrote:
Also there must be a deserialization hook. Pickle uses __setstate__, and pickle stores the name of the class to call __setstate__ upon. Oleg. -- Oleg Broytman http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

On Thu, Jul 29, 2010 at 1:40 PM, Oleg Broytman <phd@phd.pp.ru> wrote:
You cannot do a round trip because once the object is serialized, json don't know which class to instantiate to de-serialize it Which is fine really, since json just serialize simple elements. Cheers Tarek -- Tarek Ziadé | http://ziade.org

Antoine Pitrou wrote:
How would you then write a class that works with both pickle and json ? IMO, we'd need a separate method to return a JSON version of the object, e.g. .__json__(). I'm not sure how deserialization could be handled, since JSON doesn't support arbitrary object types. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010)
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Thu, Jul 29, 2010 at 1:54 PM, M.-A. Lemburg <mal@egenix.com> wrote:
As I told Oleg, I think its OK not to have a round trip like Pickle. The use case I have is to express a structure in Json, but loading it back can be done in a custom, explicit process. It cannot be triggered from the json package itself since it cannot know that a given Json structure was built through a specific class. Cheers Tarek
-- Tarek Ziadé | http://ziade.org

Tarek Ziadé wrote:
I just wanted to emphasize that a separate new method is needed, rather than trying to reuse a pickle-protocol method. I don't think deserialization support is needed either. The application getting the decoded JSON data can do that in an application specific way based on the lists and dictionaries it gets from the JSON decoder. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010)
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Thu, Jul 29, 2010 at 7:54 AM, M.-A. Lemburg <mal@egenix.com> wrote:
+1. I think this is a very sensible idea. Note that Tarek's request was not for a magic method like __repr__ that would return an easy to parse string. Instead, the request was for a method that would return an object that can be serialized instead of the original object and will carry enough data to restore the original object.
How would you then write a class that works with both pickle and json ?
Hopefully, for most types json would be able to use a unmodified __reduce__ method. If his is not enough, the reduce protocol already has an extension mechanism. For example, an object may implement obj.__reduce_ex__('json') that would return json-friendly tuple instead of pickle oriented obj.__reduce__().
I am afraid this was the turning point in this thread after which the discussion went (IMO) in the wrong direction. Again, the OP's request was for a method that would return an object that json or another simple serializer (say yaml) could handle, not for a method that will return json string.

On 29 Jul, 2010, at 17:28, Alexander Belopolsky wrote:
I'm -1 on this because the __reduce__ protocol and the proposed __json__ protocol have slightly different purposes. When I use JSON I generally only publish part of the object-state into JSON, even when pickling the object would store the entire state. Another reason for not sharing the same method for pickling and json serialisation is that the json side may have external constraints (that is, the consumer of the JSON data may have requirements on how objects are serialized) and those constraints should not limit how the object can be pickled. Ronald

On Thu, Jul 29, 2010 at 1:41 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Maybe that's because I've never used it, but I find this protocol is very complex for this simple use case
-- Tarek Ziadé | http://ziade.org

Am 29.07.2010 13:35, schrieb Tarek Ziadé:
You can do this with a very short subclass of the JSONEncoder: class MyJSONEncoder(JSONEncoder): def default(self, obj): return obj.__json__() # with a useful failure message I don't think it needs to be built into the default encoder. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote:
Does that also work with the JSON C extension ?
I don't think it needs to be built into the default encoder.
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010)
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

Am 29.07.2010 14:31, schrieb M.-A. Lemburg:
I think so. The C encoder gets the default function as an argument. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote:
Then that sounds like the right way forward. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010)
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Thu, Jul 29, 2010 at 2:22 PM, Georg Brandl <g.brandl@gmx.net> wrote: ..
Yes, but you need to customize in that case the encoding process and own it. Having a builtin recognition of __json__ would allow you to pass your objects to be serialized to any third party code that uses a plain json.dumps. For instance, some web kits out there will automatically serialize your objects into json strings when you want to do json responses. e.g. it becomes a builtin adapter Cheers Tarek

On Thu, Jul 29, 2010 at 10:39 PM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
I'll channel PJE here and point out that this kind of magic-method based protocol proliferation is exactly what a general purpose generic-function implementation is designed to avoid (i.e. instead of having json.dumps check for a __json__ magic method, you'd just flag json.dumps as a generic function and let people register their own overloads). Each individual time this question comes up people tend to react with "oh, that's too complicated and overkill, but magic methods are simple, so let's just define another magic method". The sum total of all those magic methods starts to accumulate into a lot of complexity of its own though :P Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, 29 Jul 2010 22:51:11 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
I don't agree. __json__ only matters to people who do JSON encoding/decoding. Other people can safely ignore it. And I don't see how generic functions bring less cognitive overhead. (they actually bring more of it, since most implementations are more complicated to begin with)

On Thu, Jul 29, 2010 at 11:11 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Which is exactly the attitude I was talking about: for each individual case, people go "oh, I understand magic methods, those are easy". It's the overall process of identifying the need for and gathering consensus on magic methods that is unwieldy (and ultimately fails to scale, leading to non-extensible interfaces by default, with pretty printing being the classic example, and JSON serialisation the latest).
Mostly because the fully fledged generic implementations like PEAK-rules tend to get brought into discussions when they aren't needed. Single-type generic dispatch is actually so common they gave it a name: object-oriented programming. All single-type generic dispatch is about is having a registry for a particular operation that says "to perform this operation, with objects of this type, use this function". Instead of having a protocol that says "look up this magic method in the object's own namespace" (which requires a) agreement on the magic name to use and b) that the original author of the type in question both knew and cared about the operation the application developer is interested in) you instead have a protocol that says "here is a standard mechanism for declaring a type registry for a function, so you only have to learn how to register a function once". Is it really harder for people to learn how to write things like: json.dumps.overload(mytype, mytype.to_json) json.dumps.overload(third_party_type, my_third_party_type_serialiser) than it is for them to figure out that implementing a __json__ method will allow them to change how their object is serialised? (Not to mention that a __json__ method can only be used via monkey-patching if the type you want to serialise differently came from a library module rather than your own code). The generic function registration approach is incidentally discoverable via dir(json.dumps) to see that a function provides the relevant generic function registration methods. Magic method protocols can *only* be discovered by reading documentation. Function registration is a solved problem, with much better solutions than the ad hoc YAMM (yet-another-magic-method) approach we currently use. We just keep getting scared away from the right answer by the crazily complex overloading schemes that libraries like PEAK-rules allow. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Why do you want to gather consensus? There is a single json serialization module in the stdlib and it's obvious that __json__ can/should be claimed by that module. Actually, your argument could be returned: if you use generic functions (such as @json.dumps.overload), alternative json serializers won't easily be able to make use of the information, while they could access the __json__ method like the standard json module does.
It is certainly more annoying and less natural than: def __json__(self): .... Sure, generic functions as a paradigm appear more powerful, more decoupled, etc. But in practice __magic__ methods are sufficient for most uses. Practicality beats purity. That may be why in all the years that the various generic functions libraries have existed, they don't seem to have been really popular compared to the simpler convention of defining fixed method names. (besides, it wouldn't necessarily be json.dumps that you overload, but some internal function of the json module; making it even less intuitive and easily discoverable)
If help(json.dumps) includes a small blurb about __json__, it makes the information at least as easily discoverable as invoking dir(json.dumps). Besides, I don't find it shocking if documentation problems have to be solved through documentation. Regards Antoine.

On Thu, Jul 29, 2010 at 5:39 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I'd be -1 on an __json__ method. From the perspective of someone who works on Django, the big issue we have is how do you specify how a model (something from the database) should be serialized to JSON. Often people suggest something like __json__ that the Django serializer (which uses the json module) could pick up on, however this is usually rejected: objects tend to have multiple serializations based on context. Unlike pickle, which is usually used for internal consumption, json is usually intended for the wide world, and generally you want to expose different data to different clients. For example an event's json might include a list of attendees for an authenticated client, but an unauthenticated client should only see a list of titles. For this reason Django has always rejected such an approach, in favor of having a per-serialization specification. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me

Antoine Pitrou wrote:
Sure, generic functions as a paradigm appear more powerful, more decoupled, etc.
In this case there's a sense in which using a generic function could be seen as *increasing* coupling. Suppose I write a class Foo, and as a convenience to my users, I want to give it the ability to be json-serialised. If that is done using a generic function, then I need to put a call in my module to register it. But that makes my module dependent on the json-serialising module, even for applications which don't use json at all. The alternative is just to provide the function but don't register it. But using that approach, every application that *does* use json would be responsible for registering all the functions for all the classes that need to be serialised, including those in library modules that it may not be directly aware of. This doesn't seem like a good situation either. -- Greg

On Sat, Jul 31, 2010 at 11:17 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Hence why most generic function proposals are accompanied by proposals for lazy module import hooks (i.e. delaying the registration until the relevant module is imported). However, the simpler approach is just to recommend that single-dispatch generic functions default to a particular method. "magic method" vs "generic function" isn't actually an either-or decision: it is quite possible to have the latter rely on the former in its default "unrecognised type" implementation, while still providing the type registration infrastructure that allows an application to say "no, I don't want that behaviour in this case, I want to do something different". To be honest, there are actually some more features I would want to push for in ABCs (specifically, a public API to view an ABC's type registry, as well as a callback API to be notified of registration changes) before seriously proposing an official generic function implementation in the standard library. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, Jul 31, 2010 at 3:31 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: ...
funny hazard, I was proposing to PEP 3319 authors about having the _abc_registry attribute somehow exposed. do you have an idea on how this could be done without forcing ABC subclasses to have a new public method ? Maybe a separate function ? like
-- Tarek Ziadé | http://ziade.org

Thank you for explaining generic functions so clearly. Is there a good module out there implementing them without “crazily complex overloading schemes”? Regards

On Fri, Jul 30, 2010 at 8:46 AM, Éric Araujo <merwok@netwok.org> wrote:
I'm not sure. Most of my exposure to generic functions is through PJE and he's a big fan of pushing them to their limits (hence RuleDispatch and PEAK-rules). There is an extremely bare bones implementation used internally by pkgutil's emulation of the standard import process, but Guido has said we shouldn't document or promote that in any official way without a PEP (cf. the simple proposal in http://bugs.python.org/issue5135 and the PJE's previous more comprehensive proposal in PEP 3124). As others have explained more clearly than I did, generic functions work better than magic methods when the same basic operation (e.g. pretty printing, JSON serialisation) is common to many object types, but the details may vary between applications, or even within a single application. By giving the application more control over how different types are handled (through the generic functions' separate type registries) it is much easier to have context dependent behaviour, while still fairly easily sharing code in cases where it makes sense. E.g. to use Alex Gaynor's example of attendee list serialisation and the issue 5135 syntax: @functools.simplegeneric def json_unauthenticated(obj): return json.dumps(obj) # Default to a basic dumps() call @functools.simplegeneric def json_authenticated(obj): return json_unauthenticated(obj) # Default to being the same as unauthenticated info @json_unauthenticated.register(EventAttendees): def attendee_titles(attendees): return json.dumps([attendee.title for attendee in attendees]) @json_authenticated.register(EventAttendees): def attendee_details(attendees): return json.dumps([attendee.full_details() for attendee in attendees]) (Keep in mind that I don't use JSON, so there are likely plenty of details wrong with the above, but it should give a basic idea of what generic functions are designed to support). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

There is an extremely bare bones implementation used internally by pkgutil's emulation of the standard import process
Ah, I stumbled upon that this week actually, but did not understand how it worked nor why it was useful since there’s only one decorated function and only one registered type. Thanks for pointing it, I may play with it to get a better understanding and see the possibilities. Regards

On 30 July 2010 11:39, Nick Coghlan <ncoghlan@gmail.com> wrote:
I really like Alex Gaynor's simple MultiMethod implementation. From: http://alexgaynor.net/2010/jun/26/multimethods-python/ It doesn't have a concept of a default call, but that would be very easy to add. Basic usage is: json_unauthenticated = MultiMethod() @json_unauthenticated.register(EventAttendees) def json_unauthenticated(attendees): return json.dumps([attendee.title for attendee in attendees]) @json_unauthenticated.register(OtherType) def json_unauthenticated(othertypes): return json.dumps(othertypes) And so on. Michael

You can check out my implementation of generic functions and methods in Python [1]. There are no byte code hacks, no frame introspection, support for function and method dispatching by one or more positional arguments. [1]: pypi.python.org/pypi/generic On Fri, Jul 30, 2010 at 2:46 AM, Éric Araujo <merwok@netwok.org> wrote:
-- Andrey Popp phone: +7 911 740 24 91 e-mail: 8mayday@gmail.com

Thanks Andrey, I’ll play with it when I’ll take time to dive into generic functions. Regards

+1 on your entire diagnosis. If peak-rules is too complicated and perhaps unmaintained then the focus should be on cooking up a better generic function library. The complaints against peak-rules comes up frequently enough and this shows that there is a need for a generic function library because people do use peak-rules. The problem is not with the concept but only (perhaps) with this particular implementation (disclaimer: I'm perfectly happy with peak-rules). Cheers, Daniel -- Psss, psss, put it down! - http://www.cafepress.com/putitdown

On Thu, Jul 29, 2010 at 2:51 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That makes sense. OTHO, if we drop the idea of having a __magical__ method, we could have an collections' ABC instead, called JSONSerializable, with one method to override, This is more about declaring the interface rather than adding yet another __magic__ method That's a nice OOP pattern to have imho Cheers Tarek
-- Tarek Ziadé | http://ziade.org

On Thu, 29 Jul 2010 15:25:20 +0200 Tarek Ziadé <ziade.tarek@gmail.com> wrote:
Python is supposed to be duck-typed. It would be strange to add a couple of random exceptions to that general rule. Moreover, having to *both* derive an existing class and implement the single method defined on that class is one complication too many. And I don't see how `__json__` is more annoying than e.g. `to_json`. Regards Antoine.

On Thu, Jul 29, 2010 at 3:34 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Not sure to follow here, since ABCs are about having an object supporting a series of methods no matter what are the parent classes. e.g. this is closer to the concept of "interfaces". IOW you don't need to derive from a parent class, you just to need to provide a given set of methods, and ABC provides a ways to check that an object has that signature. see: http://docs.python.org/library/collections.html#abcs-abstract-base-classes ABS is the modern duck typing I'd say :)
-- Tarek Ziadé | http://ziade.org

Le jeudi 29 juillet 2010 à 15:42 +0200, Tarek Ziadé a écrit :
Ok, but then how does it avoid having a __magic__ method? You can't use a normal name such as "to_json" because then an existing class with that method could be wrongly inferred as implementing your new ABC, and break existing code. Besides, defining an ABC for a single, module-specific method sounds rather overkill. This reminds of me of projects plagued by an overuse of interfaces for every possible concept. Regards Antoine.

Well, it could be argued that testing for an Iterator is useful for a significant variety of code, while testing for a JSONSerialiazable doesn't have an use case outside of the json module itself. Besides, it remains to be seen if anyone will use the Iterator ABC instead of directly looking up the __next__ method. I'm not convinced that all of the ABCs bundled with the stdlib are really useful, apart from showcasing the potentialities of ABCs. Regards Antoine.

Have a look at turbojson [1], the jsonification package that uses peak.rules [2] and which comes with turbogears [3]. It does exactly what you propose. Cheers, Daniel [1a] http://pypi.python.org/pypi/TurboJson [1b] http://svn.turbogears.org/projects/TurboJson [2a] pypi.python.org/pypi/PEAK-Rules [2b] http://peak.telecommunity.com/DevCenter/RulesReadme [3] http:///www.turbogears.org -- Psss, psss, put it down! - http://www.cafepress.com/putitdown

Why?
Also, AFAIK, TurboGears have stopped using turbojson and relies on [simple]json instead.
That might be true for turbogears2 but turbogears1 (which is still in active development) still uses turbojson. Turbogears 1 and 2 diverged so much that it would be more appropriate to call them different names and consider them different projects (I personally use and prefer tg1). Cheers, Daniel -- Psss, psss, put it down! - http://www.cafepress.com/putitdown

On Thu, 29 Jul 2010 15:19:56 +0200 Daniel Fetchinson <fetchinson@googlemail.com> wrote:
That it uses PEAK-Rules is probably a good reason to avoid it.
Why?
I might be mistaken, but it seems to me that it isn't maintained anymore (or perhaps that's RuleDispatch, which is from the same author). It doesn't seem to have had a stable release in years. Regards Antoine.

On Thu, Jul 29, 2010 at 10:47 PM, Daniel Fetchinson <fetchinson@googlemail.com> wrote:
Speaking of PJE and generic functions* ;) Cheers, Nick. *For those following along at home that may not be familiar with the names of various Python developers, PJE is Phillip J. Eby, the author of peak.rules (amongst many other things). -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 7/29/2010 9:13 AM, Antoine Pitrou wrote:
On 7/29/2010 9:28 AM, Antoine Pitrou wrote:
Sounds like you are damning the man and just chucking the concept and his projects in along with him. URL: svn://svn.eby-sarna.com/svnroot/PEAK-Rules Last Changed Date: 2009-07-15 00:30:57 -0400 (Wed, 15 Jul 2009) r2600 | pje | 2009-07-15 00:30:57 -0400 (Wed, 15 Jul 2009) | 2 lines Fix for Python 2.6 DeprecationWarning PEAK-Rules-0.5a1.dev-r2600.tar.gz 29-Jul-2010 04:22 93K It's unclear to me that not having been changed in a year constitutes "unmaintained" especially since PJE seems quite responsive on the PEAK mailing list. So, please restrain yourself unless you have something more to say than FUD about PJE and PEAK-Rules. -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

On Thu, 29 Jul 2010 10:29:34 -0400 Scott Dial <scott+python-ideas@scottdial.com> wrote:
Well, sorry if I mixed up PEAK-Rules and RuleDispatch (which, again, are similar libraries from the same author). The fact that RuleDispatch has been unmaintained, though, has been a source of problems for some people and projects. Regards Antoine.

On Thu, Jul 29, 2010 at 7:35 AM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
Since there isn't really any magic going on, why use a __foo__ name? The majority of __foo__ names are for things you shouldn't reference yourself, but it doesn't seem like this is too personal a method to do that with. This allows inheritance of JSONization. The current custom serialization stuff does not. I'm not certain which is the bug and which is the feature. Since you aren't using anything useful from the json module, why involve it at all? Consistent API? One nice thing about the json module is that when using it you always produce valid JSON. Even the hooks for custom serialization keep this property. This is fairly nice to have. Regards, Mike

Mike Graham wrote:
To my mind, the main reason is to avoid name clashes. Protocol methods often may need to be added to just about any class, and using a __foo__ name greatly reduces the chance of it coinciding with some pre-existing class-specific method. Anyway, you don't call it yourself in this case either -- it's called by the proposed json-serialising framework. -- Greg

I'm uncomfortable with the __foo__ style proposed. Details and "what I would do" below. On 30Jul2010 12:06, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: | Mike Graham wrote: | >Since there isn't really any magic going on, why use a __foo__ name? | >The majority of __foo__ names are for things you shouldn't reference | >yourself | | To my mind, the main reason is to avoid name clashes. Protocol | methods often may need to be added to just about any class, | and using a __foo__ name greatly reduces the chance of it | coinciding with some pre-existing class-specific method. Might not the adder of a class specific method make the same argument? If they really want a class _specific_ method, ought thy not to be using the __foo style, thus avoiding clashes anyway? The __json__ name make me uncomfortable; to my mind __foo_ names belong to the language in order to implement/override stuff like [], not to a library hook. | Anyway, you don't call it yourself in this case either -- it's | called by the proposed json-serialising framework. I'm curious; what's the special benefit to JSON here? I don't mean JSON is unpopular or horrible, but I can see people going to a __xml__ hook for a proposed XML serialisation framework, and __sql__ for db storage, and ... I'm doing a little serialisation myself for another purpose. My code gets classes that want serialisation to register themselves with the serialisation module thus: # DB is a NodeDB instance, which can store various objects DB.register_type(class, tobytes, frombytes) where class is the class desiring special serialisation and tobytes and frombytes are callables; tobytes takes an instance of the class and returns the byte serialisation and frombytes does the reverse. No special names needed and no __foo__ special name reservation. Why wouldn't one just extend the json module with a "serialise this" and "unserialise this" type registry? Cheers, -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ You can listen to what everybody says, but the fact remains that you've got to get out there and do the thing yourself. - Joan Sutherland

On Thu, Jul 29, 2010 at 7:35 AM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
In my experience serializing an object is usually not a concern of the object itself. I do not want to have to touch every object in my system when I need an alternate format. The pattern I currently use is to hint, as a class-level tuple, the fields that should be serialized. django-piston has a good working example of this pattern. It becomes a bit unruly when you have a big object graph, but I typically keep my object models shallow. -- David blog: http://www.traceback.org twitter: http://twitter.com/dstanek

On Thu, Jul 29, 2010 at 01:35:41PM +0200, Tarek Ziad? wrote:
Also there must be a deserialization hook. Pickle uses __setstate__, and pickle stores the name of the class to call __setstate__ upon. Oleg. -- Oleg Broytman http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

On Thu, Jul 29, 2010 at 1:40 PM, Oleg Broytman <phd@phd.pp.ru> wrote:
You cannot do a round trip because once the object is serialized, json don't know which class to instantiate to de-serialize it Which is fine really, since json just serialize simple elements. Cheers Tarek -- Tarek Ziadé | http://ziade.org

Antoine Pitrou wrote:
How would you then write a class that works with both pickle and json ? IMO, we'd need a separate method to return a JSON version of the object, e.g. .__json__(). I'm not sure how deserialization could be handled, since JSON doesn't support arbitrary object types. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010)
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Thu, Jul 29, 2010 at 1:54 PM, M.-A. Lemburg <mal@egenix.com> wrote:
As I told Oleg, I think its OK not to have a round trip like Pickle. The use case I have is to express a structure in Json, but loading it back can be done in a custom, explicit process. It cannot be triggered from the json package itself since it cannot know that a given Json structure was built through a specific class. Cheers Tarek
-- Tarek Ziadé | http://ziade.org

Tarek Ziadé wrote:
I just wanted to emphasize that a separate new method is needed, rather than trying to reuse a pickle-protocol method. I don't think deserialization support is needed either. The application getting the decoded JSON data can do that in an application specific way based on the lists and dictionaries it gets from the JSON decoder. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010)
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Thu, Jul 29, 2010 at 7:54 AM, M.-A. Lemburg <mal@egenix.com> wrote:
+1. I think this is a very sensible idea. Note that Tarek's request was not for a magic method like __repr__ that would return an easy to parse string. Instead, the request was for a method that would return an object that can be serialized instead of the original object and will carry enough data to restore the original object.
How would you then write a class that works with both pickle and json ?
Hopefully, for most types json would be able to use a unmodified __reduce__ method. If his is not enough, the reduce protocol already has an extension mechanism. For example, an object may implement obj.__reduce_ex__('json') that would return json-friendly tuple instead of pickle oriented obj.__reduce__().
I am afraid this was the turning point in this thread after which the discussion went (IMO) in the wrong direction. Again, the OP's request was for a method that would return an object that json or another simple serializer (say yaml) could handle, not for a method that will return json string.

On 29 Jul, 2010, at 17:28, Alexander Belopolsky wrote:
I'm -1 on this because the __reduce__ protocol and the proposed __json__ protocol have slightly different purposes. When I use JSON I generally only publish part of the object-state into JSON, even when pickling the object would store the entire state. Another reason for not sharing the same method for pickling and json serialisation is that the json side may have external constraints (that is, the consumer of the JSON data may have requirements on how objects are serialized) and those constraints should not limit how the object can be pickled. Ronald

On Thu, Jul 29, 2010 at 1:41 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Maybe that's because I've never used it, but I find this protocol is very complex for this simple use case
-- Tarek Ziadé | http://ziade.org

Am 29.07.2010 13:35, schrieb Tarek Ziadé:
You can do this with a very short subclass of the JSONEncoder: class MyJSONEncoder(JSONEncoder): def default(self, obj): return obj.__json__() # with a useful failure message I don't think it needs to be built into the default encoder. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote:
Does that also work with the JSON C extension ?
I don't think it needs to be built into the default encoder.
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010)
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

Am 29.07.2010 14:31, schrieb M.-A. Lemburg:
I think so. The C encoder gets the default function as an argument. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote:
Then that sounds like the right way forward. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010)
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Thu, Jul 29, 2010 at 2:22 PM, Georg Brandl <g.brandl@gmx.net> wrote: ..
Yes, but you need to customize in that case the encoding process and own it. Having a builtin recognition of __json__ would allow you to pass your objects to be serialized to any third party code that uses a plain json.dumps. For instance, some web kits out there will automatically serialize your objects into json strings when you want to do json responses. e.g. it becomes a builtin adapter Cheers Tarek

On Thu, Jul 29, 2010 at 10:39 PM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
I'll channel PJE here and point out that this kind of magic-method based protocol proliferation is exactly what a general purpose generic-function implementation is designed to avoid (i.e. instead of having json.dumps check for a __json__ magic method, you'd just flag json.dumps as a generic function and let people register their own overloads). Each individual time this question comes up people tend to react with "oh, that's too complicated and overkill, but magic methods are simple, so let's just define another magic method". The sum total of all those magic methods starts to accumulate into a lot of complexity of its own though :P Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, 29 Jul 2010 22:51:11 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
I don't agree. __json__ only matters to people who do JSON encoding/decoding. Other people can safely ignore it. And I don't see how generic functions bring less cognitive overhead. (they actually bring more of it, since most implementations are more complicated to begin with)

On Thu, Jul 29, 2010 at 11:11 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Which is exactly the attitude I was talking about: for each individual case, people go "oh, I understand magic methods, those are easy". It's the overall process of identifying the need for and gathering consensus on magic methods that is unwieldy (and ultimately fails to scale, leading to non-extensible interfaces by default, with pretty printing being the classic example, and JSON serialisation the latest).
Mostly because the fully fledged generic implementations like PEAK-rules tend to get brought into discussions when they aren't needed. Single-type generic dispatch is actually so common they gave it a name: object-oriented programming. All single-type generic dispatch is about is having a registry for a particular operation that says "to perform this operation, with objects of this type, use this function". Instead of having a protocol that says "look up this magic method in the object's own namespace" (which requires a) agreement on the magic name to use and b) that the original author of the type in question both knew and cared about the operation the application developer is interested in) you instead have a protocol that says "here is a standard mechanism for declaring a type registry for a function, so you only have to learn how to register a function once". Is it really harder for people to learn how to write things like: json.dumps.overload(mytype, mytype.to_json) json.dumps.overload(third_party_type, my_third_party_type_serialiser) than it is for them to figure out that implementing a __json__ method will allow them to change how their object is serialised? (Not to mention that a __json__ method can only be used via monkey-patching if the type you want to serialise differently came from a library module rather than your own code). The generic function registration approach is incidentally discoverable via dir(json.dumps) to see that a function provides the relevant generic function registration methods. Magic method protocols can *only* be discovered by reading documentation. Function registration is a solved problem, with much better solutions than the ad hoc YAMM (yet-another-magic-method) approach we currently use. We just keep getting scared away from the right answer by the crazily complex overloading schemes that libraries like PEAK-rules allow. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Why do you want to gather consensus? There is a single json serialization module in the stdlib and it's obvious that __json__ can/should be claimed by that module. Actually, your argument could be returned: if you use generic functions (such as @json.dumps.overload), alternative json serializers won't easily be able to make use of the information, while they could access the __json__ method like the standard json module does.
It is certainly more annoying and less natural than: def __json__(self): .... Sure, generic functions as a paradigm appear more powerful, more decoupled, etc. But in practice __magic__ methods are sufficient for most uses. Practicality beats purity. That may be why in all the years that the various generic functions libraries have existed, they don't seem to have been really popular compared to the simpler convention of defining fixed method names. (besides, it wouldn't necessarily be json.dumps that you overload, but some internal function of the json module; making it even less intuitive and easily discoverable)
If help(json.dumps) includes a small blurb about __json__, it makes the information at least as easily discoverable as invoking dir(json.dumps). Besides, I don't find it shocking if documentation problems have to be solved through documentation. Regards Antoine.

On Thu, Jul 29, 2010 at 5:39 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I'd be -1 on an __json__ method. From the perspective of someone who works on Django, the big issue we have is how do you specify how a model (something from the database) should be serialized to JSON. Often people suggest something like __json__ that the Django serializer (which uses the json module) could pick up on, however this is usually rejected: objects tend to have multiple serializations based on context. Unlike pickle, which is usually used for internal consumption, json is usually intended for the wide world, and generally you want to expose different data to different clients. For example an event's json might include a list of attendees for an authenticated client, but an unauthenticated client should only see a list of titles. For this reason Django has always rejected such an approach, in favor of having a per-serialization specification. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me

Antoine Pitrou wrote:
Sure, generic functions as a paradigm appear more powerful, more decoupled, etc.
In this case there's a sense in which using a generic function could be seen as *increasing* coupling. Suppose I write a class Foo, and as a convenience to my users, I want to give it the ability to be json-serialised. If that is done using a generic function, then I need to put a call in my module to register it. But that makes my module dependent on the json-serialising module, even for applications which don't use json at all. The alternative is just to provide the function but don't register it. But using that approach, every application that *does* use json would be responsible for registering all the functions for all the classes that need to be serialised, including those in library modules that it may not be directly aware of. This doesn't seem like a good situation either. -- Greg

On Sat, Jul 31, 2010 at 11:17 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Hence why most generic function proposals are accompanied by proposals for lazy module import hooks (i.e. delaying the registration until the relevant module is imported). However, the simpler approach is just to recommend that single-dispatch generic functions default to a particular method. "magic method" vs "generic function" isn't actually an either-or decision: it is quite possible to have the latter rely on the former in its default "unrecognised type" implementation, while still providing the type registration infrastructure that allows an application to say "no, I don't want that behaviour in this case, I want to do something different". To be honest, there are actually some more features I would want to push for in ABCs (specifically, a public API to view an ABC's type registry, as well as a callback API to be notified of registration changes) before seriously proposing an official generic function implementation in the standard library. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, Jul 31, 2010 at 3:31 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: ...
funny hazard, I was proposing to PEP 3319 authors about having the _abc_registry attribute somehow exposed. do you have an idea on how this could be done without forcing ABC subclasses to have a new public method ? Maybe a separate function ? like
-- Tarek Ziadé | http://ziade.org

Thank you for explaining generic functions so clearly. Is there a good module out there implementing them without “crazily complex overloading schemes”? Regards

On Fri, Jul 30, 2010 at 8:46 AM, Éric Araujo <merwok@netwok.org> wrote:
I'm not sure. Most of my exposure to generic functions is through PJE and he's a big fan of pushing them to their limits (hence RuleDispatch and PEAK-rules). There is an extremely bare bones implementation used internally by pkgutil's emulation of the standard import process, but Guido has said we shouldn't document or promote that in any official way without a PEP (cf. the simple proposal in http://bugs.python.org/issue5135 and the PJE's previous more comprehensive proposal in PEP 3124). As others have explained more clearly than I did, generic functions work better than magic methods when the same basic operation (e.g. pretty printing, JSON serialisation) is common to many object types, but the details may vary between applications, or even within a single application. By giving the application more control over how different types are handled (through the generic functions' separate type registries) it is much easier to have context dependent behaviour, while still fairly easily sharing code in cases where it makes sense. E.g. to use Alex Gaynor's example of attendee list serialisation and the issue 5135 syntax: @functools.simplegeneric def json_unauthenticated(obj): return json.dumps(obj) # Default to a basic dumps() call @functools.simplegeneric def json_authenticated(obj): return json_unauthenticated(obj) # Default to being the same as unauthenticated info @json_unauthenticated.register(EventAttendees): def attendee_titles(attendees): return json.dumps([attendee.title for attendee in attendees]) @json_authenticated.register(EventAttendees): def attendee_details(attendees): return json.dumps([attendee.full_details() for attendee in attendees]) (Keep in mind that I don't use JSON, so there are likely plenty of details wrong with the above, but it should give a basic idea of what generic functions are designed to support). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

There is an extremely bare bones implementation used internally by pkgutil's emulation of the standard import process
Ah, I stumbled upon that this week actually, but did not understand how it worked nor why it was useful since there’s only one decorated function and only one registered type. Thanks for pointing it, I may play with it to get a better understanding and see the possibilities. Regards

On 30 July 2010 11:39, Nick Coghlan <ncoghlan@gmail.com> wrote:
I really like Alex Gaynor's simple MultiMethod implementation. From: http://alexgaynor.net/2010/jun/26/multimethods-python/ It doesn't have a concept of a default call, but that would be very easy to add. Basic usage is: json_unauthenticated = MultiMethod() @json_unauthenticated.register(EventAttendees) def json_unauthenticated(attendees): return json.dumps([attendee.title for attendee in attendees]) @json_unauthenticated.register(OtherType) def json_unauthenticated(othertypes): return json.dumps(othertypes) And so on. Michael

You can check out my implementation of generic functions and methods in Python [1]. There are no byte code hacks, no frame introspection, support for function and method dispatching by one or more positional arguments. [1]: pypi.python.org/pypi/generic On Fri, Jul 30, 2010 at 2:46 AM, Éric Araujo <merwok@netwok.org> wrote:
-- Andrey Popp phone: +7 911 740 24 91 e-mail: 8mayday@gmail.com

Thanks Andrey, I’ll play with it when I’ll take time to dive into generic functions. Regards

+1 on your entire diagnosis. If peak-rules is too complicated and perhaps unmaintained then the focus should be on cooking up a better generic function library. The complaints against peak-rules comes up frequently enough and this shows that there is a need for a generic function library because people do use peak-rules. The problem is not with the concept but only (perhaps) with this particular implementation (disclaimer: I'm perfectly happy with peak-rules). Cheers, Daniel -- Psss, psss, put it down! - http://www.cafepress.com/putitdown

On Thu, Jul 29, 2010 at 2:51 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That makes sense. OTHO, if we drop the idea of having a __magical__ method, we could have an collections' ABC instead, called JSONSerializable, with one method to override, This is more about declaring the interface rather than adding yet another __magic__ method That's a nice OOP pattern to have imho Cheers Tarek
-- Tarek Ziadé | http://ziade.org

On Thu, 29 Jul 2010 15:25:20 +0200 Tarek Ziadé <ziade.tarek@gmail.com> wrote:
Python is supposed to be duck-typed. It would be strange to add a couple of random exceptions to that general rule. Moreover, having to *both* derive an existing class and implement the single method defined on that class is one complication too many. And I don't see how `__json__` is more annoying than e.g. `to_json`. Regards Antoine.

On Thu, Jul 29, 2010 at 3:34 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Not sure to follow here, since ABCs are about having an object supporting a series of methods no matter what are the parent classes. e.g. this is closer to the concept of "interfaces". IOW you don't need to derive from a parent class, you just to need to provide a given set of methods, and ABC provides a ways to check that an object has that signature. see: http://docs.python.org/library/collections.html#abcs-abstract-base-classes ABS is the modern duck typing I'd say :)
-- Tarek Ziadé | http://ziade.org
participants (19)
-
Alex Gaynor
-
Alexander Belopolsky
-
Alexandre Conrad
-
Andrey Popp
-
Antoine Pitrou
-
Cameron Simpson
-
Daniel Fetchinson
-
David Stanek
-
Georg Brandl
-
Greg Ewing
-
M.-A. Lemburg
-
Michael Foord
-
Mike Graham
-
Nick Coghlan
-
Oleg Broytman
-
Ronald Oussoren
-
Scott Dial
-
Tarek Ziadé
-
Éric Araujo