On Mon, Apr 6, 2020, 3:37 PM Dan Cojocaru <dan.cojocaru00@e-uvt.ro> wrote:
I don't see why the need for standardisation exists.

Here's another example because I can explain myself best by giving examples:

I have an API. I can write some classes that correspond to JSON inputs my API expects and define __json__ in them to convey how they are expected to be serialised. For example, a Person class with first_name and last_name fields being serialised as a dict with the fields in camel case instead. (Sure, normally I'd write a serialisation function as well as part of my package in this case but I'm just coming up with examples on the spot).

DRF (Django REST API framework) serializers have a .to_representation() method:
> Takes the object instance that requires serialization, and should return a primitive representation. Typically this means returning a structure of built-in Python datatypes. The exact types that can be handled will depend on the render classes you have configured for your API.
> May be overridden in order to modify the representation style

For non standard things like date and time, many implementations of JSON already added support for it using a format I can't recall at the moment. And even if a class defines __json__ in a particular format but you would need another one, the method of making a custom JSONEncoder still remains, as __json__ functions would be attempted only after the custom default implementation. Therefore, __json__ would enable sensible defaults to be chosen while not harming the possibility of using a custom format if it is so desired.

In trying to do this (again), I've realized that you *can't* just check for hasattr(obj, '__json__') in a JSONEncoder.default method because default only get's called for types it doesn't know how to serialize; so if you e.g. subclass a dict, default() never gets called for the dict subclass.

> If specified, default should be a function that gets called for objects that can’t otherwise be serialized.


specifies how to overload _make_iterencode *in pure Python*, but AFAICS that NOPs use of the C-optimized json encoder.

The python json module originally came from simplejson.

Here's simplejson's for_json implementation in pure Python:
https://github.com/simplejson/simplejson/blob/288e4e005c39a2eb855b5225c5dc8ebcb82eee72/simplejson/encoder.py#L515 :

for_json = _for_json and getattr(value, 'for_json', None)
if for_json and callable(for_json):
    chunks = _iterencode(for_json(), _current_indent_level)

And simplejson's for_json implementation in C:

Is there a strong reason that the method would need to be called __json__ instead of 'for_json'?

Passing a spec=None kwarg through to the __json__()/for_json() method would not be compatible with simplejson's existing implementation.
Passing a spec=None kwarg would be necessary to support different JSON standards within the same method; which I think is desirable because JSON/JSON5/JSONLD/JSON_future.

Dan Cojocaru
On 6 Apr 2020, 22:08 +0300, Chris Angelico <rosuav@gmail.com>, wrote:
On Tue, Apr 7, 2020 at 4:48 AM Andrew Barnert via Python-ideas
<python-ideas@python.org> wrote:
That’s not true for JSON; the entire point of it is data interchange. You expect to be able to dump an object, send it over the wire or store it to a file, load it (or even parse it in JS or ObjC or Go or whatever) and get back an equivalent object. It’s easy to come up with ways to build on top of JSON to interchange things like time points or raw binary strings or higher-level structured objects, but they require doing something on both the encode side and the decode side. Just being able to encode them to something human-readable is useless—if I encode a datetime object, I need to get back a datetime (or Date or NSDate or whatever) on the other end, not a str (or string or NSString or whatever) that a human could tell is probably meant to be a datetime but will raise an exception when I try to subtract it from now().

(Of course JSON isn’t perfect, as anyone who’s tried to interchange, say, int64 values discovers… but it’s good enough for many applications.)

The trouble with using JSON to interchange things that aren't in the
JSON spec is that you then have to build another layer on top of it -
a non-standard layer. That means that whatever you do, it's inherently
app-specific. For instance, I might choose to encode a datetime as a
string in ISO format, but how is the other end going to know that it
was meant to be a date? (Usually, if I'm sending something to some
front-end JavaScript, I'll just hard-code the JS to know that
thing[0].created is a timestamp, and should be parsed accordingly.)

So if it's app-specific, then the best way to handle it is in your
app, not in the data type you're encoding. Subclassing JSONEncoder
works for this; adding a __json__ method doesn't really, unless there
is some single canonical encoding for a particular object.

The two become close to equivalent when you're only asking about your
own custom classes. You can, in fact, create your own private __json__
protocol (although, since it's private to you, it'd probably be better
to call the method "to_json" rather than "__json__"), and have a
subclass of JSONEncoder that calls it. It'd work fine because you
don't NEED to interoperate with other libraries. It's only when you
try to standardize something that's inherently nonstandard that things
get problematic :)

Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4ZEQKTQ5CMZUXC6E6JVFHQS3VGYEDJ7C/
Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7LTMRW3IL4A4U3VYDZBZEXBLGOMTSFH4/
Code of Conduct: http://python.org/psf/codeofconduct/