[Python-ideas] Serialization of CSV vs. JSON

David Shawley daveshawley at gmail.com
Sun Nov 4 08:49:08 EST 2018

On Nov 3, 2018, at 5:43 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> David Shawley wrote:
> > I'm +1 on adding support for serializing datetime.date and
> > datetime.datetime *but* I'm -1 on automatically deserializing anything that
> > looks like a ISO-8601 in json.load*.  The asymmetry is the only thing that
> > kept me from bringing this up previously.
> This asymmetry bothers me too. It makes me think that datetime
> handling belongs at a different level of abstraction, something
> that knows about the structure of the data being serialised or
> deserialised.
> Java's JSON libraries have a mechanism where you can give it
> a class and a lump of JSON and it will figure out from runtime
> type information what to do. It seems like we should be able
> to do something similar using type annotations.

I was thinking about trying to do something similar to what golang has done in
their JSON support [1].  It is similar to what I would have done with JAXB when
I was still doing Java [2].  In both cases you have a type explicitly bound to
a JSON blob.  The place to make this sort of change might be in the
JSONDecoder and JSONEncoder classes.

Personally, I would place this sort of serialization logic outside of the
Standard Library -- maybe following the pattern that the rust community
adopted on this very issue.  In short, they separated serialization &
de-serialization into a free-standing library.  The best discussion that I
have found is a reddit thread [3].  The new library that they built is called
serde [4] and the previous code is in their deprecated library section [5].

The difference between the two approaches is that golang simply annotates the
types similar to what I would expect to happen in the Python case.  Then you
are required to pass a list of types into the deserializer so it knows what
which types are candidates for deserialization.  The rust and JAXB approaches
rely on type registration into the deserialiation framework.

We could probably use type annotations to handle the asymmetry provided that
we change the JSONDecoder interface to accept a list of classes that are
candidates for deserialization or something similar.  I would place this
outside of the Standard Library as a generalized serialization /
de-serialization framework since it feels embryonic to me.  This could be a
new implementation for CSV and pickle as well.

Bringing the conversation back around, I'm going to continue adding a simple
JSON formatting protocol that is asymmetric since it does solve a need that
I and others have today.  I'm not completely sure what the best way to move
this forward is.  I have most of an implementation working based on a simple
protocol of one method.  Should I:

1. Open a BPO and continue the discussion there once I have a working

2. Continue the discussion here?

3. Move the discussion to python-dev under a more appropriate subject?

cheers, dave.

[1]: https://golang.org/pkg/encoding/json/#Marshal
[2]: https://docs.oracle.com/javaee/6/tutorial/doc/gkknj.html#gmfnu
[3]: https://www.reddit.com/r/rust/comments/3v4ktz/differences_between_serde_and_rustc_serialize/
[4]: https://serde.rs
[5]: https://github.com/rust-lang-deprecated/rustc-serialize

"Syntactic sugar causes cancer of the semicolon" - Alan Perlis

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181104/5b202cff/attachment.html>

More information about the Python-ideas mailing list