
Is there any reason why date, datetime, and UUID objects automatically serialize to default strings using the csv module, but json.dumps throws an error as a default? i.e. import csv import json import io from datetime import date stream = io.StringIO() writer = csv.writer(stream) writer.writerow([date(2018, 11, 2)]) # versus json.dumps(date(2018, 11, 2))

First, this list is not appropriate. You should ask such a question in python-list. Second, JSON is a specific serialization format that explicitly rejects datetime objects in *all* the languages with JSON libraries. You can only use date objects in JSON if you control or understand both serialization and deserialization ends and have an agreed representation. On Fri, Nov 2, 2018 at 12:20 PM Philip Martin <philip.martin2007@gmail.com> wrote:
Is there any reason why date, datetime, and UUID objects automatically serialize to default strings using the csv module, but json.dumps throws an error as a default? i.e.
import csv import json import io from datetime import date
stream = io.StringIO() writer = csv.writer(stream) writer.writerow([date(2018, 11, 2)]) # versus json.dumps(date(2018, 11, 2))
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

My bad. I think I need to clarify my objective. I definitely understand the issues regarding serialization/deserialization on JSON, i.e. decimals as strings, etc., and hooking in a default serializer function is easy enough. I guess my question is more related to why the csv writer and DictWriter don't provide similar functionality for serialization/deserialization hooks? There seems to be a wide gap between reaching for a tool like pandas were maybe too much auto-magical parsing and guessing happens, and wrapping the functionality around the csv module IMO. I was curious to see if anyone else had similar opinions, and if so, whether conversion around what extended functionality would be most fruitful? On Fri, Nov 2, 2018 at 11:28 AM Calvin Spealman <cspealma@redhat.com> wrote:
First, this list is not appropriate. You should ask such a question in python-list.
Second, JSON is a specific serialization format that explicitly rejects datetime objects in *all* the languages with JSON libraries. You can only use date objects in JSON if you control or understand both serialization and deserialization ends and have an agreed representation.
On Fri, Nov 2, 2018 at 12:20 PM Philip Martin <philip.martin2007@gmail.com> wrote:
Is there any reason why date, datetime, and UUID objects automatically serialize to default strings using the csv module, but json.dumps throws an error as a default? i.e.
import csv import json import io from datetime import date
stream = io.StringIO() writer = csv.writer(stream) writer.writerow([date(2018, 11, 2)]) # versus json.dumps(date(2018, 11, 2))
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On Fri, Nov 2, 2018 at 10:31 AM Philip Martin <philip.martin2007@gmail.com> wrote:
[Why don't] csv writer and DictWriter provide ... serialization/deserialization hooks?
Do you have a specific use-case in mind? My intuition is that comprehensions provide sufficient functionality such that changing the csv module interface is unnecessary. Unlike JSON, CSV files are easy to read in a streaming/iterator fashion, so the module doesn't need to provide a way to intercept values during a holistic encode/decode.

On Nov 2, 2018, at 12:28 PM, Calvin Spealman <cspealma@redhat.com> wrote:
Second, JSON is a specific serialization format that explicitly rejects datetime objects in *all* the languages with JSON libraries. You can only use date objects in JSON if you control or understand both serialization and deserialization ends and have an agreed representation.
I would hardly say that "rejects datetime objects in *all* languages..." Most Javascript implementations do handle dates correctly which is a bit telling for me. For example, the Mozilla reference calls out Date as explicitly supported [1]. I also ran it through the Javascript console and repl.it to make sure that it wasn't a doc glitch [2]. Go also supports serialization of date/times as shown in this repl.it session [3]. As does rust, though rust doesn't use ISO-8601 [4]. That being said, I'm +1 on adding support for serializing datetime.date and datetime.datetime *but* I'm -1 on automatically deserializing anything that looks like a ISO-8601 in json.load*. The asymmetry is the only thing that kept me from bringing this up previously. What about implementing this as a protocol? The Javascript implementation of JSON.stringify looks for a method named toJSON() when it encounters a non-primitive type and uses the result for serialization. This would be a pretty easy lift in json.JSONEncoder.default: class JSONEncoder(object): def default(self, o): if hasattr(o, 'to_json'): return o.to_json(self) raise TypeError(f'Object of type {o.__class__.__name__} ' f'is not JSON serializable') I would recommend passing the JSONEncoder instance in to ``to_json()`` as I did in the snippet. This makes serialization much easier for classes since they do not have to assume a particular set of JSON serialization options. Is this something that is PEP-worthy or is a PR with a simple flag to enable the functionality in JSON encoder enough? - cheers, dave. [1]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Obj... [2]: https://repl.it/@dave_shawley/OffensiveParallelResource [3]: https://repl.it/@dave_shawley/EvenSunnyForce [4]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=73de1454da4ac56900cde37edb0d6c8f

On Sun, Nov 4, 2018 at 12:02 AM David Shawley <daveshawley@gmail.com> wrote:
On Nov 2, 2018, at 12:28 PM, Calvin Spealman <cspealma@redhat.com> wrote:
Second, JSON is a specific serialization format that explicitly rejects datetime objects in *all* the languages with JSON libraries. You can only use date objects in JSON if you control or understand both serialization and deserialization ends and have an agreed representation.
I would hardly say that "rejects datetime objects in *all* languages..."
Most Javascript implementations do handle dates correctly which is a bit telling for me. For example, the Mozilla reference calls out Date as explicitly supported [1]. I also ran it through the Javascript console and repl.it to make sure that it wasn't a doc glitch [2].
I think we need to clarify an important distinction here. JSON, as a format, does *not* support date/time objects in any way. But JavaScript's JSON.stringify() function is happy to accept them, and will represent them as strings. If the suggestion here is to have json.dumps(datetime.date(2018,11,4)) to return an encoded string, either by natively supporting it, or by having a protocol which the date object implements, that's fine and reasonable; but json.loads(s) won't return that date object. So, yes, it would be asymmetric. I personally don't have a problem with this (though I also don't have any strong use-cases). Custom encoders and decoders could do this, with or without symmetry. What would it be like to add a couple to the json module that can handle these extra types? ChrisA

On Nov 3, 2018, at 9:29 AM, Chris Angelico <rosuav@gmail.com> wrote:
I think we need to clarify an important distinction here. JSON, as a format, does *not* support date/time objects in any way. But JavaScript's JSON.stringify() function is happy to accept them, and will represent them as strings.
Very good point. The JSON document type only supports object literals, numbers, strings, and Boolean literals. My suggestion was specifically to provide an extensible mechanism for encoding arbitrary objects into the supported primitives.
If the suggestion here is to have json.dumps(datetime.date(2018,11,4)) to return an encoded string, either by natively supporting it, or by having a protocol which the date object implements, that's fine and reasonable; but json.loads(s) won't return that date object. So, yes, it would be asymmetric. I personally don't have a problem with this (though I also don't have any strong use-cases). Custom encoders and decoders could do this, with or without symmetry. What would it be like to add a couple to the json module that can handle these extra types?
Completely agreed here. I've seen many attempts to support "round trip" encode/decode in JSON libraries and it really doesn't work well unless you go down the path of type hinting. I believe that MongoDB uses something akin to hinting when it handles dates. Something like the following representation if I recall correctly. { "now": { "$type": "JSONDate", "value": "2018-11-03T09:52:20-0400" } } During deserialization they recognize the hint and instantiate the object instead of parsing it. This is interesting but pretty awful for interoperability since there isn't a standard that I'm aware of. I'm certainly not proposing that but I did want to mention it for completeness. I'll try to put together a PR/branch that adds protocol support in JSON encoder and to datetime, date, and uuid as well. It will give us something to point at and discuss. - cheers, dave. -- Mathematics provides a framework for dealing precisely with notions of "what is". Computation provides a framework for dealing precisely with notions of "how to". SICP Preface

jsondate, for example, supports both .load[s]() and .dump[s](); but only for UTC datetimes https://github.com/rconradharris/jsondate/blob/master/jsondate/__init__.py UTC is only sometimes a fair assumption; otherwise it's dangerous to assume that timezone-naieve [ISO8601] strings represent UTC-0 datetimes. In that respect - aside from readability - arbitrary-precision POSIX timestamps are less error-prone. On Saturday, November 3, 2018, David Shawley <daveshawley@gmail.com> wrote:
On Nov 3, 2018, at 9:29 AM, Chris Angelico <rosuav@gmail.com> wrote:
I think we need to clarify an important distinction here. JSON, as a format, does *not* support date/time objects in any way. But JavaScript's JSON.stringify() function is happy to accept them, and will represent them as strings.
Very good point. The JSON document type only supports object literals, numbers, strings, and Boolean literals. My suggestion was specifically to provide an extensible mechanism for encoding arbitrary objects into the supported primitives.
If the suggestion here is to have json.dumps(datetime.date(2018,11,4)) to return an encoded string, either by natively supporting it, or by having a protocol which the date object implements, that's fine and reasonable; but json.loads(s) won't return that date object. So, yes, it would be asymmetric. I personally don't have a problem with this (though I also don't have any strong use-cases). Custom encoders and decoders could do this, with or without symmetry. What would it be like to add a couple to the json module that can handle these extra types?
Completely agreed here. I've seen many attempts to support "round trip" encode/decode in JSON libraries and it really doesn't work well unless you go down the path of type hinting. I believe that MongoDB uses something akin to hinting when it handles dates. Something like the following representation if I recall correctly.
{ "now": { "$type": "JSONDate", "value": "2018-11-03T09:52:20-0400" } }
During deserialization they recognize the hint and instantiate the object instead of parsing it. This is interesting but pretty awful for interoperability since there isn't a standard that I'm aware of. I'm certainly not proposing that but I did want to mention it for completeness.
I'll try to put together a PR/branch that adds protocol support in JSON encoder and to datetime, date, and uuid as well. It will give us something to point at and discuss.
- cheers, dave. -- Mathematics provides a framework for dealing precisely with notions of "what is". Computation provides a framework for dealing precisely with notions of "how to". SICP Preface

On Sun, Nov 4, 2018 at 1:00 AM David Shawley <daveshawley@gmail.com> wrote:
Very good point. The JSON document type only supports object literals, numbers, strings, and Boolean literals. My suggestion was specifically to provide an extensible mechanism for encoding arbitrary objects into the supported primitives.
Okay, so to clarify: We currently have a mechanism for custom encoders and decoders, which you have to specify as you're thinking about encoding. But you're proposing having the core json.dumps() allow objects to customize their own representation. Sounds like a plan, and not even all that complex a plan. ChrisA

David Shawley wrote:
I'm +1 on adding support for serializing datetime.date and datetime.datetime *but* I'm -1 on automatically deserializing anything that looks like a ISO-8601 in json.load*. The asymmetry is the only thing that kept me from bringing this up previously.
This asymmetry bothers me too. It makes me think that datetime handling belongs at a different level of abstraction, something that knows about the structure of the data being serialised or deserialised. Java's JSON libraries have a mechanism where you can give it a class and a lump of JSON and it will figure out from runtime type information what to do. It seems like we should be able to do something similar using type annotations. -- Greg

On Nov 3, 2018, at 5:43 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
David Shawley wrote:
I'm +1 on adding support for serializing datetime.date and datetime.datetime *but* I'm -1 on automatically deserializing anything that looks like a ISO-8601 in json.load*. The asymmetry is the only thing that kept me from bringing this up previously.
This asymmetry bothers me too. It makes me think that datetime handling belongs at a different level of abstraction, something that knows about the structure of the data being serialised or deserialised.
Java's JSON libraries have a mechanism where you can give it a class and a lump of JSON and it will figure out from runtime type information what to do. It seems like we should be able to do something similar using type annotations.
I was thinking about trying to do something similar to what golang has done in their JSON support [1]. It is similar to what I would have done with JAXB when I was still doing Java [2]. In both cases you have a type explicitly bound to a JSON blob. The place to make this sort of change might be in the JSONDecoder and JSONEncoder classes. Personally, I would place this sort of serialization logic outside of the Standard Library -- maybe following the pattern that the rust community adopted on this very issue. In short, they separated serialization & de-serialization into a free-standing library. The best discussion that I have found is a reddit thread [3]. The new library that they built is called serde [4] and the previous code is in their deprecated library section [5]. The difference between the two approaches is that golang simply annotates the types similar to what I would expect to happen in the Python case. Then you are required to pass a list of types into the deserializer so it knows what which types are candidates for deserialization. The rust and JAXB approaches rely on type registration into the deserialiation framework. We could probably use type annotations to handle the asymmetry provided that we change the JSONDecoder interface to accept a list of classes that are candidates for deserialization or something similar. I would place this outside of the Standard Library as a generalized serialization / de-serialization framework since it feels embryonic to me. This could be a new implementation for CSV and pickle as well. Bringing the conversation back around, I'm going to continue adding a simple JSON formatting protocol that is asymmetric since it does solve a need that I and others have today. I'm not completely sure what the best way to move this forward is. I have most of an implementation working based on a simple protocol of one method. Should I: 1. Open a BPO and continue the discussion there once I have a working prototype? 2. Continue the discussion here? 3. Move the discussion to python-dev under a more appropriate subject? cheers, dave. [1]: https://golang.org/pkg/encoding/json/#Marshal [2]: https://docs.oracle.com/javaee/6/tutorial/doc/gkknj.html#gmfnu [3]: https://www.reddit.com/r/rust/comments/3v4ktz/differences_between_serde_and_... [4]: https://serde.rs [5]: https://github.com/rust-lang-deprecated/rustc-serialize -- "Syntactic sugar causes cancer of the semicolon" - Alan Perlis

Here's a JSONEncoder subclass with a default method that checks variable types in a defined sequence that includes datetime: https://gist.github.com/majgis/4200488 Passing an ordered map of (Type, fn) may or may not be any more readable than simply subclassing JSONEncoder and defining .default(). On Sunday, November 4, 2018, David Shawley <daveshawley@gmail.com> wrote:
On Nov 3, 2018, at 5:43 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I'm +1 on adding support for serializing datetime.date and datetime.datetime *but* I'm -1 on automatically deserializing anything
looks like a ISO-8601 in json.load*. The asymmetry is the only thing
David Shawley wrote: that that
kept me from bringing this up previously.
This asymmetry bothers me too. It makes me think that datetime handling belongs at a different level of abstraction, something that knows about the structure of the data being serialised or deserialised.
Java's JSON libraries have a mechanism where you can give it a class and a lump of JSON and it will figure out from runtime type information what to do. It seems like we should be able to do something similar using type annotations.
I was thinking about trying to do something similar to what golang has done in their JSON support [1]. It is similar to what I would have done with JAXB when I was still doing Java [2]. In both cases you have a type explicitly bound to a JSON blob. The place to make this sort of change might be in the JSONDecoder and JSONEncoder classes.
Personally, I would place this sort of serialization logic outside of the Standard Library -- maybe following the pattern that the rust community adopted on this very issue. In short, they separated serialization & de-serialization into a free-standing library. The best discussion that I have found is a reddit thread [3]. The new library that they built is called serde [4] and the previous code is in their deprecated library section [5].
The difference between the two approaches is that golang simply annotates the types similar to what I would expect to happen in the Python case. Then you are required to pass a list of types into the deserializer so it knows what which types are candidates for deserialization. The rust and JAXB approaches rely on type registration into the deserialiation framework.
We could probably use type annotations to handle the asymmetry provided that we change the JSONDecoder interface to accept a list of classes that are candidates for deserialization or something similar. I would place this outside of the Standard Library as a generalized serialization / de-serialization framework since it feels embryonic to me. This could be a new implementation for CSV and pickle as well.
Bringing the conversation back around, I'm going to continue adding a simple JSON formatting protocol that is asymmetric since it does solve a need that I and others have today. I'm not completely sure what the best way to move this forward is. I have most of an implementation working based on a simple protocol of one method. Should I:
1. Open a BPO and continue the discussion there once I have a working prototype?
2. Continue the discussion here?
3. Move the discussion to python-dev under a more appropriate subject?
cheers, dave.
[1]: https://golang.org/pkg/encoding/json/#Marshal [2]: https://docs.oracle.com/javaee/6/tutorial/doc/gkknj.html#gmfnu [3]: https://www.reddit.com/r/rust/comments/3v4ktz/differences_ between_serde_and_rustc_serialize/ [4]: https://serde.rs [5]: https://github.com/rust-lang-deprecated/rustc-serialize
-- "Syntactic sugar causes cancer of the semicolon" - Alan Perlis

Serialization of those data types is not defined in the JSON standard: https://www.json.org/ so you have to extend the parser/serializers to support them. On 02.11.2018 17:19, Philip Martin wrote:
Is there any reason why date, datetime, and UUID objects automatically serialize to default strings using the csv module, but json.dumps throws an error as a default? i.e.
import csv import json import io from datetime import date
stream = io.StringIO() writer = csv.writer(stream) writer.writerow([date(2018, 11, 2)]) # versus json.dumps(date(2018, 11, 2))
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Nov 02 2018)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Serialization of those data types is not defined in the JSON standard:
That being said, ISO 8601 is a standard for datetime stamps, and a defacto one for JSON So building encoding of datetime into Python's json encoder would be pretty useful. (I would not have any automatic decoding though -- as an ISO8601 string would still be just a string in JSON) Could we have a "pedantic" mode for "fully standard conforming" JSON, and then add some extensions to the standard? As another example, I would find it very handy if the json decoder would respect comments in JSON (I know that they are explicitly not part of the standard), but they are used in other applications, particularly when JSON is used as a configuration language. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

JSON-LD supports datetimes (as e.g. IS8601 xsd:datetimes) https://www.w3.org/TR/json-ld11/#typed-values Jsonpickle (Python, JS, ) supports datetimes, numpy arrays, pandas dataframes https://github.com/jsonpickle/jsonpickle JSON5 supports comments in JSON. https://github.com/json5/json5/issues/3 ... Some form of schema is necessary to avoid having to try parsing every string value as a date time (and to specify precision: "2018" is not the same as "2018 00:00:01") On Friday, November 2, 2018, Chris Barker via Python-ideas < python-ideas@python.org> wrote:
On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Serialization of those data types is not defined in the JSON standard:
That being said, ISO 8601 is a standard for datetime stamps, and a defacto one for JSON
So building encoding of datetime into Python's json encoder would be pretty useful.
(I would not have any automatic decoding though -- as an ISO8601 string would still be just a string in JSON)
Could we have a "pedantic" mode for "fully standard conforming" JSON, and then add some extensions to the standard?
As another example, I would find it very handy if the json decoder would respect comments in JSON (I know that they are explicitly not part of the standard), but they are used in other applications, particularly when JSON is used as a configuration language.
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov

On Fri, Nov 2, 2018 at 12:17 PM, Wes Turner <wes.turner@gmail.com> wrote:
JSON5 supports comments in JSON. https://github.com/json5/json5/issues/3
and other nifty things -- any plans to support JSON5 in the stdlib json library? I think that would be great. -CHB
... Some form of schema is necessary to avoid having to try parsing every string value as a date time (and to specify precision: "2018" is not the same as "2018 00:00:01")
On Friday, November 2, 2018, Chris Barker via Python-ideas < python-ideas@python.org> wrote:
On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Serialization of those data types is not defined in the JSON standard:
That being said, ISO 8601 is a standard for datetime stamps, and a defacto one for JSON
So building encoding of datetime into Python's json encoder would be pretty useful.
(I would not have any automatic decoding though -- as an ISO8601 string would still be just a string in JSON)
Could we have a "pedantic" mode for "fully standard conforming" JSON, and then add some extensions to the standard?
As another example, I would find it very handy if the json decoder would respect comments in JSON (I know that they are explicitly not part of the standard), but they are used in other applications, particularly when JSON is used as a configuration language.
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206 <https://maps.google.com/?q=7600+Sand+Point+Way+NE+%C2%A0%C2%A0(206&entry=gmail&source=g>) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

02.11.18 19:26, Chris Barker via Python-ideas пише:
On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg <mal@egenix.com <mailto:mal@egenix.com>> wrote:
Serialization of those data types is not defined in the JSON standard:
That being said, ISO 8601 is a standard for datetime stamps, and a defacto one for JSON
It is not the only standard. Other common representation is as POSIX timestamp. And, for a date without time, the Julian day.
participants (10)
-
Calvin Spealman
-
Chris Angelico
-
Chris Barker
-
David Shawley
-
Greg Ewing
-
M.-A. Lemburg
-
Michael Selik
-
Philip Martin
-
Serhiy Storchaka
-
Wes Turner