Is there any reason why date, datetime, and UUID objects automatically serialize to default strings using the csv module, but json.dumps throws an error as a default? i.e. import csv import json import io from datetime import date stream = io.StringIO() writer = csv.writer(stream) writer.writerow([date(2018, 11, 2)]) # versus json.dumps(date(2018, 11, 2))
First, this list is not appropriate. You should ask such a question in
python-list.
Second, JSON is a specific serialization format that explicitly rejects
datetime objects in *all* the languages with JSON libraries. You can only
use date objects in JSON if you control or understand both serialization
and deserialization ends and have an agreed representation.
On Fri, Nov 2, 2018 at 12:20 PM Philip Martin
Is there any reason why date, datetime, and UUID objects automatically serialize to default strings using the csv module, but json.dumps throws an error as a default? i.e.
import csv import json import io from datetime import date
stream = io.StringIO() writer = csv.writer(stream) writer.writerow([date(2018, 11, 2)]) # versus json.dumps(date(2018, 11, 2))
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Serialization of those data types is not defined in the JSON standard: https://www.json.org/ so you have to extend the parser/serializers to support them. On 02.11.2018 17:19, Philip Martin wrote:
Is there any reason why date, datetime, and UUID objects automatically serialize to default strings using the csv module, but json.dumps throws an error as a default? i.e.
import csv import json import io from datetime import date
stream = io.StringIO() writer = csv.writer(stream) writer.writerow([date(2018, 11, 2)]) # versus json.dumps(date(2018, 11, 2))
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Nov 02 2018)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/
On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg
Serialization of those data types is not defined in the JSON standard:
That being said, ISO 8601 is a standard for datetime stamps, and a defacto one for JSON So building encoding of datetime into Python's json encoder would be pretty useful. (I would not have any automatic decoding though -- as an ISO8601 string would still be just a string in JSON) Could we have a "pedantic" mode for "fully standard conforming" JSON, and then add some extensions to the standard? As another example, I would find it very handy if the json decoder would respect comments in JSON (I know that they are explicitly not part of the standard), but they are used in other applications, particularly when JSON is used as a configuration language. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
My bad. I think I need to clarify my objective. I definitely understand the
issues regarding serialization/deserialization on JSON, i.e. decimals as
strings, etc., and hooking in a default serializer function is easy enough.
I guess my question is more related to why the csv writer and DictWriter
don't provide similar functionality for serialization/deserialization
hooks? There seems to be a wide gap between reaching for a tool like pandas
were maybe too much auto-magical parsing and guessing happens, and wrapping
the functionality around the csv module IMO. I was curious to see if anyone
else had similar opinions, and if so, whether conversion around what
extended functionality would be most fruitful?
On Fri, Nov 2, 2018 at 11:28 AM Calvin Spealman
First, this list is not appropriate. You should ask such a question in python-list.
Second, JSON is a specific serialization format that explicitly rejects datetime objects in *all* the languages with JSON libraries. You can only use date objects in JSON if you control or understand both serialization and deserialization ends and have an agreed representation.
On Fri, Nov 2, 2018 at 12:20 PM Philip Martin
wrote: Is there any reason why date, datetime, and UUID objects automatically serialize to default strings using the csv module, but json.dumps throws an error as a default? i.e.
import csv import json import io from datetime import date
stream = io.StringIO() writer = csv.writer(stream) writer.writerow([date(2018, 11, 2)]) # versus json.dumps(date(2018, 11, 2))
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Nov 2, 2018 at 10:31 AM Philip Martin
[Why don't] csv writer and DictWriter provide ... serialization/deserialization hooks?
Do you have a specific use-case in mind? My intuition is that comprehensions provide sufficient functionality such that changing the csv module interface is unnecessary. Unlike JSON, CSV files are easy to read in a streaming/iterator fashion, so the module doesn't need to provide a way to intercept values during a holistic encode/decode.
JSON-LD supports datetimes (as e.g. IS8601 xsd:datetimes) https://www.w3.org/TR/json-ld11/#typed-values Jsonpickle (Python, JS, ) supports datetimes, numpy arrays, pandas dataframes https://github.com/jsonpickle/jsonpickle JSON5 supports comments in JSON. https://github.com/json5/json5/issues/3 ... Some form of schema is necessary to avoid having to try parsing every string value as a date time (and to specify precision: "2018" is not the same as "2018 00:00:01") On Friday, November 2, 2018, Chris Barker via Python-ideas < python-ideas@python.org> wrote:
On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg
wrote: Serialization of those data types is not defined in the JSON standard:
That being said, ISO 8601 is a standard for datetime stamps, and a defacto one for JSON
So building encoding of datetime into Python's json encoder would be pretty useful.
(I would not have any automatic decoding though -- as an ISO8601 string would still be just a string in JSON)
Could we have a "pedantic" mode for "fully standard conforming" JSON, and then add some extensions to the standard?
As another example, I would find it very handy if the json decoder would respect comments in JSON (I know that they are explicitly not part of the standard), but they are used in other applications, particularly when JSON is used as a configuration language.
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
On Nov 2, 2018, at 12:28 PM, Calvin Spealman
Second, JSON is a specific serialization format that explicitly rejects datetime objects in *all* the languages with JSON libraries. You can only use date objects in JSON if you control or understand both serialization and deserialization ends and have an agreed representation.
I would hardly say that "rejects datetime objects in *all* languages..." Most Javascript implementations do handle dates correctly which is a bit telling for me. For example, the Mozilla reference calls out Date as explicitly supported [1]. I also ran it through the Javascript console and repl.it to make sure that it wasn't a doc glitch [2]. Go also supports serialization of date/times as shown in this repl.it session [3]. As does rust, though rust doesn't use ISO-8601 [4]. That being said, I'm +1 on adding support for serializing datetime.date and datetime.datetime *but* I'm -1 on automatically deserializing anything that looks like a ISO-8601 in json.load*. The asymmetry is the only thing that kept me from bringing this up previously. What about implementing this as a protocol? The Javascript implementation of JSON.stringify looks for a method named toJSON() when it encounters a non-primitive type and uses the result for serialization. This would be a pretty easy lift in json.JSONEncoder.default: class JSONEncoder(object): def default(self, o): if hasattr(o, 'to_json'): return o.to_json(self) raise TypeError(f'Object of type {o.__class__.__name__} ' f'is not JSON serializable') I would recommend passing the JSONEncoder instance in to ``to_json()`` as I did in the snippet. This makes serialization much easier for classes since they do not have to assume a particular set of JSON serialization options. Is this something that is PEP-worthy or is a PR with a simple flag to enable the functionality in JSON encoder enough? - cheers, dave. [1]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Obj... [2]: https://repl.it/@dave_shawley/OffensiveParallelResource [3]: https://repl.it/@dave_shawley/EvenSunnyForce [4]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=73de1454da4ac56900cde37edb0d6c8f
On Sun, Nov 4, 2018 at 12:02 AM David Shawley
On Nov 2, 2018, at 12:28 PM, Calvin Spealman
wrote: Second, JSON is a specific serialization format that explicitly rejects datetime objects in *all* the languages with JSON libraries. You can only use date objects in JSON if you control or understand both serialization and deserialization ends and have an agreed representation.
I would hardly say that "rejects datetime objects in *all* languages..."
Most Javascript implementations do handle dates correctly which is a bit telling for me. For example, the Mozilla reference calls out Date as explicitly supported [1]. I also ran it through the Javascript console and repl.it to make sure that it wasn't a doc glitch [2].
I think we need to clarify an important distinction here. JSON, as a format, does *not* support date/time objects in any way. But JavaScript's JSON.stringify() function is happy to accept them, and will represent them as strings. If the suggestion here is to have json.dumps(datetime.date(2018,11,4)) to return an encoded string, either by natively supporting it, or by having a protocol which the date object implements, that's fine and reasonable; but json.loads(s) won't return that date object. So, yes, it would be asymmetric. I personally don't have a problem with this (though I also don't have any strong use-cases). Custom encoders and decoders could do this, with or without symmetry. What would it be like to add a couple to the json module that can handle these extra types? ChrisA
02.11.18 19:26, Chris Barker via Python-ideas пише:
On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg
mailto:mal@egenix.com> wrote: Serialization of those data types is not defined in the JSON standard:
That being said, ISO 8601 is a standard for datetime stamps, and a defacto one for JSON
It is not the only standard. Other common representation is as POSIX timestamp. And, for a date without time, the Julian day.
On Nov 3, 2018, at 9:29 AM, Chris Angelico
I think we need to clarify an important distinction here. JSON, as a format, does *not* support date/time objects in any way. But JavaScript's JSON.stringify() function is happy to accept them, and will represent them as strings.
Very good point. The JSON document type only supports object literals, numbers, strings, and Boolean literals. My suggestion was specifically to provide an extensible mechanism for encoding arbitrary objects into the supported primitives.
If the suggestion here is to have json.dumps(datetime.date(2018,11,4)) to return an encoded string, either by natively supporting it, or by having a protocol which the date object implements, that's fine and reasonable; but json.loads(s) won't return that date object. So, yes, it would be asymmetric. I personally don't have a problem with this (though I also don't have any strong use-cases). Custom encoders and decoders could do this, with or without symmetry. What would it be like to add a couple to the json module that can handle these extra types?
Completely agreed here. I've seen many attempts to support "round trip" encode/decode in JSON libraries and it really doesn't work well unless you go down the path of type hinting. I believe that MongoDB uses something akin to hinting when it handles dates. Something like the following representation if I recall correctly. { "now": { "$type": "JSONDate", "value": "2018-11-03T09:52:20-0400" } } During deserialization they recognize the hint and instantiate the object instead of parsing it. This is interesting but pretty awful for interoperability since there isn't a standard that I'm aware of. I'm certainly not proposing that but I did want to mention it for completeness. I'll try to put together a PR/branch that adds protocol support in JSON encoder and to datetime, date, and uuid as well. It will give us something to point at and discuss. - cheers, dave. -- Mathematics provides a framework for dealing precisely with notions of "what is". Computation provides a framework for dealing precisely with notions of "how to". SICP Preface
jsondate, for example, supports both .load[s]() and .dump[s](); but only
for UTC datetimes
https://github.com/rconradharris/jsondate/blob/master/jsondate/__init__.py
UTC is only sometimes a fair assumption; otherwise it's dangerous to assume
that timezone-naieve [ISO8601] strings represent UTC-0 datetimes. In that
respect - aside from readability - arbitrary-precision POSIX timestamps are
less error-prone.
On Saturday, November 3, 2018, David Shawley
On Nov 3, 2018, at 9:29 AM, Chris Angelico
wrote: I think we need to clarify an important distinction here. JSON, as a format, does *not* support date/time objects in any way. But JavaScript's JSON.stringify() function is happy to accept them, and will represent them as strings.
Very good point. The JSON document type only supports object literals, numbers, strings, and Boolean literals. My suggestion was specifically to provide an extensible mechanism for encoding arbitrary objects into the supported primitives.
If the suggestion here is to have json.dumps(datetime.date(2018,11,4)) to return an encoded string, either by natively supporting it, or by having a protocol which the date object implements, that's fine and reasonable; but json.loads(s) won't return that date object. So, yes, it would be asymmetric. I personally don't have a problem with this (though I also don't have any strong use-cases). Custom encoders and decoders could do this, with or without symmetry. What would it be like to add a couple to the json module that can handle these extra types?
Completely agreed here. I've seen many attempts to support "round trip" encode/decode in JSON libraries and it really doesn't work well unless you go down the path of type hinting. I believe that MongoDB uses something akin to hinting when it handles dates. Something like the following representation if I recall correctly.
{ "now": { "$type": "JSONDate", "value": "2018-11-03T09:52:20-0400" } }
During deserialization they recognize the hint and instantiate the object instead of parsing it. This is interesting but pretty awful for interoperability since there isn't a standard that I'm aware of. I'm certainly not proposing that but I did want to mention it for completeness.
I'll try to put together a PR/branch that adds protocol support in JSON encoder and to datetime, date, and uuid as well. It will give us something to point at and discuss.
- cheers, dave. -- Mathematics provides a framework for dealing precisely with notions of "what is". Computation provides a framework for dealing precisely with notions of "how to". SICP Preface
On Sun, Nov 4, 2018 at 1:00 AM David Shawley
Very good point. The JSON document type only supports object literals, numbers, strings, and Boolean literals. My suggestion was specifically to provide an extensible mechanism for encoding arbitrary objects into the supported primitives.
Okay, so to clarify: We currently have a mechanism for custom encoders and decoders, which you have to specify as you're thinking about encoding. But you're proposing having the core json.dumps() allow objects to customize their own representation. Sounds like a plan, and not even all that complex a plan. ChrisA
David Shawley wrote:
I'm +1 on adding support for serializing datetime.date and datetime.datetime *but* I'm -1 on automatically deserializing anything that looks like a ISO-8601 in json.load*. The asymmetry is the only thing that kept me from bringing this up previously.
This asymmetry bothers me too. It makes me think that datetime handling belongs at a different level of abstraction, something that knows about the structure of the data being serialised or deserialised. Java's JSON libraries have a mechanism where you can give it a class and a lump of JSON and it will figure out from runtime type information what to do. It seems like we should be able to do something similar using type annotations. -- Greg
On Nov 3, 2018, at 5:43 PM, Greg Ewing
David Shawley wrote:
I'm +1 on adding support for serializing datetime.date and datetime.datetime *but* I'm -1 on automatically deserializing anything that looks like a ISO-8601 in json.load*. The asymmetry is the only thing that kept me from bringing this up previously.
This asymmetry bothers me too. It makes me think that datetime handling belongs at a different level of abstraction, something that knows about the structure of the data being serialised or deserialised.
Java's JSON libraries have a mechanism where you can give it a class and a lump of JSON and it will figure out from runtime type information what to do. It seems like we should be able to do something similar using type annotations.
I was thinking about trying to do something similar to what golang has done in their JSON support [1]. It is similar to what I would have done with JAXB when I was still doing Java [2]. In both cases you have a type explicitly bound to a JSON blob. The place to make this sort of change might be in the JSONDecoder and JSONEncoder classes. Personally, I would place this sort of serialization logic outside of the Standard Library -- maybe following the pattern that the rust community adopted on this very issue. In short, they separated serialization & de-serialization into a free-standing library. The best discussion that I have found is a reddit thread [3]. The new library that they built is called serde [4] and the previous code is in their deprecated library section [5]. The difference between the two approaches is that golang simply annotates the types similar to what I would expect to happen in the Python case. Then you are required to pass a list of types into the deserializer so it knows what which types are candidates for deserialization. The rust and JAXB approaches rely on type registration into the deserialiation framework. We could probably use type annotations to handle the asymmetry provided that we change the JSONDecoder interface to accept a list of classes that are candidates for deserialization or something similar. I would place this outside of the Standard Library as a generalized serialization / de-serialization framework since it feels embryonic to me. This could be a new implementation for CSV and pickle as well. Bringing the conversation back around, I'm going to continue adding a simple JSON formatting protocol that is asymmetric since it does solve a need that I and others have today. I'm not completely sure what the best way to move this forward is. I have most of an implementation working based on a simple protocol of one method. Should I: 1. Open a BPO and continue the discussion there once I have a working prototype? 2. Continue the discussion here? 3. Move the discussion to python-dev under a more appropriate subject? cheers, dave. [1]: https://golang.org/pkg/encoding/json/#Marshal [2]: https://docs.oracle.com/javaee/6/tutorial/doc/gkknj.html#gmfnu [3]: https://www.reddit.com/r/rust/comments/3v4ktz/differences_between_serde_and_... [4]: https://serde.rs [5]: https://github.com/rust-lang-deprecated/rustc-serialize -- "Syntactic sugar causes cancer of the semicolon" - Alan Perlis
Here's a JSONEncoder subclass with a default method that checks variable
types in a defined sequence that includes datetime:
https://gist.github.com/majgis/4200488
Passing an ordered map of (Type, fn) may or may not be any more readable
than simply subclassing JSONEncoder and defining .default().
On Sunday, November 4, 2018, David Shawley
On Nov 3, 2018, at 5:43 PM, Greg Ewing
wrote: I'm +1 on adding support for serializing datetime.date and datetime.datetime *but* I'm -1 on automatically deserializing anything
looks like a ISO-8601 in json.load*. The asymmetry is the only thing
David Shawley wrote: that that
kept me from bringing this up previously.
This asymmetry bothers me too. It makes me think that datetime handling belongs at a different level of abstraction, something that knows about the structure of the data being serialised or deserialised.
Java's JSON libraries have a mechanism where you can give it a class and a lump of JSON and it will figure out from runtime type information what to do. It seems like we should be able to do something similar using type annotations.
I was thinking about trying to do something similar to what golang has done in their JSON support [1]. It is similar to what I would have done with JAXB when I was still doing Java [2]. In both cases you have a type explicitly bound to a JSON blob. The place to make this sort of change might be in the JSONDecoder and JSONEncoder classes.
Personally, I would place this sort of serialization logic outside of the Standard Library -- maybe following the pattern that the rust community adopted on this very issue. In short, they separated serialization & de-serialization into a free-standing library. The best discussion that I have found is a reddit thread [3]. The new library that they built is called serde [4] and the previous code is in their deprecated library section [5].
The difference between the two approaches is that golang simply annotates the types similar to what I would expect to happen in the Python case. Then you are required to pass a list of types into the deserializer so it knows what which types are candidates for deserialization. The rust and JAXB approaches rely on type registration into the deserialiation framework.
We could probably use type annotations to handle the asymmetry provided that we change the JSONDecoder interface to accept a list of classes that are candidates for deserialization or something similar. I would place this outside of the Standard Library as a generalized serialization / de-serialization framework since it feels embryonic to me. This could be a new implementation for CSV and pickle as well.
Bringing the conversation back around, I'm going to continue adding a simple JSON formatting protocol that is asymmetric since it does solve a need that I and others have today. I'm not completely sure what the best way to move this forward is. I have most of an implementation working based on a simple protocol of one method. Should I:
1. Open a BPO and continue the discussion there once I have a working prototype?
2. Continue the discussion here?
3. Move the discussion to python-dev under a more appropriate subject?
cheers, dave.
[1]: https://golang.org/pkg/encoding/json/#Marshal [2]: https://docs.oracle.com/javaee/6/tutorial/doc/gkknj.html#gmfnu [3]: https://www.reddit.com/r/rust/comments/3v4ktz/differences_ between_serde_and_rustc_serialize/ [4]: https://serde.rs [5]: https://github.com/rust-lang-deprecated/rustc-serialize
-- "Syntactic sugar causes cancer of the semicolon" - Alan Perlis
On Fri, Nov 2, 2018 at 12:17 PM, Wes Turner
JSON5 supports comments in JSON. https://github.com/json5/json5/issues/3
and other nifty things -- any plans to support JSON5 in the stdlib json library? I think that would be great. -CHB
... Some form of schema is necessary to avoid having to try parsing every string value as a date time (and to specify precision: "2018" is not the same as "2018 00:00:01")
On Friday, November 2, 2018, Chris Barker via Python-ideas < python-ideas@python.org> wrote:
On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg
wrote: Serialization of those data types is not defined in the JSON standard:
That being said, ISO 8601 is a standard for datetime stamps, and a defacto one for JSON
So building encoding of datetime into Python's json encoder would be pretty useful.
(I would not have any automatic decoding though -- as an ISO8601 string would still be just a string in JSON)
Could we have a "pedantic" mode for "fully standard conforming" JSON, and then add some extensions to the standard?
As another example, I would find it very handy if the json decoder would respect comments in JSON (I know that they are explicitly not part of the standard), but they are used in other applications, particularly when JSON is used as a configuration language.
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206 https://maps.google.com/?q=7600+Sand+Point+Way+NE+%C2%A0%C2%A0(206&entry=gmail&source=g) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
participants (10)
-
Calvin Spealman
-
Chris Angelico
-
Chris Barker
-
David Shawley
-
Greg Ewing
-
M.-A. Lemburg
-
Michael Selik
-
Philip Martin
-
Serhiy Storchaka
-
Wes Turner