[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

Sept. 15, 2020

      json.load and json.dump already default to UTF8 and already have parameters
for json loading and dumping.

json.loads and json.dumps exist only because there was no way to
distinguish between a string containing JSON and a file path string.
(They probably should've been .loadstr and .dumpstr, but it's too late for
that now)

TBH, I think it would be great to just have .load and .dump read the file
with standard params when a path-like ( hasattr(obj, '__path__') ) is
passed, but the suggested disadvantages of this are:

- https://docs.python.org/3/library/functions.html#open
...
The default encoding is platform dependent (whatever
locale.getpreferredencoding() returns), but any text encoding supported by
Python can be used. See the codecs module for the list of supported
encodings.
...
JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32.  The default
   encoding is UTF-8, and JSON texts that are encoded in UTF-8 are
   interoperable in the sense that they will be read successfully by the
   maximum number of implementations; there are many implementations
- .load and .dump don't default to UTF8?
  AFAIU, they do default to UTF-8. Do they instead currently default to
locale.getpreferredencoding() instead of the JSON spec(s) *
  encoding= was removed from .loads and was never accepted by json.load or
json.dump
- .load and .dump would also need to accept an encoding= parameter for
non-spec data that don't want to continue handling the file themselves
  - pickle.load has an encoding= parameter
  - marshal.load does not have (and probably doesn't need?) an encoding=
parameter
- What if you need to specify parameters for the file context manager?
  Accepting a path-like object should not break any existing code: you
could always still open and close a file-like yourself.
  open('file', 'rb') as _file:
      json.load(_file)

- Should we be using open(pth, 'rb') and open(pth, 'wb')? (Binary mode)

JSON Specs:
- https://tools.ietf.org/html/rfc7159#section-8.1  :

   that cannot successfully read texts in other encodings (such as
   UTF-16 and UTF-32).

   Implementations MUST NOT add a byte order mark to the beginning of a
   JSON text.  In the interests of interoperability, implementations
   that parse JSON texts MAY ignore the presence of a byte order mark
   rather than treating it as an error.

- https://www.json.org/ >
http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf
(PDF!)
...
JSON syntax describes a sequence of Unicode code points. JSON also
depends on Unicode in the hex
numbers used in the \u escapement notation
So, could we just have .load and .dump accept a path-like and an encoding=
parameter (because they need to be able to specify UTF-8 / UTF-16 / UTF-32
anyway)?

On Tue, Sep 15, 2020 at 3:22 AM Stephen J. Turnbull <
turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
...
Joao S. O. Bueno writes:
...
If .load and .dump are super-charged, people coding with these
methods in mind have _one_ less_ thing to worry about: if the
method accepts a path or an open file becomes irrelevant.
But then you either lose the primary benefit of this three line
function (defaulting to the UTF-8 encoding to conform to the JSON
standard), or you have a situation where what encoding you get can
depend on whether you use the name of a file or that file already
opened.
I consider that worse because it's precisely the kind of thing that
people *don't* worry about and *do* have some difficulty debugging.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/KO3ZZN...
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: A shortcut to load a JSON file into a dict : json.loadf

Wes Turner