On Tue, Sep 15, 2020 at 9:09 AM Wes Turner <
wes.turner@gmail.com> wrote:
json.load and json.dump already default to UTF8 and already have parameters for json loading and dumping.
so it turns out that loads(), which optionally takes a bytes or bytesarray object tries to determine whether the encoding is UTF-6, UTF-!6 or utf-32 (the ones allowed by the standard) (thanks Guido for the pointer). And load() calls loads(), so it should work with binary mode files as well.
Currently, dump() simply uses the fp passed in, and it doesn't support binary files, so it'll use the encoding the user set (or the default, if not set, which is an issue here) dumps() returns a string, so no encoding there.
So I was not correct: dump does not default to UTF-8 (and does not accept an encoding= parameter)
I think dumpf() should use UTF-8, and that's it. If anyone really wants something else, they can get it by providing an open text file object.
Why would we impose UTF-8 when the spec says UTF-8, UTF-16, or UTF-32?
How could this be improved? (I'm on my phone, so)
def dumpf(obj, path, *args, **kwargs):
with open(getattr(path, '__path__', path), 'w', encoding=kwargs.get('encoding', 'utf8')) as _file:
return dump(_file, *args, **kwargs)
def loadf(obj, path, *args, **kwargs):
with open(getattr(path, '__path__', path), encoding=kwargs.get('encoding', 'utf8')) as _file:
return load(_file, *args, **kwargs)
loads(), on the other hand, is a bit tricky -- it could allow only UTF-8, but it seems it would be more consistent (and easy to do) to open the file in binary mode and use the existing code to determine the encoding.
-CHB
>> The Python JSON implementation should support the full JSON spec (including UTF-8, UTF-16, and UTF-32) and should default to UTF-8.
'turns out it does already, and no one is suggesting changing that.
Anyway -- if anyone wants to push for overloading .load()/dump(), rather than making two new loadf() and dumpf() functions, then speak now -- that will take more discussion, and maybe a PEP.
I don't see why one or the other would need a PEP so long as the new functionality is backward-compatible?
iIm just putting my finger in the wind. no need for a PEP if it's simeel and non-controversial, but if even the few folks on this thread don't agree on the API we want, then it's maybe too controversial -- so either more discussion, to come to consensus, or a PEP.
Or not -- we can see what the core devs say if/when someone does a bpo / PR.
-CHB
-CHB
--
Christopher Barker, PhD
Python Language Consulting
- Teaching
- Scientific Software Development
- Desktop GUI and Web Development
- wxPython, numpy, scipy, Cython
--
Christopher Barker, PhD
Python Language Consulting
- Teaching
- Scientific Software Development
- Desktop GUI and Web Development
- wxPython, numpy, scipy, Cython