[issue10038] Returntype of json.loads() on strings
New submission from Nik Tautenhahn <nik@livinglogic.de>: Hi, before 2.7, an import json json.loads('"abc"') yielded u"abc". in 2.7 I get "abc" (a byte string). I would have expected an entry in "news" or "What's new in 2.7" why this change happened. In addition, all examples at http://docs.python.org/library/json are wrong for Python 2.7 if json.loads is involved. Any insight on this? best regards, Nik ---------- assignee: docs@python components: Documentation, Library (Lib) messages: 118069 nosy: docs@python, llnik priority: normal severity: normal status: open title: Returntype of json.loads() on strings type: behavior versions: Python 2.7 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Changes by Antoine Pitrou <pitrou@free.fr>: ---------- assignee: docs@python -> bob.ippolito nosy: +bob.ippolito _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Fred L. Drake, Jr. <fdrake@acm.org> added the comment: This is related to this issue from simplejson: http://code.google.com/p/simplejson/issues/detail?id=28 This problem is why I still use simplejson 1.x; moving forward to simplejson 2.x or Python's json is unlikely. ---------- nosy: +fdrake _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Nik Tautenhahn <nik@livinglogic.de> added the comment: Well, then at least the documentation and the "What's changed" need to be updated. Furthermore, if such decisions are made, it would be at least nice to have some general "decode-hook" for json.JSONDecoder - the "object_hook" is only used for dict-objects - why is there no hook for strings or a general hook which is used on any objects? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Fred L. Drake, Jr. <fdrake@acm.org> added the comment: As I understand it, the decision to return str instead of unicode values for the "simplejson" module was simply inherited by the standard library. As such, it still needs to be evaluated in the context of the standard library, because of the incompatibility it introduces. I still maintain that it's a bug, and should be treated as such. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Nik Tautenhahn <nik@livinglogic.de> added the comment: Yep, the solution should not be "maybe it's str, maybe it's unicode" - I mean, if the decoder gives you a str if there are no fancy characters and unicode if it contains some, this might lead to some confusion... And yes, in my opinion, this is a bug, too. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Barry A. Warsaw <barry@python.org> added the comment: I completely agree with Fred; this is a regression and a bug in Python 2.7 and should be fixed. I have a doctest in Mailman 3 for example that cannot pass in both Python 2.6 and 2.7 (without IMO ugly hackery). Not only that, but json is documented as converting JSON str to unicode, which it does fine in Python 2.6, 3.1 and 3.2. Why should Python 2.7 be different (and broken)? ---------- nosy: +barry _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Fred L. Drake, Jr. <fdrake@acm.org> added the comment: I'll note that it seems relevant that this package is not considered "externally maintained" by the terms of PEP 360: http://www.python.org/dev/peps/pep-0360/ Given the level of attention this has received from the originator of the code, we should not hesitate to commit technically acceptable changes to the Python repository, ---------- title: json.loads() on str should return unicode, not str -> Returntype of json.loads() on strings _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Changes by Barry A. Warsaw <barry@python.org>: ---------- title: Returntype of json.loads() on strings -> json.loads() on str erroneously returns str. should return unicode _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Changes by Barry A. Warsaw <barry@python.org>: ---------- title: json.loads() on str erroneously returns str. should return unicode -> json.loads() on str should return unicode, not str _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Changes by Fred L. Drake, Jr. <fdrake@acm.org>: ---------- title: Returntype of json.loads() on strings -> json.loads() on str should return unicode, not str _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Antoine Pitrou <pitrou@free.fr> added the comment: +1 for fixing this in-tree. We need a patch, though ;) ---------- nosy: +pitrou _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Nik Tautenhahn <nik@livinglogic.de> added the comment: There is even more inconsistency here. As already mentioned, we have this:
import json json.loads(json.dumps("abc"))
'abc' If, however, I am evil and hide _json.so (which is the C-part of the json module for speedup), the JSON code falls back to its python implementation and voila:
import json json.loads(json.dumps("abc"))
u'abc' Not so neat, if your fallback is not a fallback but shows such different behaviour. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Alternately, the Python implementation may be thought of as definitive and the optimizations are broken.
Walter Dörwald <walter@livinglogic.de> added the comment: The following patch (against the release27-maint branch) seems to fix the problem. ---------- keywords: +patch nosy: +doerwalter Added file: http://bugs.python.org/file19468/json.diff _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Barry A. Warsaw <barry@python.org> added the comment: The fact that the C and Python versions are not fully tested (afaict) is not good. I'm not sure that's worth fixing for 2.7 and it's probably worth a separate bug report for Python 3.2 on that. In the meantime, I'll test Walter's patch and add a unit test for this case. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Fred L. Drake, Jr. <fdrake@acm.org> added the comment: The incomplete testing and C/Python implementation mismatch are covered by issue 5723 and issue 9233. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Raymond Hettinger <rhettinger@users.sourceforge.net> added the comment: To mitigate possible negative impacts from changing the return type, consider adding a parse_string hook that lets users control the return type: json.loads(f, parse_int=decimal.Decimal, parse_string=repr) ---------- nosy: +rhettinger _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Barry A. Warsaw <barry@python.org> added the comment: Adding that argument to Python 2.7 seems like new feature territory. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Barry A. Warsaw <barry@python.org> added the comment: @doerwalter: patch looks good. I've added a test and will commit momentarily. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
Barry A. Warsaw <barry@python.org> added the comment: r86126 ---------- assignee: bob.ippolito -> barry resolution: -> fixed status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10038> _______________________________________
participants (7)
-
Antoine Pitrou
-
Barry A. Warsaw
-
Fred Drake
-
Fred L. Drake, Jr.
-
Nik Tautenhahn
-
Raymond Hettinger
-
Walter Dörwald