[New-bugs-announce] [issue33255] json.dumps has different behaviour if encoding='utf-8' or encoding='utf8'

Nicolás Hatcher report at bugs.python.org
Tue Apr 10 05:21:14 EDT 2018


New submission from Nicolás Hatcher <nicoliere at gmail.com>:

Hey I'm new here, so please let me know what incorrect things I am doing!

I _think_ `json.dumps(o, ensure_ascii=False)` is doing the wrong thing when `o` has both unicode and str keys/values. For instance:

```
import json
o = {u"greeting": "hi", "currency": "€"}
json.dumps(o, ensure_ascii=False, encoding="utf8")
json.dumps(o, ensure_ascii=False)
```

The first `dumps` will work while the second will fail. the reason is:

https://github.com/python/cpython/blob/2.7/Lib/json/encoder.py#L198

This will decode any str if the encoding is not 'utf-8'. In the mixed case (unicode and str) this will blow. I workaround is to use any of the aliases for 'utf-8' like 'utf8' or 'u8'.

I would be crazy happy to provide a PR if this is really an issue.
Let me know if extra clarification is needed.
Nicolás

----------
components: Unicode
messages: 315164
nosy: ezio.melotti, nhatcher, vstinner
priority: normal
severity: normal
status: open
title: json.dumps has different behaviour if encoding='utf-8' or encoding='utf8'
type: behavior
versions: Python 2.7

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue33255>
_______________________________________


More information about the New-bugs-announce mailing list