[pypy-issue] Issue #3044: py3.6: str.encode should not work with not-text encoders (pypy/pypy)

Zsolt Cserna issues-reply at bitbucket.org
Sat Jul 13 11:04:58 EDT 2019


New issue 3044: py3.6: str.encode should not work with not-text encoders
https://bitbucket.org/pypy/pypy/issues/3044/py36-strencode-should-not-work-with-not

Zsolt Cserna:

With Cpython, using a non-text encoder such as “hex”, the following happens:

```
>>> "foo".encode("hex")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: 'hex' is not a text encoding; use codecs.encode() to handle arbitrary codecs
```

In pypy this call gets to the codec itself, where `TypeError` is raised \(which is correct as this encoding works on bytes, not on unicode\):

```
>>>> "foo".encode("hex")
Traceback (most recent call last):
  File "/home/zsolt/src/pypy/lib-python/3/encodings/hex_codec.py", line 15, in hex_encode
    return (binascii.b2a_hex(input), len(input))
TypeError: a bytes-like object is required, not str
```

The root cause of the problem is that when looking up a codec in `str.encode`, `lookup_text_codec()` function is not called. Thereby the CodecInfo's `_is_text_encoding` is not checked at all as `str.encode` uses `codes.encode`under the hood.

The solution would be adding a check for this method somewhere between `str.encode` and using the encoder.




More information about the pypy-issue mailing list