[pypy-issue] Issue #1810: PyPy3: `os.strerror` uses wrong encoding with non-US locale (pypy/pypy)

Yichao Yu issues-reply at bitbucket.org
Fri Jul 4 13:04:29 CEST 2014


New issue 1810: PyPy3: `os.strerror` uses wrong encoding with non-US locale
https://bitbucket.org/pypy/pypy/issue/1810/pypy3-osstrerror-uses-wrong-encoding-with

Yichao Yu:

First of all, it seems that pypy3 calls `setlocale()` on initialization. (which is not a big issue but is different from all other versions.)

Moreover, the `os.strerror` method returns wrong unicode string for non-ascii error message.

With the following script and `zh_CN.utf8` locale
```python
import os
from cffi import FFI

ffi = FFI()
ffi.cdef('''
void _setlocale();
char *strerror(int);
''')
lib = ffi.verify('''
#include <locale.h>
#include <string.h>
void
_setlocale()
{
    setlocale(LC_ALL, \"\");
}
''', extra_compile_args=['-w'])

err = os.strerror(21)
print(err)
if not isinstance(err, bytes):
    print(err.encode('utf8'))
print(ffi.string(lib.strerror(21)).decode('utf8'))

lib._setlocale()

err = os.strerror(21)
print(err)
if not isinstance(err, bytes):
    print(err.encode('utf8'))
print(ffi.string(lib.strerror(21)).decode('utf8'))
```

The output on pypy(2) and cpython2 is,
```
Is a directory
Is a directory
是一个目录
是一个目录
```

On cpython3,
```
Is a directory
b'Is a directory'
Is a directory
是一个目录
b'\xe6\x98\xaf\xe4\xb8\x80\xe4\xb8\xaa\xe7\x9b\xae\xe5\xbd\x95'
是一个目录
```

On pypy3
```
���������������
b'\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd'
是一个目录
���������������
b'\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd'
是一个目录
```

I've also checked the validity of the libc translation files
```
$ TEXTDOMAIN=libc gettext 'Is a directory'
是一个目录
```





More information about the pypy-issue mailing list