Improving the reading part of REPL
It seems that there will be some refactoring of the tokenizer code. Regarding this, I'd like to recall my proposal on readline hooks. It would be nice if char* based PyOS_Readline API was replaced by a Python str based hook customizable by Python code. I propose to add function sys.readlinehook accepting optional prompt and returning a line read interactively from a user. There would also be sys.__readlinehook__ containing the original value of sys.readlinehook (similarly to sys.(__)displayhook(__), sys.(__)excepthook(__) and sys.(__)std(in/out/err)(__)). Currently, the input is read from C stdin even if sys.stdin is changed (see http://bugs.python.org/issue17620). This complicates fixing http://bugs.python.org/issue1602 – the standard sys.std* streams are not capable of communicating in Unicode with Windows console, and replacing the streams with custom ones is not enough – one has also to install a custom readline hook, which is currently complicated. And even after installing a custom readine hook one finds out that Python tokenizer cannot handle UTF-16, so he has to wrap the custom stream objects just to let their encoding attribute have a different value, because readlinehook currently returns char* rather than a Python string. For more details see the documentation of my package: https://github.com/Drekin/win-unicode-console. The pyreadline package also sets up a custom readline so it would benefit if doing so would be easier. Moreover, the two consumers of PyOS_Readline API – the input function and the tokenizer – assume a different encoding of the bytes returned by the readlinehook. Effectively, one assumes sys.stdout.encoding and the other sys.stdin.encoding, so if these two are different, there is no way to implement a correct readline hook. If sys.readlinehook was added, the builting input function would be just a thin wrapper over sys.readlinehook removing the newline character and turning no input into EOFError. I thing that the best default value for sys.readlinehook on Windows would be stdio_readline – just write the prompt to sys.stdout and read a line from sys.stdin. On Linux, the default implementation would call GNU readline if it is available and sys.stdin and sys.stdout are standard TTYs (the check present in the current implementation of the input function), and it would call stdio_readline otherwise. Regards, Adam Bartoš
Another issue with the current implementation is http://bugs.python.org/issue24829. Even if I fix my Python environment by win_unicode_console so >>> "α" really results in "α" rather than "?", the feature vanishes when I try to redirect stdout. On Thu, Nov 19, 2015 at 10:50 PM, Adam Bartoš <drekin@gmail.com> wrote:
It seems that there will be some refactoring of the tokenizer code. Regarding this, I'd like to recall my proposal on readline hooks. It would be nice if char* based PyOS_Readline API was replaced by a Python str based hook customizable by Python code. I propose to add function sys.readlinehook accepting optional prompt and returning a line read interactively from a user. There would also be sys.__readlinehook__ containing the original value of sys.readlinehook (similarly to sys.(__)displayhook(__), sys.(__)excepthook(__) and sys.(__)std(in/out/err)(__)).
Currently, the input is read from C stdin even if sys.stdin is changed (see http://bugs.python.org/issue17620). This complicates fixing http://bugs.python.org/issue1602 – the standard sys.std* streams are not capable of communicating in Unicode with Windows console, and replacing the streams with custom ones is not enough – one has also to install a custom readline hook, which is currently complicated. And even after installing a custom readine hook one finds out that Python tokenizer cannot handle UTF-16, so he has to wrap the custom stream objects just to let their encoding attribute have a different value, because readlinehook currently returns char* rather than a Python string. For more details see the documentation of my package: https://github.com/Drekin/win-unicode-console .
The pyreadline package also sets up a custom readline so it would benefit if doing so would be easier. Moreover, the two consumers of PyOS_Readline API – the input function and the tokenizer – assume a different encoding of the bytes returned by the readlinehook. Effectively, one assumes sys.stdout.encoding and the other sys.stdin.encoding, so if these two are different, there is no way to implement a correct readline hook.
If sys.readlinehook was added, the builting input function would be just a thin wrapper over sys.readlinehook removing the newline character and turning no input into EOFError. I thing that the best default value for sys.readlinehook on Windows would be stdio_readline – just write the prompt to sys.stdout and read a line from sys.stdin. On Linux, the default implementation would call GNU readline if it is available and sys.stdin and sys.stdout are standard TTYs (the check present in the current implementation of the input function), and it would call stdio_readline otherwise.
Regards, Adam Bartoš
participants (1)
-
Adam Bartoš