[Python-Dev] PEP 528: Change Windows console encoding to UTF-8

Paul Moore p.f.moore at gmail.com
Mon Sep 5 16:19:46 EDT 2016


On 5 September 2016 at 20:34, eryk sun <eryksun at gmail.com> wrote:
> Paul, do you have example code that uses the 'raw' stream? Using the
> buffer should behave as it always has -- at least in this regard.
> sys.stdin.buffer requests a large block, such as 8 KB. But since the
> console defaults to a cooked mode (i.e. processed input and line input
> -- control keys, command-line editing, input history, and aliases),
> ReadConsole returns when enter is pressed or when interrupted. It
> returns at least '\r\n', unless interrupted by Ctrl+C, Ctrl+Break or a
> custom CtrlWakeup key. However, if line-input mode is disabled,
> ReadConsole returns as soon as one or more characters is available in
> the input buffer.

The code I'm looking at doesn't use the raw stream (I think). The
problem I had (and the reason I was concerned) is that the code does
some rather messy things, and without tracing back through the full
code path, I'm not 100% sure *what* level of stream it's using.
However, now that I know that the buffered layer won't ever error
because 1 byte isn't enough to return a full character, if I need to
change the code I can do so by switching to the buffered layer and
fixing the issue that way (although with Steve's new proposal even
that won't be necessary).

> As to kbhit() returning true, this does not mean that read(1) from
> console input won't block (not unless line-input mode is disabled). It
> does mean that getwch() won't block (note the "w" in there; this one
> reads Unicode characters).The CRT's conio functions (e.g. kbhit,
> getwch) put the console input buffer in a raw mode (e.g. ^C is read as
> '\x03' instead of generating a CTRL_C_EVENT) and call the lower-level
> functions PeekConsoleInputW (kbhit) and ReadConsoleInputW (getwch), to
> peek at and read input event records.

I understand. The code I'm working on was originally written for pure
POSIX, with all the termios calls to set the console into unbuffered
mode. In addition, it was until recently using the Python 2 text
model, and so there's a lot of places in the code where it's still
confused about whether it's processing bytes or characters (we've got
rid of a *lot* of "let's decode and see if that helps" calls...). At
the moment, kbhit(), while not correct, is "good enough". When I get
the time, and we get to a point where it's enough of a priority, I may
well look at refactoring this stuff to use proper Windows calls via
ctypes to do "read what's available". But that's a way off yet.

Thanks for the information, though, I'll keep it in mind when we do
get to a point where we're looking at this.
Paul


More information about the Python-Dev mailing list