[Python-Dev] py3k: accept unicode for 'c' and byte for 'C' in getarg?
Victor Stinner
victor.stinner at haypocalc.com
Tue Mar 17 18:03:32 CET 2009
Le Tuesday 17 March 2009 17:27:39, vous avez écrit :
> > The "C" format (get a character) has the opposite problem: it accepts
> > both byte and unicode, whereas byte should be rejected. Example:
> > mmap.write_byte('é') should be a TypeError.
>
> YEah, mmap should be defined exclusively in terms of bytes.
It's already the fix (only use bytes) choosen for the mmap issue:
http://bugs.python.org/issue5391
(the problem is bigger than mmap.write_byte, other methods have to be changed)
> > Usage of "c" format:
> > msvcrt.putch(char)
> > msvcrt.ungetch(char)
>
> ISTM that putch() and ungetch() are text operations so should use 'C'.
The low level functions use the C type "char":
_putch(char)=>void
_ungetch(char)=>char
For text, we have unicode versions of these functions:
msvcrt.ungetwch(unicode string of 1 character)
msvcrt.putwch(unicode string of 1 character)
So "c" looks to be the right format for putch() and ungetch().
See also http://bugs.python.org/issue5410
--
Victor Stinner aka haypo
http://www.haypocalc.com/blog/
More information about the Python-Dev
mailing list