[Python-Dev] py3k: accept unicode for 'c' and byte for 'C' in getarg?

Victor Stinner victor.stinner at haypocalc.com
Tue Mar 17 18:03:32 CET 2009

Le Tuesday 17 March 2009 17:27:39, vous avez écrit :
> > The "C" format (get a character) has the opposite problem: it accepts
> > both byte and unicode, whereas byte should be rejected. Example:
> > mmap.write_byte('é') should be a TypeError.
> YEah, mmap should be defined exclusively in terms of bytes.

It's already the fix (only use bytes) choosen for the mmap issue:
(the problem is bigger than mmap.write_byte, other methods have to be changed)

> > Usage of "c" format:
> >  msvcrt.putch(char)
> >  msvcrt.ungetch(char)
> ISTM that putch() and ungetch() are text operations so should use 'C'.

The low level functions use the C type "char":

For text, we have unicode versions of these functions:
  msvcrt.ungetwch(unicode string of 1 character)
  msvcrt.putwch(unicode string of 1 character)

So "c" looks to be the right format for putch() and ungetch().

See also http://bugs.python.org/issue5410

Victor Stinner aka haypo

More information about the Python-Dev mailing list