decoding keyboard input when using curses

Arnaud Delobelle arnodel at googlemail.com
Sat May 30 22:55:19 CEST 2009


Hi all,

I am looking for advice on how to use unicode with curses.  First I will
explain my understanding of how curses deals with keyboard input and how
it differs with what I would like.

The curses module has a window.getch() function to capture keyboard
input.  This function returns an integer which is more or less:

* a byte if the key which was pressed is a printable character (e.g. a,
  F, &);

* an integer > 255 if it is a special key, e.g. if you press KEY_UP it
  returns 259.

As far as I know, curses is totally unicode unaware, so if the key
pressed is printable but not ASCII, the getch() function will return one
or more bytes depending on the encoding in the terminal.

E.g. given utf-8 encoding, if I press the key 'é' on my keyboard (which
encoded as '\xc3\xa9' in utf-8), I will need two calls to getch() to get
this: the first one will return 0xC3 and the second one 0xA9.

Instead of getting a stream of bytes and special keycodes (with value >
255) from getch(), what I want is a stream of *unicode characters* and
special keycodes.

So, still assuming utf-8 encoding in the terminal, if I type:

    Té[KEY_UP]ça

iterating call to the getch() function will give me this sequence of
integers:

    84, 195, 169, 259,   195, 167, 97
    T-  é-------  KEY_UP ç-------  a-

But what I want to get this stream instead:

    u'T', u'é', 259, u'ç', u'a'


I can pipe the stream of output from getch() directly through an
instance of codecs.getreader('utf-8') because getch() sometimes returns
the integer values of the 'special keys'.

Now I will present to you the solution I have come up with so far.  I am
really unsure whether it is a good way to solve this problem as both
unicode and curses still feel quite mysterious to me.  What I would
appreciate is some advice on how to do it better - or someone to point
out that I have a gross misunderstanding of what is going on!

This has been tested in Python 2.5

-------------------- uctest.py ------------------------------
# -*- coding:utf-8 -*-

import codecs
import curses

# This gives the return codes given by curses.window.getch() when
# "Té[KEY_UP]ça" is typed in a terminal with utf-8 encoding:

codes = map(ord, "Té") + [curses.KEY_UP]  + map(ord, "ça")


# This class defines a file-like object from a curses window 'win'
# whose read() function will return the next byte (as a character)
# given by win.getch() if it's a byte or return the empty string and
# set the code attribute to the value of win.getch().

# It is not used in this test, The Stream class below is used
# instead.

class CursesStream(object):
    def __init__(self, win):
        self.getch = self.win.getch
    def read(self):
        c = self.getch()
        if c == -1:
            self.code = None
            return ''
        elif c > 255:
            self.code = c
            return ''
        else:
            return chr(c)

# This class simulates CursesStream above with a predefined list of
# keycodes to return - handy for testing.

class Stream(object):
    def __init__(self, codes):
        self.codes = iter(codes)
    def read(self):
        try:
            c = self.codes.next()
        except StopIteration:
            self.code = None
            return ''
        if c > 255:
            self.code = c
            return ''
        else:
            return chr(c)

def getkeys(stream, encoding):
    '''Given a CursesStream object and an encoding, yield the decoded
    unicode characters and special keycodes that curses sends'''
    read = codecs.getreader(encoding)(stream).read
    while True:
        c = read()
        if c:
            yield c
        elif stream.code is None:
            return
        else:
            yield stream.code


# Test getkeys with

for c in getkeys(Stream(codes), 'utf-8'):
    if isinstance(c, unicode):
        print 'Char\t', c
    else:
        print 'Code\t', c

-------------------- running uctest.py ------------------------------

marigold:junk arno$ python uctest.py 
Char	T
Char	é
Code	259
Char	ç
Char	a

Thanks if you have read this far!

-- 
Arnaud



More information about the Python-list mailing list