[pypy-dev] Re: genc to genllvm

Armin Rigo arigo at tunes.org
Mon Aug 15 10:30:28 CEST 2005


Hi Richard!     (CC to pypy-dev)

On Sun, Aug 14, 2005 at 12:29:51AM +0100, Richard Emslie wrote:
> Remember we talked about this at the end of the last sprint:
> 
> def test_cast_to_int():
>     def casting(v):
>         return int(ord(chr(v)))
> ... being what caused the seg fault in the first few iterations of 
> genc'ing and pypy IIRC.  Well I went hunting for the test again and didnt 
> find it (so the above test is mine, along with misunderstandings :-).

It's in translator/c/test/test_typed.py: test_ord_returns_a_positive.

> What I dont understand is that if rpy_string chars are supposed to be 
> signed so why were we expecting this to work?

The Char type is supposed to represent characters; a character is not
necessarily the same thing as a number.  Only in C there is a confusion
between them.  In Python for example it doesn't really make sense to ask
whether a character is signed or not.  All that matters is the order
between characters, and the ord() function.  The only important thing is
that these work as expected.  So we decided a bit arbitrarily to use the
C 'char' type for characters.

> (*) actually - wouldnt rpy_string with unsigned chars make more sense? 
> Guess would involve more casting to and from external functions though.

Exactly.  Using 'char' directly is more natural in C, and putting some
casts at two places (comparison and ord()) is rather easy.  Note that
some C compilers think that 'char' is signed and some think that it is
unsigned, as far as I know.

> Why is it wrong when we do a cast from an int to char (which overflows
> and becomes negative) then the cast to an int which doesnt overflow
> and has the same value.

A cast from int to char is meant to represent the function chr().  The
reverse is for the function ord().  It's not just arbitrary casts -- you
don't have casts in Python or RPython -- it's really about these
two functions.  The goal is that after translation code like
assert ord(chr(x)) == x  continues to work for any x in range(256).  The
direct numeric value of the character chr(x) is not relevant because we
cannot inspect it in Python/RPython; we have to use ord().


A bientot,

Armin



More information about the Pypy-dev mailing list