Undefined behaviour in C [was Re: The Cost of Dynamism]

Sat Mar 26 10:09:38 EDT 2016

On 26/03/2016 13:22, Chris Angelico wrote:
> On Sat, Mar 26, 2016 at 11:21 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>> In plain English, if the programmer had an intention for the code, and it
>> was valid C syntax, it's not hard to conclude that the code has some
>> meaning. Even if that meaning isn't quite what the programmer expected.
>> Compilers are well known for only doing what you tell them to do, not what
>> you want them to do. But in the case of C and C++ they don't even do what
>> you tell them to do.
>>
>
> Does this Python code have meaning?
>
> x = 5
> while x < 10:
>      print(x)
>      ++x
>
>
> It's a fairly direct translation of perfectly valid C code, and it's
> syntactically valid. When the C spec talks about accidentally doing
> what you intended, that would be to have the last line here increment
> x. But that's never a requirement; compilers/interpreters are not
> mindreaders.

I'm surprised that both C and Python allow statements that apparently do 
nothing. In both, an example is:

   x

on a line by itself. This expression is evaluated, but then any result 
discarded. If there was a genuine use for this (for example, reporting 
any error with the evaluation), then it would be simple enough to 
require a keyword in front.

Not allowing these standalone expressions allows extra errors to be 
picked up including '++x' and 'next' in Python.

(I think simply translating '++x' in Python to 'x+=1' has already been 
discussed in the past.)

> The main reason the C int has undefined behaviour is that it's
> somewhere between "fixed size two's complement signed integer" and
> "integer with plenty of room". A C compiler is generally free to use a
> larger integer than you're expecting, which will cause numeric
> overflow to not happen. That's (part of[1]) why overflow of signed
> integers is undefined - it'd be too costly to emulate a smaller
> integer. So tell me... what happens in CPython if you incref an object
> more times than the native integer will permit? Are you bothered by
> this possibility, or do you simply assume that nobody will ever do
> that?

(On a ref-counted scheme I use, with 32-bit counts (I don't think it 
matters if they are signed or not), each reference implies a 16-byte 
object elsewhere. For the count to wrap around back to zero, that would 
mean 64GB of RAM being needed. On a 32-bit system, something else will 
go wrong first.

Even on 64-bits, it's a possibility I suppose although you might notice 
memory problems sooner.)

-- 
Bartc