[pypy-dev] OT: abs(x) with 4 assembly insns

Christian Tismer tismer at tismer.com
Mon Sep 29 02:29:04 CEST 2003


Hi friends,

today, Armin presented me a simple brain-teaser:

You have X86 assembly, you have only 4 insns,
and you don't want to use a jump.
You have a register, loaded with a value, and
you should produce its abs, in another register,
while preserving the argument register.

Hmm. 4 insns.

Here is the standard C solution for doing abs without a jump,
whioch you can find on the internet:

int int_abs(int a) {
     return a - ((a+a) & (a>>31));
}

This clearly computes the bas, quite nicely.
The resulting assembly code looks like so:

; 2    :     return a - ((a+a) & (a>>31));

   00000	8b 44 24 04	 mov	 eax, DWORD PTR _a$[esp-4]
   00004	8b c8		 mov	 ecx, eax
   00006	c1 f9 1f	 sar	 ecx, 31			; 0000001fH
   00009	8d 14 00	 lea	 edx, DWORD PTR [eax+eax]
   0000c	23 ca		 and	 ecx, edx
   0000e	2b c1		 sub	 eax, ecx

; 3    : }

This is clearly some solution. But, ignoring the first insn,
which is just moving the argument, this really takes
five instructions, one too much of course.

Actually, there is a much better solution.
X86 has the cdq insn, which does a sign extension from
32 bit signed eax to 64 bit signed edx/eax.

This in fact puts a -1 into edx, whenever the eax is negative,
or a 0, if eax is >= 0.

Now, what is taking the abs() of a value? Right, it is a two's
complement, if the argument is negative, or a do-nothing if
it is not negative.
And the name 'two's complement' comes from the fact
that you need to do two operations for the complement:
First, take the NOT of the value, then add one.

So, here comes the trick:
NOT is equivalent to XORing with -1.
ADD 1 is the same as SUB -1.

Fortunately, this can all be carried out with the result of
the CDQ sign extension, since we got the -1 from it:

Do the CDQ.
Load eax into another register
XOR it with edx   (which is either 0 or -1)
SUB edx from it   (which is either adding one or zero)

So, here the solution(TM):

int other_abs(int a) {
     __asm {
         mov eax, a;
         /* from here... */
         cdq;
         mov ebx, eax;
         xor ebx, edx;
         sub ebx, edx;
         /*... to here*/
         xchg eax, ebx;
     }
}

The meat of the program is exactly four insns, the rest
is just needed to make the C calling conventions happy.

Here the full program, you can step through it with a
debugger, to see that it really works:


int int_abs(int a) {
     return a - ((a+a) & (a>>31));
}


int other_abs(int a) {
     __asm {
         mov eax, a;                /* C: load arg */
         /* from here... */
         cdq;                       /* 1: edx has either 0 or -1 */
         mov ebx, eax;              /* 2: get a copy */
         xor ebx, edx;              /* 3: either identity, or NOT */
         sub ebx, edx;              /* 4: either add 0, or 1 */
         /* ... to here */
         xchg eax, ebx;             /* C: store result */
     }
}


int main(int argc, char **argv) {
     int a;
     a = int_abs(5);
     a = int_abs(-5);
     a = other_abs(5);
     a = other_abs(-5);
     return 0;
}


Hoping-that-everybody-trusts-me-now-that-I-can-write-a-small-OS
     -- ly y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  mobile +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





More information about the Pypy-dev mailing list