[pypy-dev] OT: abs(x) with 4 assembly insns

Bob Ippolito bob at redivi.com
Mon Sep 29 04:41:56 CEST 2003


On Sunday, Sep 28, 2003, at 20:29 America/New_York, Christian Tismer 
wrote:

> Hi friends,
>
> today, Armin presented me a simple brain-teaser:
>
> You have X86 assembly, you have only 4 insns,
> and you don't want to use a jump.
> You have a register, loaded with a value, and
> you should produce its abs, in another register,
> while preserving the argument register.
>
> Hmm. 4 insns.

If you are using a PowerPC with Altivec, you can get the absolute value 
of four 32bit integers in three instructions from C using vector signed 
int vec_abs(vector signed int a), which is a special compiler macro 
that turns into:
					# v1 is argument vector, v0 is result vector
	vspltisw	v0,0		# v0 = [0] * 4
	vsubuwm	v0,v0,v1	# v0 = map(int.__sub__, v0, v1)
	vmaxsw	v0,v1,v0	# v0 = map(max, v1, v0)

With conventional PowerPC instructions (also three instructions):
					# r0 is argument register, r1 is result, r2 is scratch
	srawi	r1,r0,31	# see below, python can't do this nicely
	add		r2,r1,r0	# r2 = r1 + r0
	xor		r1,r2,r1	# r1 = r2 ^ r1

Here's an Python example of the second, mainly to demonstrate how srawi 
works (working in unsigned longs because python 2.3 is inconvenient 
about bit twiddling signed integers):

def binary(v):
	return ''.join([(v & (1L << i)) and '1' or '0' for i in 
range(32)[::-1]])+'b'

def srawi(v, num):
	if v & 0x80000000L:
		mask = 0xFFFFFFFFL ^ ((1L << (32L - num)) - 1)
	else:
		mask = 0x00000000L
	return (v >> num) | mask

def new_abs(r0):
	if r0 < 0:
		r0 = 0x100000000L + r0
	print 'input is %s' % (binary(r0),)
	r1 = srawi(r0, 31)
	print 'r1 = %s' % (binary(r1),)
	r2 = r1 + r0
	print 'r2 = %s' % (binary(r2),)
	r1 = (r2 ^ r1) & 0xFFFFFFFFL
	print 'r1 = %s' % (binary(r1),)
	return r1

 >>> new_abs(1000)
input is 00000000000000000000001111101000b
r1 = 00000000000000000000000000000000b
r2 = 00000000000000000000001111101000b
r1 = 00000000000000000000001111101000b
1000L
 >>> new_abs(-1000)
input is 11111111111111111111110000011000b
r1 = 11111111111111111111111111111111b
r2 = 11111111111111111111110000010111b
r1 = 00000000000000000000001111101000b
1000L

-bob



More information about the Pypy-dev mailing list