Inplace operations for PyLong objects
by Manciu, Catalin Gabriel
While looking over the PyLong source code in Objects/longobject.c I came
across the fact that the PyLong object doesnt't include implementation for
basic inplace operations such as adding or multiplication:
0, /* nb_inplace_add */
0, /* nb_inplace_subtract */
0, /* nb_inplace_multiply */
0, /* nb_inplace_remainder */
While I understand that the immutable nature of this type of object justifies
this approach, I wanted to experiment and see how much performance an inplace
add would bring.
My inplace add will revert to calling the default long_add function when:
- the refcount of the first operand indicates that it's being shared
- that operand is one of the preallocated 'small ints'
which should mitigate the effects of not conforming to the PyLong immutability
It also allocates a new PyLong _only_ in case of a potential overflow.
The workload I used to evaluate this is a simple script that does a lot of
def write_progress(prev_percentage, value, limit):
percentage = (100 * value) // limit
if percentage != prev_percentage:
sys.stdout.write("%d%%\r" % (percentage))
progress = -1
the_value = 0
the_increment = ((1 << 30) - 1)
crt_iter = 0
total_iters = 10 ** 9
start = time.time()
while crt_iter < total_iters:
the_value += the_increment
crt_iter += 1
progress = write_progress(progress, crt_iter, total_iters)
end = time.time()
print ("\n%.3fs" % (end - start))
print ("the_value: %d" % (the_value))
Running the baseline version outputs:
Running the modified version outputs:
In summary, I got a +13.47% improvement for the modified version.
The CPython revision I'm using is 7f066844a79ea201a28b9555baf4bceded90484f
from the master branch and I'm running on a I7 6700K CPU with Turbo-Boost
disabled (frequency is pinned at 4GHz).
Do you think that such an optimization would be a good approach ?