[Python-bugs-list] [ python-Bugs-604716 ] faster [None]*n or []*n
noreply@sourceforge.net
noreply@sourceforge.net
Fri, 06 Sep 2002 11:38:47 -0700
Bugs item #604716, was opened at 2002-09-04 15:20
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=604716&group_id=5470
Category: Python Interpreter Core
>Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory Smith (gregsmith)
Assigned to: Nobody/Anonymous (nobody)
Summary: faster [None]*n or []*n
Initial Comment:
This is a suggestion to speed up multiplying
a list by an integer, in the case where
the length of the list is 0 or 1. currently there
is a lot of overhead in the loop for these cases.
The current loop ( in list_repeat, in listobject.c) :
p = np->ob_item;
for (i = 0; i < n; i++) {
for (j = 0; j < a->ob_size; j++) {
*p = a->ob_item[j];
Py_INCREF(*p);
p++;
}
}
return (PyObject *) np;
Suggested ( where Py_INCREF_N(o,n) bumps
the refcount of 'o' by n):
p = np->ob_item;
if( a->ob_size <= 1 ){
if( a->ob_size ){ // is 1
Pyobject *a0 = a->ob_item[0];
for (i = 0; i < n; i++) {
*p++ = a0;
}
Py_INCREF_N(a0,n);
}
}else{
for (i = 0; i < n; i++) {
for (j = 0; j < a->ob_size; j++) {
*p = a->ob_item[j];
Py_INCREF(*p);
p++;
}
}
}
return (PyObject *) np;
You could also do special cases for len=2, etc
(with *p++ = a0; *p++ = a1; inside) but len=1
is a common case, with things like [None]*1000
used to create lists.
The best approach in general is to check
if the list being replicated is sufficiently smaller than
the replication count; and if so, use a different set
of loops nested in the other order, so that the
outer loop runs less often.
With the inverted loop case, you have
'a->ob_size' calls to Py_INCREF_N instead of
'a->ob_size*n' calls to Py_INCREF.
Exact same comments apply to tuplerepeat().
If any interest, I will code this and test it.
----------------------------------------------------------------------
>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-09-06 13:38
Message:
Logged In: YES
user_id=80475
There was a typo in my code fragment. It should have
read:
memcpy(p + i*n, a->ob_item, a->ob_size * sizeof
(PyObject *));
The memcpy speeds up cases where there are many
items in the list. It is actually slower for a list of length
one which should still be special cased as the OP
suggested.
Also, his idea for adding Py_INCREF_N is great. It can be
used throughout python. Since Py_INCREF has lots of
macro magic for updating debug statistics, the OP's idea
may have to be implemented as a function.
I think there is interest in the OP's ideas and that he
should go-ahead and develop a tested patch along with
timing statistics.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-09-06 04:53
Message:
Logged In: YES
user_id=21627
Can you report what speed-up you get, for what example?
----------------------------------------------------------------------
Comment By: Raymond Hettinger (rhettinger)
Date: 2002-09-04 20:45
Message:
Logged In: YES
user_id=80475
Here's a faster approach that doesn't involve a special
case:
for (i = 0; i < n; i++)
memcpy(p + i*n, p, size);
for (j = 0; j < a->ob_size; j++){
Py_INCREF_N(*p, n);
p++;
}
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=604716&group_id=5470