[Python-3000] optimizing [x]range

Thu Aug 2 17:03:42 CEST 2007

On Sat, 28 Jul 2007 17:06:50 +0200, tomer filiba wrote:

> currently, testing for "x in xrange(y)" is an O(n) operation.
> 
> since xrange objects (which would become range in py3k) are not real
> lists, there's no reason that __contains__ be an O(n). it can easily be
> made into an O(1) operation. here's a demo code (it should be trivial to
> implement this in CPython)
> 
> 
> class xxrange(object):
>     def __init__(self, *args):
>         if len(args) == 1:
>             self.start, self.stop, self.step = (0, args[0], 1)
>         elif len(args) == 2:
>             self.start, self.stop, self.step = (args[0], args[1], 1)
>         elif len(args) == 3:
>             self.start, self.stop, self.step = args
>         else:
>             raise TypeError("invalid number of args")
> 
>     def __iter__(self):
>         i = self.start
>         while i < self.stop:
>             yield i
>             i += self.step
> 
>     def __contains__(self, num):
>         if num < self.start or num > self.stop:
>             return False
>         return (num - self.start) % self.step == 0
> 
> 
> print list(xxrange(7))            # [0, 1, 2, 3, 4, 5, 6] print
> list(xxrange(0, 7, 2))      # [0, 2, 4, 6] print list(xxrange(1, 7, 2)) 
>     # [1, 3, 5] print 98 in xxrange(100)          # True print 98 in
> xxrange(0, 100, 2)    # True print 99 in xxrange(0, 100, 2)    # False
> print 98 in xxrange(1, 100, 2)    # False print 99 in xxrange(1, 100, 2)
>    # True
> 
> 
> 
> -tomer

I gave the implementation a try. I cannot guarantee that it follows every 
guideline for the CPython core but it works, it's fast and passes all 
tests::

    >>> 98 in xrange(0, 100, 2)
    True
    >>> 99 in xrange(0, 100, 2)
    False
    >>> 98 in xrange(1, 100, 2)
    False
    >>> 99 in xrange(1, 100, 2)
    True

Note: test_xrange wasn't really helpful with validating xrange's 
functionality. No tests for negative steps, at least it didn't warn me 
while it didn't work. ;)

It is basically the algorithm you provided, with a fix for negative steps.

The patch is based on the latest trunk/ checkout, Python 2.6. I don't 
think this is a problem if nobody else made any effort towards making 
xrange more sequence-like in the Python 3000 branch. The C source might 
require some tab/space cleanup.

Speed comparison
================

$ ./python -V
Python 2.6a0
$ python -V
Python 2.5.1

./python is my local build, with the patch applied.

$ ./python -mtimeit "0 in xrange(100)"
1000000 loops, best of 3: 0.641 usec per loop
$ python -mtimeit "0 in xrange(100)"
1000000 loops, best of 3: 0.717 usec per loop

Well, with a lot of ignorance, this is still the same.

$ ./python -mtimeit "99 in xrange(100)"
1000000 loops, best of 3: 0.638 usec per loop
$ python -mtimeit "99 in xrange(100)"
100000 loops, best of 3: 6.17 usec per loop

Notice the difference in the magnitude of loops!

$ ./python -mtimeit -s "from sys import maxint" "maxint in xrange(maxint)"
1000000 loops, best of 3: 0.622 usec per loop
$ python -mtimeit -s "from sys import maxint" "maxint in xrange(maxint)"

Still waiting.. ;)


Index: Objects/rangeobject.c
===================================================================

--- Objects/rangeobject.c       (revision 56666)
+++ Objects/rangeobject.c       (working copy)
@@ -129,12 +129,31 @@
        return rtn;
 }

+static int
+range_contains(rangeobject *r, PyIntObject *key) {
+    if (PyInt_Check(key)) {
+        int keyval = key->ob_ival;
+        int start = r->start;
+        int step = r->step;
+        int end = start + (r->len * step);
+
+        if ((step < 0 && keyval <= start && keyval > end) \
+           || (step > 0 && keyval >= start && keyval < end)) {
+            return ((keyval - start) % step) == 0;
+        }
+    }
+    return 0;
+}
+
 static PySequenceMethods range_as_sequence = {
        (lenfunc)range_length,  /* sq_length */
        0,                      /* sq_concat */
        0,                      /* sq_repeat */
        (ssizeargfunc)range_item, /* sq_item */
        0,                      /* sq_slice */
+    0, /* sq_ass_item */
+    0, /* sq_ass_slice */
+    (objobjproc)range_contains, /* sq_contains */
 };

 static PyObject * range_iter(PyObject *seq);


Test suite
==========

OK
288 tests OK.
CAUTION:  stdout isn't compared in verbose mode:
a test that passes in verbose mode may fail without it.
1 test failed:
    test_nis # due to verbosity, I guess
38 tests skipped:
    test_aepack test_al test_applesingle test_bsddb test_bsddb185
    test_bsddb3 test_bz2 test_cd test_cl test_codecmaps_cn
    test_codecmaps_hk test_codecmaps_jp test_codecmaps_kr
    test_codecmaps_tw test_curses test_dbm test_gdbm test_gl
    test_imageop test_imgfile test_linuxaudiodev test_macostools
    test_normalization test_ossaudiodev test_pep277 test_plistlib
    test_scriptpackages test_socketserver test_sqlite test_startfile
    test_sunaudiodev test_tcl test_timeout test_urllib2net
    test_urllibnet test_winreg test_winsound test_zipfile64
5 skips unexpected on linux2:
    test_tcl test_dbm test_bz2 test_gdbm test_bsddb


Should I submit the patch to the SF patch manager as well?

Regards,
Stargaming