[Python-3000] C API for ints and strings

Tue Sep 11 19:03:00 CEST 2007

On 9/11/07, Paul Moore <p.f.moore at gmail.com> wrote:
> On 11/09/2007, Nicholas Bastin <nick.bastin at gmail.com> wrote:
> > On 9/11/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> > > > 3.0: 10 loops, best of 3: 6.76 sec per loop
> > > > 2.6: 10 loops, best of 3: 2.61 sec per loop
> > >
> > > I can't quite reproduce these results. On a 3.2GHz Pentium 4,
> > > running Linux 2.6.21, gcc 4.1.3, I get
> > >
> > > 3.0: 10 loops, best of 3: 728 msec per loop
> > > 2.6: 10 loops, best of 3: 558 msec per loop
> > >
> > > So it's only 30% slower, not 260%.
>
> FWIW, I get
>
> >python -m timeit "import inttest; inttest.int_test2(5)"
> 10 loops, best of 3: 367 msec per loop
>
> >\Apps\Python30\python -m timeit "import inttest; inttest.int_test2(5)"
> 10 loops, best of 3: 810 msec per loop
>
> That's on Windows XP, distributed binaries of Python 2.5 and 3.0a1.
> Processor speed:           1.7 GHz
> Processor type:            Intel(R) Pentium(R) M processor
>
> That's 120% slower (but against very different versions).
>
> I guess this proves nothing much, apart from the fact that the test is
> wildly variable and as such probably not very valid :-)

The Pentium M and Pentium D are much more alike, architecturally, than
either and the Pentium 4, although the per-clock performance of the
Pentium M is much better than either the 4 or the D (although not
*that* good compared to a D, I didn't think).  In a test like this
where the loop is reasonably tight (even given the trek through the
python interpreter), processor architecture and differing compiler
optimizations will likely have a pretty significant effect on the
overall performance.  Without looking into it at a much lower level,
it's hard to tell, but the difference between a 1MB and 2MB L2 cache
might make all the difference in 3.0 performance.

--
Nick