Best way to hash a dictionary

Alex Martelli aleax at aleax.it
Tue Mar 4 05:28:26 EST 2003


Miki Tebeka wrote:
   ...
> It might not be the main performance bottleneck but everything that
> will slow me down is usually bad.
   ...
> When you use a user defined class instread of built in type you lose
> some performance (3% from my testing on hashes)

Ah, we have a "philosophical" problem here.  If ANY performance
issue is intolerable, to the point that a 3% slowdown on SOME
parts of your code is something you must consider in deciding
your design, then Python is surely the wrong tool, and I would
suggest you also eschew C++ in favour of carefully hand optimized
machine code -- not even the best optimizing compiler can make
sure it won't lose some 3% here or there.

You say you don't profile because profiling slows down your
runs.  Yes, but profiling is also the ONLY sensible way to
find out what you must optimize.  Surely you can arrange a
"smaller" but still significant run, taking e.g, 1 day rathen
than 5, so that even if, instrumented for profiling (hint:
use hotshot, NOT the old profiler -- hotshot isn't documented,
but it's MUCH less invasive than the old profiler!), it slows
down by 2-3 times, you're still going to be done in a few
days.  Most likely you can arrange runs far shorter yet --
back when I did computational linguistics (IBM Research in
the '80s) I managed to use REXX (IBM's scripting language)
for MOST of my work, going down to PL/I, Pascal/VS (or even
well-vectorized Fortran carefully made cache-friendly when
I needed REAL performance -- going further down to BAL gave
no further boosts!) only for CAREFULLY selected hot-spots
(I didn't have a decent profiler -- I made do with logging
times at key turning-points to a file and analyzing that).
The "performance tuning" runs were on toy datasets, selected
to give me good performance prediction for the "real" runs.

The most significant "optimization" was managing to use
VM/SP rather than MVS (a factor of two...) and even more
managing to grab dedicated time on a 3090 with 6 CPU's,
vector features, and memory to burn, rather than chugging
along on a 9370.  The pieces of hardware in question (and
mostly the OS's too) are now "consigned to the dustbin of
history", but the lesson remains valid: e.g. if you're
now chugging along (for example) on a Pentium-3 machine
running Windows, the best single optimization you can
make might be to switch to a good new Athlon with DDR
and FreeBSD (other free Unix-like systems are not far, but
for sheer performance I think FreeBSD's still on top).

But, you just CANNOT avoid profiling when you're after
performance.  Guessing at where the bottlenecks might be
just doesn't work -- even the most experienced optimizer
guy, whose guesses are well-informed and clever, gets it
wrong, over and over again, when he forgets to MEASURE.
What with multiple level of caches with surprising smart
behavior, pipelines, etc, predicting today's CPU's
performance on any non-trivial task is a lost cause --
and when you put many other factors into the pot, CPU
performance issues becomes simple in comparison -- disk,
networking, virtual memory, task switching, databases...
there's just NO substitute for measurement (profiling).


Alex





More information about the Python-list mailing list