Fatest standard way to sum bytes (and their squares)?

Alexander Schmolck a.schmolck at gmail.com
Sun Aug 12 05:53:09 EDT 2007


Erik Max Francis <max at alcyone.com> writes:

> For a file hashing system (finding similar files, rather than identical ones),
> I need to be able to efficiently and quickly sum the ordinals of the bytes of
> a file and their squares.  Because of the nature of the application, it's a
> requirement that I do it in Python, or only with standard library modules (if
> such facilities exist) that might assist.
>
> So far the fastest way I've found is using the `sum` builtin and generators::
>
> 	ordinalSum = sum(ord(x) for x in data)
> 	ordinalSumSquared = sum(ord(x)**2 for x in data)
>
> This is about twice as fast as an explicit loop, but since it's going to be
> processing massive amounts of data, the faster the better.  Are there any
> tricks I'm not thinking of, or perhaps helper functions in other modules that
> I'm not thinking of?

Is this any faster?

 ordSum, orsSumSq = (lambda c:c.real,c.imag)(sum(complex(ord(x),ord(x)<<1) 
                                                 for x in data))

'as



More information about the Python-list mailing list