Implementing file reading in C/Python
Marc 'BlackJack' Rintsch
bj_666 at gmx.net
Fri Jan 9 16:14:52 EST 2009
On Fri, 09 Jan 2009 15:34:17 +0000, MRAB wrote:
> Marc 'BlackJack' Rintsch wrote:
>
>> def iter_max_values(blocks, block_count):
>> for i, block in enumerate(blocks):
>> histogram = defaultdict(int)
>> for byte in block:
>> histogram[byte] += 1
>>
>> yield max((count, byte)
>> for value, count in histogram.iteritems())[1]
>>
> [snip]
> Would it be faster if histogram was a list initialised to [0] * 256?
Don't know. Then for every byte in the 2 GiB we have to call `ord()`.
Maybe the speedup from the list compensates this, maybe not.
I think that we have to to something with *every* byte of that really
large file *at Python level* is the main problem here. In C that's just
some primitive numbers. Python has all the object overhead.
Ciao,
Marc 'BlackJack' Rintsch
More information about the Python-list
mailing list