Implementing file reading in C/Python
MRAB
google at mrabarnett.plus.com
Fri Jan 9 10:34:17 EST 2009
Marc 'BlackJack' Rintsch wrote:
> On Fri, 09 Jan 2009 04:04:41 +0100, Johannes Bauer wrote:
>
>> As this was horribly slow (20 Minutes for a 2GB file) I coded the whole
>> thing in C also:
>
> Yours took ~37 minutes for 2 GiB here. This "just" ~15 minutes:
>
> #!/usr/bin/env python
> from __future__ import division, with_statement
> import os
> import sys
> from collections import defaultdict
> from functools import partial
> from itertools import imap
>
>
> def iter_max_values(blocks, block_count):
> for i, block in enumerate(blocks):
> histogram = defaultdict(int)
> for byte in block:
> histogram[byte] += 1
>
> yield max((count, byte)
> for value, count in histogram.iteritems())[1]
>
[snip]
Would it be faster if histogram was a list initialised to [0] * 256?
More information about the Python-list
mailing list