[python-win32] print file byte contents distribution

Ghostly ghostility at gmail.com
Fri Mar 4 22:31:37 CET 2011


Thanks guys for your input

As I need this for larger files (couple of tens even hundrets of MB) I
tested your suggestion and it seems that dict method is fastest.

I ended with this:

----------------------------
import sys

try:
    counter = {}

    for bytes in open(sys.argv[1], "rb").read():
        try:
            counter[bytes] += 1
        except KeyError:
            counter[bytes] = 1

    peak = max(counter.values())

    for key in sorted(counter.keys()):
        print '%02x: %08d %s' % (ord(key), counter[key], '-' * (66 * counter[key]/peak))

except Exception, e:
    print e
----------------------------

Cheers


More information about the python-win32 mailing list