Fast pythonic way to process a huge integer list
Peter Otten
__peter__ at web.de
Thu Jan 7 05:21:03 EST 2016
high5storage at gmail.com wrote:
> I have a list of 163.840 integers. What is a fast & pythonic way to
> process this list in 1,280 chunks of 128 integers?
What kind of processing do you have in mind?
If it is about numbercrunching use a numpy.array. This can also easily
change its shape:
>>> import numpy
>>> a = numpy.array(range(12))
>>> a
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> a.shape = (3, 4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
If it's really only(!) under a million integers slicing is also good:
items = [1, 2, ...]
CHUNKSIZE = 128
for i in range(0, len(items), CHUNKSIZE):
process(items[start:start + CHUNKSIZE])
If the "list" is really huge (your system starts swapping memory) you can go
completely lazy:
from itertools import chain, islice
def chunked(items, chunksize):
items = iter(items)
for first in items:
chunk = chain((first,), islice(items, chunksize-1))
yield chunk
for dummy in chunk: # consume items that may have been skipped
# by your processing
pass
def produce_items(file):
for line in file:
yield int(line)
CHUNKSIZE = 128 # this could also be "huge"
# without affecting memory footprint
with open("somefile") as file:
for chunk in chunked(produce_items(file), CHUNKSIZE):
process(chunk)
More information about the Python-list
mailing list