continue vs. pass in this IO reading and writing
Chris Angelico
rosuav at gmail.com
Thu Sep 3 12:11:27 EDT 2015
On Fri, Sep 4, 2015 at 1:57 AM, kbtyo <ahlusar.ahluwalia at gmail.com> wrote:
> I have used CSV and collections. For some reason when I apply this algorithm, all of my files are not added (the output is ridiculously small considering how much goes in - think KB output vs MB input):
>
> from glob import iglob
> import csv
> from collections import OrderedDict
>
> files = sorted(iglob('*.csv'))
> header = OrderedDict()
> data = []
>
> for filename in files:
> with open(filename, 'r') as fin:
> csvin = csv.DictReader(fin)
> header.update(OrderedDict.fromkeys(csvin.fieldnames))
> data.append(next(csvin))
>
> with open('output_filename_version2.csv', 'w') as fout:
> csvout = csv.DictWriter(fout, fieldnames=list(header))
> csvout.writeheader()
> csvout.writerows(data)
You're collecting up just one row from each file. Since you say your
input is measured in MB (not GB or anything bigger), the simplest
approach is probably fine: instead of "data.append(next(csvin))", just
use "data.extend(csvin)", which should grab them all. That'll store
all your input data in memory, which should be fine if it's only a few
meg, and probably not a problem for anything under a few hundred meg.
ChrisA
More information about the Python-list
mailing list