[Tutor] csv DictReader/Writer question
Peter Otten
__peter__ at web.de
Fri Oct 22 16:14:40 CEST 2010
Ara Kooser wrote:
> Thank you for your response. I did try reading the documentation but I
> missed something (or several somethings in this case). So what I see in
> the code you supplied is:
>
> with open(source, "rb") as instream:
> reader = csv.DictReader(instream, skipinitialspace=True)
>
> destfieldnames = list(reader.fieldnames)
> destfieldnames.remove("Species2")
> destfieldnames.remove("Protein ID2")
>
> So this reads the csv file in but I need to run it to see what
> skipinitialspace does.
Your csv header looks like
Field1, Field2, Field3, ...
When you feed that to the DictReader you get fieldnames
["Field1", " Field2", " Field3", ...]
skipinitialspace advises the DictReader to remove the leading spaces, i. e.
you get
["Field1", "Field2", "Field3", ...]
instead.
> Then it reads in the header line and removes the
> Species2 and Protein ID2. Does this remove all the data associated with
> those columns? For some reason I thought I had to bring these into a
> dictionary to manipulate them.
destfieldnames is the list of field names that will be written to the output
file. I construct it by making a copy of the list of field names of the
source and then removing the two names of the columns that you don't want in
the output. Alternatively you could use a constant list like
destfieldnames = ["Field2", "Field5, "Field7"]
to handpick the columns.
> with open(dest, "wb") as outstream:
> writer = csv.DictWriter(outstream,
> destfieldnames,extrasaction="ignore")
The following line uses the csv.writer instance wrapped by the
csv.DictWriter to write the header.
> writer.writer.writerow(destfieldnames)
The line below iterates over the rows in the source file and writes them
into the output file.
> writer.writerows(reader)
A more verbose way to achieve the same thing would be
for row in reader:
writer.writerow(row)
Remember that row is a dictionary that has items that shall not be copied
into the output? By default the DictWriter raises an exception if it
encounters such extra items. But we told it to silently ignore them with
extrasaction="ignore".
> I think the first line after the open writes the field names to the file
> and the follow lines write the data is that correct? I am going to run the
> code as soon as I get home.
Come back if you have more questions.
Peter
More information about the Tutor
mailing list