[Tutor] mapping header row to data rows in file

Peter Otten __peter__ at web.de
Fri Jun 28 19:09:22 CEST 2013


Sivaram Neelakantan wrote:

> I apologise for mailing you directly but this one seems to work but I
> don't seem to understand it.  Could you please explain this?

[I don't see anything private about your questions, so I'm taking the 
liberty do bring this back on list]

> a) for row in reader(f)...
>   reader(f) is called 6 times or not?

No, the reader() function is called once before the first iteration of the 
loop. You can think of

for x in expr():
   ...

as syntactic sugar for

tmp = iter(expr())
while True:
    try:
        x = next(tmp)
    except StopIteration:
        break
    ...

> b) why isn't the print in reader() not printing each row each time
> reader() is called

It is called just once. The function returns a "generator" built from the 
"generator expression"

(Row(*values) for values in rows)

which corresponds to the "tmp" variable in the long version of the for-loop 
above. A generator lazily produces one value when you call its next() 
method:

>>> g = (i*i for i in [1, 2, 3])
>>> next(g) # same as g.next() in Python 2 or g.__next__() in Python 3
1
>>> next(g)
4
>>> next(g)
9
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

There is an alternative way to create a generator that is perhaps easier to 
grasp:

>>> def f():
...     for i in [1, 2, 3]:
...             yield i*i
... 
>>> g = f()
>>> next(g)
1
>>> next(g)
4
>>> next(g)
9
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

On each next() call the code in f() is executed until it encounters a 
"yield".

> c) what does Row(*values) do?

It unpacks the values sequence. For example, if values is a list of length 3 
like values = ["a", "b", "c"] then 

Row(*values) 

is equivalent to 

Row(values[0], values[1], values[2])

or 

Row("a", "b", "c")

> 
> --8<---------------cut here---------------start------------->8---
> def reader(instream):
>     rows = csv.reader(instream)
>     #    rows = (line.split(",") for line in instream)
>     rows = ([field.strip() for field in row] for row in rows) 
>     print type(rows)
>     names = next(rows)
>     print names
>     Row = namedtuple("Row", names)
>     return (Row(*values) for values in rows)
> 
> with open("AA.csv", "r") as f:
>     for row in reader(f):
>         print row
> 
> $ python csvproc.py
> <type 'generator'>
> ['Symbol', 'Series', 'Date', 'Prev_Close']
> Row(Symbol='STER', Series='EQ', Date='22-Nov-2012', Prev_Close='9')
> Row(Symbol='STER', Series='EQ', Date='29-Nov-2012', Prev_Close='10')
> Row(Symbol='STER', Series='EQ', Date='06-Dec-2012', Prev_Close='11')
> Row(Symbol='STER', Series='EQ', Date='06-Jun-2013', Prev_Close='9')
> Row(Symbol='STER', Series='EQ', Date='07-Jun-2013', Prev_Close='9')




More information about the Tutor mailing list