It's ...

Beni Cherniavsky beni.cherniavsky at gmail.com
Tue Jun 30 16:24:15 EDT 2009


On Jun 24, 11:40 pm, "J. Cliff Dyer" <j... at sdf.lonestar.org> wrote:
> Also note that you can iterate over a file several times:
>
> f = open('foo.txt')
> for line in f:
>     print line[0]  # prints the first character of every line
> for line in f:
>     print line[1]  #prints the second character of every line
>
No, you can't.  The second loop prints nothing!
A file by default advances forward.  Once you reach the end, you stay
there.

You could explicitly call f.seek(0, 0) to rewind it.  Note that not
all file objects are seekable (e.g. pipes and sockets aren't).

The cleaner way to read a regular file twice is to *open* it time:

for line in open('foo.txt'):
    print line[0]  # prints the first character of every line
for line in open('foo.txt'):
    print line[1]  # prints the second character of every line

Quick recap for Angus:
for loops work on "iterables" - objects that can be asked for an
"iterator".
Python iterators are unseekable - once exhausted they stay empty.
Most iterables (e.g. lists) return a new iterator every time you ask,
so you can iterate over the same data many times.
But if you already have an iterator, you can use it in a for loop -
when asked for an iterator, it will offer itself (in other words an
iterator is a degenerate kind of iterable).
This is what happened with the file object - it's an iterator and
can't be reused.

Reusing the same iterator between for loops is sometimes useful if you
exit the first loop mid-way:

f = open('foo.mail')
# skip headers until you see an empty line
for line in f:
    if not line.strip():
        break
# print remainer or file
for line in f:
    sys.stdout.write(line)


P.S. Warning: after you use ``for line in f``, it's dangerous to use
``f.read()`` and ``f.readline()`` (buffering mess - just don't.)


> Glad you're enjoying Beazley.  I would look for something more
> up-to-date.  Python's come a long way since 2.1.  I'd hate for you to
> miss out on all the iterators, booleans, codecs, subprocess, yield,
> unified int/longs, decorators, decimals, sets, context managers and
> new-style classes that have come since then.
>
Seconded - 2.1 is ancient.
If you continue with the book, here is a quick list of the most
fundamental improvements to keep in mind:

1. Iterators, generators, generator expressions.
2. Working nested scopes.
3. Decorators.
4. with statement.
5. set & bool types.
6. Descriptors (if confusing, just understand properties).
7. from __future__ import division, // operator.

and the most refreshing modules added:

- subprocess
- ctypes
- itertools
- ElementTree
- optparse

and not new but I just love introducing it people:

- doctest



More information about the Python-list mailing list