[Tutor] Uniform 'for' behavior for strings and files

Tue Jun 10 02:14:27 CEST 2008

"Shrutarshi Basu" <technorapture at gmail.com> wrote

> for item in block:
>
> where block is a file, then item becomes each line of the file in 
> turn
> But if block in a large string (let's say the actual text that was 
> in
> the file), item becomes each character at a time.

Correct because for iterates over a sequence. It will return whatever
the unit of sequence is. So the challenge is to make the sequence
represent the data in the units you need.

So to make a long string return lines rather than characters you
need to break the string into lines. The easiest way to do that is
using the split() method:

for line in block.read().split():

But of course for files you can just read the lines directly - or
indirectly using readlines().

> have a uniform iteration irrespective of whether block is a file or
> string (ideally one \n-delimited line at a time)?

The for loop is uniform, it is just that the sequence unit is 
different

> default for behavior is being problematic. I could turn the whole 
> file
> into a string first, but that might cause problems with large files.

You could write a generator function that returned blocks. That
would fit your problem quite well and be memory friendly as well.
That way the for would work exactly as you require.

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld