[Tutor] just what does read() return?
steve at pearwood.info
Fri Oct 1 02:01:48 CEST 2010
On Fri, 1 Oct 2010 08:32:40 am Alex Hall wrote:
> I fully expected to see txt be an array of strings since I figured
> self.original would have been split on one or more new lines. It
> turns out, though, that I get this instead:
> ['l\nvx vy z\nvx vy z']
There's no need to call str() on something that already is a string.
Admittedly it doesn't do much harm, but it is confusing for the person
reading, who may be fooled into thinking that perhaps the argument
wasn't a string in the first place.
The string split method doesn't interpret its argument as a regular
expression. r'\n+' has no special meaning here. It's just three literal
characters backslash, the letter n, and the plus sign. split() tries to
split on that substring, and since your data doesn't include that
combination anywhere, returns a list containing a single item:
> How is it that txt is not an array of the lines in the file, but
> instead still holds \n characters? I thought the manual said read()
> returns a string:
It does return a string. It is a string including the newline
> I know I can use f.readline(), and I was doing that before and it all
> worked fine. However, I saw that I was reading the file twice and, in
> the interest of good practice if I ever have this sort of project
> with a huge file, I thought I would try to be more efficient and read
> it once.
You think that keeping a huge file in memory *all the time* is more
efficient? It's the other way around -- when dealing with *small* files
you can afford to keep it in memory. When dealing with huge files, you
need to re-write your program to deal with the file a piece at a time.
(This is often a good strategy for small files as well, but it is
essential for huge ones.)
Of course, "small" and "huge" is relative to the technology of the day.
I remember when 1MB was huge. These days, huge would mean gigabytes.
Small would be anything under a few tens of megabytes.
More information about the Tutor