[Tutor] What is the best way to count the number of lines in a huge file?

dman dsh8290@rit.edu
Thu, 6 Sep 2001 17:11:23 -0400


On Thu, Sep 06, 2001 at 09:07:08AM -0400, Ignacio Vazquez-Abrams wrote:
| On Thu, 6 Sep 2001, dman wrote:
| > On Thu, Sep 06, 2001 at 02:50:04AM -0400, Ignacio Vazquez-Abrams wrote:
| > | On Thu, 6 Sep 2001, HY wrote:
<major snipage>
| > | a=None
| > | n=0
| > | while not a=='':
| > |   a=file.read(262144) # season to taste
| > |   n+=a.count('\n')
| >
| > Just beware of Mac's.  You won't find a single \n in a Mac text file
| > because they use \r instead.  FYI in case you have to deal with a text
| > file that came from a Mac.
| 
| Fair enough:
| 
| ---
| a=None
| n=0
| while not a='':
|   a=file.read(262144)
|   n+=a.count(os.linesep)
| ---

The only problem with this is it only (truly properly) counts the
lines of files that were created with the same OS as the one
counting.

You'd probably want to use a regex to search for "\r\n|\r|\n", but it
all depends on the source(s) of the files you want to count.

Make your script "good enough", not "perfect according to some misty
definition".  :-).

-D