[Tutor] text module
Scot W. Stevenson
scot@possum.in-berlin.de
Sat, 31 Aug 2002 01:41:30 +0200
Hello Kyle,
> started to write a module (called it the text module, any better
> ideas?)
I would agree with J"o that "wc.py" might be a better name, even though you
might end up explaining to non-Unix-people that it doesn't involve toilet
paper =8).
> body = subj.readlines()
This is okay if you know that the file is going to be small and you machine
is big, but for large files on small machines, it could be a problem:
readlines loads the whole file into memory in one big gulp. xreadlines was
created in Python 2.1 (I think) to avoid this problem, but if you have
Python 2.2 or later, the really cool thing to do is to use iterators and
simply create a loop such as:
for line in subj:
(etc)
which reads one line at a time as a string. You can get the number of
characters in that line (with spaces) as
len(line)
and the number of spaces as
line.count(" ")
which, put together, should be a simpler way of calculating the number of
characters.
wordlist = line.split(" ")
gives you the a list of words split by spaces, and the length of that list
is therefore the number of words in the line.
So to figure out everything in one loop at once, you could try (in Python
2.2 only):
========================================
def CountAll(location):
nbr_lines = nbr_words = nbr_allchars = nbr_blackchars = 0
subj = file(location, 'r')
for line in subj:
nbr_lines = nbr_lines + 1
nbr_words = nbr_words + len(line.split(" "))
temp = len(line)
nbr_allchars = nbr_allchars + temp
nbr_blackchars = nbr_blackchars + temp - line.count(" ")
subj.close()
print nbr_lines, nbr_words, nbr_allchars, nbr_blackchars
=========================================
which, of course, is not quite what you wanted to do...but you should be
able to adapt it to your setup quite easily. Again, this will only work
for Python 2.2, so if you have a different version, you are going to have
to use xreadlines and such.
Hope this helped,
Y, Scot
--
Scot W. Stevenson wrote me on Saturday, 31. Aug 2002 in Zepernick, Germany
on his happy little Linux system that has been up for 1774 hours
and has a CPU that is falling asleep at a system load of 0.30.