Counting

John Machin sjmachin at lexicon.net
Sun Apr 29 22:58:57 EDT 2007


On 30/04/2007 7:17 AM, James Stroud wrote:
> rockmode at gmail.com wrote:
>> That's a short, abridged version of my code :) But, what I want is to
>> count total# of keywords per line and print 'em. Rather than
>> printing :
>>
>> The word 'and' belongs in line num: 1
>> The word 'del' belongs in line num: 1
>> The word 'from' belongs in line num: 1
>>
>> I want to print " Line #1 has 3 keywords"
>>
>> ;)
>>
> 
> 
> I think it would be obvious how to write this:
> 
> 
> for i,line in enumerate(linelist):
>   line = line.split()
>   for k in line:
>     if keyword.iskeyword(k):
>       c = line.count(k)
>       total += line.count(k)
>       print "Line #%d has %d keywords." % (i+1, c)
>       break
> 
> print "Total keyords are: %d" % total

I would have thought so too. But the above is ... let's just say it's 
not quite right. If there are 3 different keywords (as in the OP's 
example), the above code prints 3 times for the same line.

Here's a straight-forward natural way to do it:
total = 0
for i, line in enumerate(linelist):
    c = 0
    line = line.split()
    for k in line:
      if keyword.iskeyword(k):
        c += 1
     # Alternatively, replace above 5 lines by
     # c = sum(keyword.iskeyword(k) for k in line.split())
     # or the equivalent using map(), depending on taste etc :-)
    total += c
    print "Line #%d has %d keywords." % (i+1, c)
print "Total number of keywords is", total

======

Perhaps someone should point out to the OP that using str.split as a 
tokeniser is somewhat deficient:
1. comments and string literals  could make the counts somewhat unreliable:
"# if not use mung(), will break while frobotzing later in code"
2. "else:"
3. "if not(0 <= n < maxn):"

HTH,
John



More information about the Python-list mailing list