[Tutor] Ideas on making this cleaner?

Mats Wichmann mats at wichmann.us
Sun May 5 16:10:36 EDT 2024


On 5/4/24 09:15, Leam Hall wrote:
> This gets a rough USA grade level report for chapters in a book. There's 
> a lot of "count()" duplication, can it be done in a cleaner fashion?  
> Coding style is Python 3.12 and standard library only; no exceptions. I 
> also run "black -l 79" on everything.   :)
> 
> Full code is at:  https://github.com/LeamHall/bookbot
> 
...
>      def count_sentences(self):
>          """Counts the number of sentence ending marks."""
>          self.sentence_count = 0
>          for line in self.lines:
>              self.sentence_count += line.count(".")
>              self.sentence_count += line.count("?")
>              self.sentence_count += line.count("!")

So to swing back to the original question which seemed to be more about 
"I'm counting things over and over...is that bad?"  the "hard part" of 
this work is splitting up the text in a way that identifies the 
different things you want to tot up, and the easier part is doing the 
computations once you've done so.  Making sense of the English language 
(or in fact, any human language - in contrast to computer languages 
which have well-defined parsers) is a long-standing problem that lots of 
people work on.  These days most people choose, for production use, to 
stand on the shoulders of those who have gone before, rather than 
rolling their own. Of course, if you're coding your own project as a 
learning exercise, of course the balance of factors is different.  All 
this to say, someday, as a different exercise, you may want to 
experiment with using something like NLTK, which is a mature (thousands 
of commits across over 20 years of development history) toolkit for 
working with natural language.   (https://github.com/nltk/nltk)



More information about the Tutor mailing list