[Tutor] Ideas on making this cleaner?
ThreeBlindQuarks
threesomequarks at proton.me
Sat May 4 18:25:49 EDT 2024
Leam,
The initial message was indented fine for me albeit some of the replies ended up flush to the left.
My question was more about the design parameters.
Generally, when a class DOES something, there is a way to access it or why bother?
So, in your code, you start with __init__() and do not declare any variables in advance. the initializer does create variables like "filename" and "lines" and various of the other methods create others like "sentence_count" and "word_count".
When the initializer has completed, all the variables presumably are set and you could ask for a new object called myreport to share info such as myreport.filename or myreport.word_count.
On my first quick read, I did not notice there was indeed a method that was not invoked by the initializer and that the intent was to be able to ask for myreport.report_data which creates a temporary dictionary in the namespace of the function, and returns it, but not in the namespace of the object.
So, I see your design now. Again, not objecting, just wrapping my mind around how you chose to do it.
If a report is only asked for once, or maybe never, this is fine. Other designs might make the report automatically at startup or delay all the processing until and unless a report was requested or even keep the report in the class when asked for the first time, and a subsequent request would just return the previously computed values. In such a variant, you might even remove the variables already in the dictionary created to save space.
There are MANY choices and none need be right or wrong and it may depend on how you see you or others using this class.
My other comment may need explaining. I was not so much worried about using an initial underscore to make some methods look private, albeit some purists might.
What I was wondering about is whether the design would be faster or simpler if you had only two methods in the first place. In Python, you can define a function within the body of an existing function and use it and other internal functions fairly easily.
Are there advantages and disadvantages? Do these functions need to be passed and use "self" and will they run any slower or faster than the way you chose? I am simply saying that as a design, I might have tried to use internal functions rather than methods never meant to be invoked from outside.
Your main question was whether you could do some of this in other ways and it is in that spirit I mentioned it.
We probably could discuss things you did not ask about. For example, your many functions each traverse the same lines. But they all scribble on the same namespace rather than return a result to the called in the initializer. Thus, could you have traversed the list of lines once and done each step in the same place, either in one or a few methods or the main initializer method?
I mentioned using dictionaries in some places and another one to consider is using sets and testing set membership which can look superficially simpler than listing choices. And, in a sense, it could generalize better should you ever decide to adapt your code to understand reports in another language where you might have alternate sets of vowels and so on.
Sent with Proton Mail secure email.
On Saturday, May 4th, 2024 at 3:44 PM, Leam Hall <leamhall at gmail.com> wrote:
> Before I answer that, I just wanted to ask; is my email keeping the indentation properly? It looks like it on the Tutor archive, but I'm never quite sure how other's mail handles this.
>
> TBQ, the Report class is a builder inside the Chapter class. You're right in that most (all?) of the Report methods could be prefixed with an underscore to mark them as intended to be private, their only use is to create the data bits for the report_data dictionary. The report data gets called by they Chapter class, and is later written to a report file.
>
> Does that answer the question, or did I misunderstand?
>
> Leam
>
>
>
> On 5/4/24 13:01, ThreeBlindQuarks wrote:
>
> > I looked more carefully at the code and noticed something I am not sure is like what I have seen before in how one uses an object.
> >
> > None of the methods seems to be designed to be called from outside the class initialization. I see a dunder init that simply invokes all the other methods and some of those methods invoke each other. After initialization, you have an object with instance variables just sitting there.
> >
> > This is not in any way illegal, but probably could have been done many other ways such as adding a calculate_now method and not doing it all on initialization, or having the methods used be nested and defined only in the initialization method.
> >
> > I am curious how this class/object is being used.
> >
> > Sent with Proton Mail secure email.
> >
> > On Saturday, May 4th, 2024 at 1:32 PM, ThreeBlindQuarks via Tutor tutor at python.org wrote:
> >
> > > Leam,
> > >
> > > I some cases, a dictionary is a useful method to consolidate multiple cases.
> > >
> > > For example, your vowels:
> > >
> > > > self.syllable_count += line.count("a")
> > > > self.syllable_count += line.count("e")
> > > > self.syllable_count += line.count("i")
> > > > self.syllable_count += line.count("o")
> > > > self.syllable_count += line.count("u")
> > > > self.syllable_count -= line.count("ee")
> > > > self.syllable_count -= line.count("oi")
> > > > self.syllable_count -= line.count("oo")
> > > > self.syllable_count -= line.count("ou")
> > >
> > > You could initialize a dictionary like Vowels with keys like "a" through "ou" initialized to zero and increment it when a vowel is encountered.
> > >
> > > This is not always trivial or even helpful as you need to deal with making sure you only include the vowels you want and still have to iterate over things.
> > >
> > > Sent with Proton Mail secure email.
> > >
> > > On Saturday, May 4th, 2024 at 11:15 AM, Leam Hall leamhall at gmail.com wrote:
> > >
> > > > This gets a rough USA grade level report for chapters in a book. There's a lot of "count()" duplication, can it be done in a cleaner fashion? Coding style is Python 3.12 and standard library only; no exceptions. I also run "black -l 79" on everything. :)
> > > >
> > > > Full code is at: https://github.com/LeamHall/bookbot
> > > >
> > > > class Report:
> > > > def init(self, data, filename):
> > > > self.filename = filename
> > > > self.lines = list()
> > > > for line in data:
> > > > self.lines.append(line.lower())
> > > > self.count_sentences()
> > > > self.count_words()
> > > > self.count_syallables()
> > > > self.grade_report()
> > > >
> > > > def count_sentences(self):
> > > > """Counts the number of sentence ending marks."""
> > > > self.sentence_count = 0
> > > > for line in self.lines:
> > > > self.sentence_count += line.count(".")
> > > > self.sentence_count += line.count("?")
> > > > self.sentence_count += line.count("!")
> > > >
> > > > def count_words(self):
> > > > """Counts the number of words, ignoring punctuation."""
> > > > self.word_count = 0
> > > > for line in self.lines:
> > > > self.word_count += len(line.split())
> > > >
> > > > def count_syallables(self):
> > > > """Simplistic syllable counter. Does not handle unicode."""
> > > > self.syllable_count = 0
> > > > for line in self.lines:
> > > > self.syllable_count += line.count("a")
> > > > self.syllable_count += line.count("e")
> > > > self.syllable_count += line.count("i")
> > > > self.syllable_count += line.count("o")
> > > > self.syllable_count += line.count("u")
> > > > self.syllable_count -= line.count("ee")
> > > > self.syllable_count -= line.count("oi")
> > > > self.syllable_count -= line.count("oo")
> > > > self.syllable_count -= line.count("ou")
> > > > scrubbed_line = line.replace(".", " ")
> > > > scrubbed_line = scrubbed_line.replace("!", " ")
> > > > scrubbed_line = scrubbed_line.replace("?", " ")
> > > > scrubbed_line = scrubbed_line.replace('"', " ")
> > > > words = scrubbed_line.split()
> > > > for word in words:
> > > > for phrase in ["e", "ey"]:
> > > > if word.endswith(phrase):
> > > > self.syllable_count -= 1
> > > > for phrase in [
> > > > "y",
> > > > ]:
> > > > if word.endswith(phrase):
> > > > self.syllable_count += 1
> > > > if self.syllable_count < 1:
> > > > self.syllable_count = 1
> > > >
> > > > def grade_report(self):
> > > > """Calculates grade level per:
> > > > https://en.wikipedia.org/wiki/Flesch–Kincaid_readability_tests
> > > > """
> > > > self.sentence_average = self.word_count / self.sentence_count
> > > > self.syllables_per_word_average = self.syllable_count / self.word_count
> > > > self.grade_level = (
> > > > (0.39 * self.sentence_average)
> > > > + (11.8 * self.syllables_per_word_average)
> > > > - 15.59
> > > > )
> > > > self.grade_level = float("{:.2f}".format(self.grade_level))
> > > >
> > > > def report_data(self):
> > > > """Collates and returns report data."""
> > > > data = dict()
> > > > data["filename"] = self.filename
> > > > data["sentence_average"] = self.sentence_average
> > > > data["grade_level"] = self.grade_level
> > > > data["syllables_per_word_average"] = self.syllables_per_word_average
> > > > return data
> > > >
> > > > --
> > > > Software Engineer (reuel.net/resume)
> > > > Scribe: The Domici War (domiciwar.net)
> > > > General Ne'er-do-well (github.com/LeamHall)
> > > > _______________________________________________
> > > > Tutor maillist - Tutor at python.org
> > > > To unsubscribe or change subscription options:
> > > > https://mail.python.org/mailman/listinfo/tutor
> > >
> > > _______________________________________________
> > > Tutor maillist - Tutor at python.org
> > > To unsubscribe or change subscription options:
> > > https://mail.python.org/mailman/listinfo/tutor
>
>
> --
> Software Engineer (reuel.net/resume)
> Scribe: The Domici War (domiciwar.net)
> General Ne'er-do-well (github.com/LeamHall)
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list