[Tutor] Need to discuss about a class that can be used to print number of vowels, consonants, uppercase and lowercase letters and spaces in a string

rmlibre at riseup.net rmlibre at riseup.net
Thu Aug 19 11:55:32 EDT 2021


I'd create sets of each category because they have handy methods for
doing inclusion / exclusion, plus they were designed to be very
efficient at checking for contained values. This will be faster than
checking `char.isupper()` etc. for every letter and for each category.
That sounds like O(c*l) complexity, which means the running time would
scale with the product of both the number of categories & the length of
the string.

I agree with the other responder that properties would be helpful here,
since you can do the calculations only when they are needed. This also
means the calculations will need to be done every time the value is
queried. For some problems, you might want to only do calculations when
they are needed, but you also only want to do them once if they are
needed multiple times. In that case, the built-in functools module has
an lru_cache decorator which handles caching calculated method values.
I'd also use the Counter class from the built-in collections module. It
will take quick care of giving the raw counts of each character (even
the one's you haven't prepared for e.g. punctuation, control chars). The
built-in string module also seems helpful here. If you can find
built-ins which have already solved some of the problem you're needing
to solve, it's a good idea to try finding them -- they're there to make
your life easier.

StringStats seems like a reasonable name. Even though abbreviations, or
otherwise shortened words, are generally discouraged by PEP8 and other
style guides in names, "stats" is a common enough shortening IMHO.
However, you may also consider StringAnalysis. The method names could
also be more clear. If the name of the class is changed to
StringAnalysis, then it becomes more evident that each method should
indicate which kind of value is going to be returned, if any. In your
case, a character count. Names are important for many reasons. A name
like CharacterCounter could also work, and may obviate the role of each
method. It's up to you, that's why they're called style guides.

Resources for Naming:
(https://www.youtube.com/watch?v=5cafjDPPtJ0)
(https://www.youtube.com/watch?v=n0Ak6xtVXno)

Finally, and kind of restating a previous point, your class will
struggle to compute over very large strings. As well, saving the string
in the instance, while necessary in some cases, should be avoided if
isn't for a good reason. Every time a string is moved, passed as an
argument or stored in a variable, then the string is copied over to this
new location. This can lead to lots of simultaneous copies of the same
data during the runtime of a program. If the string is very large, then
this issue can become a big problem in memory overhead.

Below I'm posting an example of how I might write a class to get this
job done. Consider it MIT licensed.

Best,
rmlibre


import string
from collections import Counter

from functools import lru_cache
# See also
https://docs.python.org/3/library/functools.html#functools.cached_property


class StringAnalysis:
    """
    Creates objects which calculate the number of varying kinds of 
    characters in a given string.

    Usage example:

    string = "some string to be analyzed"
    string_analysis = StringAnalysis(string)

    print(string_analysis.lowercase_count)
    22

    print(string_analysis.whitespace_count)
    4
    """

    _LETTERS = frozenset(string.ascii_letters)
    _LOWERCASE = frozenset(string.ascii_lowercase)
    _UPPERCASE = frozenset(string.ascii_uppercase)
    _WHITESPACE = frozenset(string.whitespace)
    _VOWELS = frozenset("AEIOUaeiou")
    _CONSONANTS = _LETTERS.difference(_VOWELS)

    def __init__(self, string: str):
        """
        Saves memory by keeping only the count of each letter within 
        the instance, avoiding the copying of the entire string. 
        Analysis is done within the instance properties.
        """
        self._raw_count = Counter(string)

    @property
    @lru_cache()
    def vowel_count(self):
        vowels = self._VOWELS.intersection(self._raw_count)
        return sum(self._raw_count[char] for char in vowels)

    @property
    @lru_cache()
    def consonant_count(self):
        consonants = self._CONSONANTS.intersection(self._raw_count)
        return sum(self._raw_count[char] for char in consonants)

    @property
    @lru_cache()
    def uppercase_count(self):
        uppercase = self._UPPERCASE.intersection(self._raw_count)
        return sum(self._raw_count[char] for char in uppercase)

    @property
    @lru_cache()
    def lowercase_count(self):
        lowercase = self._LOWERCASE.intersection(self._raw_count)
        return sum(self._raw_count[char] for char in lowercase)

    @property
    @lru_cache()
    def whitespace_count(self):
        whitespace = self._WHITESPACE.intersection(self._raw_count)
        return sum(self._raw_count[char] for char in whitespace)


On 2021-08-18 09:05, tutor-request at python.org wrote:
> Send Tutor mailing list submissions to
> 	tutor at python.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://mail.python.org/mailman/listinfo/tutor
> or, via email, send a message with subject or body 'help' to
> 	tutor-request at python.org
> 
> You can reach the person managing the list at
> 	tutor-owner at python.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Tutor digest..."
> 
> Today's Topics:
> 
>    1. exec('a=1') in functions (Jan Kenin)
>    2. Need to discuss about a class that can be used to print
>       number of vowels, consonants, uppercase and lowercase letters and
>       spaces in a string (Manprit Singh)
>    3. Re: Need to discuss about a class that can be used to print
>       number of vowels, consonants, uppercase and lowercase letters and
>       spaces in a string (dn)
>    4. Re: exec('a=1') in functions (Peter Otten)
>    5. Re: Need to discuss about a class that can be used to print
>       number of vowels, consonants, uppercase and lowercase letters and
>       spaces in a string (Alan Gauld)
> 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> https://mail.python.org/mailman/listinfo/tutor


More information about the Tutor mailing list