[Tutor] Readability: To use certain Python features or not?

Peter Otten __peter__ at web.de
Mon Jun 28 04:28:22 EDT 2021


On 28/06/2021 00:21, boB Stepp wrote:
> On Sun, Jun 27, 2021 at 4:49 PM <alan.gauld at yahoo.co.uk> wrote:
>>
>> Personally I think the second is more reliable and maintainable so prefer it. If a reader is so new to python they don't know about get() then they need to look it up and learn. But OTOH a default dict might be better still!
>> There is a difference between writing clever code that is unreadable and using standard language or library features that might be less well known. One is easy to look up, the other is just downright hard work and therefore fragile.
>>
>> On 27 Jun 2021 22:41, boB Stepp <robertvstepp at gmail.com> wrote:
>>
>> Questions inspired by an example from "Practical Programming, 3rd ed."
>> by Gries, Campbell and Montojo.
>>
>> p. 221 example.  Compare:
>>
>> [...]
>> bird_to_observations = {}
>> for line in observations_file:
>>      bird = line.strip()
>>      if bird in bird_to_observations:
>>          bird_to_observations[bird] = bird_to_obserations[bird] + 1
>>      else:
>>          bird_to_observations[bird] = 1
>> [...]
>>
>> to
>>
>> [...]
>> bird_to_observations = {}
>> for line in observations_file:
>>      bird = line.strip()
>>      bird_to_observations[bird] = bird_to_observations.get(bird, 0) + 1
>> [...]
> 
> Hmm.  So, Alan, I guess you are suggesting the following for this
> concrete instance:
> 
> from collections import defaultdict
> [...]
> bird_to_obeservations = defaultdict(int)
> for line in observations_file:
>      bird = line.strip()
>      bird_to_observations[bird] += 1
> [...]
> 
> That does look to my eye clearer and more expressive.  The cognitive
> load on the reader is to know how to use default dictionaries and know
> that int() always returns 0.  But as you point out the reader can
> always look up defaultdict and the collections module is very popular
> and well-used AFAIK.
> 
> So if I am understanding your answer to the more general questions,
> you believe that even using less well-known Python standard features
> is desirable if it simplifies the code presentation and is more
> expressive of intent?

You didn't ask me, but I always try to use the "best fit" that I know of 
rather than the "best known fit" -- at least when that best fit is 
provided by the stdlib.

In this case that would be

birds = (line.strip() for line in observations_file)
bird_to_observations = collections.Counter(birds)

If that isn't self-explanatory wrap it in a function with proper 
documentation

def count_birds(birds):
     """Count birds by species.
     >>> count_birds(
     ... ["norwegian blue", "norwegian blue", "unladen swallow"])
     Counter({'norwegian blue': 2, 'unladen swallow': 1})
     """
     return collections.Counter(birds)

This has the advantage that the function could contain any 
implementation that satisfies the doctest. As a consequence you can 
start with something that barely works and refine it as you learn more, 
without affecting the rest of the script.



More information about the Tutor mailing list