[Tutor] Type annotation errors

Fri Jun 5 05:13:40 EDT 2020

> In terms of following along with Beazley's course material this would be
> cheating.  Classes have not been presented yet.  Even very simple ones.
> ~(:>))
> 
>> PPS: For someone who knows a little C the obvious fix for your version 
>> is to
>> cast the union to the expected type:

This is a little difficult to rationalise:
- we are not trying to get ahead of the course material
- but we are prepared to launch into complex (if not complicated) mypy
Apologies, I can see the 'why' but can't help but feel it is only that 
imbalance which has led to the problem...

I'd like to discuss the (actual) problem, and then dispute the 
claim-pursuant:

[from another part of the thread]
<<<
In this exercise set he wishes to create a module fileparse which can
accept a csv-like file and allow one to obtain a list of dictionaries (one
per file row) if the file has headers or a list of tuples if not,
representing the collection of records.  Additionally he wishes to allow
for setting arguments to choose only certain data columns (If the file has
headers), do type conversions by providing a list of functions to do the
conversions, and setting a different delimiter than the default comma.  At
the end of the exercises he comments:

"If you’ve made it this far, you’ve created a nice library function that’s
genuinely useful. You can use it to parse arbitrary CSV files, select out
columns of interest, perform type conversions, without having to worry too
much about the inner workings of files or the csv module."
 >>>

I'm firmly joining with others who have suggested that rather than 
"nice" it is actually quite 'ugly'. The routine, as-is, violates the 
Single Responsibility Principle (SRP). It is trying to deal with CSV 
files that have column headings AND those which don't. That's not an 
horrendous crime, per-se (but see later). However, the idea that the 
routine will output either a list of dicts or a list of tuples, most 
certainly is a major transgression!

The Zen of Python says "Simple is better than complex. Complex is better 
than complicated.". Which of the three describes that code?

To be fair, you weren't intending to discuss the course 
materials/output, and IIRC there has been no explanation as to how the 
calling routine 'knows' whether this .CSV has headers or not. Similarly, 
it does not say how it will deal with the two-format output issue when 
it comes time to actually use the extracted-data.

Regarding the first, wouldn't it make sense to not only ascertain that 
headers are present (or not) AND note any headings - as a single task?

Now, instead of varying the function's output according to the presence 
of headers (or not), the data could be extracted (only) as a tuple.

Lastly, when it comes time to further-process the extracted-data, it can 
be paired/zipped with the headings, if-possible, as-required...

Alternately, write the basic function (no headings) and then add a 
decorator to handle headings - separation in a different fashion.

To those, apply the Zen?

Now apply mypy to the function (even design-level stubs)?

Per the comment "nice library function", I'm hoping that you will later 
report that the course builds-upon this/these function(s) and makes them 
even more useful/"nice". Speaking for myself, and because I have 
frequent needs to extract data from worksheets or .CSV files, I have a 
bunch of classes ready for re-use (sub-classing per file/application) 
and find them *very* useful.

As a general rule, I find that the greatest re-use is of simple classes 
rather than those more complex or complicated. The complexity comes when 
the simple 'framework' of a base-class is adapted and expanded to suit 
the application. So, SRP rules!

It might be a little early to throw at you what is possibly the hardest 
part of "SOLID" to understand and implement by-habit (but I know you've 
used multiple languages over the years, before tackling Python): the 
Dependency Inversion Principle (DIP) wherein we say that "details should 
depend on abstractions" not "abstractions depend upon details".

In this mode, we abstract the process of taking data from a worksheet. 
Then, isn't the absence/presence of headings, a 'detail'?

(just as another hint for the course's future: that we might only wish 
to extract specific columns ("select") or rows ("project")...) Each of 
these 'details' refines the extraction process rather than calls for an 
entirely different method of extraction/different presentation of the 
results!
-- 
Regards =dn