[Tutor] this group and one liners

dn PythonList at DancesWithMice.info
Thu Jul 7 19:14:29 EDT 2022


On 08/07/2022 03.38, Mats Wichmann wrote:
> On 7/6/22 11:02, avi.e.gross at gmail.com wrote:
> Boy, we "old-timers" do get into these lengthy discussions...  having
> read several comments here, of which this is only one:

Who is being called "old"?

[this conversation is operating at an 'advanced' level. If you are  more
of a Beginner and would like an explanation of any of what I've written
here, please don't hesitate to request expansion or clarification!]


>> The excuse for many one-liners is often that they are in some ways better as in use less memory or are faster in some sense.

Rather than "excuse", is it a (misplace?) sense of pride or ego?

That said, sometimes there are efficiency-gains. The problem though, is
that such may only apply on the particular system. So, when the code is
moved to my PC, an alternate approach is actually 'better' (for
whichever is/are the pertinent criteria).


>> Although that can be true, I think it is reasonable to say that often the exact opposite is true.
>>
>> In order to make a compact piece of code, you often twist the problem around in a way that makes it possible to leave out some cases or not need to handle errors and so on. I did this a few days ago until I was reminded max(..., default=0) handled my need. Some of my attempts around it generated lots of empty strings which were all then evaluated as zero length and then max() processed a longer list with lots of zeroes.

Is this a good point to talk 'testability'?

Normally, I would mention "TDD" - my preferred dev-approach. However, at
this time I'm working with a bunch of 'Beginners' and showing the use of
Python's Interactive Mode, aka the REPL - so, experiment-first rather
than test-first - or, more testing of code-constructs than the data
processing.

The problem with dense units of code is that they are, by-definition, a
'black box'. We can test 'around' them, but not 'through' them/within them.

Accordingly, the TDD approach would be to develop step-by-step, assuring
each step in the process, as it is built (as it should be built). This
produces a line-by-line result. From there, some tools will suggest that
a for-loop be turned into a comprehension (for example). Would it be
part of one's skill in refactoring to decide if such is actually a good
idea - and because 'testing' has been integral from the start, that may
provide a discouragement from too much integration.

Trainees, using the REPL, particularly those with caution as an
attribute (cf arrogance (?) ), are also more inclined to develop
multi-stage processes (such as the R example given earlier),
stage-by-stage. They will similarly test-as-you-go, even if after
writing each line of code (cf 'pure' TDD). Once a single worked-example
has been developed, the code is likely to be copied into an editor/IDE,
and now that the coder's confidence is raised, the code (one hopes) will
be subjected to a wider range of test-cases.

In the former case, there is some caution before consolidation - tests
may need to be removed/rendered impossible. In the latter, it is less
likely such thinking will apply, because the "confidence" and code-first
thinking may lead to over-confidence, without regard for 'testability'.

Maybe?


> After a period when they felt awkward because lots of my programming
> background was in languages that didn't have these, I now use simple
> one-liners extensively - comprehensions and ternary expressions.  If
> they start nesting, probably not.  Someone mentioned a decent metric -
> if a code formatter like Black starts breaking your comprehension into
> four lines, it probably got too complex.

Agreed. There is nothing inherently 'wrong' with the likes of
comprehensions, generators, ternary operators, etc. So, why not use
them? In short, they are idioms within the Python language, and would
not have been made available if they hadn't been deemed useful and
appropriate. (see PEP process)

Any argument that they are difficult to understand is going to be
correct - at least, on the face of it. This applies to natural-languages
as well. For example, the American "yeah, right" is an exclamation
consisting of two positive words - apparently a statement of agreement
("yes"), and an affirmation of correctness ("right", ie correct). Yet it
is actually used as an expression of disagreement through derision, eg
someone says "dn is the best-looking person in the room" but another
person disputes with feeling by sarcastically intoning "yeah, right!".

Idioms are learned (and often can't be easily or literally translated),
and part of the necessity for ("deliberate") practice in the use of
Python. Sure, someone who is only a few chapters into an intro-book will
not readily recognise a list-comprehension. However, that is not a good
reason why those of us who have finished the whole book should not use
these facilities!

[ x+2 for x in iterator ]
- is a reasonable Python idiom

max(items or [0])
- appears to be a reasonable use of a ternary operator, except that
max( items, default=0 )
- is more 'pythonic' (a feature built-into the language for this express
purpose/situation) and thus idiomatic Python

How about the popular 'gotcha' of using a mutable-collection as a
function-parameter, eg

def function_name( mutable_collection=None ):
  if mutable_collection is None:
    mutable_collection = []

or is this commonly-used idiom more pythonic?

def function_name( mutable_collection=None ):
  mutable_collection = mutable_collection if mutable_collection else []



Can concise-forms be over-used?
Yes!

Rather than relying upon some external tool, I find that the IDE (or my
own) formatting of the various clauses provides immediate feedback that
things are becoming too complex (for my future-self to handle).

A longer ternary operator, or one that is contained within a more
complex construct could be spread over two lines (which would highlight
the two 'choices'). A list comprehension could split its expression, its
for-clause, and its conditional-clause over three lines.

Spreading a single such construct over more than three lines starts to
look 'complex' (to my "old" eyes). The indentation requires more thought
than I'd care to devote (I need my brain-power for problem-solving
rather than 'art work'!). Thus, these lead to the idea that 'simple is
better than complex...complicated' - regardless of one's interpretation
of "beautiful" [Zen of Python].


> My take on these is you can write a more compact function this way -
> you're more likely to have the meat of what's going on right there
> together in a few lines, rather than building mountain ranges of
> indentation - and this can actually *improve* readability, not obscure
> it.  I agree "clever" one-liners may be a support burden, but anyone
> with a reasonable amount of Python competency (which I'd expect of
> anyone in a position to maintain my code at a later date) should have no
> trouble recognizing the intent of simple ones.

Could we also posit that there is more than one definition of "complex"?
As well as "testability" being lost as groups of steps in a multi-stage
process are combined and/or compressed, there is a longer-term impact.
When (not "if") the code needs to be changed, how easy would you rate
that task? I guess the answer can't escape 'testability' in that some
code which comes with (good) tests already in-place, can be changed with
a greater sense of confidence - the idea that the code can be altered
and those alterations will not cause 'breakage' (because the tests still
'pass') - regression testing.

However, the main point here, is that someone charged with changing a
piece of code will take a certain amount of time to read and understand
it (before (s)he starts to make changes). Even assuming that-someone is
a Python-Master, comprehending existing code requires an understanding
of the domain and the algorithm (etc). Accordingly, the more 'dense' the
code, the harder it is likely to become to make such a task.

If the algorithm is a common platform within the domain, the 'level' at
which things will be deemed 'complex' will change. For example, a recent
thread on the list dealt with calculating a standard deviation.
Statisticians will readily recognise such a calculation (more likely
there is a library routine/function to call, but...). Accordingly, the
domain doesn't require such code to be 'simplified' - even though 'mere
mortals' might scratch their heads trying to decipher its purpose, the
steps within it, and its place in the overall routine.


> Sometimes thinking about how to write a concise one-liner exposes a
> failure to have thought through what you're doing completely - so unlike
> what is mentioned above - twisting a problem around unnaturally (no
> argument that happens too), you might actually realize that there's a
> simpler way to structure a step.

Unnatural twisting sounds like something to avoid!
Twist again, like we did last summer...
Play Twister again, like we...


I prefer to break complex calculations into their working parts (not
just as a characteristic of TDD, as above). It can also be useful to
separate working-parts into their own function. In both cases, we need
'names', either for intermediate-results (see also the use of _ as a
placeholder-identifier) or to be able to call the function. Taking care
to choose a 'good' name improves readability and decreases apparent
complexity.

A 'problem' I see trainees often evidencing (aka 'the rush to code',
'I'm not working if I'm not coding', etc), is that a name must be chosen
when the identifier is first defined ("LHS"). However, its use may not
be properly appreciated until that value is subsequently used ("RHS").
It is at this later step that the importance of the name becomes most
obvious (to the writer). If that is so, the assessment goes-double for a
subsequent reader attempting to divine the code's meaning and workings!

In the past, realising that the first choice of name might not be the
best may have lead us to say 'oh well', sigh, and quietly (try to)
carry-on, because of the (considerable) effort of changing a name
(without introducing regression errors). These days, such "technical
debt" is quite avoidable. Capable IDEs enable one to quickly and easily
refactor a choice of name and to (find-replace) update the code to
utilise a better name, everywhere it is mentioned, with minimal manual
effort! Thus, the effort of ensuring the future-reader/maintainer has
competent implicit and fundamental documentation loses yet another
'excuse' and reaches the level of professional expectation.

-- 
Regards,
=dn


More information about the Tutor mailing list