[Python-ideas] Objectively Quantifying Readability

Nathaniel Smith njs at pobox.com
Tue May 1 04:29:16 EDT 2018

On Mon, Apr 30, 2018 at 8:46 PM, Matt Arcidy <marcidy at gmail.com> wrote:
> On Mon, Apr 30, 2018 at 5:42 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>> (If we know that, let's say, really_long_descriptive_identifier_names
>> hurt readability, how does that help us judge whether adding a new kind
>> of expression will hurt or help readability?)
> A new feature can remove symbols or add them.  It can increase density
> on a line, or remove it.  It can be a policy of variable naming, or it
> can specifically note that variable naming has no bearing on a new
> feature.  This is not limited in application.  It's just scoring.
> When anyone complains about readability, break out the scoring
> criteria and assess how good the _comparative_ readability claim is:
> 2 vs 10?  4 vs 5?  The arguments will no longer be singularly about
> "readability," nor will the be about the question of single score for
> a specific statement.  The comparative scores of applying the same
> function over two inputs gives a relative difference.  This is what
> measures do in the mathematical sense.

Unfortunately, they kind of study they did here can't support this
kind of argument at all; it's the wrong kind of design. (I'm totally
in favor of being more evidence-based decisions about language design,
but interpreting evidence is tricky!) Technically speaking, the issue
is that this is an observational/correlational study, so you can't use
it to infer causality. Or put another way: just because they found
that unreadable code tended to have a high max variable length,
doesn't mean that taking those variables and making them shorter would
make the code more readable.

This sounds like a finicky technical complaint, but it's actually a
*huge* issue in this kind of study. Maybe the reason long variable
length was correlated with unreadability was that there was one
project in their sample that had terrible style *and* super long
variable names, so the two were correlated even though they might not
otherwise be related. Maybe if you looked at Perl, then the worst
coders would tend to be the ones who never ever used long variables
names. Maybe long lines on their own are actually fine, but in this
sample, the only people who used long lines were ones who didn't read
the style guide, so their code is also less readable in other ways.
(In fact they note that their features are highly correlated, so they
can't tell which ones are driving the effect.) We just don't know.

And yeah, it doesn't help that they're only looking at 3 line blocks
of code and asking random students to judge readability – hard to say
how that generalizes to real code being read by working developers.


Nathaniel J. Smith -- https://vorpus.org

More information about the Python-ideas mailing list