On Mon, Apr 30, 2018 at 8:46 PM, Matt Arcidy
On Mon, Apr 30, 2018 at 5:42 PM, Steven D'Aprano
wrote: (If we know that, let's say, really_long_descriptive_identifier_names hurt readability, how does that help us judge whether adding a new kind of expression will hurt or help readability?)
A new feature can remove symbols or add them. It can increase density on a line, or remove it. It can be a policy of variable naming, or it can specifically note that variable naming has no bearing on a new feature. This is not limited in application. It's just scoring. When anyone complains about readability, break out the scoring criteria and assess how good the _comparative_ readability claim is: 2 vs 10? 4 vs 5? The arguments will no longer be singularly about "readability," nor will the be about the question of single score for a specific statement. The comparative scores of applying the same function over two inputs gives a relative difference. This is what measures do in the mathematical sense.
Unfortunately, they kind of study they did here can't support this kind of argument at all; it's the wrong kind of design. (I'm totally in favor of being more evidence-based decisions about language design, but interpreting evidence is tricky!) Technically speaking, the issue is that this is an observational/correlational study, so you can't use it to infer causality. Or put another way: just because they found that unreadable code tended to have a high max variable length, doesn't mean that taking those variables and making them shorter would make the code more readable. This sounds like a finicky technical complaint, but it's actually a *huge* issue in this kind of study. Maybe the reason long variable length was correlated with unreadability was that there was one project in their sample that had terrible style *and* super long variable names, so the two were correlated even though they might not otherwise be related. Maybe if you looked at Perl, then the worst coders would tend to be the ones who never ever used long variables names. Maybe long lines on their own are actually fine, but in this sample, the only people who used long lines were ones who didn't read the style guide, so their code is also less readable in other ways. (In fact they note that their features are highly correlated, so they can't tell which ones are driving the effect.) We just don't know. And yeah, it doesn't help that they're only looking at 3 line blocks of code and asking random students to judge readability – hard to say how that generalizes to real code being read by working developers. -n -- Nathaniel J. Smith -- https://vorpus.org