[Python-ideas] [PEP8] Predicate consistency

Fri Feb 6 11:16:07 CET 2015

On Feb 6, 2015, at 0:06, Marco Buttu <marco.buttu at gmail.com> wrote:

> On 06/02/2015 07:45, Andrew Barnert wrote

You've skipped over the most important point, the fact that the guideline explicitly says to separate words with underscores "as necessary to improve readability", not "always". Given that the entire PEP is a set of sometimes contradictory rules of thumb, very few parts of it explicitly call out the fact that they're not universal, hard and fast rules, so it's presumably important that this one actually does so. And yet you're insisting that we need extra special-case rules to give people permission to violate this rule?

>> Elsewhere the PEP talks about consistency with related code. If "isnan" is documented to do effectively the same as the C function of the same name it makes sense to use the same name. That's presumably also why "inspect.isgeneratorfunction" doesn't use underscores--they're arguably necessary, but consistency with "isfunction" trumps that.
> 
> Your are right we have to keep consistency with other code, but I think we should prefer *horizontal consistency* (I mean with our codebase API) over a *vertical* one.

Do you have some examples where they give different answers?

Also, note that functions like math.isnan and os.getpid predate PEP 8, and aren't likely to be changed for backward compat reasons, which means any new function that's a thin wrapper around a C/POSIX stdlib function likely has as much of a horizontal consistency reason as a vertical one--e.g., if you were to add math.isnormal as a wrapper around C isnormal, you'd probably want it to be consistent with math.isnan and friends just as much as with C isnormal, right?

In fact, if you look at newer functions that aren't very close parallels to existing functions in the same module or very thin wrappers around C functions, your rule doesn't seem to work. For example, in os again, getresguid matches the BSD C function and matches existing os.getguid, while get_exec_path and get_terminal_size don't have such major consistency constraints and they have underscores.

Note that your new rule would leave us in the same position--we'd have some original-PEP-8-named legacy functions that we'd sometimes have to stay consistent with.

Also note that this seems to work the same way for global variables/constants as for functions, as you'd expect. sys.maxunicode has a strong parallel with legacy sys.maxsize, while other attributes added around the same time mostly have underscores.

>> Also, are you sure it's really "predicate" that's the relevant difference, as opposed to, say, "short word and really short word" (which is really just a special case of "not necessary for readability").
> 
> Looking at the core and the standard library, I see we basically do not use underscore in get/set and predicates.

Except when we do. As in the os examples above, or all the predicate methods in pathlib.Path, or many other places where there are no relevant consistency issues.

> This may be a clear general rule, and not "short word" that is not clear at all. How can you explain the following?
> 
> sys.getswitchinterval()
> sys.setswitchinterval()
> asyncio.iscoroutinefunction()

The first two are consistent with the legacy (pre-PEP8) get/setcheckinterval function, and the last is consistent with isgeneratorfunction, itself consistent with isfunction as you (I think) accepted above.

> And my point is that we break it because of a lack in the PEP8.

Or we break it because sometimes consistency beats readability, and maybe occasionally some other guideline, and in rare cases because the "necessary for readability" clause is a tough judgment call.

> But perphaps I am alone thinking that is ugly to have this kind of inconsistency in predicates and get/set function/method names

Given that both Python 2.1 and 3.0 decided it wasn't worth changing all the legacy names for backward-compat reasons, and you're not suggesting changing them today, I don't think there's any way to avoid having that inconsistency.

But following the existing PEP 8 guidelines--adding underscores when necessary for readability except when another guideline like consistency trumps it--hasn't been a problem for the last 13 years, so I don't see it making things worse over the next 13.

>> You use "dont_write_bytecode" as an example of something that obviously needs the underscores.
> 
> I think it is not obvious at all. It is obvious just in the case we have a rule about predicates and get/set, othewise I do not understand how come it is possible that sys.dont_write_bytecode() should obviously have underscores but sys.setswitchinterval() should not.

Once again you've skipped over the most important point. Are you seriously suggesting that if we have a pair of related functions, they should be named "dont_write_bytecode" and "doesntwritebytecode" because the latter is a predicate?

>> And beyond your own example, "is_greater_than_one" or "can_fit_on_filesystem" seem like they need underscores, while "dotwice" seems fine without.
> 
> I said the opposite: is_greather_then_one() should have unserscores

No you didn't. You said predicates should _not_ have underscores. But under current PEP 8, this one probably should, and I think that's better. (If you don't agree with that one, consider a name with vowels at the word edges, like "is_any_one_among" and imagine that without the underscores.)

> , and as the PEP8 says, the naming rule is the same for functions and methods. In the standard library we do not have any can_* function or method,

From a quick ack I see have can_write and friends in asyncio, can_fs_encode in distutils, cannot_convert in lib2to3, can_fetch in urllib/robotparser, a variety of can_ functions in the test suite... and not a single can\w except for cancel, canvas, and canonical.

> but if in the future we will have one, then yes, I propose it should be consistent with the "predicate rule" (no underscores).

So you'd make every current can* function in the stdlib a legacy function that doesn't follow the current rules?

I think you're going too far trying to make the rules complete, unambiguous, and inviolable when keeping them simple and letting humans use their judgment on the edge cases makes a lot more sense. Adding a sweeping exception to the rule that itself needs exceptions that will ultimately have a less compelling legacy justification doesn't seem like a step forward. M