[Python-Dev] Argument Clinic: what to do with builtins with non-standard signatures?

Tres Seaver tseaver at palladion.com
Fri Jan 24 17:10:24 CET 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/24/2014 10:07 AM, Larry Hastings wrote:
> THE SPECIFICS
> 
> I'm sorting the problems we see into four rough categories.
> 
> a) Functions where there's a static Python value that behaves 
> identically to not passing in that parameter (aka "the NULL problem")
> 
> Example: _sha1.sha1().  Its optional parameter has a default value in
> C of NULL. We can't express NULL in a Python signature.  However, it
> just so happens that _sha1.sha1(b'') is exactly equivalent to
> _sha1.sha1(). b'' makes for a fine replacement default value.
> 
> Same holds for list.__init__().  its optional "sequence" parameter
> has a default value in C of NULL.  But this signature: 
> list.__init__(sequence=()) works fine.
> 
> The way Clinic works, we can actually still use the NULL as the 
> default value in C.  Clinic will let you use completely different
> values as the published default value in Python and the real default
> value in C. (Consenting adults rule and all that.)  So we could lie to
> Python and everything works just the way we want it to.
> 
> Possible Solutions: 0) Do nothing, don't convert the function. 1) Use
> that clever static value as the default.

I prefer #1.

> b) Functions where there's no static Python value that behaves 
> identically to not passing in that parameter (aka "the dynamic default
> problem")
> 
> There are functions with parameters whose defaults are mildly
> dynamic, responding to other parameters.
> 
> Example: I forget its name, but someone recently showed me a builtin
> that took a list as its first parameter, and its optional second
> parameter defaulted to the length of the list.  As I recall this
> function didn't allow negative numbers, so -1 wasn't a good fit.
> 
> Possible solutions: 0) Do nothing, don't convert the function. 1) Use
> a magic value as None.  Preferably of the same type as the function
> accepts, but failing that use None.  If they pass in the magic value
> use the previous default value.  Guido himself suggested this in 2)
> Use an Argument Clinic "optional group".  This only works for 
> functions that don't support keyword arguments.  Also, I hate this,
> because "optional groups" are not expressable in Python syntax, so
> these functions automatically have invalid signatures.

I prefer #1.

> c) Functions that accept an 'int' when they mean 'boolean' (aka the 
> "ints instead of bools" problem)
> 
> This is specific but surprisingly common.
> 
> Before Python 3.3 there was no PyArg_ParseTuple format unit that
> meant "boolean value".  Functions generally used "i" (int).  Even
> older functions accepted an object and called PyLong_AsLong() on it. 
> Passing in True or False for "i" (or PyLong_AsLong()) works, because 
> boolean inherits from long.   But anything other than ints and bools 
> throws an exception.
> 
> In Python 3.3 I added the "p" format unit for boolean arguments. This
> calls PyObject_IsTrue() which accepts nearly any Python value.
> 
> I assert that Python has a crystal clear definition of what 
> constitutes "true" and "false".  These parameters are clearly intended
> as booleans but they don't conform to the boolean protocol.  So I
> suggest every instance of this is a (very mild!) bug.  But changing
> these parameters to use "p" is a change: they'll accept many more
> values than before.
> 
> Right now people convert these using 'int' because that's an exact 
> match.  But sometimes they are optional, and the person doing the 
> conversion wants to use True or False as a default value, and it 
> doesn't work: Argument Clinic's type enforcement complains and they
> have to work around it.  (Argument Clinic has to enforce some 
> type-safety here because the values are used as defaults for C 
> variables.)  I've been asked to allow True and False as defaults for
> "int" parameters specifically because of this.
> 
> Example: str.splitlines(keepends)
> 
> Solution: 1) Use "bool". 2) Use "int", and I'll go relax Argument
> Clinic so they can use bool values as defaults for int parameters.

I prefer #1.

> d) Functions with behavior that deliberately defy being expressed as
> a Python signature (aka the "untranslatable signature" problem)
> 
> Example: itertools.repeat(), which behaves differently depending on
> whether "times" is supplied as a positional or keyword argument.  (If 
> "times" is <0, and was supplied via position, the function yields 0
> times. If "times" is <0, and was supplied via keyword, the function
> yields infinitely-many times.)
> 
> Solution: 0) Do nothing, don't convert the function. 1) Change the
> signature until it is Python compatible.  This new signature *must*
> accept a superset of the arguments accepted by the existing signature.
> (This is being discussed right now in issue #19145.)

I can't imagine justifying such an API design in the first place, but
sometimes things "jest grew", rather than being designed.  I'm in favor
of # 1, in any case.  If real backward compatibility is not feasible
for some reason, then I would favor the following:

       2) Deprecate the manky builtin, and leave it unconverted for AC;
          then add a new builtin with a sane signature, and re-implement
          the deprecated version as an impedance-matching shim over the
          new one.


Tres.
- -- 
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEUEARECAAYFAlLikGgACgkQ+gerLs4ltQ5UEgCYu13+7HfmwWw2hq7GrsBGM4I3
UACgz3WKVvqG1QkOsx8C3tiCjp5PkL0=
=2tLW
-----END PGP SIGNATURE-----



More information about the Python-Dev mailing list