Mailman 3 May 2017 - Python-ideas

@classproperty, @abc.abstractclasspropery, etc.
by K. Richard Pixley 16 Dec '20

16 Dec '20

There's a whole matrix of these and I'm wondering why the matrix is currently sparse rather than implementing them all. Or rather, why we can't stack them as: class foo(object): @classmethod @property def bar(cls, ...): ... Essentially the permutation are, I think: {'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable attribute}. concreteness implicit first arg type name comments {unadorned} {unadorned} method def foo(): exists now {unadorned} {unadorned} property @property exists now {unadorned} {unadorned} non-callable attribute x = 2 exists now {unadorned} static method @staticmethod exists now {unadorned} static property @staticproperty proposing {unadorned} static non-callable attribute {degenerate case - variables don't have arguments} unnecessary {unadorned} class method @classmethod exists now {unadorned} class property @classproperty or @classmethod;@property proposing {unadorned} class non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract {unadorned} method @abc.abstractmethod exists now abc.abstract {unadorned} property @abc.abstractproperty exists now abc.abstract {unadorned} non-callable attribute @abc.abstractattribute or @abc.abstract;@attribute proposing abc.abstract static method @abc.abstractstaticmethod exists now abc.abstract static property @abc.staticproperty proposing abc.abstract static non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract class method @abc.abstractclassmethod exists now abc.abstract class property @abc.abstractclassproperty proposing abc.abstract class non-callable attribute {degenerate case - variables don't have arguments} unnecessary I think the meanings of the new ones are pretty straightforward, but in case they are not... @staticproperty - like @property only without an implicit first argument. Allows the property to be called directly from the class without requiring a throw-away instance. @classproperty - like @property, only the implicit first argument to the method is the class. Allows the property to be called directly from the class without requiring a throw-away instance. @abc.abstractattribute - a simple, non-callable variable that must be overridden in subclasses @abc.abstractstaticproperty - like @abc.abstractproperty only for @staticproperty @abc.abstractclassproperty - like @abc.abstractproperty only for @classproperty --rich

10 15

Specify number of items to allocate for array.array() constructor
by Sven Rahmann 22 Feb '20

22 Feb '20

At the moment, the array module of the standard library allows to create arrays of different numeric types and to initialize them from an iterable (eg, another array). What's missing is the possiblity to specify the final size of the array (number of items), especially for large arrays. I'm thinking of suffix arrays (a text indexing data structure) for large texts, eg the human genome and its reverse complement (about 6 billion characters from the alphabet ACGT). The suffix array is a long int array of the same size (8 bytes per number, so it occupies about 48 GB memory). At the moment I am extending an array in chunks of several million items at a time at a time, which is slow and not elegant. The function below also initializes each item in the array to a given value (0 by default). Is there a reason why there the array.array constructor does not allow to simply specify the number of items that should be allocated? (I do not really care about the contents.) Would this be a worthwhile addition to / modification of the array module? My suggestions is to modify array generation in such a way that you could pass an iterator (as now) as second argument, but if you pass a single integer value, it should be treated as the number of items to allocate. Here is my current workaround (which is slow): def filled_array(typecode, n, value=0, bsize=(1<<22)): """returns a new array with given typecode (eg, "l" for long int, as in the array module) with n entries, initialized to the given value (default 0) """ a = array.array(typecode, [value]*bsize) x = array.array(typecode) r = n while r >= bsize: x.extend(a) r -= bsize x.extend([value]*r) return x

14 20

Positional-only parameters
by Victor Stinner 11 Sep '18

11 Sep '18

Hi, For technical reasons, many functions of the Python standard libraries implemented in C have positional-only parameters. Example: ------- $ ./python Python 3.7.0a0 (default, Feb 25 2017, 04:30:32) >>> help(str.replace) replace(self, old, new, count=-1, /) # <== notice "/" at the end ... >>> "a".replace("x", "y") # ok 'a' >>> "a".replace(old="x", new="y") # ERR! TypeError: replace() takes at least 2 arguments (0 given) ------- When converting the methods of the builtin str type to the internal "Argument Clinic" tool (tool to generate the function signature, function docstring and the code to parse arguments in C), I asked if we should add support for keyword arguments in str.replace(). The answer was quick: no! It's a deliberate design choice. Quote of Yury Selivanov's message: """ I think Guido explicitly stated that he doesn't like the idea to always allow keyword arguments for all methods. I.e. `str.find('aaa')` just reads better than `str.find(needle='aaa')`. Essentially, the idea is that for most of the builtins that accept one or two arguments, positional-only parameters are better. """ http://bugs.python.org/issue29286#msg285578 I just noticed a module on PyPI to implement this behaviour on Python functions: https://pypi.python.org/pypi/positional My question is: would it make sense to implement this feature in Python directly? If yes, what should be the syntax? Use "/" marker? Use the @positional() decorator? Do you see concrete cases where it's a deliberate choice to deny passing arguments as keywords? Don't you like writing int(x="123") instead of int("123")? :-) (I know that Serhiy Storshake hates the name of the "x" parameter of the int constructor ;-)) By the way, I read that "/" marker is unknown by almost all Python developers, and [...] syntax should be preferred, but inspect.signature() doesn't support this syntax. Maybe we should fix signature() and use [...] format instead? Replace "replace(self, old, new, count=-1, /)" with "replace(self, old, new[, count=-1])" (or maybe even not document the default value?). Python 3.5 help (docstring) uses "S.replace(old, new[, count])". Victor

19 27

Implicit string literal concatenation considered harmful?
by Guido van Rossum 14 Mar '18

14 Mar '18

I just spent a few minutes staring at a bug caused by a missing comma -- I got a mysterious argument count error because instead of foo('a', 'b') I had written foo('a' 'b'). This is a fairly common mistake, and IIRC at Google we even had a lint rule against this (there was also a Python dialect used for some specific purpose where this was explicitly forbidden). Now, with modern compiler technology, we can (and in fact do) evaluate compile-time string literal concatenation with the '+' operator, so there's really no reason to support 'a' 'b' any more. (The reason was always rather flimsy; I copied it from C but the reason why it's needed there doesn't really apply to Python, as it is mostly useful inside macros.) Would it be reasonable to start deprecating this and eventually remove it from the language? -- --Guido van Rossum (python.org/~guido)

51 165

JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects)
by Simon Ramstedt 13 Jul '17

13 Jul '17

Hi, do you have an opinion on the following? Wouldn't it be nice to define classes via a simple constructor function (as below) instead of a conventional class definition? *conventional*: class MyClass(ParentClass): def __init__(x): self._x = x def my_method(y): z = self._x + y return z *proposed*: def MyClass(x): self = ParentClass() def my_method(y): z = x + y return z self.my_method = my_method # that's cumbersome (see comments below) return self Here are the pros and cons I could come up with for the proposed method: (+) Simpler and more explicit. (+) No need to create attributes (like `self._x`) just to pass something from `__init__` to another method. (+) Default arguments / annotations for methods could be different for each class instance. Adaptive defaults wouldn't have to simulated with a None. (+) Class/instance level imports would work. (-/+) Speed: The `def`-based objects take 0.6 μs to create while the `class`-based objects take only 0.4 μs. For method execution however the closure takes only 0.15 μs while the proper method takes 0.22 μs (script <https://gist.github.com/rmst/78b2b0f56a3d9ec13b1ec6f3bd50aa9c>). (-/+) Checking types: In the proposed example above the returned object wouldn't know that it has been created by `MyClass`. There are a couple of solutions to that, though. The easiest to implement would be to change the first line to `self = subclass(ParentClass())` where the subclass function looks at the next item in the call stack (i.e. `MyClass`) and makes it the type of the object. Another solution would be to have a special rule for functions with capital first letter returning a single object to append itself to the list of types of the returned object. Alternatively there could be a special keyword e.g. `classdef` that would be used instead of `def` if we wouldn't want to rely on the name. (-) The current syntax for adding a function to an object is cumbersome. That's what is preventing me from actually using the proposed pattern. But is this really the only reason for not using it? And if so, wouldn't that be a good argument for enabling something like below? *attribute function definitions*: def MyClass(x): self = ParentClass() def self.my_method(y): z = x + y return z return self or alternatively *multiline lambdas*: def MyClass(x): self = ParentClass() self.my_method = (y): z = x + y return z return self Cheers, Simon

25 80

SealedMock proposal for unittest.mock
by Mario Corchero 01 Jun '17

01 Jun '17

Hello Everyone! First time writing to python-ideas. *Overview* Add a new mock class within the mock module <https://github.com/python/cpython/blob/master/Lib/unittest/mock.py>, SealedMock (or RestrictedMock) that allows to restrict in a dynamic and recursive way the addition of attributes to it. The new class just defines a special attribute "sealed" which once set to True the behaviour of automatically creating mocks is blocked, as well as for all its "submocks". See sealedmock <https://github.com/mariocj89/sealedmock/blob/master/README.md>. Don't focus on the implementation, it is ugly, it would be much simpler within *mock.py*. *Rationale* Inspired by GMock <https://github.com/google/googletest/tree/master/googlemock> RestrictedMock, SealedMock aims to allow the developer to define a narrow interface to the mock that defines what the mocks allows to be called on. The feature of mocks returning mocks by default is extremely useful but not always desired. Quite often you rely on it only at the time you are writing the test but you want it to be disabled at the time the mock is passed into your code, that is what SealedMock aims to address. This solution also prevents user errors when mocking incorrect paths or having typos when calling attributes/methods of the mock. We have tried it internally in our company and it gives quite a nicer user experience for many use cases, specially for new users of mock as it helps out when you mock the wrong path. *Alternatives* - Using auto_spec/spec is a possible solution but removes flexibility and is rather painful to write for each of the mocks and submocks being used. - Leaving it outside of the mock.py as it is not interesting enough. I am fine with it :) just proposing it in case you think otherwise. - Make it part of the standard Mock base class. Works for me, but I'd concerned on how can we do it in a backward compatible way. (Say someone is mocking something that has a "sealed" attribute already). Let me know what you think, happy to open a enhancement in https://bugs.python.org/ and send a PR. Regards, Mario

3 3

dict(default=int)
by Steven Piantadosi 30 May '17

30 May '17

Hi All, I find importing defaultdict from collections to be clunky and it seems like having a default should just be an optional keyword to dict. Thus, something like, d = dict(default=int) would be the same as from collections import defaultdict d = defaultdict(int) Any thoughts? Thanks, ++Steve

15 19

a memory-efficient variant of itertools.tee
by Stephan Houben 29 May '17

29 May '17

Hi all, The itertools.tee function can hold on to objects "unnecessarily". In particular, if you do iter2 = itertools.tee(iter1, 2)[0] i.e. you "leak" one of the returned iterators, then all returned objects are not collected until also iter2 is collected. I propose a different implementation, namely the one in: https://github.com/stephanh42/streamtee streamtee.tee is a drop-in alternative for itertools.tee but as you can see from the test in the repo, it will not hold on to the generated objects as long. I propose this as an improved implementation of itertools.tee. Thanks, Stephan

2 2

tweaking the file system path protocol
by Wolfgang Maier 29 May '17

29 May '17

What do you think of this idea for a slight modification to os.fspath: the current version checks whether its arg is an instance of str, bytes or any subclass and, if so, returns the arg unchanged. In all other cases it tries to call the type's __fspath__ method to see if it can get str, bytes, or a subclass thereof this way. My proposal is to change this to: 1) check whether the type of the argument is str or bytes *exactly*; if so, return the argument unchanged 2) check wether __fspath__ can be called on the type and returns an instance of str, bytes, or any subclass (just like in the current version) 3) check whether the type is a subclass of str or bytes and, if so, return it unchanged This would have the following implications: a) it would speed up the very common case when the arg is either a str or a bytes instance exactly b) user-defined classes that inherit from str or bytes could control their path representation just like any other class c) subclasses of str/bytes that don't define __fspath__ would still work like they do now, but their processing would be slower d) subclasses of str/bytes that accidentally define a __fspath__ method would change their behavior I think cases c) and d) could be sufficiently rare that the pros outweigh the cons? Here's how the proposal could be implemented in the pure Python version (os._fspath): def _fspath(path): path_type = type(path) if path_type is str or path_type is bytes: return path # Work from the object's type to match method resolution of other magic # methods. try: path_repr = path_type.__fspath__(path) except AttributeError: if hasattr(path_type, '__fspath__'): raise elif issubclass(path_type, (str, bytes)): return path else: raise TypeError("expected str, bytes or os.PathLike object, " "not " + path_type.__name__) if isinstance(path_repr, (str, bytes)): return path_repr else: raise TypeError("expected {}.__fspath__() to return str or bytes, " "not {}".format(path_type.__name__, type(path_repr).__name__))

9 24

Exposing CPython's subinterpreter C-API in the stdlib.
by Eric Snow 27 May '17

27 May '17

Although I haven't been able to achieve the pace that I originally wanted, I have been able to work on my multi-core Python idea little-by-little. Most notably, some of the blockers have been resolved at the recent PyCon sprints and I'm ready to move onto the next step: exposing multiple interpreters via a stdlib module. Initially I just want to expose basic support via 3 successive changes. Below I've listed the corresponding (chained) PRs, along with what they add. Note that the 2 proposed modules take some cues from the threading module, but don't try to be any sort of replacement. Threading and subinterpreters are two different features that are used together rather than as alternatives to one another. At the very least I'd like to move forward with the _interpreters module sooner rather than later. Doing so will facilitate more extensive testing of subinterpreters, in preparation for further use of them in the multi-core Python project. We can iterate from there, but I'd at least like to get the basic functionality landed early. Any objections to (or feedback about) the low-level _interpreters module as described? Likewise for the high-level interpreters module? Discussion on any expanded functionality for the modules or on the broader topic of the multi-core project are both welcome, but please start other threads for those topics. -eric basic low-level API: https://github.com/python/cpython/pull/1748 _interpreters.create() -> id _interpreters.destroy(id) _interpreters.run_string(id, code) _interpreters.run_string_unrestricted(id, code, ns=None) -> ns extra low-level API: https://github.com/python/cpython/pull/1802 _interpreters.enumerate() -> [id, ...] _interpreters.get_current() -> id _interpreters.get_main() -> id _interpreters.is_running(id) -> bool basic high-level API: https://github.com/python/cpython/pull/1803 interpreters.enumerate() -> [Interpreter, ...] interpreters.get_current() -> Interpreter interpreters.get_main() -> Interpreter interpreters.create() -> Interpreter interpreters.Interpreter(id) interpreters.Interpreter.is_running() interpreters.Interpreter.destroy() interpreters.Interpreter.run(code)

10 20