Mailman 3 August 2017 - Python-ideas

@classproperty, @abc.abstractclasspropery, etc.
by K. Richard Pixley 16 Dec '20

16 Dec '20

There's a whole matrix of these and I'm wondering why the matrix is currently sparse rather than implementing them all. Or rather, why we can't stack them as: class foo(object): @classmethod @property def bar(cls, ...): ... Essentially the permutation are, I think: {'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable attribute}. concreteness implicit first arg type name comments {unadorned} {unadorned} method def foo(): exists now {unadorned} {unadorned} property @property exists now {unadorned} {unadorned} non-callable attribute x = 2 exists now {unadorned} static method @staticmethod exists now {unadorned} static property @staticproperty proposing {unadorned} static non-callable attribute {degenerate case - variables don't have arguments} unnecessary {unadorned} class method @classmethod exists now {unadorned} class property @classproperty or @classmethod;@property proposing {unadorned} class non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract {unadorned} method @abc.abstractmethod exists now abc.abstract {unadorned} property @abc.abstractproperty exists now abc.abstract {unadorned} non-callable attribute @abc.abstractattribute or @abc.abstract;@attribute proposing abc.abstract static method @abc.abstractstaticmethod exists now abc.abstract static property @abc.staticproperty proposing abc.abstract static non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract class method @abc.abstractclassmethod exists now abc.abstract class property @abc.abstractclassproperty proposing abc.abstract class non-callable attribute {degenerate case - variables don't have arguments} unnecessary I think the meanings of the new ones are pretty straightforward, but in case they are not... @staticproperty - like @property only without an implicit first argument. Allows the property to be called directly from the class without requiring a throw-away instance. @classproperty - like @property, only the implicit first argument to the method is the class. Allows the property to be called directly from the class without requiring a throw-away instance. @abc.abstractattribute - a simple, non-callable variable that must be overridden in subclasses @abc.abstractstaticproperty - like @abc.abstractproperty only for @staticproperty @abc.abstractclassproperty - like @abc.abstractproperty only for @classproperty --rich

10 15

Specify number of items to allocate for array.array() constructor
by Sven Rahmann 21 Feb '20

21 Feb '20

At the moment, the array module of the standard library allows to create arrays of different numeric types and to initialize them from an iterable (eg, another array). What's missing is the possiblity to specify the final size of the array (number of items), especially for large arrays. I'm thinking of suffix arrays (a text indexing data structure) for large texts, eg the human genome and its reverse complement (about 6 billion characters from the alphabet ACGT). The suffix array is a long int array of the same size (8 bytes per number, so it occupies about 48 GB memory). At the moment I am extending an array in chunks of several million items at a time at a time, which is slow and not elegant. The function below also initializes each item in the array to a given value (0 by default). Is there a reason why there the array.array constructor does not allow to simply specify the number of items that should be allocated? (I do not really care about the contents.) Would this be a worthwhile addition to / modification of the array module? My suggestions is to modify array generation in such a way that you could pass an iterator (as now) as second argument, but if you pass a single integer value, it should be treated as the number of items to allocate. Here is my current workaround (which is slow): def filled_array(typecode, n, value=0, bsize=(1<<22)): """returns a new array with given typecode (eg, "l" for long int, as in the array module) with n entries, initialized to the given value (default 0) """ a = array.array(typecode, [value]*bsize) x = array.array(typecode) r = n while r >= bsize: x.extend(a) r -= bsize x.extend([value]*r) return x

14 20

Asynchronous exception handling around with/try statement borders
by Erik Bray 24 Sep '18

24 Sep '18

Hi folks, I normally wouldn't bring something like this up here, except I think that there is possibility of something to be done--a language documentation clarification if nothing else, though possibly an actual code change as well. I've been having an argument with a colleague over the last couple days over the proper way order of statements when setting up a try/finally to perform cleanup of some action. On some level we're both being stubborn I think, and I'm not looking for resolution as to who's right/wrong or I wouldn't bring it to this list in the first place. The original argument was over setting and later restoring os.environ, but we ended up arguing over threading.Lock.acquire/release which I think is a more interesting example of the problem, and he did raise a good point that I do want to bring up. </prologue> My colleague's contention is that given lock = threading.Lock() this is simply *wrong*: lock.acquire() try: do_something() finally: lock.release() whereas this is okay: with lock: do_something() Ignoring other details of how threading.Lock is actually implemented, assuming that Lock.__enter__ calls acquire() and Lock.__exit__ calls release() then as far as I've known ever since Python 2.5 first came out these two examples are semantically *equivalent*, and I can't find any way of reading PEP 343 or the Python language reference that would suggest otherwise. However, there *is* a difference, and has to do with how signals are handled, particularly w.r.t. context managers implemented in C (hence we are talking CPython specifically): If Lock.__enter__ is a pure Python method (even if it maybe calls some C methods), and a SIGINT is handled during execution of that method, then in almost all cases a KeyboardInterrupt exception will be raised from within Lock.__enter__--this means the suite under the with: statement is never evaluated, and Lock.__exit__ is never called. You can be fairly sure the KeyboardInterrupt will be raised from somewhere within a pure Python Lock.__enter__ because there will usually be at least one remaining opcode to be evaluated, such as RETURN_VALUE. Because of how delayed execution of signal handlers is implemented in the pyeval main loop, this means the signal handler for SIGINT will be called *before* RETURN_VALUE, resulting in the KeyboardInterrupt exception being raised. Standard stuff. However, if Lock.__enter__ is a PyCFunction things are quite different. If you look at how the SETUP_WITH opcode is implemented, it first calls the __enter__ method with _PyObjet_CallNoArg. If this returns NULL (i.e. an exception occurred in __enter__) then "goto error" is executed and the exception is raised. However if it returns non-NULL the finally block is set up with PyFrame_BlockSetup and execution proceeds to the next opcode. At this point a potentially waiting SIGINT is handled, resulting in KeyboardInterrupt being raised while inside the with statement's suite, and finally block, and hence Lock.__exit__ are entered. Long story short, because Lock.__enter__ is a C function, assuming that it succeeds normally then with lock: do_something() always guarantees that Lock.__exit__ will be called if a SIGINT was handled inside Lock.__enter__, whereas with lock.acquire() try: ... finally: lock.release() there is at last a small possibility that the SIGINT handler is called after the CALL_FUNCTION op but before the try/finally block is entered (e.g. before executing POP_TOP or SETUP_FINALLY). So the end result is that the lock is held and never released after the KeyboardInterrupt (whether or not it's handled somehow). Whereas, again, if Lock.__enter__ is a pure Python function there's less likely to be any difference (though I don't think the possibility can be ruled out entirely). At the very least I think this quirk of CPython should be mentioned somewhere (since in all other cases the semantic meaning of the "with:" statement is clear). However, I think it might be possible to gain more consistency between these cases if pending signals are checked/handled after any direct call to PyCFunction from within the ceval loop. Sorry for the tl;dr; any thoughts?

7 15

Positional-only parameters
by Victor Stinner 10 Sep '18

10 Sep '18

Hi, For technical reasons, many functions of the Python standard libraries implemented in C have positional-only parameters. Example: ------- $ ./python Python 3.7.0a0 (default, Feb 25 2017, 04:30:32) >>> help(str.replace) replace(self, old, new, count=-1, /) # <== notice "/" at the end ... >>> "a".replace("x", "y") # ok 'a' >>> "a".replace(old="x", new="y") # ERR! TypeError: replace() takes at least 2 arguments (0 given) ------- When converting the methods of the builtin str type to the internal "Argument Clinic" tool (tool to generate the function signature, function docstring and the code to parse arguments in C), I asked if we should add support for keyword arguments in str.replace(). The answer was quick: no! It's a deliberate design choice. Quote of Yury Selivanov's message: """ I think Guido explicitly stated that he doesn't like the idea to always allow keyword arguments for all methods. I.e. `str.find('aaa')` just reads better than `str.find(needle='aaa')`. Essentially, the idea is that for most of the builtins that accept one or two arguments, positional-only parameters are better. """ http://bugs.python.org/issue29286#msg285578 I just noticed a module on PyPI to implement this behaviour on Python functions: https://pypi.python.org/pypi/positional My question is: would it make sense to implement this feature in Python directly? If yes, what should be the syntax? Use "/" marker? Use the @positional() decorator? Do you see concrete cases where it's a deliberate choice to deny passing arguments as keywords? Don't you like writing int(x="123") instead of int("123")? :-) (I know that Serhiy Storshake hates the name of the "x" parameter of the int constructor ;-)) By the way, I read that "/" marker is unknown by almost all Python developers, and [...] syntax should be preferred, but inspect.signature() doesn't support this syntax. Maybe we should fix signature() and use [...] format instead? Replace "replace(self, old, new, count=-1, /)" with "replace(self, old, new[, count=-1])" (or maybe even not document the default value?). Python 3.5 help (docstring) uses "S.replace(old, new[, count])". Victor

19 27

Implicit string literal concatenation considered harmful?
by Guido van Rossum 14 Mar '18

14 Mar '18

I just spent a few minutes staring at a bug caused by a missing comma -- I got a mysterious argument count error because instead of foo('a', 'b') I had written foo('a' 'b'). This is a fairly common mistake, and IIRC at Google we even had a lint rule against this (there was also a Python dialect used for some specific purpose where this was explicitly forbidden). Now, with modern compiler technology, we can (and in fact do) evaluate compile-time string literal concatenation with the '+' operator, so there's really no reason to support 'a' 'b' any more. (The reason was always rather flimsy; I copied it from C but the reason why it's needed there doesn't really apply to Python, as it is mostly useful inside macros.) Would it be reasonable to start deprecating this and eventually remove it from the language? -- --Guido van Rossum (python.org/~guido)

51 165

Signature Literals
by Blaine Rogers 30 Aug '17

30 Aug '17

The current syntax for Callable types is unwieldy, particularly when extended to include varargs and keyword args as in http://mypy.readthedocs.io/en/latest/kinds_of_types.html# extended-callable-types. Why not introduce a signature literal? Proposed syntax: >>> from inspect import Signature, Parameter > >>> () -> > Signature() > >>> (arg0, arg1, arg2=None, arg3=None) -> > Signature( > [Parameter('arg0', Parameter.POSITIONAL_OR_KEYWORD), > Parameter('arg1', Parameter.POSITIONAL_OR_KEYWORD), > Parameter('arg2', Parameter.POSITIONAL_OR_KEYWORD, default=None), > Parameter('arg3', Parameter.POSITIONAL_OR_KEYWORD, default=None)], > return_annotation=str > ) > >>> (arg0, arg1: int, arg2=None, arg3: float=None) -> str > Signature( > [Parameter('arg0', Parameter.POSITIONAL_OR_KEYWORD), > Parameter('arg1', Parameter.POSITIONAL_OR_KEYWORD, annotation=int), > Parameter('arg2', Parameter.POSITIONAL_OR_KEYWORD, default=None), > Parameter('arg3', Parameter.POSITIONAL_OR_KEYWORD, annotation=float, > default=None)], > return_annotation=str > ) > >>> (:, :, :, arg1, *, arg2) -> > Signature( > [Parameter('', Parameter.POSITIONAL_ONLY), > Parameter('', Parameter.POSITIONAL_ONLY), > Parameter('', Parameter.POSITIONAL_ONLY), > Parameter('arg1', Parameter.POSITIONAL_OR_KEYWORD), > Parameter('arg2', Parameter.KEYWORD_ONLY)] > ) > >>> (:int, :float, *, keyword: complex) -> str > Signature( > [Parameter('', Parameter.POSITIONAL_ONLY, annotation=int), > Parameter('', Parameter.POSITIONAL_ONLY, annotation=float), > Parameter('keyword', Parameter.KEYWORD_ONLY, annotation=complex)], return_annotation=str ) Compare the above to their equivalents using Callable (and the experimental extension to Mypy): >>> Callable[[], Any] > >>> Callable[[Arg(Any, 'arg0'), Arg(int, 'arg1'), DefaultArg(Any, 'arg2'), > DefaultArg(float, 'kwarg3')], str] > >>> Callable[[Arg(), Arg(), Arg(), Arg(Any, 'arg1'), NamedArg(Any, > 'arg2')], Any] > >>> Callable[[int, float, NamedArg(complex, 'keyword')], Any] The proposed signature literal syntax is shorter, just as clear and imo nicer to read. Here is what it looks like in annotations: from typing import TypeVar, Callable > > A = TypeVar('A') > def apply_successor(func: Callable[[A], A], init: A, n_applications: int) > -> A: ... > def apply_successor(func: (:A) -> A, init: A, n_applications: int) -> A: > ... > > import tensorflow as tf > import numpy as np > > def run(policy: Callable[[np.ndarray, Arg(Dict[tf.Tensor, np.ndarray], > 'updated_feeds')], np.ndarray]) -> bool: ... > def run(policy: (:np.ndarray, updated_feeds: Dict[tf.Tensor, np.ndarray]) > -> np.ndarray) -> bool: ... > # If Mypy accepted literals for container types (dict, set, list, tuple, > etc) this would be nicer still > def run(policy: (:np.ndarray, updated_feeds: {tf.Tensor: np.ndarray}) -> > np.ndarray) -> bool: ... Initial thoughts: - () -> is ugly, but the -> would be necessary to distinguish it from the empty tuple (). Actually, it can be difficult to tell the difference between the proposed signature literals and tuples, especially for long signatures with no annotations or defaults. An alternative would be to prefix the arguments with an @ or other uncommon symbol (maybe &). () -> becomes @(), and it is clear from the start that you're reading a signature. - Supposing the syntax for function definitions was changed to match the proposed signature literals, one could make something like the following possible: >>> def add(:, :): ... arg0, arg1 = __call_signature__.args ... return arg0.value + arg1.value >>> add(1, 2) 3 >>> add('hello', 'world') 'helloworld' Where __call_signature__ is a magic name that evaluates to an inspect.BoundArguments instance representing the signature of the function call. I'm not sure why you'd want functions with positional-only arguments, but now you could have them. - You could further extend the function definition syntax to allow an expression that evaluates to a signature instead of a literal >>> signature = (:, :) -> >>> def add signature: ... arg0, arg1 = __call_signature__.args ... return arg0 + arg1 Again, not sure how useful this would be.

4 3

argparse.ArgumentParser: include arguments from files with relative paths
by Robert Schindler 29 Aug '17

29 Aug '17

Hello, I'm often using ArgumentParser in my projects, as well as its ability to read argument lists from files. However, the problem is that nested includes of such argument files have to specify paths relative to os.getcwd(), no matter where the file containing the include statement is located. Currently, this can be circumvented by always using absolute paths. But imho that is not a practical solution, due to the obvious portability issues it causes. I suggest adding a new parameter to argparse.ArgumentParser that controls the behaviour: * fromfile_parent_relative_ - Whether to treat paths of included argument files as relative to the location of the file they are specified in (``True``) or to the current working directory (``False``) (default: ``False``) Doing so would allow users to choose between the two different strategies while keeping backwards compatibility. I made a pull request [1] which adds the functionality + docs to demonstrate a possible solution. What do you think about this enhancement? Please note this is my first contribution to cpython. I now know that I should have presented it to python-ideas before starting a pull request. Sorry for doing it the wrong way around. Best regards Robert [1] https://github.com/python/cpython/pull/1698

1 0

Unittest error message failure context lazy creation
by francismb 27 Aug '17

27 Aug '17

Hi all, while using `unittest` I see the pattern of creating an error message with the test context for the case that some `assert...` methods fails (to get a good error message). On the lines: class Test...(unittest.TestCase): longMessage = True def test_(self): ... for a, b, c ... in zip(A, B, C, ..): * call the function under test and get the result msg = "Some headline: {}{} ...".format(a, b, c,..) self.assert...( ,msg) The `msg` is just used in case the assert fails but its creation takes time and adds up. What is the best practice/pattern you use here? Or, are there ideas to have a lazy mechanism to avoid that creation and just infer them in the case the assert failed? Thanks in advance! --francis

3 3

Re: [Python-ideas] Please consider adding context manager versions of setUp/tearDown to unittest.TestCase
by rymg19＠gmail.com 25 Aug '17

25 Aug '17

TBH you're completely right. Every time I see someone using unittest andItsHorriblyUnpythonicNames, I want to kill a camel. Sometimes, though, I feel like part of the struggle is the alternative. If you dislike unittest, but pytest is too "magical" for you, what do you use? Many Python testing tools like nose are just test *runners*, so you still need something else. In the end, many just end up back at unittest, maybe with nose on top. As much as I hate JavaScript, their testing libraries are leagues above what Python has. -- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone elsehttp://refi64.com On Aug 22, 2017 at 5:09 PM, <Chris Barker <chris.barker(a)noaa.gov>> wrote: ** Caution: cranky curmudgeonly opinionated comment ahead: ** unitest is such an ugly Java-esque static mess of an API that there's really no point in trying to clean it up and make it more pythonic -- go off and use pytest and be happier. -CHB On Tue, Aug 22, 2017 at 5:42 AM, Nick Coghlan <ncoghlan(a)gmail.com> wrote: > On 22 August 2017 at 15:34, Nick Coghlan <ncoghlan(a)gmail.com> wrote: > > On 21 August 2017 at 11:32, Neil Girdhar <mistersheik(a)gmail.com> wrote: > >> This question describes an example of the problem: > >> https://stackoverflow.com/questions/8416208/in-python- > is-there-a-good-idiom-for-using-context-managers-in-setup-teardown. > >> You want to invoke a context manager in your setup/tearing-down, but the > >> easiest way to do that is to override run, which seems ugly. > > > > Using context managers when you can't use a with statement is one of > > the main use cases for contextlib.ExitStack(): > > > > def setUp(self): > > self._resource_stack = stack = contextlib.ExitStack() > > self._resource = stack.enter_context(MyResource()) > > > > def tearDown(self): > > self._resource_stack.close() > > > > I posted that as an additional answer to the question: > > https://stackoverflow.com/questions/8416208/in-python- > is-there-a-good-idiom-for-using-context-managers-in- > setup-teardown/45809502#45809502 > > Sjoerd pointed out off-list that this doesn't cover the case where > you're acquiring multiple resources and one of the later acquisitions > fails, so I added the ExitStack idiom that covers that case (using > stack.pop_all() as the last operation in a with statement): > > def setUp(self): > with contextlib.ExitStack() as stack: > self._resource1 = stack.enter_context(GetResource()) > self._resource2 = stack.enter_context(GetOtherResource()) > # Failures before here -> immediate cleanup > self.addCleanup(stack.pop_all().close) > # Now cleanup won't happen until the cleanup functions run > > I also remember that using addCleanup lets you avoid defining tearDown > entirely. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas(a)python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker(a)noaa.gov _______________________________________________ Python-ideas mailing list Python-ideas(a)python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

7 11

PEP 550 dumbed down
by Jim J. Jewett 24 Aug '17

24 Aug '17

I know I'm not the only one who is confused by at least some of the alternative terminology choices. I suspect I'm not the only one who sometimes missed part of the argument because I was distracted figuring out what the objects were, and forgot to verify what was being done and why. I also suspect that it could be much simpler to follow if the API were designed in the abstract, with the implementation left for later. So is the following API missing anything important? (1) Get the current (writable) context. Currently proposed as a sys.* call, but I think injecting to __builtins__ or globals would work as well. (2) Get a value from the current context, by string key. Currently proposed as key.get, rather env.__getitem__ (3) Write a value to the current context, by string key. Currently proposed as key.set, rather env.__setitem__ (4) Create a new (writable) empty context. (5) Create a copy of the current context, so that changes can be isolated. The copy will not be able to change anything in the current context, though it can shadow keys. (6) Choose which context to use when calling another function/generator/iterator/etc/ At this point, it looks an awful lot like a subset of ChainMap, except that: (A) The current mapping is available through a series of sys.* calls. (why not a builtin? Or at least a global, injected when a different environment is needed?) (B) Concurrency APIs are supposed to ensure that each process/thread/Task/worker is using its own private context, unless the call explicitly requests a shared or otherwise different context. (C) The current API requires users to initialize every key before it can be added to a context. This is presumably to support limits of the proposed implementation. If the semantics are right, and collections.ChainMap is rejected only for efficiency, please say so in the PEP. If the semantics are wrong, please explain how they differ. Sample code: olduser=env["username"] env["reason"] = "Spanish Inquisition" with env.copy(): env["username"] = "secret admin" foo() print ("debugging", env["foodebug"]) bar() with env.empty(): assert "username" not in env assert env["username"] is olduser -jJ

5 4