Mailman 3 September 2019 - Python-ideas

PEP's shouldn't require a sponsor
by Batuhan Taskaya 03 Dec '19

03 Dec '19

Why do i need to convince a core developer for my PEP? AFAIK the steering council can include non core developers (i know it isn't that current case but for the future this is important). And if the last authority who will approve my PEP is the steering council i just need to convince them not core developers. Sponsors can stay (and they should because guidance is important) but thy shouldn't be mandatory. Let everyone to send their peps.

22 29

Please consider adding of functions file system operations to pathlib
by George Fischhof 16 Nov '19

16 Nov '19

Good day all, as a continuation of thread "OS related file operations (copy, move, delete, rename...) should be placed into one module" https://mail.python.org/pipermail/python-ideas/2017-January/044217.html please consider making pathlib to a central file system module with putting file operations (copy, move, delete, rmtree etc) into pathlib. BR, George

3 3

Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]
by Raymond Hettinger 23 Oct '19

23 Oct '19

> On Friday, April 6, 2018 at 8:14:30 AM UTC-7, Guido van Rossum wrote: > On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor <peter.ed...(a)gmail.com> wrote: >> So some more humble proposals would be: >> >> 1) An initializer to itertools.accumulate >> functools.reduce already has an initializer, I can't see any controversy to adding an initializer to itertools.accumulate > > See if that's accepted in the bug tracker. It did come-up once but was closed … [View More]for a number reasons including lack of use cases. However, Peter's signal processing example does sound interesting, so we could re-open the discussion. For those who want to think through the pluses and minuses, I've put together a Q&A as food for thought (see below). Everybody's design instincts are different -- I'm curious what you all think think about the proposal. Raymond --------------------------------------------- Q. Can it be done? A. Yes, it wouldn't be hard. _sentinel = object() def accumulate(iterable, func=operator.add, start=_sentinel): it = iter(iterable) if start is _sentinel: try: total = next(it) except StopIteration: return else: total = start yield total for element in it: total = func(total, element) yield total Q. Do other languages do it? A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes. * http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.… * https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html * http://microapl.com/apl/apl_concepts_chapter5.html \+ 1 2 3 4 5 1 3 6 10 15 * https://reference.wolfram.com/language/ref/Accumulate.html * https://www.haskell.org/hoogle/?hoogle=mapAccumL Q. How much work for a person to do it currently? A. Almost zero effort to write a simple helper function: myaccum = lambda it, func, start: accumulate(chain([start], it), func) Q. How common is the need? A. Rare. Q. Which would be better, a simple for-loop or a customized itertool? A. The itertool is shorter but more opaque (especially with respect to the argument order for the function call): result = [start] for x in iterable: y = func(result[-1], x) result.append(y) versus: result = list(accumulate(iterable, func, start=start)) Q. How readable is the proposed code? A. Look at the following code and ask yourself what it does: accumulate(range(4, 6), operator.mul, start=6) Now test your understanding: How many values are emitted? What is the first value emitted? Are the two sixes related? What is this code trying to accomplish? Q. Are there potential surprises or oddities? A. Is it readily apparent which of assertions will succeed? a1 = sum(range(10)) a2 = sum(range(10), 0) assert a1 == a2 a3 = functools.reduce(operator.add, range(10)) a4 = functools.reduce(operator.add, range(10), 0) assert a3 == a4 a4 = list(accumulate(range(10), operator.add)) a5 = list(accumulate(range(10), operator.add, start=0)) assert a5 == a6 Q. What did the Python 3.0 Whatsnew document have to say about reduce()? A. "Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable." Q. What would this look like in real code? A. We have almost no real-world examples, but here is one from a StackExchange post: def wsieve(): # wheel-sieve, by Will Ness. ideone.com/mqO25A->0hIE89 wh11 = [ 2,4,2,4,6,2,6,4,2,4,6,6, 2,6,4,2,6,4,6,8,4,2,4,2, 4,8,6,4,6,2,4,6,2,6,6,4, 2,4,6,2,6,4,2,4,2,10,2,10] cs = accumulate(cycle(wh11), start=11) yield( next( cs)) # cf. ideone.com/WFv4f ps = wsieve() # codereview.stackexchange.com/q/92365/9064 p = next(ps) # 11 psq = p*p # 121 D = dict( zip( accumulate(wh11, start=0), count(0))) # start from sieve = {} for c in cs: if c in sieve: wheel = sieve.pop(c) for m in wheel: if not m in sieve: break sieve[m] = wheel # sieve[143] = wheel@187 elif c < psq: yield c else: # (c==psq) # map (p*) (roll wh from p) = roll (wh*p) from (p*p) x = [p*d for d in wh11] i = D[ (p-11) % 210] wheel = accumulate(cycle(x[i:] + x[:i]), start=psq) p = next(ps) ; psq = p*p next(wheel) ; m = next(wheel) sieve[m] = wheel [View Less]

17 49

Set operations with Lists
by Richard Higginbotham 14 Oct '19

14 Oct '19

I'm not sure if there is any interest by others but I have frequently come across cases where I would like to compare items in one list in another similar to relational algebra. For example are the file names in A in B and if so return a new list with those items. Long story short, I wrote some functions to do that. They are quite simple and fast (due to timsort in no small part). Even the plain python code is faster than the built in set functions (afaik). I created a github and put the ones I … [View More]

10 65

Why not accept lists or arbitrary iterables in str.endswith?
by Soni L. 14 Oct '19

14 Oct '19

I'm parsing configs for domain filtering rules, and they come as a list. However, str.endswith requires a tuple. So I need to use str.endswith(tuple(list)). I don't know the reasoning for this, but why not just accept a list as well?

3 6

Add Subscriptable ABC
by Steven D'Aprano 11 Oct '19

11 Oct '19

I keep finding myself needing to test for objects that support subscripting. This is one case where EAFP is *not* actually easier: try: obj[0] except TypeError: subscriptable = False except (IndexError, KeyError): subscriptable = True else: subscriptable = True if subscriptable: ... But I don't like manually testing for it like this: if getattr(obj, '__getitem__', None) is not None: ... because it is wrong. (Its wrong because … [View More]

18 43

Re: Adding support for adequately tagging AIX (pep425) to support distributed wheels
by Michael Felt 09 Oct '19

09 Oct '19

Among other places, Python ideas was recommended as a place to goto. In the meantime I have been discussing this on pypa/pip (mainly), and also on wheel and packaging. Even submitted PRs. But the PRs are only needed if the tag is inadequate. So, part of my reason for being here is to figure out if I can call the current tag algorithm a bug, or is correcting it a feature? I am trying to approach this the Python way rather than be seen as a bull in a china shop. Thanks for replying! Sent from my iPhone

3 4

Add __slots__ to dataclasses to use less than half as much RAM
by Johnny Dahlberg 29 Sep '19

29 Sep '19

It's pretty wasteful to use a dynamic storage dictionary to hold the data of a "struct-like data container". Users can currently add `__slots__` manually to your `@dataclass` class, but it means you can no longer use default values, and the manual typing gets very tedious. I compared the RAM usage and benchmarked the popular attrs library vs dataclass, and saw the following result: Slots win heavily in the memory usage department, regardless of whether you use dataclass or attrs. And … [View More]dataclass with manually written slots use 8 bytes less than attrs-with-slots (static number, does not change based on how many fields the class has). But dataclass loses with its lack of features, lack of default values if slots are used, and tedious way to write slots manually (see class "D"). Here are the numbers in bytes per-instance for classes: ``` attrs size 512 attrs-with-slots size 200 dataclass size 512 dataclass-with-slots size 192 ``` As for data access benchmarks: The result varied too much between runs to draw any conclusions except to say that slots was slightly faster than dictionary-based storage. And that there's no real difference between the dataclass and attrs libraries in access-speed. Here is the full benchmark code: ``` import attr from dataclasses import dataclass from pympler import asizeof import time # every additional field adds 88 bytes @attr.s class A: a = attr.ib(type=int, default=0) b = attr.ib(type=int, default=4) c = attr.ib(type=int, default=2) d = attr.ib(type=int, default=8) # every additional field adds 40 bytes @attr.s(slots=True) class B: a = attr.ib(type=int, default=0) b = attr.ib(type=int, default=4) c = attr.ib(type=int, default=2) d = attr.ib(type=int, default=8) # every additional field adds 88 bytes @dataclass class C: a: int = 0 b: int = 4 c: int = 2 d: int = 8 # every additional field adds 40 bytes @dataclass class D: __slots__ = {"a", "b", "c", "d"} a: int b: int c: int d: int Ainst = A() Binst = B() Cinst = C() Dinst = D(0,4,2,8) print("attrs size", asizeof.asizeof(Ainst)) # 512 bytes print("attrs-with-slots size", asizeof.asizeof(Binst)) # 200 bytes print("dataclass size", asizeof.asizeof(Cinst)) # 512 bytes print("dataclass-with-slots size", asizeof.asizeof(Dinst)) # 192 bytes s = time.perf_counter() for i in range(0,250000000): x = Ainst.a elapsed = time.perf_counter() - s print("elapsed attrs:", (elapsed*1000), "milliseconds") s = time.perf_counter() for i in range(0,250000000): x = Binst.a elapsed = time.perf_counter() - s print("elapsed attrs-with-slots:", (elapsed*1000), "milliseconds") s = time.perf_counter() for i in range(0,250000000): x = Cinst.a elapsed = time.perf_counter() - s print("elapsed dataclass:", (elapsed*1000), "milliseconds") s = time.perf_counter() for i in range(0,250000000): x = Dinst.a elapsed = time.perf_counter() - s print("elapsed dataclass-with-slots:", (elapsed*1000), "milliseconds") ``` Also note that it IS possible to annotate attrs-classes using the PEP 526 annotation (ie. `a: int = 0` instead of `a = attr.ib(type=int, default=0)`, but then you lose out on a bunch of its extra features that are also specified as named parameters to attr.ib (such as validators, kw_only parameters, etc). Anyway, the gist of everything is: Slots heavily beat dictionaries, reducing the RAM usage to less than half of the current dataclass implementation. My proposal: Implement `@dataclass(slots=True)` which does the same thing as attrs: Replaces the class with a modified class that has a `__slots__` property instead of a `__dict__`. And fully supporting default values in the process. [View Less]

10 24

Error handling toggle / levels
by salemalbream＠gmail.com 29 Sep '19

29 Sep '19

Hello, So when coding, at deferent stages we need different levels of error handling. For example at stage of finding the concept, initial implementation, alpha, releasing testing, etc. By some bits of code we can simulate enable/disable error handling. But it would be very helpful to have such thing built in the language with some more pythonic salt 😁. For example, Assume this behavior.. ************ SetExceptionLevel(50) try: x = 1 / 0 except(0) Exception as e: print(e) except … [View More]

5 4

Add a generic object comparison utility
by Sébastien Eskenaz 27 Sep '19

27 Sep '19

Hello, Several libraries have complex objects but no comparison operators for them other than "is" which checks if we are comparing an object with itself. It would be quite nice to be able to compare any two objects together. I made this function in python to have a starting point https://gist.github.com/SebastienEske/5a9c04e718becd93b7928514e80f0211 I know that it needs some improvement to protect against infinite loops and I don't like the hardcoded check for strings, it should probably be … [View More]

3 6