Mailman 3 July 2018 - Python-ideas

Re: [Python-ideas] slice[] to get more complex slices
by Jeroen Demeyer July 23, 2018

July 23, 2018

On 2018-07-23 11:58, Grégory Lielens wrote: > Not sure slice[1::3] can be done It can be done. Since "slice" is a class, it would require a metaclass though. Another solution that nobody has mentioned (as far as I know) is to add additional syntax to the language for that. For example, one could say that (1:3) could be used to construct slice(1, 3) directly. The parentheses are required to avoid confusion with type hints. I'm not a Python language expert, but I don't think that type … [View More]

8 7

slice[] to get more complex slices
by Todd July 23, 2018

July 23, 2018

For basic slices, the normal "slice(start, stop, step)" syntax works well. But it becomes much more verbose to create more complicated slices that you want to re-use for multiple multidimensional data structures, like numpy, pandas, xarray, etc. One idea I had was to allow creating slices by using indexing on the slice class. So for example: x = slice[5:1:-1, 10:20:2, 5:end] Would be equivalent to: x = (slice(5, 1, -1), slice(10, 20, 2), slice(5, None)) Note that this wouldn't be … [View More]

5 7

Fwd: PEP 505: None-aware operators
by Paul Moore July 22, 2018

July 22, 2018

Aargh, I hate Google Groups with a vengeance. If people *have* to post from there, can they please change reply-to so that replies don't get messed up. Or is that not possible, and yet another way that GG is just broken? Paul ---------- Forwarded message ---------- From: Paul Moore <p.f.moore(a)gmail.com> Date: 22 July 2018 at 12:05 Subject: Re: [Python-ideas] PEP 505: None-aware operators To: Grégory Lielens <gregory.lielens(a)gmail.com> Cc: python-ideas <python-ideas(a)… [View More]

1 0

Revisiting str.rreplace()
by Graham Gott July 22, 2018

July 22, 2018

This was previously proposed here in 2014 < https://mail.python.org/pipermail/python-ideas/2014-January/025091.html>, but the discussion fizzled out. To me, str.rreplace() is an obvious and necessary complement to str.replace(), just as str.rsplit() is a complement to str.split(). It would make Python closer to the goal of the Zen of Python: "There should be one-- and preferably only one --obvious way to do it." To support its usefulness, this question has been asked on Stack Overflow a … [View More]number of times, (<https://stackoverflow.com/q/2556108>, < https://stackoverflow.com/q/14496006>, <https://stackoverflow.com/q/9943504>) with more than a trivial number of votes for the answers (>100 for the top answer of the first two questions, and >50 for the third, not necessarily mutually exclusive voters, but probably largely; even assuming the worst, >100 is nothing to scoff at). While anonymous Stack Overflow votes are not necessarily the best arbiter on what's a good idea, they at least show interest. My proposed behavior (though probably not the implementation details) would be as follows (though the implementation would obviously be as a string method, not a function, with 'self' replacing 's'): def rreplace(s, old, new, count=-1): ''' Return a copy with all occurrences of substring old replaced by new. count Maximum number of occurrences to replace. -1 (the default value) means replace all occurrences Substrings are replaced starting at the end of the string and working to the front. ''' return new.join(s.rsplit(old, count)) Addressing some (paraphrased) issues from the previous discussion, in the form of an FAQ: rreplace() looks weird. > This is maintaining consistency with the trend of 'r' prefixes for 'reverse' methods. Introducing 'reverse_' as a prefix would be inconsistent, but worse, it would also encourage backwards incompatible changes into existing methods. I think such a prefix naming change warrants its own separate discussion. There are already ways to do this. Why should we add another string method? > My main motivation is having one clear and efficient way to do this. I explain this in greater detail in my introduction above. The default of count=-1 has the same behavior as str.replace(), right? > Actually, it doesn't have the same behavior, as MRAB pointed out in the previous discussion < https://mail.python.org/pipermail/python-ideas/2014-January/025102.html>. The default of -1 also keeps the syntax consistent with str.rsplit()'s syntax. If we're adding this, shouldn't we also add bytes.rreplace, > bytearray.rreplace, bytearray.rremove, tuple.rindex, list.rindex, > list.rremove? > Honestly, I don't know. I would prefer not to dilute this discussion too much, or start making a slippery slope argument, but if it's directly relevant, I think any thoughts are welcome. Couldn't we just add a traverse_backwards parameter to str.replace()? In > fact, why don't we just get rid of str.rfind() and str.rindex() entirely > and just add new parameters to str.find() and str.index()? > I think Steven D'Aprano explained this well in the previous discussion here: < https://mail.python.org/pipermail/python-ideas/2014-January/025132.html>. And addressing counterarguments here: < https://mail.python.org/pipermail/python-ideas/2014-January/025135.html>. Basically, different functions for different purposes make the purpose clearer (no confusing/complicated parameter names and functions), and str.rreplace() and str.replace() are usually going to be used in situations where the traversal direction is known at edit time, so there's no need to make the method determine the direction at runtime. Thoughts? Support/oppose? [View Less]

7 10

Performance improvements via static typing
by Michael Hall July 21, 2018

July 21, 2018

While I am aware of projects like Cython and mypy, it seems to make sense for CPython to allow optional enforcement of type hints, with compiler optimizations related to it to be used. While this would not receive the same level of performance benefits as using ctypes directly, there do appear to be various gains available here. My main concern with this as a thought was how to specify type hints as optional, as for maximum benefit, this shouldn't prevent the ability to type hint functions … [View More]

6 6

Re: [Python-ideas] Multi-core reference count garbage collection
by Steven D'Aprano July 21, 2018

July 21, 2018

On Sat, Jul 21, 2018 at 11:05:43AM +0100, Daniel Moisset wrote: [snip interesting and informative discussion, thank you] > @Steven D'Aprano: you mentioned soemthign about race conditions but I don't > think this algorithm has any (the article you linked just said that doing > refcounting in the traditional way and without synchronization in a multi > core environment has race conditions, which is not what is being discussed > here). Could you expand on this? Certainly not. I … [View More]

1 0

The future of Python parallelism. The GIL. Subinterpreters. Actors.
by David Foster July 21, 2018

July 21, 2018

In the past I have personally viewed Python as difficult to use for parallel applications, which need to do multiple things simultaneously for increased performance: * The old Threads, Locks, & Shared State model is inefficient in Python due to the GIL, which limits CPU usage to only one thread at a time (ignoring certain functions implemented in C, such as I/O). * The Actor model can be used with some effort via the “multiprocessing” module, but it doesn’t seem that streamlined and … [View More]forces there to be a separate OS process per line of execution, which is relatively expensive. I was thinking it would be nice if there was a better way to implement the Actor model, with multiple lines of execution in the same process, yet avoiding contention from the GIL. This implies a separate GIL for each line of execution (to eliminate contention) and a controlled way to exchange data between different lines of execution. So I was thinking of proposing a design for implementing such a system. Or at least get interested parties thinking about such a system. With some additional research I notice that [PEP 554] (“Multiple subinterpeters in the stdlib”) appears to be putting forward a design similar to the one I described. I notice however it mentions that subinterpreters currently share the GIL, which would seem to make them unusable for parallel scenarios due to GIL contention. I'd like to solicit some feedback on what might be the most efficient way to make forward progress on efficient parallelization in Python inside the same OS process. The most promising areas appear to be: 1. Make the current subinterpreter implementation in Python have more complete isolation, sharing almost no state between subinterpreters. In particular not sharing the GIL. The "Interpreter Isolation" section of PEP 554 enumerates areas that are currently shared, some of which probably shouldn't be. 2. Give up on making things work inside the same OS process and rather focus on implementing better abstractions on top of the existing multiprocessing API so that the actor model is easier to program against. For example, providing some notion of Channels to communicate between lines of execution, a way to monitor the number of Messages waiting in each channel for throughput profiling and diagnostics, Supervision, etc. In particular I could do this by using an existing library like Pykka or Thespian and extending it where necessary. Thoughts? [PEP 554]: https://www.python.org/dev/peps/pep-0554/ -- David Foster | Seattle, WA, USA [View Less]

19 44

Adding Python interpreter info to "pip install"
by Al Sweigart July 20, 2018

July 20, 2018

The goal of this idea is to make it easier to find out when someone has installed packages for the wrong python installation. I'm coming across quite a few StackOverflow posts and emails where beginners are using pip to install a package, but then finding they can't import it because they have multiple python installations and used the wrong pip. For example, this guy has this problem: https://stackoverflow.com/questions/37662012/which-pip-is-with-which-python I'd propose adding a simple … [View More]

8 9

Re: [Python-ideas] Performance improvements via static typing
by Michael Hall July 19, 2018

July 19, 2018

On Thu, Jul 19, 2018 at 10:07 AM, Stephan Houben <stephanh42(a)gmail.com> wrote: > You are aware of numba? > > https://numba.pydata.org/ > > Stephan > > Op do 19 jul. 2018 16:03 schreef Eric Fahlgren <ericfahlgren(a)gmail.com>: > >> On Thu, Jul 19, 2018 at 6:52 AM Michael Hall >> <python-ideas(a)michaelhall.tech> wrote: >> >>> While I am aware of projects like Cython and mypy, it seems to make >>> sense for CPython … [View More]

1 0

Allow a group by operation for dict comprehension
by Nicolas Rolin July 18, 2018

July 18, 2018

Hi, I use list and dict comprehension a lot, and a problem I often have is to do the equivalent of a group_by operation (to use sql terminology). For example if I have a list of tuples (student, school) and I want to have the list of students by school the only option I'm left with is to write student_by_school = defaultdict(list) for student, school in student_school_list: student_by_school[school].append(student) What I would expect would be a syntax with comprehension … [View More]allowing me to write something along the lines of: student_by_school = {group_by(school): student for school, student in student_school_list} or any other syntax that allows me to regroup items from an iterable. Small FAQ: Q: Why include something in comprehensions when you can do it in a small number of lines ? A: A really appreciable part of the list and dict comprehension is the fact that it allows the developer to be really explicit about what he wants to do at a given line. If you see a comprehension, you know that the developer wanted to have an iterable and not have any side effect other than depleting the iterator (if he respects reasonable code guidelines). Initializing an object and doing a for loop to construct it is both too long and not explicit enough about what is intended. It should be reserved for intrinsically complex operations, not one of the base operation one can want to do with lists and dicts. Q: Why group by in particular ? A: If we take SQL queries (https://en.wikipedia.org/wiki/SQL_syntax#Queries) as a reasonable way of seeing how people need to manipulate data on a day-to-day basis, we can see that dict comprehensions already covers most of the base operations, the only missing operations being group by and having. Q: Why not use it on list with syntax such as student_by_school = [ school, student for school, student in student_school_list group by school ] ? A: It would create either a discrepancy with iterators or a perhaps misleading semantic (the one from itertools.groupby, which requires the iterable to be sorted in order to be useful). Having the option do do it with a dict remove any ambiguity and should be enough to cover most "group by" applications. Examples: edible_list = [('fruit', 'orange'), ('meat', 'eggs'), ('meat', 'spam'), ('fruit', 'apple'), ('vegetable', 'fennel'), ('fruit', 'pineapple'), ('fruit', 'pineapple'), ('vegetable', 'carrot')] edible_list_by_food_type = {group_by(food_type): edible for food_type, edible in edible_list} print(edible_list_by_food_type) {'fruit': ['orange', 'pineapple'], 'meat': ['eggs', 'spam'], 'vegetable': ['fennel', 'carrot']} bank_transactions = [200.0, -357.0, -9.99, -15.6, 4320.0, -1200.0] splited_bank_transactions = {group_by('credit' if amount > 0 else 'debit'): amount for amount in bank_transactions} print(splited_bank_transactions) {'credit': [200.0, 4320.0], 'debit': [-357.0, -9.99, -15.6, -1200.0]} -- Nicolas Rolin [View Less]

20 84