Mailman 3 February 2014 - Python-ideas

Caching iterators
by Ryan Gonzalez Feb. 28, 2014

Feb. 28, 2014

Note: I PROMISE this is a better idea than my last 20 bad ones(raise_if, shlex extra argument, etc.) I'll use an example to illustrate the idea first. Let's use a completely non-realistic and contrived example. Say you have a lexer that's really slow. Now, this lexer might be a function that uses generators, i.e.: def mylexer(input): while input: ... if xyz: yield SomeToken() Now, if we have a parser that uses that lexer, it always has to wait for the lexer to … [View More]yield the next token *after* it already parsed the current token. That can be somewhat time consuming. Caching iterators are based on the idea: what if the iterator is running at the same time as the function getting the iterator elements? Or, better yet, it's an iterator wrapper that takes an iterator and continues to take its elements while the function that uses the iterator is running? This is easily accomplished using multiprocessing and pipes. Since that was somewhat vague, here's an example: def my_iterator(): for i in range(0,5): time.sleep(0.2) yield i for item in my_iterator(): time.sleep(0.5) print(item) Now with a normal iterator, the flow is like this: - Wait for my_iterator to return an element(0.2s) - Wait 0.5s and print the element(0.5s) In total, that takes 0.7s per element. What a waste! What if the iterator was yielding elements at the same time as the for loop was using them? Well, for every for loop iteration, the iterator could generate ~2.2 elements. That's what a caching iterator does. It runs both at the same time using multiprocessing. It's thread safe as long as the iterator doesn't depend on whatever is using it. An example: def my_iterator(): for i in range(0,5): time.sleep(0.2) yield i for item in itertools.CachingIterator(my_iterator()): # this is the only change time.sleep(0.5) print(item) Now the flow is like this: - Wait for my_iterator to return the very first element. - While that first element is looped over, continue recieving elements from my_iterator(), storing them in an intermediate space(similar to a deque). - When the loop is completed, take the next element from the intermediate space and loop over it - While that element is looped over, continue recieving elements... ...and so forth. That way, time isn't wasted waited for the loop to finish. I have a working implementation. Although there is a very slight overhead, in the above example, about 0.4s is still saved. There could also be an lmap function, which just does this: def lmap(f,it): yield from map(f,CachingIterator(it)) Thoughts? -- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." [View Less]

14 33

Method chaining notation
by Chris Angelico Feb. 27, 2014

Feb. 27, 2014

Yeah, I'm insane, opening another theory while I'm busily championing a PEP. But it was while writing up the other PEP that I came up with a possible syntax for this. In Python, as in most languages, method chaining requires the method to return its own object. class Count: def __init__(self): self.n = 0 def inc(self): self.n += 1 return self dracula = Count() dracula.inc().inc().inc() print(dracula.n) It's common in languages like C++ to return *this by reference if … [View More]there's nothing else useful to return. It's convenient, it doesn't cost anything much, and it allows method chaining. The Python convention, on the other hand, is to return self only if there's a very good reason to, and to return None any time there's mutation that could plausibly return a new object of the same type (compare list.sort() vs sorted()). Method chaining is therefore far less common than it could be, with the result that, often, intermediate objects need to be separately named and assigned to. I pulled up one file from Lib/tkinter (happened to pick filedialog) and saw what's fairly typical of Python GUI code: ... self.midframe = Frame(self.top) self.midframe.pack(expand=YES, fill=BOTH) self.filesbar = Scrollbar(self.midframe) self.filesbar.pack(side=RIGHT, fill=Y) self.files = Listbox(self.midframe, exportselection=0, yscrollcommand=(self.filesbar, 'set')) self.files.pack(side=RIGHT, expand=YES, fill=BOTH) ... Every frame has to be saved away somewhere (incidentally, I don't see why self.midframe rather than just midframe - it's not used outside of __init__). With Tkinter, that's probably necessary (since the parent is part of the construction of the children), but in GTK, widget parenting is done in a more method-chaining-friendly fashion. Compare these examples of PyGTK and Pike GTK: # Cut down version of http://pygtk.org/pygtk2tutorial/examples/helloworld2.py import pygtk pygtk.require('2.0') import gtk def callback(widget, data): print "Hello again - %s was pressed" % data def delete_event(widget, event, data=None): gtk.main_quit() return False window = gtk.Window(gtk.WINDOW_TOPLEVEL) window.set_title("Hello Buttons!") window.connect("delete_event", delete_event) window.set_border_width(10) box1 = gtk.HBox(False, 0) window.add(box1) button1 = gtk.Button("Button 1") button1.connect("clicked", callback, "button 1") box1.pack_start(button1, True, True, 0) button2 = gtk.Button("Button 2") button2.connect("clicked", callback, "button 2") box1.pack_start(button2, True, True, 0) window.show_all() gtk.main() //Pike equivalent of the above: void callback(object widget, string data) {write("Hello again - %s was pressed\n", data);} void delete_event() {exit(0);} int main() { GTK2.setup_gtk(); object button1, button2; GTK2.Window(GTK2.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->set_border_width(10) ->add(GTK2.Hbox(0,0) ->pack_start(button1 = GTK2.Button("Button 1"), 1, 1, 0) ->pack_start(button2 = GTK2.Button("Button 2"), 1, 1, 0) ) ->show_all() ->signal_connect("delete_event", delete_event); button1->signal_connect("clicked", callback, "button 1"); button2->signal_connect("clicked", callback, "button 2"); return -1; } Note that in the Pike version, I capture the button objects, but not the Hbox. There's no name ever given to that box. I have to capture the buttons, because signal_connect doesn't return the object (it returns a signal ID). The more complicated the window layout, the more noticeable this is: The structure of code using chained methods mirrors the structure of the window with its widgets containing widgets; but the structure of the Python equivalent is strictly linear. So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be: 1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current. 3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self. This can be done with an external wrapper, so it might be possible to do this with MacroPy. It absolutely must be a compact notation, though. This probably wouldn't interact at all with __getattr__ (because the attribute has to already exist for this to work), and definitely not with __setattr__ or __delattr__ (mutations aren't affected). How it interacts with __getattribute__ I'm not sure; whether it adds the wrapper around any returned functions or applies only to something that's looked up "the normal way" can be decided by ease of implementation. Supposing this were done, using the -> token that currently is used for annotations as part of 'def'. Here's how the PyGTK code would look: import pygtk pygtk.require('2.0') import gtk def callback(widget, data): print "Hello again - %s was pressed" % data def delete_event(widget, event, data=None): gtk.main_quit() return False window = (gtk.Window(gtk.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->connect("delete_event", delete_event) ->set_border_width(10) ->add(gtk.HBox(False, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ) ->show_all() ) gtk.main() Again, the structure of the code would match the structure of the window. Unlike the Pike version, this one can even connect signals as part of the method chaining. Effectively, x->y would be equivalent to chain(x.y): def chain(func): def chainable(self, *args, **kwargs): func(self, *args, **kwargs) return self return chainable Could be useful in a variety of contexts. Thoughts? ChrisA [View Less]

19 66

Infix functions
by Andrew Barnert Feb. 26, 2014

Feb. 26, 2014

While we're discussing crazy ideas inspired by a combination of a long-abandoned PEP and Haskell idioms (see the implicit lambda thread), here's another: arbitrary infix operators: a `foo` b == foo(a, b) I'm not sure there's any use for this, I just have a nagging feeling there _might_ be, based on thinking about how Haskell uses them to avoid the need for special syntax in a lot of cases where Python can't. This isn't a new idea; it came up a lot in the early days of Numeric. … [View More]PEP 225 (http://legacy.python.org/dev/peps/pep-0225/) has a side discussion on "Impact on named operators" that starts off with: The discussions made it generally clear that infix operators is a scarce resource in Python, not only in numerical computation, but in other fields as well. Several proposals and ideas were put forward that would allow infix operators be introduced in ways similar to named functions. We show here that the current extension does not negatively impact on future extensions in this regard. The future extension was never written as a PEP because 225 and its competitors were were all deferred/abandoned. Also, most of the anticipated use cases for it back then were solved in other ways. The question is whether there are _other_ use cases that make the idea worth reviving. The preferred syntax at that time was @opname. There are other alternatives in that PEP, but they all look a lot worse. Nobody proposed the `opname` because it meant repr(opname), but in 3.x that isn't a problem, so I'm going to use that instead, because… In Haskell, you can turn prefix function into an infix operator by enclosing it in backticks, and turn any infix operator into a prefix function by enclosing it in parens. (Ignore the second half of that, because Python has a small, fixed set of operators, and they all have short, readable names in the operator module.) And, both in the exception-expression discussion and the while-clause discussion, I noticed that this feature is essential to the way Haskell deals with both of these features without requiring lambdas all over the place. The Numeric community wanted this as a way of defining new mathematical operators. For example, there are three different ways you can "multiply" two vectors—element-wise, dot-product, or cross-product—and you can't spell all of them as a * b. So, how do you spell the others? There were proposals to add a new @ operator, or to double the set of operators by adding a ~-prefixed version of each, or to allow custom operators made up of any string of symbols (which Haskell allows), but none of those are remotely plausible extensions to Python. (There's a reason those PEPs were all deferred/abandoned.) However, you could solve the problem easily with infix functions: m `cross` n m `dot` n In Haskell, it's used for all kinds of things beyond that, from type constructors: a `Pair` b a `Tree` (b `Tree` c `Tree` d) `Tree` e … to higher-order functions. The motivating example here is that exception handling is done with the catch function, and instead of this: catch func (\e -> 0) … you can write: func `catch` \e -> 0 Or, in Python terms, instead of this: catch(func, lambda e: 0) … it's: func `catch` lambda e: 0 … which isn't miles away from: func() except Exception as e: 0 … and that's (part of) why Haskell doesn't have or need custom exception expression syntax. PEP 225 assumed that infix functions would be defined in terms of special methods. The PEP implicitly assumed they were going to convince Guido to rename m.__add__(n) to m."+"(n), so m @cross n would obviously be m."@cross"(n). But failing that, there are other obvious possibilities, like m.__@cross__(n), m.__cross__(n), m.__infix__('cross')(n), etc. But really, there's no reason for special methods at all—especially with PEP 443 generic functions. Instead, it could just mean this: cross(m, n) … just as in Haskell. In fact, in cases where infix functions map to methods, there's really no reason not to just _write_ them as methods. That's how NumPy solves the vector-multiplication problem; the dot product of m and n is just: m.dot(n) (The other two meanings are disambiguated in a different way—m*n means cross-product for perpendicular vectors, element-wise multiplication for parallel vectors.) But this can still get ugly for long expressions. Compare: a `cross` b + c `cross` (d `dot` e) a.cross(b).add(c.cross(d.dot(e))) add(cross(a, b), cross(c, dot(d, e)) The difference between the first and second isn't as stark as between the second and third, but it's still pretty clear. And consider the catch example again: func.catch(lambda e: 0) Unless catch is a method on all callables, this makes no sense, which means method syntax isn't exactly extensible. There are obviously lots of questions raised. The biggest one is: are there actually real-life use cases (especially given that NumPy has for the most part satisfactorily solved this problem for most numeric Python users)? Beyond that: What can go inside the backticks? In Haskell, it's an identifier, but the Haskell wiki (http://www.haskell.org/haskellwiki/Infix_expressions) notes that "In ABC the stuff between backquotes is not limited to an identifier, but any expression may occur here" (presumably that's not Python's ancestor ABC, which I'm pretty sure used backticks for repr, but some other language with the same name) and goes on to show how you can build that in Haskell if you really want to… but I think that's an even worse idea for Python than for Haskell. Maybe attribute references would be OK, but anything beyond that, even slicing (to get functions out of a table) looks terrible: a `self.foo` b a `funcs['foo']` b Python 2.x's repr backticks allowed spaces inside the ticks. For operator syntax this would look terrible… but it does make parsing easier, and there's no reason to actually _ban_ it, just strongly discourage it. What should the precedence and associativity be? In Haskell, it's customizable—which is impossible in Python, where functions are defined at runtime but calls are parsed at call time—but defaults to left-associative and highest-precedence. In Python, I think it would be more readable as coming between comparisons and bitwise ops. In grammar terms: infix_expr ::= or_expr | or_expr "`" identifier "`" infix_expr comparison ::= infix_expr ( comparison_operator infix_expr ) * That grammar could easily be directly evaluated into a Call node in the AST, or it could have a different node (mainly because I'm curious whether MacroPy could do something interesting with it…), like: InfixCall(left=Num(n=1), op=Name(id='foo', ctx=Load()), right=Num(n=2)) Either way, it ultimately just compiles to the normal function-call bytecode, so there's no need for any change to the interpreter. (That does mean that "1 `abs` 2" would raise a normal "TypeError: abs expected at most 1 arguments, got 2" instead of a more specific "TypeError: abs cannot be used as an infix operator", but I don't think that's a problem.) This theoretically could be expanded from operators to augmented assignment… but it shouldn't be; anyone who wants to write this needs to be kicked in the head: a `func`= b [View Less]

20 52

Allow __len__ to return infinity
by Ram Rachum Feb. 26, 2014

Feb. 26, 2014

I'd like to have some objects that return infinity from their __len__ method. Unfortunately the __len__ method may only return an int, and it's impossible to represent an infinity as an int. Do you think that Python could allow returning infinity from __len__? Thanks, Ram.

10 11

time.Timer
by anatoly techtonik Feb. 26, 2014

Feb. 26, 2014

UX improvement fix for the common case: import time class Timer(object): def __init__(self, seconds): self.seconds = seconds self.restart() def restart(self): self.end = time.time() + self.seconds @property def expired(self): return (time.time() > self.end) Example: class FPS(object): def __init__(self): self.counter = 0 self.timer = Timer(1) def process(self): self.counter += 1 if self.timer.expired: print "FPS: %s" % self.counter … [View More]

5 8

Re: [Python-ideas] hey Stephanie, this is DJ, how are you?
by Jason Bursey Feb. 25, 2014

Feb. 25, 2014

I went to Amsterdam bout a year ago, but am looking to move from Dallas to Seattle were I could win a case. I think a 501(c) to help relocation of patients is needed. Cheers, j On Friday, January 31, 2014, Stephanie Bishop (Googlersa with search r s a primes with shorsalgorthm=1&Flintstones findprimenumber. Low kelvin +) <replyto-a98dd37a(a)plus.google.com> wrote: > < https://lh6.googleusercontent.com/-aRFDqfZ-Llo/AAAAAAAAAAI/AAAAAAAAAi8/7uQs… > > I am fabulous. … [View More]

1 0

Re: [Python-ideas] Unify global and nonlocal
by Manuel Cerón Feb. 24, 2014

Feb. 24, 2014

On Mon, Feb 24, 2014 at 4:22 AM, Saket Dandawate <newton3143(a)gmail.com> wrote: > Also why can't it be like this too >>>> global x = 3 > > rather than only >>>> global x >>>> x=3 I can't find the discussion thread, but originally PEP3104 proposed this kind of syntax for nonlocal, but it was never actually implemented because of ambiguity with the unpacking assignment case: nonlocal x, y, z = 1, 2, 3 Does this mean that x, y and z are non … [View More]

2 1

Allowing breaks in generator expressions by overloading the while keyword
by Carl Smith Feb. 24, 2014

Feb. 24, 2014

Sometimes you need to build a list in a loop, but break from the loop if some condition is met, keeping the list up to that point. This is so common it doesn't really need an example. Trying to shoehorn the break keyword in to the generator expression syntax doesn't look pretty, unless you add `and break if expr` to the end, but that has its own issues. Besides, overloading `while` is much cuter... ls = [ expr for name in iterable while expr ] ls = [ expr for name in iterable if expr … [View More]

16 35

Re: [Python-ideas] [Python-Dev] Tangent on class level scoping rules
by Greg Ewing Feb. 23, 2014

Feb. 23, 2014

Nick Coghlan wrote: > Dealing with references from nested closures is the hard part. I think that could be handled by creating new cells for the inner variables each time the inner scope is entered. -- Greg

2 1

Joining dicts again
by haael＠interia.pl Feb. 23, 2014

Feb. 23, 2014

Hello I know this has been mangled thousand times, but let's do it once again. Why does Python not have a simple dict joining operator? >From what I read, it seems the biggest concern is: which value to pick up if both dicts have the same key. a = {'x':1} b = {'x':2} c = a | b print(c['x']) # 1 or 2? My proposal is: the value should be derermined as the result of the operation 'or'. The 'or' operator returns the first operand that evaluates to boolean True, or the last operand if all are … [View More]

10 15