Mailman 3 May 2016 - Python-ideas

pattern matching proof-of-concept
by Michael Selik June 3, 2016

June 3, 2016

My original dict unpacking proposal was very short and lacked a motivating usage. Toy examples made my proposal look unnecessarily verbose and suggested obvious alternatives with easy current syntax. Nested/recursive unpacking is much more troublesome, especially when combined with name-binding. I wrote an example to compare my proposal with current syntax. Example usage. https://github.com/selik/destructure/blob/master/examples/fips.py Implementation. https://github.com/selik/destructure/blob/master/destructure.py The design of my module I'm least happy with is the name-binding. I extended a SimpleNamespace to create an Erlang-style distinction between bound and unbound names. Though the API is a bit awkward, now that the module is built, I'm less enthusiastic about introducing new syntax. Funny how that works. I haven't yet decided how to add post-binding guards to the cases.

4 7

Match statement brainstorm
by Guido van Rossum June 2, 2016

June 2, 2016

On Wed, May 18, 2016 at 8:17 PM, Nick Coghlan <ncoghlan(a)gmail.com> wrote: > On 19 May 2016 at 08:53, Guido van Rossum <guido(a)python.org> wrote: >> The one thing that Python doesn't have (and mypy doesn't add) would be >> a match statement. The design of a Pythonic match statement would be >> an interesting exercise; perhaps we should see how far we can get with >> that for Python 3.7. > > If it's syntactic sugar for a particular variety of if/elif/else > statement, I think that may be feasible, but you'd presumably want to > avoid the "Can we precompute a lookup table?" quagmire that doomed PEP > 3103's switch statement. > > That said, for the pre-computed lookup table case, whatever signature > deconstruction design you came up with for a match statement might > also be usable as the basis for a functools.multidispatch() decorator > design. Let's give up on the pre-computed lookup table. Neither PEP 3103 nor PEP 275 (to which it compares itself) even gets into the unpacking part, which would be the interesting thing from the perspective of learning from Sum types and matching in other languages. Agreed on the idea of trying to reuse this for multidispatch! A few things that might be interesting to explore: - match by value or set of values (like those PEPs) - match by type (isinstance() checks) - match on tuple structure, including nesting and * unpacking (essentially, try a series of destructuring assignments until one works) - match on dict structure? (extension of destructuring to dicts) - match on instance variables or attributes by name? - match on generalized condition (predicate)? The idea is that many of these by themselves are better off using a classic if/elif/else structure, but a powerful matching should be allowed to alternate between e.g. destructuring matches and value or predicate matches. IIUC Haskell allows pattern matches as well as conditions, which they seem to call guards or where-clauses (see https://www.haskell.org/tutorial/patterns.html, or http://learnyouahaskell.com/syntax-in-functions if you like your pages more colorful). Or maybe we should be able to combine structure matches with guards. I guess the tuple structure matching idea is fairly easy to grasp. The attribute idea would be similar to a duck-type check, though more emphasizing data attributes. It would be nice if we could write a match that said "if it has attributes x and y, assign those to local variables p and q, and ignore other attributes". Strawman syntax could be like this: def demo(arg): switch arg: case (x=p, y=q): print('x=', p, 'y=', q) case (a, b, *_): print('a=', a, 'b=', b) else: print('Too bad') Now suppose we had a Point defined like this: Point = namedtuple('Point', 'x y z') and some variables like this: a = Point(x=1, y=2, z=3) b = (1, 2, 3, 4) c = 'hola' d = 42 then we could call demo with these variables: >>> demo(a) x= 1 y= 2 >>> demo(b) a= 1 b= 2 >>> demo(c) a= h b= o >>> demo(d) Too bad >>> (Note the slightly unfortunate outcome for 'hola', but that's what a destructuring assignment would do. Water under the bridge.) Someone else can try to fit simple value equality, set membership, isinstance, and guards into that same syntax. -- --Guido van Rossum (python.org/~guido)

24 78

Proposal to change List Sequence Repetition (*) so it is not useless for Mutable Objects
by Matthew Tanous June 1, 2016

June 1, 2016

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Currently, the use of the (*) operator on a list is to duplicate a list by creating multiple references to the same object. While this works intuitively for immutable objects (like [True] * 5) as these immutable references are replaced when the list is assigned to, it makes the operator nigh unusable for mutable objects. The most obvious case is when the operator is duplicated in a sequence like this: arr = [[True] * 5] * 5 This does not create a matrix-like arrangement of the immutable truth variable, but instead creates a list of 5 references to the same list, such that a following assignment like arr[2][3] = False will not change just that one index, but every 4th element of each list in the outer list. This also makes the sequence construction using a mutable type a problem. For example, assume a class Foo: class Foo: def __init__(self): self.val = True def set(self): self.val = False def __repr__(self): return str(self.val) If I then use sequence repetition to create a list of these like so: arr = [Foo()] * 5 This will create a list of references to the same Foo instance, making the list construction itself effectively meaningless. Running the set() method on any of the instances in the list is the same as running it on all the instances in the list. It is my opinion that the sequence repetition operator should be modified to make copies of the objects it is repeating, rather than copying references alone. I believe this would both be more intuitive from a semantic point of view and more useful for the developer. This would change the operator in a way that is mostly unseen in current usage ([5] * 3 would still result in [5, 5, 5]) while treating mutable nesting in a way that is more understandable from the apparent intent of the syntax construction. Reference regarding previous discussion: https://bugs.python.org/issue27135 -----BEGIN PGP SIGNATURE----- iQEcBAEBCAAGBQJXTSDZAAoJEF14rZEhZ/cMd24H/1p24+EYIALc7pBR5qbGpW20 oxHUWGfVaERizkhvuDAbO/n5sXUB5QHbh6MMwe2tn3TCWLstnvRhvJDR9ahKx7gm EChB4sIAKM/npUQge6ljLqP61m88p7LpnIVV6gF4PC0Wkyz8g2iSMjVwFv4XEBYZ /PNWXa4QLlNmqksrcQ7pYKZObYSjU8lNAEsCmtRy8PbBTvWq2f+YB9kcc79byFIs W0bhSI7x2iaicU24UC7FJAo4bSFKNZ8LDSEMZhu7gWhFxJ7wVsyxk6/RrZkptdCx z/DNqo9/Ggs4UJ9vo4cfCoX0723bejT0VG1K/EuYxWAXYOlNuICYIMQVNiZhkoo= =2/BD -----END PGP SIGNATURE-----

11 13

Re: [Python-ideas] A function currying decorator ((Was: no subject))
by Ryan Gonzalez May 31, 2016

May 31, 2016

So you're referring to function currying, correct? I changed the thread title to reflect that. -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something’s wrong. http://kirbyfan64.github.io/ On May 31, 2016 8:04 AM, "Yongsheng Cheng" <cyscoyote(a)gmail.com> wrote: to begin with: i a deep python fun i want to suguest python a feature to support Fuctional Program :,here is the toy code i need a more powerful decrote in moudle functools instead of functools.partial,thanks do you think it is a nice idea? waitting for your reply def minpara(func_1,args): num=0 try: func_1(*args) except Exception as error: num,given=re.findall(r'\d+',str(error))[:2] if num: #return int(num),int(given) return (num,given) else: #return func(*args) return (0,0) def curried(func): #@wraps(func) def new_func(*args): num,given=minpara(func,args) #print num,given if (num,given)==(0,0): return func(*args) else: return curried(functools.partial(func,*args)) return new_func @curried def sun_1(a,b,c): return a+b+c @curried def sun_2(a,b,c): return a*b*c if __name__=="__main__": print sun_2(1,)(2)(44) print sun_1(1,)(2)(44) print sun_2(1,2)(23) _______________________________________________ Python-ideas mailing list Python-ideas(a)python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

3 2

functional programming curried
by Yongsheng Cheng May 31, 2016

May 31, 2016

i want to suguest python a feature to support Fuctional Program :,here is the toy code i need a more powerful decrote in moudle functools instead of functools.partial,thanks do you think it is a nice idea? waitting for your reply,import port some features like scala. def minpara(func_1,args): num=0 try: func_1(*args) except Exception as error: num,given=re.findall(r'\d+',str(error))[:2] if num: #return int(num),int(given) return (num,given) else: #return func(*args) return (0,0) def curried(func): #@wraps(func) def new_func(*args): num,given=minpara(func,args) #print num,given if (num,given)==(0,0): return func(*args) else: return curried(functools.partial(func,*args)) return new_func @curried def sun_1(a,b,c): return a+b+c @curried def sun_2(a,b,c): return a*b*c if __name__=="__main__": print sun_2(1,)(2)(44) print sun_1(1,)(2)(44) print sun_2(1,2)(23) _______________________________________________

1 0

(no subject)
by Yongsheng Cheng May 31, 2016

May 31, 2016

to begin with: i a deep python fun i want to suguest python a feature to support Fuctional Program :,here is the toy code i need a more powerful decrote in moudle functools instead of functools.partial,thanks do you think it is a nice idea? waitting for your reply def minpara(func_1,args): num=0 try: func_1(*args) except Exception as error: num,given=re.findall(r'\d+',str(error))[:2] if num: #return int(num),int(given) return (num,given) else: #return func(*args) return (0,0) def curried(func): #@wraps(func) def new_func(*args): num,given=minpara(func,args) #print num,given if (num,given)==(0,0): return func(*args) else: return curried(functools.partial(func,*args)) return new_func @curried def sun_1(a,b,c): return a+b+c @curried def sun_2(a,b,c): return a*b*c if __name__=="__main__": print sun_2(1,)(2)(44) print sun_1(1,)(2)(44) print sun_2(1,2)(23)

1 0

Re: [Python-ideas] Add support for version objects
by Nick Coghlan May 30, 2016

May 30, 2016

On 28 May 2016 3:36 am, "Paul Moore" <p.f.moore(a)gmail.com> wrote: > > On 28 May 2016 at 05:38, Ethan Furman <ethan(a)stoneleaf.us> wrote: > > On 05/27/2016 08:22 PM, Nick Coghlan wrote: > > > >> The main advantage I'd see to stdlib inclusion is providing "one obvious > >> way to do it" - it isn't immediately obvious to a newcomer that PEP 440 > >> and the implementation in packaging are the preferred approach for > >> SemVer-like version numbers in Python projects. > > > > > > Yup, I'll second that! > > Agreed. Having a version object in the stdlib would be a good way of > directing people to the standard approach. Some points: > > - It pretty much has to be PEP 440, as that's the standard in the > Python packaging ecosystem. Adding a *different* standard to the > stdlib would be a big problem IMO. > - Packaging tools like pip will still need to bundle an external > implementation (at the moment we use the "packaging" library) as > they'll need to support older versions of Python for some time yet. So > there would need to be a maintained backport version on PyPI (or the > packaging library could be sanctioned as that backport). > - Assuming a separate backport, it's not clear to me how we'd manage > the transition between packaging and the backport (maintain the 2 in > parallel? packaging falls back to the stdlib/backport if present? > packaging gains a conditional dependency on the backport?) > - PEP 440, and the implementation, is pretty stable. But are we happy > for it to move to the development pace of the stdlib? The other issue is the intended use case - I *think* PEP 440 (including the normalisation steps) will cover the version numbering for external dependencies like Tkinter and zlib, but it's not the precise context that spec was designed to cover (i.e. versioning distribution packages for Python projects). I still think the idea is worth considering, it's just a more complex problem than it appears to be at first glance. Cheers, Nick. > > The distutils version classes are pretty old, and fairly limited - the > strict version is frequently *too* strict, but the loose version is > rather too lenient... > > Paul > _______________________________________________ > Python-ideas mailing list > Python-ideas(a)python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/

1 0

Add support for version objects
by Serhiy Storchaka May 28, 2016

May 28, 2016

I propose to add support for version objects in the stdlib. Version object is a namedtuple-like object with special attrbutes major, minor, etc for representing semantic version in the form used by most open source (and not only) software. The sys module already contains two version objects: sys.version_info and the result of sys.getwindowsversion(). There is a support of version objects in Perl [1] and .NET [2]. There are third-party Python implementations ([3], [4], [5]). Version objects can be used for representing: 1. Version of Python itself (sys.version_info). 2. Version of OS and system libraries. 3. Static and runtime versions of wrapped libraries (zlib, expat, sqlite, curses, Tcl/Tk, libffi, libmpdec). 4. Versions of third-party modules. The benefit of version objects against plain strings is that version objects are comparable with expected rules (1.2.3 < 1.10.1). The benefit of version objects against tuples is human readable string representation and named attributes for components. [1] http://search.cpan.org/perldoc?version [2] https://msdn.microsoft.com/en-us/library/system.version%28v=vs.110%29.aspx [3] https://pypi.python.org/pypi/semantic_version [4] https://pypi.python.org/pypi/versions [3] https://pypi.python.org/pypi/version https://pypi.python.org/pypi/SemVerPy

7 8

Should decimal.InvalidOperation subclass ValueError?
by Steven D'Aprano May 26, 2016

May 26, 2016

On the tutor mailing list, somebody asked a question about decimal.InvalidOperation. They were converting strings to Decimal using something like this: try: d = Decimal(the_string) except ValueError: handle_error() and were perplexed by the error that they got: decimal.InvalidOperation: [<class 'decimal.ConversionSyntax'>] This got me thinking. Normally, such a conversion error should raise ValueError, as ints, floats and Fractions do. Presumably decimal does something different as it must match the General Decimal Arithmetic Specification. But, is there any reason why InvalidOperation couldn't be a subclass of ValueError? Exception class hierarchies are not part of the GDAS, so this should be allowed. Looking at the docs, there are nine examples given of things which can raise InvalidOperation (if not trapped, in which case they return a NAN): Infinity - Infinity 0 * Infinity Infinity / Infinity x % 0 Infinity % x sqrt(-x) and x > 0 0 ** 0 x ** (non-integer) x ** Infinity plus invalid string conversion. To my mind, ValueError would be an acceptable error for all of these things. E.g. the root of a negative value raises ValueError in Python 2: >>> (-2.0)**0.5 Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: negative number cannot be raised to a fractional power (In Python 3, it returns a complex number.) So I propose that InvalidOperation be changed to inherit from ValueError, to match the expected behaviour from other numeric types. The only tricky bit is that division by zero doesn't raise ValueError, but DivideByZeroError instead. But that's no worse than the current situation: # current - people expect Decimal(1)/0 to raise DivideByZeroError, but it raises InvalidOperation; # proposal - people expect Decimal(1)/0 to raise DivideByZero, but it raises InvalidOperation (subclass of ValueError). Oh, a further data point: if you pass an invalid list or tuple to Decimal, you get a ValueError: py> decimal.Decimal([]) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: argument must be a sequence of length 3 Thoughts? -- Steve

10 24

Application awareness of memory storage classes
by R. David Murray May 24, 2016

May 24, 2016

I'm currently working on a project for Intel involving Python and directly addressable non-volatile memory. See https://nvdimm.wiki.kernel.org/ for links to the Linux features underlying this work, and http://pmem.io/ for the Intel initiated open source project I'm working with whose intent is to bring support for DAX NVRAM to the application programming layer (nvml currently covers C, with C++ support in process, and there are python bindings for the lower level (non-libpmemobj) libraries). tldr: In the future (and the future is now) application programs will want to be aware of what class of memory they are allocating: normal DRAM, persistent RAM, fast DRAM, (etc?). High level languages like Python that manage memory for the application programmer will need to expose a way for programmers to declare which kind of memory backs an object, and define rules for interactions between objects that reside in different memory types. I'm interested in starting a discussion (not necessarily here, though I think that probably makes sense) about the challenges this presents and how Python-the-language and CPython the implementation might solve them. Feel free to tell me this is the wrong forum, and if so we can a new forum to continue the conversation if enough people are interested. Long version: While I mentioned fast DRAM as well as Persistent RAM above, really it is Persistent RAM that provides the most interesting challenges. This is because you actually have to program differently when your objects are backed by persistent memory. Consider the python script ('pop' is short for Persistent Object Pool): source_account = argv[1].lower() target_account = argv[2].lower() delta = float(argv[3]) pop = pypmemobj('/path/to/pool') if not hasattr(pop.ns, 'accounts'): pop.ns.accounts = dict() for acct in (source_account, target_account): if acct not in pop.ns.accounts: pop.ns.accounts[acct] = 0 pop.ns.accounts[source_account] -= delta pop.ns.accounts[target_account] += delta print("{:-10s} {:10.2f} {:-10s}".format( source_account, pop.ns.accounts[source_account], target_account, pop.ns.accounts[target_account] )) (I haven't bothered to test this code, forgive stupid errors.) This is a simple CLI ap that lets you manage a set of bank accounts: > acct deposits checking 10.00 deposits -10.00 savings 10.00 > acct checking savings 5.00 savings 5.00 checking 5.00 Obviously you'd have a lot more code in a real ap. The point here is that we've got *persistent* account objects, with values that are preserved between runs of the program. There's a big problem with this code. What happens if the program crashes or the machine crashes? Specifically, what happens if the machine crashes between the -= and the += lines? You could end up with a debit from one account with no corresponding credit to the other, leaving your accounts in an inconsistent state with no way to recover. The authors of the nvml library documented at the pmem.io site have discovered via theory and practice that what you need for reliable programming with persistent memory is almost exactly same kind of atomicity that is required when doing multi-threaded programming. (I can sense the asyncio programmers groaning...didn't we just solve that problem and now you are bringing it back? As you'll see below if you combine asyncio and persistent memory, providing the atomicity isn't nearly as bad as managing threading locks; although it does take more thought than writing DRAM-backed procedural code, it is relatively straightforward to reason about.) What we need for the above program fragment to work reliably in the face of crashes is for all of the operations that Python programmers think of as atomic (assignment, dictionary update, etc) to be atomic at the persistent memory level, and to surround any code that does multiple-object updates that are interdependent with a 'transaction' guard. Thus we'd rewrite the end of the above program like this: with pop: pop.ns.accounts[source_account] -= delta pop.ns.accounts[target_account] += delta print("{:-10s} {:10.2f} {:-10s}".format( source_account, pop.ns.accounts[source_account], target_account, pop.ns.accounts[target_account] )) If this were in fact a threaded program, we'd be holding a write lock on pop.ns.accounts in the same scope; but in that case (a) we'd also want to include the print statement and (b) we'd need to pay attention to what lock we were holding. The pmem transaction doesn't have to worry about locks, it is guarding *whatever* memory changes are made during the transaction, no mater which object they are made to. This is what I mean by asyncio + pmem transactions being simpler than threading locks. The transaction that is implied by 'with pop' is the heart of what the nvml libpmemobj does: it provides a transaction scope where any registered changes to the persistent memory will be rolled back on abort, or rolled back when the object pool is next opened if the transaction did not complete before the crash. Of course, in Python as it stands today, the above program would not do anything practical, since when the dict that becomes accounts is allocated, it gets allocated in DRAM, not in persistent memory. To use persistent memory, I'm about to start working on an extension module that basically re-implements most of the Python object classes so that they can be stored in persistent memory instead of RAM. Using my proposed API, the above program would actually look like this: source_account = argv[1].lower() target_account = argv[2].lower() delta = float(argv[3]) pop = pypmemobj('/path/to/pool') if not hasattr(pop.ns, 'accounts'): pop.ns.accounts = PersistentDict(pop) for acct in (source_account, target_account): if acct not in pop.ns.accounts: pop.ns.accounts[acct] = 0.0 with pop: pop.ns.accounts[source_account] -= delta pop.ns.accounts[target_account] += delta print("{:-10s} {:10.2f} {:-10s}".format( source_account, pop.ns.accounts[source_account], target_account, pop.ns.accounts[target_account] )) The difference here is creating a PersistentDict object, and telling it what persistent memory to allocate itself in. But what happens when we do: pop.ns.accounts[acct] = 0.0 We don't want to have to write pop.ns.accounts[acct] = PersistentFloat(pop, 0.0) but if we don't, we'd be trying to store a pointer to an float object that lives in normal RAM into our persistent list, and that pointer would be invalid on the next program run. Instead, for immutable objects we can make a copy in persistent ram when the assignment happens. But we have to reject any attempt to assign a mutable object in to a persistent object unless the mutable is itself persistent...we can't simply make a copy of a mutable object and make a persistent version, because of the following Python idiom: pop.ns.accounts = accounts = dict() If we copied the dict into a PersistentDict automatically, pop.ns.accounts and accounts would point to different objects, and the Python programmer's expectations about his program would be violated. So that's a brief description of what I'm planning to try to implement as an extension module (thoughts and feedback welcome). It's pretty clear that having to re-implement every mutable Python C type, as well as immutable collection types (so that we can handle pointers to other persistent objects correctly) will be a pain. What would it take for Python itself to support the notion of objects being backed by different kinds of memory? It seems to me that it would be possible, but that in CPython at least the cost would be a performance penalty for the all-DRAM case. I'm guessing this is not acceptable, but I'd like to present the notion anyway, in the hopes that someone cleverer than me can see a better way :) At the language level, support would mean two things, I think: there would need to be a way to declare which backing store an object belongs to, and there would need to be a notion of a memory hierarchy where, at least in the case of persistent vs non-persistent RAM, an object higher in the hierarchy could not point to an object lower in the hierarchy that was mutable, and that immutables would be copied from lower to higher. (Here I'm notionally defining "lower" as "closer to the CPU", since RAM is 'just there' whereas persistent memory goes through a driver layer before direct access can get exposed, and fast DRAM would be closer yet to the CPU :). In CPython I envision this being implemented by having every object be associated with a 'storage manager', with the DRAM storage manager obviously being the default. I think that which storage manager an object belongs to can be deduced from its address at runtime, although whether that would be better than storing a pointer is an open question. Object method implementations would then need to be expanded as follows: (1) wrapping any operations that comprise an 'atomic' operation with a 'start transaction' and 'end transaction' call on the storage manager. (2) any memory management function (malloc, etc) would be called indirectly through the storage manager, (3) any object pointer to be stored or retrieved would be passed through an appropriate call on the storage manager to give it an opportunity to block it or transform it, and (4) any memory range to be modified would be reported to the memory manager so that it can be registered with the transaction. I *think* that's the minimum set of changes. Clearly, however, making such calls is going to be less efficient in the DRAM case than the current code, even though the RAM implementation would be a noop. Summary: Providing Python language level support for directly addressable persistent RAM requires addressing the issues of how the application indicates when an object is persistent, and managing the interactions between objects that are persistent and those that are not. In addition, operations that a Python programmer expects to be atomic need to be made atomic from the viewpoint of persistent memory by bracketing them with implicit transactions, and a way to declare an explicit transaction needs to be exposed to the application programmer. These are all interesting design challenges, and I may not have managed to identify all the issues involved. Directly addressable persistent memory is going to become much more common, and probably relatively soon. An extension module such as I plan to work on can provide a way for Python to be used in this space, but are there things that we are able and willing to do to support it more directly in the language and, by implication, in the various Python implementations? Are there design considerations for this extension module that would make it easier or harder for more integrated language support to be added later? I realize this could be viewed as early days to be bringing up this subject, since I haven't even started the extension module yet. On the other hand, it seems to me that the community may have architectural insights that could inform the development, and that starting the conversation now without trying to nail anything down could be very helpful for facilitating the long term support of variant memory types in Python.

9 23