Mailman 3 January 2007 - Python-ideas

oneof() and multi split and replace for stirngs
by Calvin Spealman 23 Jan '07

23 Jan '07

I don't know if this would make sense to try to push to 2.6 or 3.0. I was talking with some people about how the ability to split or replace on multiple substrings would be added to python, without adding new methods or having ugly tuple passing requirents like s.split(('foo', 'bar'), 4). This idea came to mind, so I wanted to toss it out there for scrutination. It would be a builtin, but can be implemented in python like this, basically: class oneof(list): def __init__(self, *args): list.__init__(self) self.extend(args) def __eq__(self, o): return o in self assert 'bar' == oneof('bar', 'baz') In addition to the new type, .replace, .split, and other appropriate functions would be updated to take this as the substring argument to locate and would match any one of the substrings it contains. I've asked a few people and gotten good responses on the general idea so far, but what do you all think? 1) Would the multi-substring operations be welcomed? 2) Could this be a good way to add those to the API without breaking things? 3) What version would it target? 4) What all functions and methods should support this or generally might gain value from some similar solution? -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://ironfroggy-code.blogspot.com/

3 2

Re: [Python-ideas] oneof() and multi split and replace for stirngs
by Calvin Spealman 23 Jan '07

23 Jan '07

On 1/23/07, Josiah Carlson <jcarlson(a)uci.edu> wrote: > Whether it is a tuple being passed, or a "magic" container, I don't > think it matters; though I would lean towards a tuple because it is 5 > less characters to type out, and one fewer data types to worry about. I had talked to others and the concensus was against the tuples for ugliness. s.split((a,b), c) wasn't a popular choice, but s.split(oneof(a, b), c) reads better. > This has been discussed before in python-dev, I believe the general > consensus was that it would be convenient at times, but I also believe > the general consensus was "use re"; It seems like we have a history of useful string operations being moved away from "use re" to "dont use re", such that the slightly recent startswith and endswith methods, and even split and replace themselves. I would like to see less reasons for people to worry with regular expressions until they actually need them. If we can provide a better way to get the job done, that seems like a great idea. ALSO The oneof type isn't just a single use thing. Usecode may often make use of it, and other types could benefit such as doing a lookup with d[oneof(1,2,3)] (where order would matter for priority). I think this semantic collection type would be very useful in a number of contexts where we would currently just loop or duplicate code. -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://ironfroggy-code.blogspot.com/

2 1

Fwd: multi-dispatch again
by tomer filiba 14 Jan '07

14 Jan '07

Guido asked me to move it here. anyway, i might as well state my case better. in the formal definitions of object oriented programming, objects are said to encapsulate state and behavior. behavior is largely dependent on the type of the object, but the state is important nonetheless. for example, file objects are stateful: they can be opened for reading-only, writing-only, both, or be closed altogether. still, they are all instances of the file type. since generic functions/multi-dispatch come to help the programmer by reducing boilerplate early type-checking code and providing a more capable dispatch mechanism -- we can't overlook the state of the object. this early type-checking goes against the spirit of pure duck typing (which i'm fond of), but is crucial when the code has side effects. in this case, you can't just start executing the code and "hope it works", as the resources (i.e., files) involved are modified. in this kind of code, you want to check everything is alright *instead* of suddenly having an AttributeError/TypeError somewhere. here's an example: @dispatch def copy(src: file, dst: file): while True: buf = src.read(1000) if not buf: break dst.write(buf) suppose now that dst is mistakenly opened for reading. this means src would have already been modified, while dst.write is bound to fail. if src is a socket, for instance, this would be destructive. so if we already go as far as having multiple dispatch, which imposes constraints on the arguments a function accepts, we might as well base it on the type and state of the object, rather than only on it's type. we could say, for example: @dispatch def copy(src: file_for_reading, dst: file_for_writing): file_for_reading is not a type -- it's a checker. it may be defined as def file_for_reading(obj: file): return file.mode == "r" and not file.closed types, by default, would check using isinstance(), but costume checkers could check for stateful requirements too. and a note about performance: this check is required whether it's done explicitly or by the dispatch mechanism, and since most functions don't require so many overloads, i don't think it's an issue. besides, we can have a different decorator for dispatching by type or by checkers, i.e., @dispatch vs @stateful_dispatch, or something. the simple @dispatch would use a dictionary, while the stateful version would use a loop. -tomer ---------- Forwarded message ---------- From: tomer filiba <tomerfiliba(a)gmail.com> Date: Jan 14, 2007 2:28 PM Subject: multi-dispatch again To: Python-3000(a)python.org i just thought of a so-to-speak counter-example for ABCs... it's not really a counter-example, but i believe it shows a deficiency in the concept. theoretically speaking, objects are a made of type and state. ABCs, isinstance() and interfaces at general only check the type part. for example: @dispatch def log_to_file(text: str, device: file): file.write(text) this will constrain the *type* of the device, but not its *state*. practically speaking, i can pass a closed file, or a file open for reading-only, and it would pass silently. basing multi-dispatch on types is of course a leap forward, but if we already plan to take this leap, why not make it general enough to support more complex use-cases? this way we could rewrite the snippet above as @dispatch def log_to_file(text: str, device: open_file): file.write(text) where open_file isn't a type, but rather a "checker" that may also examine the state. by default, type objects would check for inheritance (via a special method), but checkers could extend this behavior. for efficiency purposes, we can have two decorators: @type_dispatch - dispatches based on type only @full_dispatch - dispatches based on type and state bottom line -- we can't just look at the type of the object for dispatching, overlooking its state. the state is meaningful, and we'd want the function not to be called at all if the state of the object is wrong. -tomer

1 0

Re: [Python-ideas] [Python-Dev] features i'd like [Python 3000] ... #3: fix super()
by Eduardo "EdCrypt" O. Padoan 06 Jan '07

06 Jan '07

> I was bitten by the urge to play with this today, and modified my > previous "self" hack to handle "super" also, so that the following > code works: > > class D (C): > @method > def sum(n): > return super.sum(n * 2) - self.base > > Posted as "evil2.py" here: > > http://www.lag.net/robey/code/surf/ > > Because hacking "super" requires having the class object handy, this > one needs a metaclass to do its magic, which is a shame. I guess if > it was implemented inside the cpython compiler, it would be less of a > problem. BTW, a "super-only" version of this decortor (that I think could be called "implement") has some more chances in Python. But think belongs more to Python-Ideas list, ok? -- EduardoOPadoan (eopadoan->altavix::com) Bookmarks: http://del.icio.us/edcrypt Blog: http://edcrypt.blogspot.com Jabber: edcrypt at jabber dot org ICQ: 161480283 GTalk: eduardo dot padoan at gmail dot com MSN: eopadoan at altavix dot com

1 1

Module local namespaces
by Matt Draisey 04 Jan '07

04 Jan '07

A while ago, I developed a small PyGtk programme that could dynamically reload all the working callbacks and logic while the GUI was still running. I could get away with this because of the flexible way Python loads modules at runtime, but it ended up being a waste of time as implementing it took more time that actually using it. For sanity's sake it quickly becomes clear you almost never want to rely on being able to refer to a half initialized module. And wouldn't it be nice if Python enforced this. My suggestion is that module importing occur in a temporary local namespace that exists only until the end of the module code is executed, then a small function could copy everything from the temporary namespace into the module object. The usual closure semantics would guarantee that top-level functions could still call each other, but they would effectively become immutable after the namespace wraps up. The 'global' keyword could be used at the top level in a module to force it to be defined in the module immediately, and to ensure internal references to the object go through the module object. This would be a big change in module import semantics, but should have remarkably few consequences, as it really is an enforcement mechanism for good style. The copying from the temporary namespace into the module object would be a good place to insert a hook function to filter what objects are actually published to the module. You could by default not copy any object indentified by a leading underscore.

4 7

PEP 3107 Function Annotations: interoperability (again)
by Tony Lownds 03 Jan '07

03 Jan '07

In discussing the Function Annotations PEP on python-list, interoperability between schemes came up again: http://mail.python.org/pipermail/python-list/2006-December/420645.html John Roth wrote: > Third, it's half of a proposal. Type checking isn't the only use > for metadata about functions/methods, classes, properties > and other objects, and the notion that there are only going to > be a small number of non-intersecting libraries out there is > an abdication of responsibility to think this thing through. This issue came up before in http://mail.python.org/pipermail/python-3000/2006-August/002796.html and a rather long thread followed. Here is the paragraph in the PEP that needs updating, at the least: There is no worry that these libraries will assign semantics at random, or that a variety of libraries will appear, each with varying semantics and interpretations of what, say, a tuple of strings means. The difficulty inherent in writing annotation interpreting libraries will keep their number low and their authorship in the hands of people who, frankly, know what they're doing. The notion that libraries don't intersect should be stripped from the PEP. The question in my mind is whether this PEP needs to take responsibility for interoperability. I contend that people who design an annotation-consuming library that ends up intersecting several others will likely be capable of finding a solution even if not ideal without a central mechanism, and will be far better situated to define a solution for a central mechanism. Any thoughts? Thanks -Tony

4 6

Re: [Python-ideas] new operators via backquoting
by George Sakkis 03 Jan '07

03 Jan '07

Chris Rebert wrote: > In Haskell, foo `baz` bar means (baz foo bar), which translates to > baz(foo, bar) in Python. This allows Haskell programmers to use > functions as infix operators. > If I recall correctly, in Py3k, enclosing something in backticks will no > longer cause it to be repr()-ed, leaving the backtick without a meaning > in Python. Perhaps you're not aware of the "best hack of 2005", which works in current Python and goes a step further by allowing you to even "override" the operator's delimiter: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/384122 George

2 1

new operators via backquoting
by Chris Rebert 03 Jan '07

03 Jan '07

In Haskell, foo `baz` bar means (baz foo bar), which translates to baz(foo, bar) in Python. This allows Haskell programmers to use functions as infix operators. If I recall correctly, in Py3k, enclosing something in backticks will no longer cause it to be repr()-ed, leaving the backtick without a meaning in Python. Thus, I propose one of the following as the new use for the backtick (`): [Note: In both, the characters between the backticks must be a valid Python identifier.] (A) `baz` is treated as an operator, named "baz", just as / is "div". foo `baz` bar thus causes python to try to call foo.__baz__(bar), and failing that, bar.__rbaz__(foo), and if both those fail, raise TypeError. This is, if I understand correctly, how the builtin operators work. (B) `baz` is a special way to call a callable. foo `baz` bar is translated to baz(foo, bar) with the standard lookup rules for resolving "baz" Example use cases, stolen from Haskell: The Craft of Functional Programming: 2 `max` 5 => 5 7 `cons` tail => ConsCell(val=7, next=tail) matrix1 `crossproduct` matrix2 => cross-product of the matrices [1, 2, 3] `zip` ['a', 'b', 'c'] => [[1, 'a'], [2, 'c'], [3, 'c']] I believe that this would improve the readability of code, such as Numeric, without going off the deep end and offering programmable syntax. - Chris Rebert

5 4

Attribute Docstrings and Annotations
by Tony Lownds 03 Jan '07

03 Jan '07

I'd like to propose annotations and docstrings on attributes for Python 3000 with the following Grammar and properties: expr_stmt: test (':' test ['=' (yield_expr|testlist)] | augassign (yield_expr|testlist) | [',' testlist] ('=' (yield_expr|testlist))* ) * annotations can appear for attributes that are not defined yet * code to generate and populate __annotations__ and __attrdoc__ would appear for all modules and class bodies, not for functions. * attribute annotations allow a name as target only * attribute annotations without assignment are illegal for functions * attribute annotations with assignment should probably be illegal for functions * docstring annotations only apply to the first target, and only if it is a name * docstring annotations do not apply to augmented assignments * docstring and annotations on functions do not get populated in __annotations__ * the class docstring is not reused as a function docstring The basic rationale for annotations on attributes is completeness with PEP3107. I'm proposing attribute docstrings as well because I think it's preferable to have a spot for documentation that isn't tied to annotations, like functions' __doc__. Also attribute docstrings as specified look similar to function docstrings; both have statements consisting of only a string that is taken as documentation. Here is an interactive session and the code that might be generated. What do you think? Thanks -Tony >>> class X: ... "class docstring" ... foo: 1 = 1 ... bar: 2 ... "attribute docstring" ... attr = None ... "another attribute docstring" ... fields = __slots__ = ['fields', 'attr'] ... "docstring ignored" ... x, y = 1, 2 ... >>> X.__attrdoc__ {'fields': 'another attribute docstring', 'attr': 'attribute docstring'} >>> X.__annotations__ {'foo': 1, 'bar': 2} >>> X.foo 1 >>> X.bar Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: type object 'X' has no attribute 'bar' >>> def f(): ... x: 1 ... File "<stdin>", line 2 SyntaxError: annotation without assignment not allowed in function 2 0 LOAD_NAME 0 (__name__) 3 STORE_NAME 1 (__module__) 6 BUILD_MAP 0 9 STORE_NAME 2 (__annotations__) 12 BUILD_MAP 0 15 STORE_NAME 3 (__attrdoc__) 3 18 LOAD_CONST 0 ('class docstring') 21 STORE_NAME 4 (__doc__) 5 24 LOAD_CONST 1 ('attribute docstring') 27 LOAD_NAME 3 (__attrdoc__) 30 LOAD_CONST 2 ('attr') 33 STORE_SUBSCR 34 LOAD_NAME 5 (None) 37 STORE_NAME 6 (attr) 6 40 LOAD_CONST 3 (1) 48 STORE_NAME 7 (foo) 44 LOAD_CONST 3 (1) 51 LOAD_NAME 2 (__annotations__) 54 LOAD_CONST 4 ('foo') 57 STORE_SUBSCR 7 58 LOAD_CONST 5 (2) 61 LOAD_NAME 2 (__annotations__) 64 LOAD_CONST 6 ('bar') 67 STORE_SUBSCR 9 68 LOAD_CONST 7 ('another attribute docstring') 71 LOAD_NAME 3 (__attrdoc__) 74 LOAD_CONST 8 ('fields') 77 STORE_SUBSCR 78 LOAD_CONST 8 ('fields') 81 LOAD_CONST 2 ('attr') 84 BUILD_LIST 2 87 DUP_TOP 88 STORE_NAME 8 (fields) 91 STORE_NAME 9 (__slots__) 94 LOAD_LOCALS 95 RETURN_VALUE

6 7

Re: [Python-ideas] [Python-3000] PEP 3107 Function Annotations: overloadable ->
by Josiah Carlson 02 Jan '07

02 Jan '07

To Tony and Kay, my short answer is: use __returns__ . Moving to python-ideas as per Guido's request: "Guido van Rossum" <guido(a)python.org> wrote: > This is sufficiently controversial that I believe it ought to go to > python-ideas first. If it comes to a PEP it should be a separate one > from PEP 3107. > On 1/1/07, Talin <talin(a)acm.org> wrote: > > Tony Lownds wrote: > > >> From: Tony Lownds <tony(a)pagedna.com> > > > What do people here think? > > 1) Normally, we don't name operators based on their shape - we don't > > call '/' the __slash__ operator, for example, nor do we call '|' the > > "__vbar__" operator. Certainly, but those two operators, and basically every other operator used in Python have long-standing semantics in basically every language that Python is even remotely related to. > > 2) I think that all operators should have a "suggested semantic". When > > someone overloads the '+' operator, its a good bet that the meaning of > > the overload has something to do with addition or accumulation in a > > general sense. This won't *always* be true, but it will be true often > > enough. I don't buy your "suggested semantic" argument. And even if I did, operator overloading allows people to choose the semantics of operations for themselves; suggesting a semantic for an operation would be a set of documentation that would never be read, and if it was read, ignored. > > But an arbitrary operator with no guidelines as to what it means is > > anyone's guess; It means that when we see a '->' operator embedded in > > the code, we have no idea what is being said. Ahh, but in this case there is precisely one place where the '->' operator is planned on being found for Py3k: def <name>(<arglist with or without annotations>) -> <annotation>: <body> In that sense, we don't need a fcn_obj.__becomes__(<annotation>) method, and that wasn't what the discussion was about, it was "what should the attribute be called for this already agreed upon *function annotation*?". Now, because this *particular* annotation was created to allow for the annotation of "returns", I agree with Kay's last suggestion, the attribute should be called __returns__, as that is *exactly* what the annotation was meant to convey. > > From an HCI perspective, punctuation symbols improve code readability > > only if their meanings are familiar to the reader; An operator whose > > meaning is constantly changing is a hindrance to readability rather than > > a help. Claiming "we want a particular operation to always refer to the same method/attribute" is only applicable if a particular operation has a chance of meaning more than one thing. Currently it means *exactly* one thing in the context of Python, 'this function returns X', so in my opinion, your argument isn't applicable. If you can manage to convince more people in python-ideas of arbitrary operations (as per your previous message(s) on the subject), and/or you can convince Guido to say "I would like more operations in Python", then your argument is applicable. However, I don't believe that you will be able to convince Guido that the large set of operations that you have previously posted about would be a good idea, and I certainly don't believe it would happen with sufficient time to make it into the first Py3k release. - Josiah

6 10