I don't know if this would make sense to try to push to 2.6 or 3.0. I
was talking with some people about how the ability to split or replace
on multiple substrings would be added to python, without adding new
methods or having ugly tuple passing requirents like s.split(('foo',
'bar'), 4). This idea came to mind, so I wanted to toss it out there
for scrutination. It would be a builtin, but can be implemented in
python like this, basically:
class oneof(list):
def __init__(self, *args):
list.__init__(self)
self.extend(args)
def __eq__(self, o):
return o in self
assert 'bar' == oneof('bar', 'baz')
In addition to the new type, .replace, .split, and other appropriate
functions would be updated to take this as the substring argument to
locate and would match any one of the substrings it contains. I've
asked a few people and gotten good responses on the general idea so
far, but what do you all think?
1) Would the multi-substring operations be welcomed?
2) Could this be a good way to add those to the API without breaking things?
3) What version would it target?
4) What all functions and methods should support this or generally
might gain value from some similar solution?
--
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/
On 1/23/07, Josiah Carlson <jcarlson(a)uci.edu> wrote:
> Whether it is a tuple being passed, or a "magic" container, I don't
> think it matters; though I would lean towards a tuple because it is 5
> less characters to type out, and one fewer data types to worry about.
I had talked to others and the concensus was against the tuples for
ugliness. s.split((a,b), c) wasn't a popular choice, but
s.split(oneof(a, b), c) reads better.
> This has been discussed before in python-dev, I believe the general
> consensus was that it would be convenient at times, but I also believe
> the general consensus was "use re";
It seems like we have a history of useful string operations being
moved away from "use re" to "dont use re", such that the slightly
recent startswith and endswith methods, and even split and replace
themselves. I would like to see less reasons for people to worry with
regular expressions until they actually need them. If we can provide a
better way to get the job done, that seems like a great idea.
ALSO
The oneof type isn't just a single use thing. Usecode may often make
use of it, and other types could benefit such as doing a lookup with
d[oneof(1,2,3)] (where order would matter for priority). I think this
semantic collection type would be very useful in a number of contexts
where we would currently just loop or duplicate code.
--
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/
Guido asked me to move it here. anyway, i might as well state my case
better.
in the formal definitions of object oriented programming, objects
are said to encapsulate state and behavior. behavior is largely dependent
on the type of the object, but the state is important nonetheless.
for example, file objects are stateful: they can be opened for reading-only,
writing-only, both, or be closed altogether. still, they are all instances
of the file type.
since generic functions/multi-dispatch come to help the programmer
by reducing boilerplate early type-checking code and providing a
more capable dispatch mechanism -- we can't overlook the state of the
object.
this early type-checking goes against the spirit of pure duck typing (which
i'm fond of), but is crucial when the code has side effects. in this case,
you can't just start executing the code and "hope it works", as the
resources (i.e., files) involved are modified. in this kind of code, you
want to check everything is alright *instead* of suddenly having
an AttributeError/TypeError somewhere.
here's an example:
@dispatch
def copy(src: file, dst: file):
while True:
buf = src.read(1000)
if not buf: break
dst.write(buf)
suppose now that dst is mistakenly opened for reading.
this means src would have already been modified, while
dst.write is bound to fail. if src is a socket, for instance,
this would be destructive.
so if we already go as far as having multiple dispatch, which imposes
constraints on the arguments a function accepts, we might as well
base it on the type and state of the object, rather than only on it's type.
we could say, for example:
@dispatch
def copy(src: file_for_reading, dst: file_for_writing):
file_for_reading is not a type -- it's a checker. it may be defined as
def file_for_reading(obj: file):
return file.mode == "r" and not file.closed
types, by default, would check using isinstance(), but costume
checkers could check for stateful requirements too.
and a note about performance: this check is required whether
it's done explicitly or by the dispatch mechanism, and since
most functions don't require so many overloads, i don't think
it's an issue.
besides, we can have a different decorator for dispatching
by type or by checkers, i.e., @dispatch vs @stateful_dispatch,
or something. the simple @dispatch would use a dictionary,
while the stateful version would use a loop.
-tomer
---------- Forwarded message ----------
From: tomer filiba <tomerfiliba(a)gmail.com>
Date: Jan 14, 2007 2:28 PM
Subject: multi-dispatch again
To: Python-3000(a)python.org
i just thought of a so-to-speak counter-example for ABCs... it's not really
a counter-example, but i believe it shows a deficiency in the concept.
theoretically speaking, objects are a made of type and state. ABCs,
isinstance()
and interfaces at general only check the type part. for example:
@dispatch
def log_to_file(text: str, device: file):
file.write(text)
this will constrain the *type* of the device, but not its *state*.
practically speaking, i can pass a closed file, or a file open for
reading-only,
and it would pass silently.
basing multi-dispatch on types is of course a leap forward, but if we
already plan to take this leap, why not make it general enough to support
more complex use-cases?
this way we could rewrite the snippet above as
@dispatch
def log_to_file(text: str, device: open_file):
file.write(text)
where open_file isn't a type, but rather a "checker" that may also examine
the
state. by default, type objects would check for inheritance (via a special
method),
but checkers could extend this behavior.
for efficiency purposes, we can have two decorators:
@type_dispatch - dispatches based on type only
@full_dispatch - dispatches based on type and state
bottom line -- we can't just look at the type of the object for dispatching,
overlooking its state. the state is meaningful, and we'd want the function
not to be called at all if the state of the object is wrong.
-tomer
> I was bitten by the urge to play with this today, and modified my
> previous "self" hack to handle "super" also, so that the following
> code works:
>
> class D (C):
> @method
> def sum(n):
> return super.sum(n * 2) - self.base
>
> Posted as "evil2.py" here:
>
> http://www.lag.net/robey/code/surf/
>
> Because hacking "super" requires having the class object handy, this
> one needs a metaclass to do its magic, which is a shame. I guess if
> it was implemented inside the cpython compiler, it would be less of a
> problem.
BTW, a "super-only" version of this decortor (that I think could be
called "implement") has some more chances in Python. But think
belongs more to Python-Ideas list, ok?
--
EduardoOPadoan (eopadoan->altavix::com)
Bookmarks: http://del.icio.us/edcrypt
Blog: http://edcrypt.blogspot.com
Jabber: edcrypt at jabber dot org
ICQ: 161480283
GTalk: eduardo dot padoan at gmail dot com
MSN: eopadoan at altavix dot com
A while ago, I developed a small PyGtk programme that could dynamically
reload all the working callbacks and logic while the GUI was still
running. I could get away with this because of the flexible way Python
loads modules at runtime, but it ended up being a waste of time as
implementing it took more time that actually using it. For sanity's
sake it quickly becomes clear you almost never want to rely on being
able to refer to a half initialized module. And wouldn't it be nice if
Python enforced this.
My suggestion is that module importing occur in a temporary local
namespace that exists only until the end of the module code is executed,
then a small function could copy everything from the temporary namespace
into the module object. The usual closure semantics would guarantee
that top-level functions could still call each other, but they would
effectively become immutable after the namespace wraps up. The 'global'
keyword could be used at the top level in a module to force it to be
defined in the module immediately, and to ensure internal references to
the object go through the module object.
This would be a big change in module import semantics, but should have
remarkably few consequences, as it really is an enforcement mechanism
for good style. The copying from the temporary namespace into the
module object would be a good place to insert a hook function to filter
what objects are actually published to the module. You could by default
not copy any object indentified by a leading underscore.
In discussing the Function Annotations PEP on python-list,
interoperability
between schemes came up again:
http://mail.python.org/pipermail/python-list/2006-December/420645.html
John Roth wrote:
> Third, it's half of a proposal. Type checking isn't the only use
> for metadata about functions/methods, classes, properties
> and other objects, and the notion that there are only going to
> be a small number of non-intersecting libraries out there is
> an abdication of responsibility to think this thing through.
This issue came up before in
http://mail.python.org/pipermail/python-3000/2006-August/002796.html
and a rather long thread followed. Here is the paragraph in the PEP that
needs updating, at the least:
There is no worry that these libraries will assign semantics at
random, or that a variety of libraries will appear, each with
varying semantics and interpretations of what, say, a tuple of
strings means. The difficulty inherent in writing annotation
interpreting libraries will keep their number low and their
authorship in the hands of people who, frankly, know what they're
doing.
The notion that libraries don't intersect should be stripped from the
PEP. The question in my mind is whether this PEP needs to take
responsibility for interoperability.
I contend that people who design an annotation-consuming library
that ends up intersecting several others will likely be capable of
finding
a solution even if not ideal without a central mechanism, and will be
far
better situated to define a solution for a central mechanism.
Any thoughts?
Thanks
-Tony
Chris Rebert wrote:
> In Haskell, foo `baz` bar means (baz foo bar), which translates to
> baz(foo, bar) in Python. This allows Haskell programmers to use
> functions as infix operators.
> If I recall correctly, in Py3k, enclosing something in backticks will no
> longer cause it to be repr()-ed, leaving the backtick without a meaning
> in Python.
Perhaps you're not aware of the "best hack of 2005", which works in
current Python and goes a step further by allowing you to even
"override" the operator's delimiter:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/384122
George
In Haskell, foo `baz` bar means (baz foo bar), which translates to
baz(foo, bar) in Python. This allows Haskell programmers to use
functions as infix operators.
If I recall correctly, in Py3k, enclosing something in backticks will no
longer cause it to be repr()-ed, leaving the backtick without a meaning
in Python.
Thus, I propose one of the following as the new use for the backtick (`):
[Note: In both, the characters between the backticks must be a valid
Python identifier.]
(A) `baz` is treated as an operator, named "baz", just as / is "div".
foo `baz` bar thus causes python to try to call foo.__baz__(bar), and
failing that, bar.__rbaz__(foo), and if both those fail, raise
TypeError. This is, if I understand correctly, how the builtin operators
work.
(B) `baz` is a special way to call a callable. foo `baz` bar is
translated to baz(foo, bar) with the standard lookup rules for
resolving "baz"
Example use cases, stolen from Haskell: The Craft of Functional Programming:
2 `max` 5 => 5
7 `cons` tail => ConsCell(val=7, next=tail)
matrix1 `crossproduct` matrix2 => cross-product of the matrices
[1, 2, 3] `zip` ['a', 'b', 'c'] => [[1, 'a'], [2, 'c'], [3, 'c']]
I believe that this would improve the readability of code, such as
Numeric, without going off the deep end and offering programmable syntax.
- Chris Rebert
I'd like to propose annotations and docstrings on attributes for
Python 3000
with the following Grammar and properties:
expr_stmt: test (':' test ['=' (yield_expr|testlist)] |
augassign (yield_expr|testlist) |
[',' testlist] ('=' (yield_expr|testlist))*
)
* annotations can appear for attributes that are not defined yet
* code to generate and populate __annotations__ and __attrdoc__
would appear for all modules and class bodies, not for functions.
* attribute annotations allow a name as target only
* attribute annotations without assignment are illegal for functions
* attribute annotations with assignment should probably be illegal
for functions
* docstring annotations only apply to the first target, and only if
it is a name
* docstring annotations do not apply to augmented assignments
* docstring and annotations on functions do not get populated in
__annotations__
* the class docstring is not reused as a function docstring
The basic rationale for annotations on attributes is completeness
with PEP3107.
I'm proposing attribute docstrings as well because I think it's
preferable to have
a spot for documentation that isn't tied to annotations, like
functions' __doc__.
Also attribute docstrings as specified look similar to function
docstrings; both
have statements consisting of only a string that is taken as
documentation.
Here is an interactive session and the code that might be generated.
What do you
think?
Thanks
-Tony
>>> class X:
... "class docstring"
... foo: 1 = 1
... bar: 2
... "attribute docstring"
... attr = None
... "another attribute docstring"
... fields = __slots__ = ['fields', 'attr']
... "docstring ignored"
... x, y = 1, 2
...
>>> X.__attrdoc__
{'fields': 'another attribute docstring', 'attr': 'attribute docstring'}
>>> X.__annotations__
{'foo': 1, 'bar': 2}
>>> X.foo
1
>>> X.bar
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'X' has no attribute 'bar'
>>> def f():
... x: 1
...
File "<stdin>", line 2
SyntaxError: annotation without assignment not allowed in function
2 0 LOAD_NAME 0 (__name__)
3 STORE_NAME 1 (__module__)
6 BUILD_MAP 0
9 STORE_NAME 2 (__annotations__)
12 BUILD_MAP 0
15 STORE_NAME 3 (__attrdoc__)
3 18 LOAD_CONST 0 ('class docstring')
21 STORE_NAME 4 (__doc__)
5 24 LOAD_CONST 1 ('attribute docstring')
27 LOAD_NAME 3 (__attrdoc__)
30 LOAD_CONST 2 ('attr')
33 STORE_SUBSCR
34 LOAD_NAME 5 (None)
37 STORE_NAME 6 (attr)
6 40 LOAD_CONST 3 (1)
48 STORE_NAME 7 (foo)
44 LOAD_CONST 3 (1)
51 LOAD_NAME 2 (__annotations__)
54 LOAD_CONST 4 ('foo')
57 STORE_SUBSCR
7 58 LOAD_CONST 5 (2)
61 LOAD_NAME 2 (__annotations__)
64 LOAD_CONST 6 ('bar')
67 STORE_SUBSCR
9 68 LOAD_CONST 7 ('another attribute
docstring')
71 LOAD_NAME 3 (__attrdoc__)
74 LOAD_CONST 8 ('fields')
77 STORE_SUBSCR
78 LOAD_CONST 8 ('fields')
81 LOAD_CONST 2 ('attr')
84 BUILD_LIST 2
87 DUP_TOP
88 STORE_NAME 8 (fields)
91 STORE_NAME 9 (__slots__)
94 LOAD_LOCALS
95 RETURN_VALUE
To Tony and Kay, my short answer is: use __returns__ .
Moving to python-ideas as per Guido's request:
"Guido van Rossum" <guido(a)python.org> wrote:
> This is sufficiently controversial that I believe it ought to go to
> python-ideas first. If it comes to a PEP it should be a separate one
> from PEP 3107.
> On 1/1/07, Talin <talin(a)acm.org> wrote:
> > Tony Lownds wrote:
> > >> From: Tony Lownds <tony(a)pagedna.com>
> > > What do people here think?
> > 1) Normally, we don't name operators based on their shape - we don't
> > call '/' the __slash__ operator, for example, nor do we call '|' the
> > "__vbar__" operator.
Certainly, but those two operators, and basically every other operator
used in Python have long-standing semantics in basically every language
that Python is even remotely related to.
> > 2) I think that all operators should have a "suggested semantic". When
> > someone overloads the '+' operator, its a good bet that the meaning of
> > the overload has something to do with addition or accumulation in a
> > general sense. This won't *always* be true, but it will be true often
> > enough.
I don't buy your "suggested semantic" argument. And even if I did,
operator overloading allows people to choose the semantics of operations
for themselves; suggesting a semantic for an operation would be a set of
documentation that would never be read, and if it was read, ignored.
> > But an arbitrary operator with no guidelines as to what it means is
> > anyone's guess; It means that when we see a '->' operator embedded in
> > the code, we have no idea what is being said.
Ahh, but in this case there is precisely one place where the '->'
operator is planned on being found for Py3k:
def <name>(<arglist with or without annotations>) -> <annotation>:
<body>
In that sense, we don't need a fcn_obj.__becomes__(<annotation>) method,
and that wasn't what the discussion was about, it was "what should the
attribute be called for this already agreed upon *function annotation*?".
Now, because this *particular* annotation was created to allow for the
annotation of "returns", I agree with Kay's last suggestion, the
attribute should be called __returns__, as that is *exactly* what the
annotation was meant to convey.
> > From an HCI perspective, punctuation symbols improve code readability
> > only if their meanings are familiar to the reader; An operator whose
> > meaning is constantly changing is a hindrance to readability rather than
> > a help.
Claiming "we want a particular operation to always refer to the same
method/attribute" is only applicable if a particular operation has a
chance of meaning more than one thing. Currently it means *exactly* one
thing in the context of Python, 'this function returns X', so in my
opinion, your argument isn't applicable.
If you can manage to convince more people in python-ideas of arbitrary
operations (as per your previous message(s) on the subject), and/or you
can convince Guido to say "I would like more operations in Python", then
your argument is applicable.
However, I don't believe that you will be able to convince Guido that
the large set of operations that you have previously posted about would
be a good idea, and I certainly don't believe it would happen with
sufficient time to make it into the first Py3k release.
- Josiah