Hi.
[Mark Hammond]
> The point isn't about my suffering as such. The point is more that
> python-dev owns a tiny amount of the code out there, and I don't believe we
> should put Python's users through this.
>
> Sure - I would be happy to "upgrade" all the win32all code, no problem. I
> am also happy to live in the bleeding edge and take some pain that will
> cause.
>
> The issue is simply the user base, and giving Python a reputation of not
> being able to painlessly upgrade even dot revisions.
I agree with all this.
[As I imagined explicit syntax did not catch up and would require
lot of discussions.]
[GvR]
> > Another way is to use special rules
> > (similar to those for class defs), e.g. having
> >
> > <frag>
> > y=3
> > def f():
> > exec "y=2"
> > def g():
> > return y
> > return g()
> >
> > print f()
> > </frag>
> >
> > # print 3.
> >
> > Is that confusing for users? maybe they will more naturally expect 2
> > as outcome (given nested scopes).
>
> This seems the best compromise to me. It will lead to the least
> broken code, because this is the behavior that we had before nested
> scopes! It is also quite easy to implement given the current
> implementation, I believe.
>
> Maybe we could introduce a warning rather than an error for this
> situation though, because even if this behavior is clearly documented,
> it will still be confusing to some, so it is better if we outlaw it in
> some future version.
>
Yes this can be easy to implement but more confusing situations can arise:
<frag>
y=3
def f():
y=9
exec "y=2"
def g():
return y
return y,g()
print f()
</frag>
What should this print? the situation leads not to a canonical solution
as class def scopes.
or
<frag>
def f():
from foo import *
def g():
return y
return g()
print f()
</frag>
[Mark Hammond]
> > This probably won't be a very popular suggestion, but how about pulling
> > nested scopes (I assume they are at the root of the problem)
> > until this can be solved cleanly?
>
> Agreed. While I think nested scopes are kinda cool, I have lived without
> them, and really without missing them, for years. At the moment the cure
> appears worse then the symptoms in at least a few cases. If nothing else,
> it compromises the elegant simplicity of Python that drew me here in the
> first place!
>
> Assuming that people really _do_ want this feature, IMO the bar should be
> raised so there are _zero_ backward compatibility issues.
I don't say anything about pulling nested scopes (I don't think my opinion
can change things in this respect)
but I should insist that without explicit syntax IMO raising the bar
has a too high impl cost (both performance and complexity) or creates
confusion.
[Andrew Kuchling]
> >Assuming that people really _do_ want this feature, IMO the bar should be
> >raised so there are _zero_ backward compatibility issues.
>
> Even at the cost of additional implementation complexity? At the cost
> of having to learn "scopes are nested, unless you do these two things
> in which case they're not"?
>
> Let's not waffle. If nested scopes are worth doing, they're worth
> breaking code. Either leave exec and from..import illegal, or back
> out nested scopes, or think of some better solution, but let's not
> introduce complicated backward compatibility hacks.
IMO breaking code would be ok if we issue warnings today and implement
nested scopes issuing errors tomorrow. But this is simply a statement
about principles and raised impression.
IMO import * in an inner scope should end up being an error,
not sure about 'exec's.
We will need a final BDFL statement.
regards, Samuele Pedroni.
I thought it would be nice to try to improve the mimetypes module by having
it, on Windows, query the Registry to get the mapping of filename extensions
to media types, since the mimetypes code currently just blindly checks
posix-specific paths for httpd-style mapping files. However, it seems that the
way to get mappings from the Windows registry is excessively slow in Python.
I'm told that the reason has to do with the limited subset of APIs that are
exposed in the _winreg module. I think it is that EnumKey(key, index) is
querying for the entire list of subkeys for the given key every time you call
it. Or something. Whatever the situation is, the code I tried below is way
slower than I think it ought to be.
Does anyone have any suggestions (besides "write it in C")? Could _winreg
possibly be improved to provide an iterator or better interface to get the
subkeys? (or certain ones? There are a lot of keys under HKEY_CLASSES_ROOT,
and I only need the ones that start with a period). Should I file this as a
feature request?
Thanks
-Mike
from _winreg import HKEY_CLASSES_ROOT, OpenKey, EnumKey, QueryValueEx
i = 0
typemap = {}
try:
while 1:
subkeyname = EnumKey(HKEY_CLASSES_ROOT, i)
try:
subkey = OpenKey(HKEY_CLASSES_ROOT, subkeyname)
if subkeyname[:1] == '.':
data = QueryValueEx(subkey, 'Content Type')[0]
print subkeyname, '=', data
typemap[subkeyname] = data # data will be unicode
except EnvironmentError, WindowsError:
pass
i += 1
except WindowsError:
pass
I've noticed several times now, in both debug and release builds, that
if I run regrtest.py with -uall, *sometimes* it just stops after
running test_compiler:
$ python_d regrtest.py -uall
test_grammar
test_opcodes
...
test_compare
test_compile
test_compiler
$
There's no indication of error, it just ends. It's not consistent.
Happened once when I was running with -v, and test_compiler's output ended here:
...
compiling C:\Code\python\lib\test\test_operator.py
compiling C:\Code\python\lib\test\test_optparse.py
compiling C:\Code\python\lib\test\test_os.py
compiling C:\Code\python\lib\test\test_ossaudiodev.py
compiling C:\Code\python\lib\test\test_parser.py
In particular, there's no
Ran M tests in Ns
output, so it doesn't look like unittest (let alone regrtest) ever got
control back.
Hmm. os.listdir() is in sorted order on NTFS, so test_compiler should
be chewing over a lot more files after test_parser.py.
*This* I could blame on a blown C stack -- although I'd expect a much
nastier symptom then than just premature termination.
Anyone else?
Before it's too late and the API gets frozen, I would like to propose an
alternate implementation for PEP292 that exposes two functions instead
of two classes.
Current way: print Template('Turn $direction') %
dict(direction='right')
Proposed: print dollarsub('Turn $direction',
dict(direction='right'))
or: print dollarsub('Turn $direction', direction='right')
My main issue with the current implementation is that we get no leverage
from using a class instead of a function. Though the API is fairly
simple either way, it is easier to learn and document functions than
classes. We gain nothing from instantiation -- the underlying data
remains immutable and no additional state is attached. The only new
behavior is the ability to apply the mod operator. Why not do it in one
step.
I had thought a possible advantage of classes was that they could be
usefully subclassed. However, a little dickering around showed that to
do anything remotely interesting (beyond changing the pattern alphabet)
you have to rewrite everything by overriding both the method and the
pattern. Subclassing gained you nothing, but added a little bit of
complexity. A couple of simple exercises show this clearly: write a
subclass using a different escape character or one using dotted
identifiers for attribute lookup in the local namespace -- either way
subclasses provide no help and only get in the way.
One negative effect of the class implementation is that it inherits from
unicode and always returns a unicode result even if all of the inputs
(mapping values and template) are regular strings. With a function
implementation, that can be avoided (returning unicode only if one of
the inputs is unicode).
The function approach also makes it possible to have keyword arguments
(see the example above) as well as a mapping. This isn't a big win, but
it is nice to have and reads well in code that is looping over multiple
substitutions (mailmerge style):
for girl in littleblackbook:
print dollarsub(loveletter, name=girl[0].title(),
favoritesong=girl[3])
Another minor advantage for a function is that it is easier to lookup in
the reference. If a reader sees the % operator being applied and looks
it up in the reference, it is going to steer them in the wrong
direction. This is doubly true if the Template instantiation is remote
from the operator application.
Summary for functions:
* is more appropriate when there is no state
* no unnecessary instantiation
* can be applied in a single step
* a little easier to learn/use/document
* doesn't force result to unicode
* allows keyword arguments
* easy to find in the docs
Raymond
----------- Sample Implementation -------------
def dollarsub(template, mapping=None, **kwds):
"""A function for supporting $-substitutions."""
typ = type(template)
if mapping is None:
mapping = kwds
def convert(mo):
escaped, named, braced, bogus = mo.groups()
if escaped is not None:
return '$'
if bogus is not None:
raise ValueError('Invalid placeholder at index %d' %
mo.start('bogus'))
val = mapping[named or braced]
return typ(val)
return _pattern.sub(convert, template)
def safedollarsub(template, mapping=None, **kwds):
"""A function for $-substitutions.
This function is 'safe' in the sense that you will never get
KeyErrors if
there are placeholders missing from the interpolation dictionary.
In that
case, you will get the original placeholder in the value string.
"""
typ = type(template)
if mapping is None:
mapping = kwds
def convert(mo):
escaped, named, braced, bogus = mo.groups()
if escaped is not None:
return '$'
if bogus is not None:
raise ValueError('Invalid placeholder at index %d' %
mo.start('bogus'))
if named is not None:
try:
return typ(mapping[named])
except KeyError:
return '$' + named
try:
return typ(mapping[braced])
except KeyError:
return '${' + braced + '}'
return _pattern.sub(convert, template)
The thread about bytes() is about a Python 3.0 feature. Guido's
presentations have mentioned various changes he'd like to make in 3.0,
but there's no master list of things that would change.
I think it would be useful to have such a list, because it would focus
our language development effort on ones that are steps to 3.0, and
maybe people can invent ways to smooth the transition. I've started a
list in the Wiki at http://www.python.org/moin/Python3.0 , but should
it be a PEP? (PEP 3000, perhaps?)
--amk
A Yet Simpler Proposal, modifying that of PEP 292
I propose that the Template module not use $ to set off
placeholders; instead, placeholders are delimited by braces {}.
The following rules for {}-placeholders apply:
1. {{ and }} are escapes; they are replaced with a single { or }
respectively.
2. {identifier} names a substitution placeholder matching a
mapping key of "identifier". By default, "identifier" must
spell a Python identifier as defined in Identifiers and
Keywords[1].
No other characters have special meaning.
If the left-brace { is unmatched, appears at the end of the
string, or is followed by any non-identifier character, a
ValueError will be raised at interpolation time[2].
If a single, unmatched right-brace } occurs in the string, a
ValueError will be raised at interpolation time. This avoids
ambiguity: did the user want a single right-brace, or did they
inadvertently omit the left-brace? This will also cause a
probably erroneous "{foo}}" or "{{foo}" to raise a ValueError.
Rationale
There are several reasons for preferring the paired delimiters {}
to a single prefixed $:
1. The placeholder name stands out more clearly from its
surroundings, due to the presence of a closing delimiter, and
also to the fact that the braces bear less resemblance to any
alphabetic characters than the dollar sign:
"Hello, {name}, how are you?" vs "Hello, $name, how are you?"
2. Only two characters have special meanings in the string, as
opposed to three. Additionally, dollar signs are expected to
be more often used in templated strings (e.g. for currency
values) than braces:
"The {item} costs ${price}." vs "The $item costs $$$price."
3. The placeholder syntax is consistent, and does not change even
when valid identifier characters follow the placeholder but
are not part of the placeholder:
"Look at the {plural}s!" vs "Look at the ${plural}s!"
4. The template substitution could be changed in future to
support dotted names without breaking existing code. The
example below would break if the $-template were changed to
allow dotted names:
"Evaluate {obj}.__doc__" vs "Evaluate $obj.__doc__"
There are two drawbacks to the proposal when compared with
$-templates:
1. An extra character must be typed for each placeholder in the
common case:
"{name}, welcome to {site}!" vs "$name, welcome to $site!"
2. Templated strings containing braces become more complicated:
"dict = {{'{key}': '{value}'}}" vs "dict = {'$key': '$value'}"
The first is not a real issue; the extra closing braces needed
for the placeholder when compared with the number of other
characters in the templated string will usually be insignificant.
Furthermore, the {}-placeholders require fewer characters to be
typed in the less common case when valid identifier characters
follow the placeholder but are not part of it.
The need for braces in a templated string is not expected to
occur frequently. Because of this, the second drawback is
considered of minor importance.
Reference Implementation
If the general nature of feedback on this proposal is positive,
or expressive of interest in an implementation, then a reference
implementation will be created forthwith.
References and Notes
[1] Identifiers and Keywords
http://www.python.org/doc/current/ref/identifiers.html
[2] The requirement for interpolation-time error raising is the
same as in PEP 292. Although not a part of this proposal, I
suggest that it would be better if the error occured when the
Template instance is created.
It is worth noting that the PEP 292 reference implementation
(in python/nondist/sandbox/string/string/template.py) does
not fully conform to the PEP as regards raising errors for
invalid template strings:
>>> t = Template("This should cause an error: $*")
>>> t % {}
u'This should cause an error: $*'
>>> t = Template("This also should cause an error: $")
>>> t % {}
u'This also should cause an error: $'
> > * doesn't force result to unicode
[Aahz]
> This is the main reason I'm +0, pending further arguments.
Employing class logic for a function problem is my main reason. Any
function can be wrapped in class logic and use an overloaded operator
(string.template for example) -- it's invariably the wrong thing to do
for reasons of complexity, separation of instantiation and application,
and the issues that always arise when an operator is overloaded (a good
technique that should only be applied sparingly).
[Aahz]
> OTOH, I also
> like using %, so you'd have to come up with more points to move me
> beyond +0.
The reasons center around the remoteness of instantiation from
application, the differences from current uses of %, and an issue with
the % operator itself.
When application is remote from instantiation (for instance, when a
template argument is supplied to a function), we get several problems.
Given a line like "r = t % m", it is hard to verify that the code is
correct. Should there be KeyError trapping? Can m be a tuple? Is t a
%template, a safetemplate, or dollartemplate? Is the result a str or
Unicode object? Should the decision of safe vs regular substitution be
made at the point of instantiation or application?
The temptation will be to reuse existing code that uses the % operator
which unfortunately is not the same (especially with respect to applying
tuples, return types, and auto-stringizing). The % operator is hard to
search for in the docs and has precedence issues arising from its
primary use for modulo arithemetic. Also, using a function makes it
possible to use both % formatting and $ formatting (I do have use a case
for this). Further, the % operator mnemonically only makes sense with %
identifier tags -- it makes less sense with $ tags.
Whatever answer is chosen, it should be forward looking and consider
that others will want to add functionality (local namespace lookups for
example) and that some poor bastards are going to waste time figuring
out how to subclass the current implementation. In time, there will
likely be more than two of these -- do you want more classes or more
functions?
Raymond
From: Aahz
>On Mon, Aug 30, 2004, Peter Harris wrote:
>>
>> I have updated PEP309 a little,
[...]
>
>> I hope there will still be time to squeeze it in before 2.4 beta 1, and
>> if there is any reason why not (apart from there has to be a cut-off
>> point somewhere, which I accept), please someone let me know what work
>> is still needed to make it ready to go in.
> What's PEP309?
PEP 309 is the "Partial Function Application" class. (Used to be referred
to as "Currying", but that was agreed not to be an accurate name).
Paul
__________________________________________________________________________
This e-mail and the documents attached are confidential and intended
solely for the addressee; it may also be privileged. If you receive this
e-mail in error, please notify the sender immediately and destroy it.
As its integrity cannot be secured on the Internet, the Atos Origin group
liability cannot be triggered for the message content. Although the
sender endeavours to maintain a computer virus-free network, the sender
does not warrant that this transmission is virus-free and will not be
liable for any damages resulting from any virus transmitted.
__________________________________________________________________________
> List some possible reasons why arriving at consensus about
> decorators has been so hard (or impossible) to achieve
Thanks. I think that was an important contribution to the discussion.
At this point, looking at the meta-discussion is likely to best way to
help cut through the current morass.
+There is no one clear reason why this should be so, but a
+few problems seem to be most problematic.
A possible social issue is that decorators can be used in a tremendous
variety of ways, each of which is only needed or appreciated by small,
disjoint groups of users. For instance, applications to ctypes have
unique needs that not understood or shared by others. Some users want to
use decorators for metaclass magic that scares the socks off of the
rest.
> + Almost everyone agrees that decorating/transforming a function at
> + the end of its definition is suboptimal.
One other point of agreement is that no one like having to write the
function name three times:
def f():
...
f = deco(f)
> +* Syntactic constraints.
There is an existing (though clumsy) syntax. Most of the proposals are
less flexible but more elegant. This trade-off has created much more
consternation than if there were no existing ways to apply decorations.
+* Overall unfamiliarity with the concept. For people who have a
+ passing acquaintance with algebra (or even basic arithmetic) or have
+ used at least one other programming language, much of Python is
+ intuitive. Very few people will have had any experience with the
+ decorator concept before encountering it in Python. There's just no
+ strong preexisting meme that captures the concept.
My take on this one is that there are some existing memes from C# and
Java and that in the future they will be more widely known.
However, there are not good existing, thought-out concepts of what
exactly decorators should do in terms of preserving docstrings,
attributes, using parameters, etc. Most have only a single application
in mind and the full implications (and possible applications) of a given
syntax are shrouded in the mists of the future.
Like metaclasses, everyone knew they were powerful when they were put
in, but no one knew how they would be used or whether they were
necessary. In fact, two versions later, we still don't know those
answers.
Raymond
Tim Peters <tim.peters(a)gmail.com>:
> [François Revol]
> > I'm updating the BeOS port of python to include the latest version
> > in
> > Zeta, the next incarnation of BeOS.
> >
> > I'm having some problem when enabling pymalloc:
> > [zapped]
> > 0 bytes originally requested
> > The 4 pad bytes at p-4 are FORBIDDENBYTE, as expected.
> > The 4 pad bytes at tail=0x80010fb8 are not all FORBIDDENBYTE
> > (0xfb):
> > at tail+0: 0xdb *** OUCH
> > at tail+1: 0xdb *** OUCH
> > at tail+2: 0xdb *** OUCH
> > at tail+3: 0xdb *** OUCH
> > The block was made by call #3688627195 to debug malloc/realloc.
>
> No it wasn't. The four bytes following the 4 trailing 0xfb hold the
> call number, and they're apparently corrupt too.
Eh...
>
> > Fatal Python error: bad trailing pad byte
> >
> > indeed, there seem to be 4 deadblock marks between the forbidden
> > ones,
> > while the len is supposed to be null:
>
> That's reliable. If there actually was a request for 0 bytes (that
> is, assuming this pointer isn't just pointing at a random memory
> address), the debug pymalloc allocates 16 bytes for it, filled with
>
> 00000000 fbfbfbfb fbfbfbfb serial
>
> where "serial" is a big-endian 4-byte "serial number" uniquely
> identifying which call to malloc (or realloc) this was. The address
> of the second block of fb's is returned.
Yes that's what I deduced from the code of pymalloc.
>
> > python:dm 0x80010fb8-8 32
> > 80010fb0 00000000 fbfbfbfb dbdbdbdb fbfbdbdb
> > .................
>
>
> > 80010fc0 0100fbfb 507686ef 04000000 fbfbfbfb
> > .......vP........
> > 80010fd0 8013cbc8 fbfbfbfb 44ee0100 ffed0100
> > ............D....
> >
> > Any clue ?
>
> Try gcc without -O. Nobody has reported anything like this before --
> you're in for a lot of fun <wink>.
>
OK, tried -O0 -g but same result.
I suspect it might be a bad interaction with fork(), as it crashes in a
child, quite badly, as no images are repported as loaded in the team (=
no binary are mapped in the process), however the areas are there (=
pages).
Now, I don't see why malloc itself would give such a result, it's
pyMalloc which places those marks, so the thing malloc does wouldn't
place them 4 bytes of each other for no reason, or repport 0 bytes
where 4 are allocated.
François.