Perhaps this is more approprate for python-list but I looks like a
bug to me. Example code:
class A:
def __str__(self):
return u'\u1234'
'%s' % u'\u1234' # this works
'%s' % A() # this doesn't work
It will work if 'A' subclasses from 'unicode' but should not be
necessary, IMHO. Any reason why this shouldn't be fixed?
Neil
Hi.
[Mark Hammond]
> The point isn't about my suffering as such. The point is more that
> python-dev owns a tiny amount of the code out there, and I don't believe we
> should put Python's users through this.
>
> Sure - I would be happy to "upgrade" all the win32all code, no problem. I
> am also happy to live in the bleeding edge and take some pain that will
> cause.
>
> The issue is simply the user base, and giving Python a reputation of not
> being able to painlessly upgrade even dot revisions.
I agree with all this.
[As I imagined explicit syntax did not catch up and would require
lot of discussions.]
[GvR]
> > Another way is to use special rules
> > (similar to those for class defs), e.g. having
> >
> > <frag>
> > y=3
> > def f():
> > exec "y=2"
> > def g():
> > return y
> > return g()
> >
> > print f()
> > </frag>
> >
> > # print 3.
> >
> > Is that confusing for users? maybe they will more naturally expect 2
> > as outcome (given nested scopes).
>
> This seems the best compromise to me. It will lead to the least
> broken code, because this is the behavior that we had before nested
> scopes! It is also quite easy to implement given the current
> implementation, I believe.
>
> Maybe we could introduce a warning rather than an error for this
> situation though, because even if this behavior is clearly documented,
> it will still be confusing to some, so it is better if we outlaw it in
> some future version.
>
Yes this can be easy to implement but more confusing situations can arise:
<frag>
y=3
def f():
y=9
exec "y=2"
def g():
return y
return y,g()
print f()
</frag>
What should this print? the situation leads not to a canonical solution
as class def scopes.
or
<frag>
def f():
from foo import *
def g():
return y
return g()
print f()
</frag>
[Mark Hammond]
> > This probably won't be a very popular suggestion, but how about pulling
> > nested scopes (I assume they are at the root of the problem)
> > until this can be solved cleanly?
>
> Agreed. While I think nested scopes are kinda cool, I have lived without
> them, and really without missing them, for years. At the moment the cure
> appears worse then the symptoms in at least a few cases. If nothing else,
> it compromises the elegant simplicity of Python that drew me here in the
> first place!
>
> Assuming that people really _do_ want this feature, IMO the bar should be
> raised so there are _zero_ backward compatibility issues.
I don't say anything about pulling nested scopes (I don't think my opinion
can change things in this respect)
but I should insist that without explicit syntax IMO raising the bar
has a too high impl cost (both performance and complexity) or creates
confusion.
[Andrew Kuchling]
> >Assuming that people really _do_ want this feature, IMO the bar should be
> >raised so there are _zero_ backward compatibility issues.
>
> Even at the cost of additional implementation complexity? At the cost
> of having to learn "scopes are nested, unless you do these two things
> in which case they're not"?
>
> Let's not waffle. If nested scopes are worth doing, they're worth
> breaking code. Either leave exec and from..import illegal, or back
> out nested scopes, or think of some better solution, but let's not
> introduce complicated backward compatibility hacks.
IMO breaking code would be ok if we issue warnings today and implement
nested scopes issuing errors tomorrow. But this is simply a statement
about principles and raised impression.
IMO import * in an inner scope should end up being an error,
not sure about 'exec's.
We will need a final BDFL statement.
regards, Samuele Pedroni.
I thought it would be nice to try to improve the mimetypes module by having
it, on Windows, query the Registry to get the mapping of filename extensions
to media types, since the mimetypes code currently just blindly checks
posix-specific paths for httpd-style mapping files. However, it seems that the
way to get mappings from the Windows registry is excessively slow in Python.
I'm told that the reason has to do with the limited subset of APIs that are
exposed in the _winreg module. I think it is that EnumKey(key, index) is
querying for the entire list of subkeys for the given key every time you call
it. Or something. Whatever the situation is, the code I tried below is way
slower than I think it ought to be.
Does anyone have any suggestions (besides "write it in C")? Could _winreg
possibly be improved to provide an iterator or better interface to get the
subkeys? (or certain ones? There are a lot of keys under HKEY_CLASSES_ROOT,
and I only need the ones that start with a period). Should I file this as a
feature request?
Thanks
-Mike
from _winreg import HKEY_CLASSES_ROOT, OpenKey, EnumKey, QueryValueEx
i = 0
typemap = {}
try:
while 1:
subkeyname = EnumKey(HKEY_CLASSES_ROOT, i)
try:
subkey = OpenKey(HKEY_CLASSES_ROOT, subkeyname)
if subkeyname[:1] == '.':
data = QueryValueEx(subkey, 'Content Type')[0]
print subkeyname, '=', data
typemap[subkeyname] = data # data will be unicode
except EnvironmentError, WindowsError:
pass
i += 1
except WindowsError:
pass
Failure running the test suite today with -u compiler enabled on Windows XP.
test_logging
Assertion failed: bp != NULL, file
\code\python\dist\src\Objects\obmalloc.c, line 604
The debugger says the error is here:
msvcr71d.dll!_assert(const char * expr=0x1e22bcc0, const char *
filename=0x1e22bc94, unsigned int lineno=604) Line 306 C
python24_d.dll!PyObject_Malloc(unsigned int nbytes=100) Line 604 + 0x1b C
python24_d.dll!_PyObject_DebugMalloc(unsigned int nbytes=84) Line
1014 + 0x9 C
python24_d.dll!PyThreadState_New(_is * interp=0x00951028) Line 136 + 0x7 C
python24_d.dll!PyGILState_Ensure() Line 430 + 0xc C
python24_d.dll!t_bootstrap(void * boot_raw=0x02801d48) Line 431 + 0x5 C
python24_d.dll!bootstrap(void * call=0x04f0d264) Line 166 + 0x7 C
msvcr71d.dll!_threadstart(void * ptd=0x026a2320) Line 196 + 0xd C
I've been seeing this sort of error on-and-off for at least a year
with my Python 2.3 install. It's the usual reason my spambayes
popproxy dies. I can't recell seeing it before on Windows or while
running the test suite.
Jeremy
My thesis (which, for those who don't know, was to come up with a way to do
type inferencing in the compiler without requiring any semantic changes;
basically type inferencing atomic types assigned to local variables) is now far
enough long that I have the algorithm done and I can generate statistics on
what opcodes are called with the most common types that I can specifically
infer (can also do static type checking on occasion; only triggers were actual
unit tests making sure TypeError was raised for certain things like ``~4.2``
and such). Thought some of you might get a kick out of this since the numbers
are rather blatent for certain opcodes and methods.
To read the stats, the number to the left is the number of times the opcode was
compiled (not executed!) with the specific type(s) known for the opcode (if it
took two args, then both types are listed; order was considered irrelevant).
Now they are listed as integers, so here is the conversion::
Basestring 4
IntegralType 8
FloatType 16
ImagType 32
DictType 64
ListType 128
TupleType 256
For the things named "meth_<something>" that is the method being called
immediately on the type.
Now realize these numbers are only for opcodes where I could definitely infer
the type; ones where it could be more than one type, regardless if those
possibilities were very specific, I just ignored it and did not include in the
stats.
I also tweaked some opcodes knowing how they are more often used. So, for
instance, BINARY_MODULO checks specifically for the case of when the left side
is a basestring and then just doesn't worry about the other args. Other ones I
just didn't bother with all the args since it was not interesting to me in
terms of deciding what type-specific opcodes I want to come up with.
Anyway, here are the numbers on Lib sans Lib/test (129,814 lines according to
SLOCCount) for the ones above 100::
(101, ('BINARY_MULTIPLY', (8, 4))),
(106, ('BINARY_SUBSCR', 128)),
(118, ('GET_ITER', 128)),
(124, ('BINARY_MODULO', None)),
(195, ('meth_join', 4)),
(204, ('BINARY_ADD', (8, 8))),
(331, ('BINARY_ADD', (4, 4))),
(513, ('BINARY_LSHIFT', (8, 8))),
(840, ('meth_append', 128)),
(1270, ('PRINT_ITEM', 4)),
(1916, ('BINARY_MODULO', 4)),
(12302, ('STORE_SUBSCR', 64))]
We sure like our dictionaries (for those that don't know, dictionaries are
created by making an empty dict and then basically doing an indivual assignment
for each value). We also seem to love to use string interpolation, and
printing stuff. Using list.append is also popular. Now the BINARY_LSHIFT is
rather interesting, and that ties into the whole issue of how much I can
actually infer; since binary work tends to be with all constants I can infer it
really easily and so its frequency is rather high. Its actual frequency of
use, though, compared to other things probably is not high, though. Plus I
doubt Plone, for instance, uses ``<<`` very often so I suspect the opcode will
get weeded out when I incorporate stats from the other apps I am taking stats from.
As for the stuff I cut out, the surprising thing from those numbers was how few
mathematical expressions could be inferred. I checked my numbers with grep and
there really is only 3 times where a float constant is divided by a float
constant (and they are all in colorsys). I was not expecting that at all.
Guess global variables or object attributes tend to have them or I just can't
infer the values. Either way I just wasn't expecting that.
Anyway, as I said I just thought some people might find this interesting.
Don't read into this too much since I am just using these numbers as guidelines
for type-specific opcodes to write for use as a quantifiable measurement of the
usefulness of type inferencing like this.
-Brett
P.S.: anyone who is *really* interested I can send you the full stats for the
apps I have run my modified version of compile.c against.
I spent some time the other day looking at the use of bare except statements in
the standard library.
Many of them seemed to fall into the category of 'need to catch anything user
code is likely to throw, but shouldn't be masking SystemExit, StopIteration,
KeyboardInterrupt, MemoryError, etc'.
Changing them to "except Exception:" doesn't help, since all of the above still
fit into that category (Tim posted a message recently about rearranging the
Exception heirarchy to fix this. Backwards compatibility woes pretty much killed
the discussion though).
However, another possibility occurred to me:
try:
# Do stuff
except sys.special_exceptions:
raise
except:
# Deal with all the mundane stuff
With an appropriately defined tuple, that makes it easy for people to "do the
right thing" with regards to critical exceptions. Such a tuple could also be
useful for invoking isinstance() and issubclass().
Who knows? If something like this caught on, it might some day be possible to
kill a Python script with a single press of Ctrl-C };>
Cheers,
Nick.
--
Nick Coghlan
Brisbane, Australia
Patch # 1035498 attempts to implement the semantics suggested by Ilya and
Anthony and co.
"python -m module"
Means:
- find the source file for the relevant module (using the standard locations for
module import)
- run the located script as __main__ (note that containing packages are NOT
imported first - it's as if the relevant module was executed directly from the
command line)
- as with '-c' anything before the option is an argument to the interpreter,
anything after is an argument to the script
The allowed modules are those whose associated source file meet the normal rules
for a command line script. I believe that means .py and .pyc files only (e.g.
"python -m profile" works, but "python -m hotshot" does not).
Special import hooks (such as zipimport) almost certainly won't work (since I
don't believe they work with the current command line script mechanism).
Cheers,
Nick.
--
Nick Coghlan
Brisbane, Australia
I've noticed several times now, in both debug and release builds, that
if I run regrtest.py with -uall, *sometimes* it just stops after
running test_compiler:
$ python_d regrtest.py -uall
test_grammar
test_opcodes
...
test_compare
test_compile
test_compiler
$
There's no indication of error, it just ends. It's not consistent.
Happened once when I was running with -v, and test_compiler's output ended here:
...
compiling C:\Code\python\lib\test\test_operator.py
compiling C:\Code\python\lib\test\test_optparse.py
compiling C:\Code\python\lib\test\test_os.py
compiling C:\Code\python\lib\test\test_ossaudiodev.py
compiling C:\Code\python\lib\test\test_parser.py
In particular, there's no
Ran M tests in Ns
output, so it doesn't look like unittest (let alone regrtest) ever got
control back.
Hmm. os.listdir() is in sorted order on NTFS, so test_compiler should
be chewing over a lot more files after test_parser.py.
*This* I could blame on a blown C stack -- although I'd expect a much
nastier symptom then than just premature termination.
Anyone else?
At 11:36 PM 9/30/04 +0100, Michael Sparks wrote:
>On Thu, 30 Sep 2004, Phillip J. Eby wrote:
>...
> > A mechanism to pass values or exceptions into generators
>
>[ Possibly somewhat off topic, and apologies if it is, and I'm positive
> someone's done something similar before, but I think it's relevant to
> the discussion in hand -- largely because the above use case *doesn't*
> require changes to python... ]
I know it doesn't; peak.events does this now, but in order to have a decent
programmer interface, the implementation involves a fair amount of
magic. Similarly, PEP 334 doesn't call for anything you can't do with a
bit of work and magic. I was suggesting, however, that a true "simple
co-routine" (similar to a generator but with bidirectional communication of
values and exceptions) would be a valuable addition to the language, in the
area of simplifying async programming in e.g. Twisted and peak.events.
To put it another way: Slap the current title of PEP 334 onto the body of
PEP 288, and change its syntax so you have a way to pass values and
exceptions *in* to a suspended "coroutine" (a new and different animal from
a generator), and I'm sold.
I've packaged up the idea of a coroutine facility using iterators and an
exception, SuspendIteration. This would require some rather deep
changes to how generators are implemented, however, it seems backwards
compatible, implementable /w JVM or CLR, and would make most of my
database/web development work far more pleasant.
http://www.python.org/peps/pep-0334.html
Cheers!
Clark
...
PEP: 334
Title: Simple Coroutines via SuspendIteration
Version: $Revision: 1.1 $
Last-Modified: $Date: 2004/09/08 00:11:18 $
Author: Clark C. Evans <info(a)clarkevans.com>
Status: Draft
Type: Standards Track
Python-Version: 3.0
Content-Type: text/x-rst
Created: 26-Aug-2004
Post-History:
Abstract
========
Asynchronous application frameworks such as Twisted [1]_ and Peak
[2]_, are based on a cooperative multitasking via event queues or
deferred execution. While this approach to application development
does not involve threads and thus avoids a whole class of problems
[3]_, it creates a different sort of programming challenge. When an
I/O operation would block, a user request must suspend so that other
requests can proceed. The concept of a coroutine [4]_ promises to
help the application developer grapple with this state management
difficulty.
This PEP proposes a limited approach to coroutines based on an
extension to the iterator protocol [5]_. Currently, an iterator may
raise a StopIteration exception to indicate that it is done producing
values. This proposal adds another exception to this protocol,
SuspendIteration, which indicates that the given iterator may have
more values to produce, but is unable to do so at this time.
Rationale
=========
There are two current approaches to bringing co-routines to Python.
Christian Tismer's Stackless [6]_ involves a ground-up restructuring
of Python's execution model by hacking the 'C' stack. While this
approach works, its operation is hard to describe and keep portable. A
related approach is to compile Python code to Parrot [7]_, a
register-based virtual machine, which has coroutines. Unfortunately,
neither of these solutions is portable with IronPython (CLR) or Jython
(JavaVM).
It is thought that a more limited approach, based on iterators, could
provide a coroutine facility to application programmers and still be
portable across runtimes.
* Iterators keep their state in local variables that are not on the
"C" stack. Iterators can be viewed as classes, with state stored in
member variables that are persistent across calls to its next()
method.
* While an uncaught exception may terminate a function's execution, an
uncaught exception need not invalidate an iterator. The proposed
exception, SuspendIteration, uses this feature. In other words,
just because one call to next() results in an exception does not
necessarily need to imply that the iterator itself is no longer
capable of producing values.
There are four places where this new exception impacts:
* The simple generator [8]_ mechanism could be extended to safely
'catch' this SuspendIteration exception, stuff away its current
state, and pass the exception on to the caller.
* Various iterator filters [9]_ in the standard library, such as
itertools.izip should be made aware of this exception so that it can
transparently propagate SuspendIteration.
* Iterators generated from I/O operations, such as a file or socket
reader, could be modified to have a non-blocking variety. This
option would raise a subclass of SuspendIteration if the requested
operation would block.
* The asyncore library could be updated to provide a basic 'runner'
that pulls from an iterator; if the SuspendIteration exception is
caught, then it moves on to the next iterator in its runlist [10]_.
External frameworks like Twisted would provide alternative
implementations, perhaps based on FreeBSD's kqueue or Linux's epoll.
While these may seem dramatic changes, it is a very small amount of
work compared with the utility provided by continuations.
Semantics
=========
This section will explain, at a high level, how the introduction of
this new SuspendIteration exception would behave.
Simple Iterators
----------------
The current functionality of iterators is best seen with a simple
example which produces two values 'one' and 'two'. ::
class States:
def __iter__(self):
self._next = self.state_one
return self
def next(self):
return self._next()
def state_one(self):
self._next = self.state_two
return "one"
def state_two(self):
self._next = self.state_stop
return "two"
def state_stop(self):
raise StopIteration
print list(States())
An equivalent iteration could, of course, be created by the
following generator::
def States():
yield 'one'
yield 'two'
print list(States())
Introducing SuspendIteration
----------------------------
Suppose that between producing 'one' and 'two', the generator above
could block on a socket read. In this case, we would want to raise
SuspendIteration to signal that the iterator is not done producing,
but is unable to provide a value at the current moment. ::
from random import randint
from time import sleep
class SuspendIteration(Exception):
pass
class NonBlockingResource:
"""Randomly unable to produce the second value"""
def __iter__(self):
self._next = self.state_one
return self
def next(self):
return self._next()
def state_one(self):
self._next = self.state_suspend
return "one"
def state_suspend(self):
rand = randint(1,10)
if 2 == rand:
self._next = self.state_two
return self.state_two()
raise SuspendIteration()
def state_two(self):
self._next = self.state_stop
return "two"
def state_stop(self):
raise StopIteration
def sleeplist(iterator, timeout = .1):
"""
Do other things (e.g. sleep) while resource is
unable to provide the next value
"""
it = iter(iterator)
retval = []
while True:
try:
retval.append(it.next())
except SuspendIteration:
sleep(timeout)
continue
except StopIteration:
break
return retval
print sleeplist(NonBlockingResource())
In a real-world situation, the NonBlockingResource would be a file
iterator, socket handle, or other I/O based producer. The sleeplist
would instead be an async reactor, such as those found in asyncore or
Twisted. The non-blocking resource could, of course, be written as a
generator::
def NonBlockingResource():
yield "one"
while True:
rand = randint(1,10)
if 2 == rand:
break
raise SuspendIteration()
yield "two"
It is not necessary to add a keyword, 'suspend', since most real
content generators will not be in application code, they will be in
low-level I/O based operations. Since most programmers need not be
exposed to the SuspendIteration() mechanism, a keyword is not needed.
Application Iterators
---------------------
The previous example is rather contrived, a more 'real-world' example
would be a web page generator which yields HTML content, and pulls
from a database. Note that this is an example of neither the
'producer' nor the 'consumer', but rather of a filter. ::
def ListAlbums(cursor):
cursor.execute("SELECT title, artist FROM album")
yield '<html><body><table><tr><td>Title</td><td>Artist</td></tr>'
for (title, artist) in cursor:
yield '<tr><td>%s</td><td>%s</td></tr>' % (title, artist)
yield '</table></body></html>'
The problem, of course, is that the database may block for some time
before any rows are returned, and that during execution, rows may be
returned in blocks of 10 or 100 at a time. Ideally, if the database
blocks for the next set of rows, another user connection could be
serviced. Note the complete absence of SuspendIterator in the above
code. If done correctly, application developers would be able to
focus on functionality rather than concurrency issues.
The iterator created by the above generator should do the magic
necessary to maintain state, yet pass the exception through to a
lower-level async framework. Here is an example of what the
corresponding iterator would look like if coded up as a class::
class ListAlbums:
def __init__(self, cursor):
self.cursor = cursor
def __iter__(self):
self.cursor.execute("SELECT title, artist FROM album")
self._iter = iter(self._cursor)
self._next = self.state_head
return self
def next(self):
return self._next()
def state_head(self):
self._next = self.state_cursor
return "<html><body><table><tr><td>\
Title</td><td>Artist</td></tr>"
def state_tail(self):
self._next = self.state_stop
return "</table></body></html>"
def state_cursor(self):
try:
(title,artist) = self._iter.next()
return '<tr><td>%s</td><td>%s</td></tr>' % (title, artist)
except StopIteration:
self._next = self.state_tail
return self.next()
except SuspendIteration:
# just pass-through
raise
def state_stop(self):
raise StopIteration
Complicating Factors
--------------------
While the above example is straight-forward, things are a bit more
complicated if the intermediate generator 'condenses' values, that is,
it pulls in two or more values for each value it produces. For
example, ::
def pair(iterLeft,iterRight):
rhs = iter(iterRight)
lhs = iter(iterLeft)
while True:
yield (rhs.next(), lhs.next())
In this case, the corresponding iterator behavior has to be a bit more
subtle to handle the case of either the right or left iterator raising
SuspendIteration. It seems to be a matter of decomposing the
generator to recognize intermediate states where a SuspendIterator
exception from the producing context could happen. ::
class pair:
def __init__(self, iterLeft, iterRight):
self.iterLeft = iterLeft
self.iterRight = iterRight
def __iter__(self):
self.rhs = iter(iterRight)
self.lhs = iter(iterLeft)
self._temp_rhs = None
self._temp_lhs = None
self._next = self.state_rhs
return self
def next(self):
return self._next()
def state_rhs(self):
self._temp_rhs = self.rhs.next()
self._next = self.state_lhs
return self.next()
def state_lhs(self):
self._temp_lhs = self.lhs.next()
self._next = self.state_pair
return self.next()
def state_pair(self):
self._next = self.state_rhs
return (self._temp_rhs, self._temp_lhs)
This proposal assumes that a corresponding iterator written using
this class-based method is possible for existing generators. The
challenge seems to be the identification of distinct states within
the generator where suspension could occur.
Resource Cleanup
----------------
The current generator mechanism has a strange interaction with
exceptions where a 'yield' statement is not allowed within a
try/finally block. The SuspendIterator exception provides another
similar issue. The impacts of this issue are not clear. However it
may be that re-writing the generator into a state machine, as the
previous section did, could resolve this issue allowing for the
situation to be no-worse than, and perhaps even removing the
yield/finally situation. More investigation is needed in this area.
API and Limitations
-------------------
This proposal only covers 'suspending' a chain of iterators, and does
not cover (of course) suspending general functions, methods, or "C"
extension function. While there could be no direct support for
creating generators in "C" code, native "C" iterators which comply
with the SuspendIterator semantics are certainly possible.
Low-Level Implementation
========================
The author of the PEP is not yet familiar with the Python execution
model to comment in this area.
References
==========
.. [1] Twisted
(http://twistedmatrix.com)
.. [2] Peak
(http://peak.telecommunity.com)
.. [3] C10K
(http://www.kegel.com/c10k.html)
.. [4] Coroutines
(http://c2.com/cgi/wiki?CallWithCurrentContinuation)
.. [5] PEP 234, Iterators
(http://www.python.org/peps/pep-0234.html)
.. [6] Stackless Python
(http://stackless.com)
.. [7] Parrot /w coroutines
(http://www.sidhe.org/~dan/blog/archives/000178.html)
.. [8] PEP 255, Simple Generators
(http://www.python.org/peps/pep-0255.html)
.. [9] itertools - Functions creating iterators
(http://docs.python.org/lib/module-itertools.html)
.. [10] Microthreads in Python, David Mertz
(http://www-106.ibm.com/developerworks/linux/library/l-pythrd.html)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:
--
Clark C. Evans Prometheus Research, LLC.
http://www.prometheusresearch.com/
o office: +1.203.777.2550
~/ , mobile: +1.203.444.0557
//
(( Prometheus Research: Transforming Data Into Knowledge
\\ ,
\/ - Research Exchange Database
/\ - Survey & Assessment Technologies
` \ - Software Tools for Researchers
~ *