>> I've just noticed that the StringIO in CVS hasn't been updated to be an
>> iterator, like real file objects. This probably should be fixed before
>> 2.3 is released.
>I think you mean cStringIO.
Actually, I meant StringIO, but I see that I had my details wrong -
I was just accessing it incorrectly (it's not a self-iterator, and has
no next() method). I'll get back in my box now. Thanks.
Andrew McNamara, Senior Developer, Object Craft
Ben Laurie [mailto:firstname.lastname@example.org] wrote:
> > I've asked Ben Laurie to review the patch for me. Once he
> > says it looks OK
> > I'll assign both the bug and the patch to Guido to deal
> > with. Hopefully
> > it's finally right.
> Doesn't seem quite right to me yet - the problem is that if
> data arrives
> 1 byte at a time with just less than the timeout between each
> byte, then
> you can get n*timeout as the actual timeout (where n is
> potentially very
> large). You need to reduce the timeout after each select, surely?
Hmmm. The question is, how should the timeout be interpreted?
The patch I submitted interprets it as the timeout for activity on the
underlying socket itself, since that's the object on which you set the
timeout by calling sock.settimeout(). You're suggesting that it should be
interpreted as the timeout for the call to read() or write() on the ssl
object, which involves potentially many separate socket operations.
Anyone else have any opinion on the correct interpretation? I lean toward
the one I've already implemented, since it requires no new work :-)
> Apart from that, it looks good. Apologies for delay, I managed to
> overlook this.
Ben Laurie [mailto:email@example.com] wrote:
> The point is that in the standard case, a byte on the network is a byte
> in the application, so you either get a byte or you time out in the
> specified time.
> In the SSL case, you could neither get a byte nor time out, at the
> application layer, until much, much later than you thought you specified.
> This seems broken to me, and POLA would suggest I'm right (i.e. if I say
> time out in 1 second, I'll be pretty astonished when that turns into a
I would agree if we were talking about a timeout that was explicitly set on
the SSL layer. But if I set a timeout on a socket, I expect it to apply to
the individual low-level socket operations, not to the higher level SSL
I've uploaded a new patch to set BIO to be non-blocking when necessary and
to retry properly when the SSL error code indicates it:
And, as a result of properly retrying on SSL_connect, it happens to also fix
this bug when socket.setdefaulttimeout() is used in conjunction with ssl:
I've asked Ben Laurie to review the patch for me. Once he says it looks OK
I'll assign both the bug and the patch to Guido to deal with. Hopefully
it's finally right.
> -----Original Message-----
> From: Ben Laurie [mailto:firstname.lastname@example.org]
> Sent: Tuesday, January 28, 2003 11:53 AM
> To: Geoffrey Talvola
> Cc: 'python-dev(a)python.org'
> Subject: Re: [Python-Dev] the new 2.3a1 settimeout() with httplib and
> Geoffrey Talvola wrote:
> > Ben Laurie [mailto:email@example.com] wrote:
> >>Guido van Rossum wrote:
> >>>Hm, from that page it looks like the internal implementation may
> >>>actually repeatedly read from the socket, until it has processed a
> >>>full 16K block. But I may be mistaken, since it also refers to a
> >>>non-blocking underlying "BIO", whatever that is. :-(
> >>BIO is OpenSSL's I/O abstraction - if you have a nonblocking
> >>one, then
> >>SSL_read() will return when a read returns nothing, and if you want
> >>SSL_read() to not block, then you pretty much have to use a
> >>BIO (because even if select() says there's data, there may
> >>not be enough
> >>to actually return any via SSL_read()).
> > That's OK, I think, because what we care about with
> timeouts is detecting
> > when there is _no_ activity on the socket for more than N
> seconds, and
> > select() does detect that situation properly.
> >>I can help out here if there's still a problem.
> > If you'd like, you could quickly review the latest checkin
> here -- I have no
> > prior experience with OpenSSL so that might be prudent:
> But it seems to work fine.
Yeah, but there are corner cases where it won't. If the other end dies
partway through sending an SSL record, then your select will succeed,
but the SSL_read will block forever (or at least until the socket
closes). You do need to put the socket and BIO into a non-blocking mode
for this to work properly.
I can't remember whether you get an error or a 0 back (I think its an
error) when the socket would block, but in any case, that would need to
be handled (presumably by going back around for the remaining time).
If you need more info, I can find it :-)
"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff
A new PEP (305), "CSV File API", is available for reader feedback. This PEP
describes an API and implementation for reading and writing CSV files.
There is a sample implementation available as well which you can take out
for a spin. The PEP is available at
(The latest version as of this note is 1.9. Please wait until that is
available to grab a copy on which to comment.)
The sample implementation, which is heavily based on Object Craft's existing
csv module, is available at
To those people who are already using the Object Craft module, make sure you
rename your csv.so file before trying this one out.
Please send feedback to csv(a)mail.mojam.com. You can subscribe to that list
That page contains a pointer to the list archives.
(Many thanks BTW to Barry Warsaw and the Mailman crew for Mailman 2.1. It
Guido van Rossum wrote:
> > def C as class:
> > suite
> Um, the whole point of this syntax is that property, synchronized
> etc. do *not* have to be keywords -- they are just callables.
How many different types of parsing are needed? Could
some small number of keywords that determine how the
thunk is parsed be intermixed in the same context as
the thunk-aware objects?
def C(B) as class:
def spam() as staticmethod function: ...
def spamspam(klass) as classmethod generator: ...
def spamspamspam as class property: ...
Guido van Rossum [mailto:firstname.lastname@example.org] wrote:
> > On Fri, Jan 31, 2003, Geoffrey Talvola wrote:
> > > Ben Laurie [mailto:email@example.com] wrote:
> > >>
> > >> Doesn't seem quite right to me yet - the problem is that if data
> > >> arrives 1 byte at a time with just less than the timeout
> > >> between each
> > >> byte, then you can get n*timeout as the actual timeout
> > >> (where n is
> > >> potentially very large). You need to reduce the timeout
> > >> after each
> > >> select, surely?
> > >
> > > Hmmm. The question is, how should the timeout be interpreted?
> > >
> > > The patch I submitted interprets it as the timeout for
> > > activity on the
> > > underlying socket itself, since that's the object on which you set
> > > the timeout by calling sock.settimeout(). You're
> > > suggesting that it
> > > should be interpreted as the timeout for the call to
> > > read() or write()
> > > on the ssl object, which involves potentially many separate socket
> > > operations.
> > >
> > > Anyone else have any opinion on the correct
> > > interpretation? I lean
> > > toward the one I've already implemented, since it
> > > requires no new work
> > > :-)
> > I'm in favor of it referring to network activity, since that's the
> > interpretation used by non-SSL timeouts.
> > --
> > Aahz (aahz(a)pythoncraft.com) <*>
> I don't understand what Aahz meant, but if this is the only issue, I'm
> with Geoff. In general, the timeout gets reset whenever a byte is
I think that's exactly what Aahz meant.
In that case, my patch on SF should be good. I've assigned it to you to
While you're at it, you might also take a look at patch 678257 which fixes a
case where socket.sendall() would return before its timeout period expired.
Basically, it was only waiting for timeout on the first pass through the
loop instead of on every iteration.
> But the question is incredibly ill-defined, so it's not clear that
> Raymond will be able to calculate a useful answer. (See Tim's post.)
Even a simple measure which double-counted everything might be useful to me.
Or perhaps a simple measure which counted only memory that would be freed by
destroying this object.
I am after all looking for a needle (excessive memory usage) which is two orders
of magnitude bigger than the hay (memory usage that I am sure I want).
I don't know what the very *best* kind of answer would be. Perhaps a
recursively defined "weight", where an object which holds the *only* reference
to another object carries the weight of that other object, and an object which
holds one of K references to another object carries 1/Kth of its weight. I'm
not sure that would be useful, but it might be.
Raymond Hettinger <raymond.hettinger(a)verizon.net> wrote:
> I'm about to start working on this one and wanted
> to check here first to make sure there is still a
> demand for it [...]
Yes, please! I have wished for this many times as I struggled to figure out
which part of my program was using up too much RAM.
^-- my web page
^-- my big, RAM-over-using, Python program
Guido van Rossum wrote:
> Note that 'property' and 'synchronized' used as examples are not new
> keywords! They are just built-in objects that have the desired
> semantics and know about thunks.
This seems really elegant. I hope it can be worked out.
One thing I find mildly disconcerting is the restriction
that nothing can follow the thunk.
If you can do this:
a = thunkuser:
It seems like you should be able to do this (however
unwieldy it looks):
a,b = (thunkuser:
And once you allow one-line thunks, following them in
an expression won't seem quite as unwieldy:
a,b = (thunkuser: thunk), (thunkuser: thunk)
My discomfort is an artifact of allowing thunks to be
used in expressions. The restriction seems perfectly
reasonable in these contexts:
> What's still missing is a way to add formal parameters to the thunk --
> I presume that J and K are evaluated before interface.interface is
> called. The thunk syntax could be extended to allow this; maybe this
> can too. E.g.:
> e:(x, y):
> would create a thunk with two formal parameters, x and y; and when e
> calls the thunk, it has to call it with two arguments which will be
> placed in x and y. But this is a half-baked idea, and the syntax I
> show here is ambiguous.
Doesn't lambda already accomplish this? Everything between
"lambda" and ":" is interpreted as an argument list.
In the same light, def stores the function name, interprets
the argument list, and sets the function body to the thunk
code. Perhaps the following could be equivalent:
def(func) lambda args: