I've received some enthusiastic emails from someone who wants to
revive restricted mode. He started out with a bunch of patches to the
CPython runtime using ctypes, which he attached to an App Engine bug:
http://code.google.com/p/googleappengine/issues/detail?id=671
Based on his code (the file secure.py is all you need, included in
secure.tar.gz) it seems he believes the only security leaks are
__subclasses__, gi_frame and gi_code. (I have since convinced him that
if we add "restricted" guards to these attributes, he doesn't need the
functions added to sys.)
I don't recall the exploits that Samuele once posted that caused the
death of rexec.py -- does anyone recall, or have a pointer to the
threads?
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
Alright, I will re-submit with the contents pasted. I never use double
backquotes as I think them rather ugly; that is the work of an editor
or some automated program in the chain. Plus, it also messed up my
line formatting and now I have lines with one word on them... Anyway,
the contents of PEP 3145:
PEP: 3145
Title: Asynchronous I/O For subprocess.Popen
Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
Type: Standards Track
Content-Type: text/plain
Created: 04-Aug-2009
Python-Version: 3.2
Abstract:
In its present form, the subprocess.Popen implementation is prone to
dead-locking and blocking of the parent Python script while waiting on data
from the child process.
Motivation:
A search for "python asynchronous subprocess" will turn up numerous
accounts of people wanting to execute a child process and communicate with
it from time to time reading only the data that is available instead of
blocking to wait for the program to produce data [1] [2] [3]. The current
behavior of the subprocess module is that when a user sends or receives
data via the stdin, stderr and stdout file objects, dead locks are common
and documented [4] [5]. While communicate can be used to alleviate some of
the buffering issues, it will still cause the parent process to block while
attempting to read data when none is available to be read from the child
process.
Rationale:
There is a documented need for asynchronous, non-blocking functionality in
subprocess.Popen [6] [7] [2] [3]. Inclusion of the code would improve the
utility of the Python standard library that can be used on Unix based and
Windows builds of Python. Practically every I/O object in Python has a
file-like wrapper of some sort. Sockets already act as such and for
strings there is StringIO. Popen can be made to act like a file by simply
using the methods attached the the subprocess.Popen.stderr, stdout and
stdin file-like objects. But when using the read and write methods of
those options, you do not have the benefit of asynchronous I/O. In the
proposed solution the wrapper wraps the asynchronous methods to mimic a
file object.
Reference Implementation:
I have been maintaining a Google Code repository that contains all of my
changes including tests and documentation [9] as well as blog detailing
the problems I have come across in the development process [10].
I have been working on implementing non-blocking asynchronous I/O in the
subprocess.Popen module as well as a wrapper class for subprocess.Popen
that makes it so that an executed process can take the place of a file by
duplicating all of the methods and attributes that file objects have.
There are two base functions that have been added to the subprocess.Popen
class: Popen.send and Popen._recv, each with two separate implementations,
one for Windows and one for Unix based systems. The Windows
implementation uses ctypes to access the functions needed to control pipes
in the kernel 32 DLL in an asynchronous manner. On Unix based systems,
the Python interface for file control serves the same purpose. The
different implementations of Popen.send and Popen._recv have identical
arguments to make code that uses these functions work across multiple
platforms.
When calling the Popen._recv function, it requires the pipe name be
passed as an argument so there exists the Popen.recv function that passes
selects stdout as the pipe for Popen._recv by default. Popen.recv_err
selects stderr as the pipe by default. "Popen.recv" and "Popen.recv_err"
are much easier to read and understand than "Popen._recv('stdout' ..." and
"Popen._recv('stderr' ..." respectively.
Since the Popen._recv function does not wait on data to be produced
before returning a value, it may return empty bytes. Popen.asyncread
handles this issue by returning all data read over a given time
interval.
The ProcessIOWrapper class uses the asyncread and asyncwrite functions to
allow a process to act like a file so that there are no blocking issues
that can arise from using the stdout and stdin file objects produced from
a subprocess.Popen call.
References:
[1] [ python-Feature Requests-1191964 ] asynchronous Subprocess
http://mail.python.org/pipermail/python-bugs-list/2006-December/
036524.html
[2] Daily Life in an Ivory Basement : /feb-07/problems-with-subprocess
http://ivory.idyll.org/blog/feb-07/problems-with-subprocess
[3] How can I run an external command asynchronously from Python? - Stack
Overflow
http://stackoverflow.com/questions/636561/how-can-i-run-an-external-
command-asynchronously-from-python
[4] 18.1. subprocess - Subprocess management - Python v2.6.2 documentation
http://docs.python.org/library/subprocess.html#subprocess.Popen.wait
[5] 18.1. subprocess - Subprocess management - Python v2.6.2 documentation
http://docs.python.org/library/subprocess.html#subprocess.Popen.kill
[6] Issue 1191964: asynchronous Subprocess - Python tracker
http://bugs.python.org/issue1191964
[7] Module to allow Asynchronous subprocess use on Windows and Posix
platforms - ActiveState Code
http://code.activestate.com/recipes/440554/
[8] subprocess.rst - subprocdev - Project Hosting on Google Code
http://code.google.com/p/subprocdev/source/browse/doc/subprocess.rst?spec=s…
[9] subprocdev - Project Hosting on Google Code
http://code.google.com/p/subprocdev
[10] Python Subprocess Dev
http://subdev.blogspot.com/
Copyright:
This P.E.P. is licensed under the Open Publication License;
http://www.opencontent.org/openpub/.
On Tue, Sep 8, 2009 at 22:56, Benjamin Peterson <benjamin(a)python.org> wrote:
> 2009/9/7 Eric Pruitt <eric.pruitt(a)gmail.com>:
>> Hello all,
>>
>> I have been working on adding asynchronous I/O to the Python
>> subprocess module as part of my Google Summer of Code project. Now
>> that I have finished documenting and pruning the code, I present PEP
>> 3145 for its inclusion into the Python core code. Any and all feedback
>> on the PEP (http://www.python.org/dev/peps/pep-3145/) is appreciated.
>
> Hi Eric,
> One of the reasons you're not getting many response is that you've not
> pasted the contents of the PEP in this message. That makes it really
> easy for people to comment on various sections.
>
> BTW, it seems like you were trying to use reST formatting with the
> text PEP layout. Double backquotes only mean something in reST.
>
>
> --
> Regards,
> Benjamin
>
Which I noticed since it's cited in the BeOpen license we still refer
to in LICENSE. Since pythonlabs.com itself is still up, it probably
isn't much work to make the logos.html URI work again, but I don't know
who maintains that page.
cheer,
Georg
--
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.
Hello everyone.
I see several problems with the two hex-conversion function pairs that
Python offers:
1. binascii.hexlify and binascii.unhexlify
2. bytes.fromhex and bytes.hex
Problem #1:
bytes.hex is not implemented, although it was specified in PEP 358.
This means there is no symmetrical function to accompany bytes.fromhex.
Problem #2:
Both pairs perform the same function, although The Zen Of Python suggests
that
"There should be one-- and preferably only one --obvious way to do it."
I do not understand why PEP 358 specified the bytes function pair although
it mentioned the binascii pair...
Problem #3:
bytes.fromhex may receive spaces in the input string, although
binascii.unhexlify may not.
I see no good reason for these two functions to have different features.
Problem #4:
binascii.unhexlify may receive both input types: strings or bytes, whereas
bytes.fromhex raises an exception when given a bytes parameter.
Again there is no reason for these functions to be different.
Problem #5:
binascii.hexlify returns a bytes type - although ideally, converting to hex
should
always return string types and converting from hex should always return
bytes.
IMO there is no meaning of bytes as an output of hexlify, since the output
is a
representation of other bytes.
This is also the suggested behavior of bytes.hex in PEP 358
Problems #4 and #5 call for a decision about the input and output of the
functions being discussed:
Option A : Strict input and output
unhexlify (and bytes.fromhex) may only receives string and may only return
bytes
hexlify (and bytes.hex) may only receives bytes and may only return strings
Option B : Robust input and strict output
unhexlify (and bytes.fromhex) may receive bytes and strings and may only
return bytes
hexlify (and bytes.hex) may receive bytes or strings and may only return
strings
Of course we may also consider a third option, which will allow the return
type of
all functions to be robust (perhaps specified in a keyword argument), but as
I wrote in
the description of problem #5, I see no sense in that.
Note that PEP 3137 describes: "... the more strict definitions of encoding
and decoding in
Python 3000: encoding always takes a Unicode string and returns a bytes
sequence, and decoding
always takes a bytes sequence and returns a Unicode string." - suggesting
option A.
To repeat problems #4 and #5, the current behavior does not match any
option:
* The return type of binascii.hexlify should be string, and this is not the
current behavior.
As for the input:
* Option A is not the current behavior because binascii.unhexlify may
receive both input types.
* Option B is not the current behavior because bytes.fromhex does not allow
bytes as input.
To fix these issues, three changes should be applied:
1. Deprecate bytes.fromhex. This fixes the following problems:
#4 (go with option B and remove the function that does not allow bytes
input)
#2 (the binascii functions will be the only way to "do it")
#1 (bytes.hex should not be implemented)
2. In order to keep the functionality that bytes.fromhex has over unhexlify,
the latter function should be able to handle spaces in its input (fix #3)
3. binascii.hexlify should return string as its return type (fix #5)
At 10:59 AM 3/7/2010 -0800, Jeffrey Yasskin wrote:
>So is it that you just don't like the idea of blocking, and want to
>stop anything that relies on it from getting into the standard library?
Um, no. As I said before, call it a "parallel task queue" or
"parallel task manager" or something to that general effect and I'm on board.
It may not be in the Zen of Python, but ISTM that names should
generally follow use cases. It is something of a corollary to "one
obvious way to do it", in that if you see something whose name
matches what you want to do, then it should be obvious that that's
the way in question. ;-)
The use cases for "parallel task queues", however, are a subset of
those for "futures" in the general case. Since the proposed module
addresses most of the former but very little of the latter, calling
it futures is inappropriate.
Specifically, it's:
1. Confusing to people who don't know what futures are (see e.g R.D.
Murray's post), and
2. Underpowered for people who expect/want a more fully-featured
futures system along the lines of E or Deferreds.
It seems that the only people for whom it's an intuitively correct
description are people who've only had experience with more limited
futures models (like Java's). However, these people should not have
a problem understanding the notion of parallel task queueing or task
management, so changing the name isn't really a loss for them, and
it's a gain for everybody else.
> Given the set_result and set_exception methods, it's pretty
> straightforward to fill in the value of a future from something
> that isn't purely computational.
Those are described as "internal" methods in the PEP; by contrast,
the Deferred equivalents are part of the public API.
> Given a way to register "on-done" callbacks with the future, it
> would be straightforward to wait for a future without blocking, too.
Yes, and with a few more additions besides that one, you might be on
the way to an actual competitor for Deferreds. For example: retry
support, chaining, logging, API for transparent result processing,
coroutine support, co-ordination tools like locks, sempaphores and queues, etc.
These are all things you would very likely want or need if you
actually wanted to write a program using futures as *your main
computational model*, vs. just needing to toss out some parallel
tasks in a primarily synchronous program.
Of course, Deferreds are indeed overkill if all you're ever going to
want is a few parallel tasks, unless you're already skilled in using
Twisted or some wrapper for it.
So, I totally support having a simple task queue in the stdlib, as
there are definitely times I would've used such a thing for a quick
script, if it were available.
However, I've *also* had use cases for using futures as a
computational model, and so that's what I originally thought this PEP
was about. After the use cases were clarified, though, it seems to
me that *calling* it futures is a bad idea, because it's really just
a nice task queuing system.
I'm +1 on adding a nice task queuing system, -1 on calling it by any
other name. ;-)
Hey all,
This seems to happen whenever we do a new release (we've had a couple of
emails to webmaster(a)python.org about it since 2.6.5 was released). The
main download page for Python has a broken link for the Mac installer
(because it hasn't been built yet I assume):
http://python.org/download/
The link 404s, with no explanation or alternate link - so for the casual
user who wants to install Python 2.6 on Mac OS X they are
sorely-out-of-luck.
Not being able to provide a mac installer at the same time as other
platforms is one thing (and I accept that is unavoidable), breaking the
download links for Mac users for unspecified lengths of time is just bad
practise. If we create a new stable release without a Mac installer can
we at least provide a brief explanation and link to the *previous
version* until the new version is ready?
All the best,
Michael
-------- Original Message --------
Subject: Broken link to down
Date: Sun, 21 Mar 2010 13:40:36 +0000
From: Ben Hodgson <ben(a)benhodgson.com>
To: webmaster(a)python.org
Hey there,
In case you don't know, the link on http://www.python.org/download/ to
the Python 2.6.5 Mac Installer Disk Image
(http://www.python.org/ftp/python/2.6.5/python-2.6.5_macosx10.3.dmg) is
broken.
Cheers,
Ben
--
Ben Hodgson
http://benhodgson.com/
I'd appreciate some opinions on this. Personally, I'm in the "the current
code is buggy" camp. :-) I can code up the changes to the syslog module
if we decide that's the right way to go. Looks like Raymond, Guido, and I
are the last ones to do syslog-specific changes to this module in the last
15 years.
If you call:
from syslog import syslog, openlog
syslog('My error message')
Something like the following gets logged:
Mar 18 05:20:22 guin python: My error message
^^^^^^
Where I'm annoyed by the "python" in the above. This is pulled from
the C argv[0]. IMHO, what we really want is the Python sys.argv[0].
Because of this feature, I always have to do the following when I'm using
syslog:
openlog(os.path.basename(sys.argv[0]))
I would propose changing the Python syslog() call to do the C equivalent of:
if openlog_hasnt_been_called_before_now:
if sys.argv[0]:
syslog.openlog(os.path.basename(sys.argv[0]))
In other words, if there's a script name and openlog hasn't already been
called, call openlog with the basename of the script name.
This is effectively what happens in normal C code that calls syslog.
Note that the Python syslog.openlog says that the default for ident is
"(usually) 'syslog'".
The benefit of this change is that you get a more identifiable ident string
in Python programs, so when you look at the logs you can tell what script
it came from, not just that it came from some Python program. One way of
looking at it would be that the syslog module logging "python" as the
program name is a bug.
The downside I see is that there might be some users doing log scraping
depending on this (IMHO, buggy) log ident.
Something else just occurred to me though, a nice enhancement would be for
openlog() to have a "None" default for ident, something like:
orig_openlog = openlog
def openlog(ident = None, logopt = 0, facility = LOG_USER)
if ident == None:
if sys.argv[0]:
ident = os.path.basename(sys.argv[0])
else:
ident = 'python'
orig_openlog(ident, logopt, facility)
So you could call openlog and rely on the default ident (sys.argv[0]),
and set the logopt and facility.
The gory details of why this is occurring are: If you don't call
"openlog()" before "syslog()", the system syslog() function calls
something similar to: "openlog(basename(argv[0]))", which causes the
"ident" of the syslog messages to be "python" rather than the specific
program.
Thoughts?
Thanks,
Sean
--
"Every increased possession loads us with new weariness."
-- John Ruskin
Sean Reifschneider, Member of Technical Staff <jafo(a)tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
On Mon, Mar 29, 2010 at 4:50 PM, Tarek Ziadé <ziade.tarek(a)gmail.com> wrote:
> Anatoly, I am now answering only in Distutils-SIG.
>
> On Mon, Mar 29, 2010 at 3:45 PM, anatoly techtonik <techtonik(a)gmail.com> wrote:
> [..]
Seems like I start to hate mailing lists even more with all this
message duplication and thread following nightmare. Why can't people
here create a Google Groups mirror?
--
anatoly t.
Hi all,
Having been active in bug triage and patch writing/reviewing since late
2009, it was suggested in the python-dev IRC channel that I request commit
access to the repository. I'm primarily a Windows user and have worked with
many of the other active contributors to diagnose issues and test patches
when they don't have direct access to Windows.
Brian Curtin
p.s. My contributor form in on file as of 2010-01-31.
At 09:58 AM 3/16/2010 -0500, Facundo Batista wrote:
>I'm +0 to allow these comparisons, being "Decimal(1) < .3" the same as
>"Decimal(1) < Decimal.from_float(.3)"
Does Decimal.from_float() use the "shortest decimal representation" approach?
If not, it might be confusing if a number that prints as '.1'
compares unequal to Decimal('.1').