I've received some enthusiastic emails from someone who wants to
revive restricted mode. He started out with a bunch of patches to the
CPython runtime using ctypes, which he attached to an App Engine bug:
http://code.google.com/p/googleappengine/issues/detail?id=671
Based on his code (the file secure.py is all you need, included in
secure.tar.gz) it seems he believes the only security leaks are
__subclasses__, gi_frame and gi_code. (I have since convinced him that
if we add "restricted" guards to these attributes, he doesn't need the
functions added to sys.)
I don't recall the exploits that Samuele once posted that caused the
death of rexec.py -- does anyone recall, or have a pointer to the
threads?
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
Alright, I will re-submit with the contents pasted. I never use double
backquotes as I think them rather ugly; that is the work of an editor
or some automated program in the chain. Plus, it also messed up my
line formatting and now I have lines with one word on them... Anyway,
the contents of PEP 3145:
PEP: 3145
Title: Asynchronous I/O For subprocess.Popen
Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
Type: Standards Track
Content-Type: text/plain
Created: 04-Aug-2009
Python-Version: 3.2
Abstract:
In its present form, the subprocess.Popen implementation is prone to
dead-locking and blocking of the parent Python script while waiting on data
from the child process.
Motivation:
A search for "python asynchronous subprocess" will turn up numerous
accounts of people wanting to execute a child process and communicate with
it from time to time reading only the data that is available instead of
blocking to wait for the program to produce data [1] [2] [3]. The current
behavior of the subprocess module is that when a user sends or receives
data via the stdin, stderr and stdout file objects, dead locks are common
and documented [4] [5]. While communicate can be used to alleviate some of
the buffering issues, it will still cause the parent process to block while
attempting to read data when none is available to be read from the child
process.
Rationale:
There is a documented need for asynchronous, non-blocking functionality in
subprocess.Popen [6] [7] [2] [3]. Inclusion of the code would improve the
utility of the Python standard library that can be used on Unix based and
Windows builds of Python. Practically every I/O object in Python has a
file-like wrapper of some sort. Sockets already act as such and for
strings there is StringIO. Popen can be made to act like a file by simply
using the methods attached the the subprocess.Popen.stderr, stdout and
stdin file-like objects. But when using the read and write methods of
those options, you do not have the benefit of asynchronous I/O. In the
proposed solution the wrapper wraps the asynchronous methods to mimic a
file object.
Reference Implementation:
I have been maintaining a Google Code repository that contains all of my
changes including tests and documentation [9] as well as blog detailing
the problems I have come across in the development process [10].
I have been working on implementing non-blocking asynchronous I/O in the
subprocess.Popen module as well as a wrapper class for subprocess.Popen
that makes it so that an executed process can take the place of a file by
duplicating all of the methods and attributes that file objects have.
There are two base functions that have been added to the subprocess.Popen
class: Popen.send and Popen._recv, each with two separate implementations,
one for Windows and one for Unix based systems. The Windows
implementation uses ctypes to access the functions needed to control pipes
in the kernel 32 DLL in an asynchronous manner. On Unix based systems,
the Python interface for file control serves the same purpose. The
different implementations of Popen.send and Popen._recv have identical
arguments to make code that uses these functions work across multiple
platforms.
When calling the Popen._recv function, it requires the pipe name be
passed as an argument so there exists the Popen.recv function that passes
selects stdout as the pipe for Popen._recv by default. Popen.recv_err
selects stderr as the pipe by default. "Popen.recv" and "Popen.recv_err"
are much easier to read and understand than "Popen._recv('stdout' ..." and
"Popen._recv('stderr' ..." respectively.
Since the Popen._recv function does not wait on data to be produced
before returning a value, it may return empty bytes. Popen.asyncread
handles this issue by returning all data read over a given time
interval.
The ProcessIOWrapper class uses the asyncread and asyncwrite functions to
allow a process to act like a file so that there are no blocking issues
that can arise from using the stdout and stdin file objects produced from
a subprocess.Popen call.
References:
[1] [ python-Feature Requests-1191964 ] asynchronous Subprocess
http://mail.python.org/pipermail/python-bugs-list/2006-December/
036524.html
[2] Daily Life in an Ivory Basement : /feb-07/problems-with-subprocess
http://ivory.idyll.org/blog/feb-07/problems-with-subprocess
[3] How can I run an external command asynchronously from Python? - Stack
Overflow
http://stackoverflow.com/questions/636561/how-can-i-run-an-external-
command-asynchronously-from-python
[4] 18.1. subprocess - Subprocess management - Python v2.6.2 documentation
http://docs.python.org/library/subprocess.html#subprocess.Popen.wait
[5] 18.1. subprocess - Subprocess management - Python v2.6.2 documentation
http://docs.python.org/library/subprocess.html#subprocess.Popen.kill
[6] Issue 1191964: asynchronous Subprocess - Python tracker
http://bugs.python.org/issue1191964
[7] Module to allow Asynchronous subprocess use on Windows and Posix
platforms - ActiveState Code
http://code.activestate.com/recipes/440554/
[8] subprocess.rst - subprocdev - Project Hosting on Google Code
http://code.google.com/p/subprocdev/source/browse/doc/subprocess.rst?spec=s…
[9] subprocdev - Project Hosting on Google Code
http://code.google.com/p/subprocdev
[10] Python Subprocess Dev
http://subdev.blogspot.com/
Copyright:
This P.E.P. is licensed under the Open Publication License;
http://www.opencontent.org/openpub/.
On Tue, Sep 8, 2009 at 22:56, Benjamin Peterson <benjamin(a)python.org> wrote:
> 2009/9/7 Eric Pruitt <eric.pruitt(a)gmail.com>:
>> Hello all,
>>
>> I have been working on adding asynchronous I/O to the Python
>> subprocess module as part of my Google Summer of Code project. Now
>> that I have finished documenting and pruning the code, I present PEP
>> 3145 for its inclusion into the Python core code. Any and all feedback
>> on the PEP (http://www.python.org/dev/peps/pep-3145/) is appreciated.
>
> Hi Eric,
> One of the reasons you're not getting many response is that you've not
> pasted the contents of the PEP in this message. That makes it really
> easy for people to comment on various sections.
>
> BTW, it seems like you were trying to use reST formatting with the
> text PEP layout. Double backquotes only mean something in reST.
>
>
> --
> Regards,
> Benjamin
>
Which I noticed since it's cited in the BeOpen license we still refer
to in LICENSE. Since pythonlabs.com itself is still up, it probably
isn't much work to make the logos.html URI work again, but I don't know
who maintains that page.
cheer,
Georg
--
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.
Hello everyone.
I see several problems with the two hex-conversion function pairs that
Python offers:
1. binascii.hexlify and binascii.unhexlify
2. bytes.fromhex and bytes.hex
Problem #1:
bytes.hex is not implemented, although it was specified in PEP 358.
This means there is no symmetrical function to accompany bytes.fromhex.
Problem #2:
Both pairs perform the same function, although The Zen Of Python suggests
that
"There should be one-- and preferably only one --obvious way to do it."
I do not understand why PEP 358 specified the bytes function pair although
it mentioned the binascii pair...
Problem #3:
bytes.fromhex may receive spaces in the input string, although
binascii.unhexlify may not.
I see no good reason for these two functions to have different features.
Problem #4:
binascii.unhexlify may receive both input types: strings or bytes, whereas
bytes.fromhex raises an exception when given a bytes parameter.
Again there is no reason for these functions to be different.
Problem #5:
binascii.hexlify returns a bytes type - although ideally, converting to hex
should
always return string types and converting from hex should always return
bytes.
IMO there is no meaning of bytes as an output of hexlify, since the output
is a
representation of other bytes.
This is also the suggested behavior of bytes.hex in PEP 358
Problems #4 and #5 call for a decision about the input and output of the
functions being discussed:
Option A : Strict input and output
unhexlify (and bytes.fromhex) may only receives string and may only return
bytes
hexlify (and bytes.hex) may only receives bytes and may only return strings
Option B : Robust input and strict output
unhexlify (and bytes.fromhex) may receive bytes and strings and may only
return bytes
hexlify (and bytes.hex) may receive bytes or strings and may only return
strings
Of course we may also consider a third option, which will allow the return
type of
all functions to be robust (perhaps specified in a keyword argument), but as
I wrote in
the description of problem #5, I see no sense in that.
Note that PEP 3137 describes: "... the more strict definitions of encoding
and decoding in
Python 3000: encoding always takes a Unicode string and returns a bytes
sequence, and decoding
always takes a bytes sequence and returns a Unicode string." - suggesting
option A.
To repeat problems #4 and #5, the current behavior does not match any
option:
* The return type of binascii.hexlify should be string, and this is not the
current behavior.
As for the input:
* Option A is not the current behavior because binascii.unhexlify may
receive both input types.
* Option B is not the current behavior because bytes.fromhex does not allow
bytes as input.
To fix these issues, three changes should be applied:
1. Deprecate bytes.fromhex. This fixes the following problems:
#4 (go with option B and remove the function that does not allow bytes
input)
#2 (the binascii functions will be the only way to "do it")
#1 (bytes.hex should not be implemented)
2. In order to keep the functionality that bytes.fromhex has over unhexlify,
the latter function should be able to handle spaces in its input (fix #3)
3. binascii.hexlify should return string as its return type (fix #5)
I have two somewhat unrelated thoughts about PEPs.
* Accepted: header
When PEP 3147 was accepted, I had a few folks ask that this be recorded in the
PEP by including a link to the BDFL pronouncement email. I realized that
there's no formal way to express this in a PEP, and many PEPs in fact don't
record more than the fact that it was accepted. I'd like to propose
officially adding an Accepted: header which should include a URL to the email
or other web resource where the PEP is accepted. I've come as close as
possible to this (without modifying the supporting scripts or PEP 1) in PEP
3147:
http://www.python.org/dev/peps/pep-3147/
I'd be willing to update things if there are no objections.
* EOL schedule for older releases.
We have semi-formal policies for the lifetimes of Python releases, though I'm
not sure this policy is written down in any of the existing informational
PEPs. However, we have release schedule PEPs going back to Python 1.6. It
seems reasonable to me that we include end-of-life information in those PEPs.
For example, we could state that Python 2.4 is no longer even being maintained
for security, and we could state the projected date that Python 2.6 will go
into security-only maintenance mode.
I would not mandate that we go back and update all previous PEPs for either of
these ideas. We'd adopt them moving forward and allow anyone who's motivated
to backfill information opportunistically.
Thoughts?
-Barry
Hello, everybody.
I've been searching for a data structure like a tuple/list *but* unordered --
like a set, but duplicated elements shouldn't be removed. I have not even
found a recipe, so I'd like to write an implementation and contribute it to
the "collections" module in the standard library.
This is the situation I have: I have a data structure that represents an
arithmetic/boolean operation. Operations can be commutative, which means that
the order of their operands don't change the result of the operation. This is,
the following operations are equivalent:
operation(a, b, c) == operation(c, b, a) == operation(b, a, c)
operation(a, b, a) == operation(a, a, b) == operation(b, a, a)
operation(a, a) == operation(a, a)
So, I need a type to store the arguments/operands so that if two of these
collections have the same elements with the same multiplicity, they are
equivalent, regardless of the order.
A multiset is not exactly what I need: I still need to use the elements in the
order they were given. For example, the logical conjuction (aka "and"
operator) has a left and right operands; I need to evaluate the first/left one
and, if it returned True, then call the second/right one. They must not be
evaluated in a random order.
To sum up, it would behave like a tuple or a list, except when it's compared
with another object: They would be equivalent if they're both unordered
tuples/lists, and have the same elements. There can be mutable and immutable
editions (UnorderedList and UnorderedTuple, respectively).
I will write a PEP to elaborate on this if you think it'd be nice to have. Or,
should I have written the PEP first?
Cheers,
--
Gustavo Narea <xri://=Gustavo>.
| Tech blog: =Gustavo/(+blog)/tech ~ About me: =Gustavo/about |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Debugging a strange problem today, I got the following result:
Sockets open by stdlib libraries are open without the "keepalive"
option, so the system default is used. The system default under linux is
"no keepalive".
So, if you are using a URLlib connection, POP3 connection, IMAP
connection, etc., any stdlib that internally creates a socket, and your
server goes away suddendly (you lose network connectivity, by instance),
the library will wait FOREVER for the server. The client can't detect
that the server is not longer available.
The "keepalive" option will send a probe packed every X minutes of
inactivity, to check if the other side is still alive, even if the
connection is idle.
The issue is bad, but the solution is simple enough. Options:
1. All "client" libraries should create sockets with the "KEEPALIVE" option.
2. Modify the socket C module to create all sockets as "Keepalive" by
default.
3. To have a global variable in the socket module to change the default
for future sockets. Something like current "socket.setdefaulttimeout()".
The default should be "keepalive".
4. Modify client libraries to accept a new optional socket-like object
as an optional parameter. This would allow things like transparent
compression or encryption, or to replace the socket connection by
anything else (read/write to shared memory or database, for example).
This is an issue in Linux because by default the sockets are not
"keepalive". In other Unix systems, the default is "keepalive". I don't
know about MS Windows.
What do you think?. The solution seems trivial, after deciding the right
way to go.
PS: "socket.setdefaulttimeout()" is not enough, because it could
shutdown a perfectly functional connection, just because it was idle for
too long.
- --
Jesus Cea Avion _/_/ _/_/_/ _/_/_/
jcea(a)jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/
jabber / xmpp:jcea@jabber.org _/_/ _/_/ _/_/_/_/_/
. _/_/ _/_/ _/_/ _/_/ _/_/
"Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/
"My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQCVAwUBS8OV2Zlgi5GaxT1NAQIJhwP8CH+gij4KfrJ1oHW5ys4PboH5Ru0pgdly
Wbsza0+uj3p68P1vDnC9jIr7j+fI3ql3DOc8zUoIKGpJoaWVspbv3c4vI4ATLHo+
J6I18dpkviRT8/sT/69vMvghaGndiO0Sks/S4tDjhNstYH7oGjWxi63cKqtGPY/p
WSTLpwrd4SY=
=vkT4
-----END PGP SIGNATURE-----
Hi,
when embedding python 3.1, I have my own free-method in tp_dealloc.
The allocated memory is in host-memory, not in python (dll). Now, the problem is, Python appears to read-access the deallocated memory still after tp_dealloc. After tp_dealloc, I get an access violation if the pyobject-header fields have been modified inside tp_dealloc. If I leave the header unmodified, then no access violation occurs. Accessing deallocated memory sounds like a bug, or is this intended design?
Thank you
Marvin
--
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
In issue 1759169 people have been demanding for quite some time that the
definition of _XOPEN_SOURCE on Solaris should be dropped, as it was
unneeded and caused problems for other software.
Now, issue 8864 reports that the multiprocessing module fails to
compile, and indeed, if _XOPEN_SOURCE is not defined, control messages
stop working. Several of the CMSG interfaces are only available if
_XPG4_2 is defined (and, AFAICT, under no other condition); this, in
turn, apparently is only defined if _XOPEN_SOURCE is 500, 600, or (has
an arbitrary value and _XOPEN_SOURCE_EXTENDED is 1).
So how should I go about fixing that?
a) revert the patch for #1759169, documentating that Python compilation
actually requires _XOPEN_SOURCE to be defined, or
b) define _XOPEN_SOURCE only for the multiprocessing module.
Any input appreciated.
Regards,
Martin
At 06:18 PM 5/30/2010 -0700, Brett Cannon wrote:
>On Sun, May 30, 2010 at 00:40, P.J. Eby <pje(a)telecommunity.com> wrote:
> >
> > Which would completely break one of the major use cases of the
> PEP, which is
> > precisely to ensure that you can install two pieces of code to the same
> > namespace without either one overwriting the other's files.
>
>The PEP says the goal is to span packages across directories.
The goal of namespace packages is to allow separately-distributed
pieces of code to live in the same package namespace. That this is
sometimes achieved by installing them to different paths is an
implementation detail.
In the case of e.g. Linux distributions and other system packaging
scenarios, the code will all be installed to the *same* directory --
so there cannot be any filename collisions among the
separately-distributed modules. That's why we want to get rid of the
need for an __init__.py to mark the directory as a package: it's a
collision point for system package management tools.
> > pkgutil doesn't have such a limitation, except in the case extend_path, and
> > that limitation is one that PEP 382 intends to remove.
>
>It's because pkgutil.extend_path has that limitation I am asking as
>that's what the PEP refers to. If the PEP wants to remove the
>limitation it should clearly state how it is going to do that.
I'm flexible on it either way. The only other importer I know of
that does anything else is one that actually allows (unsafely)
importing from URLs.
If we allow for other things, then we need to extend the PEP 302
protocol to have a way to ask an importer for a subpath string.
>As for adding to the PEP 302 protocols, it's a question of how much we
>want importer implementors to have control over this versus us. I
>personally would rather keep any protocol extensions simple and have
>import handle as many of the details as possible.
I lean the other way a bit, in that the more of the importer
internals you expose, the harder you make it for an importer to be
anything other than a mere virtual file system. (As it is, I think
there is too much "file-ness" coupling in the protocol already, what
with file extensions and the like.)
Indeed, now that I'm thinking about it, it actually seems to make
more sense to just require the importers to implement PEP 382, and
provide some common machinery in imp or pkgutil for reading .pth
strings, setting up __path__, and hunting down all the other directories.