I've received some enthusiastic emails from someone who wants to
revive restricted mode. He started out with a bunch of patches to the
CPython runtime using ctypes, which he attached to an App Engine bug:
http://code.google.com/p/googleappengine/issues/detail?id=671
Based on his code (the file secure.py is all you need, included in
secure.tar.gz) it seems he believes the only security leaks are
__subclasses__, gi_frame and gi_code. (I have since convinced him that
if we add "restricted" guards to these attributes, he doesn't need the
functions added to sys.)
I don't recall the exploits that Samuele once posted that caused the
death of rexec.py -- does anyone recall, or have a pointer to the
threads?
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
Alright, I will re-submit with the contents pasted. I never use double
backquotes as I think them rather ugly; that is the work of an editor
or some automated program in the chain. Plus, it also messed up my
line formatting and now I have lines with one word on them... Anyway,
the contents of PEP 3145:
PEP: 3145
Title: Asynchronous I/O For subprocess.Popen
Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
Type: Standards Track
Content-Type: text/plain
Created: 04-Aug-2009
Python-Version: 3.2
Abstract:
In its present form, the subprocess.Popen implementation is prone to
dead-locking and blocking of the parent Python script while waiting on data
from the child process.
Motivation:
A search for "python asynchronous subprocess" will turn up numerous
accounts of people wanting to execute a child process and communicate with
it from time to time reading only the data that is available instead of
blocking to wait for the program to produce data [1] [2] [3]. The current
behavior of the subprocess module is that when a user sends or receives
data via the stdin, stderr and stdout file objects, dead locks are common
and documented [4] [5]. While communicate can be used to alleviate some of
the buffering issues, it will still cause the parent process to block while
attempting to read data when none is available to be read from the child
process.
Rationale:
There is a documented need for asynchronous, non-blocking functionality in
subprocess.Popen [6] [7] [2] [3]. Inclusion of the code would improve the
utility of the Python standard library that can be used on Unix based and
Windows builds of Python. Practically every I/O object in Python has a
file-like wrapper of some sort. Sockets already act as such and for
strings there is StringIO. Popen can be made to act like a file by simply
using the methods attached the the subprocess.Popen.stderr, stdout and
stdin file-like objects. But when using the read and write methods of
those options, you do not have the benefit of asynchronous I/O. In the
proposed solution the wrapper wraps the asynchronous methods to mimic a
file object.
Reference Implementation:
I have been maintaining a Google Code repository that contains all of my
changes including tests and documentation [9] as well as blog detailing
the problems I have come across in the development process [10].
I have been working on implementing non-blocking asynchronous I/O in the
subprocess.Popen module as well as a wrapper class for subprocess.Popen
that makes it so that an executed process can take the place of a file by
duplicating all of the methods and attributes that file objects have.
There are two base functions that have been added to the subprocess.Popen
class: Popen.send and Popen._recv, each with two separate implementations,
one for Windows and one for Unix based systems. The Windows
implementation uses ctypes to access the functions needed to control pipes
in the kernel 32 DLL in an asynchronous manner. On Unix based systems,
the Python interface for file control serves the same purpose. The
different implementations of Popen.send and Popen._recv have identical
arguments to make code that uses these functions work across multiple
platforms.
When calling the Popen._recv function, it requires the pipe name be
passed as an argument so there exists the Popen.recv function that passes
selects stdout as the pipe for Popen._recv by default. Popen.recv_err
selects stderr as the pipe by default. "Popen.recv" and "Popen.recv_err"
are much easier to read and understand than "Popen._recv('stdout' ..." and
"Popen._recv('stderr' ..." respectively.
Since the Popen._recv function does not wait on data to be produced
before returning a value, it may return empty bytes. Popen.asyncread
handles this issue by returning all data read over a given time
interval.
The ProcessIOWrapper class uses the asyncread and asyncwrite functions to
allow a process to act like a file so that there are no blocking issues
that can arise from using the stdout and stdin file objects produced from
a subprocess.Popen call.
References:
[1] [ python-Feature Requests-1191964 ] asynchronous Subprocess
http://mail.python.org/pipermail/python-bugs-list/2006-December/
036524.html
[2] Daily Life in an Ivory Basement : /feb-07/problems-with-subprocess
http://ivory.idyll.org/blog/feb-07/problems-with-subprocess
[3] How can I run an external command asynchronously from Python? - Stack
Overflow
http://stackoverflow.com/questions/636561/how-can-i-run-an-external-
command-asynchronously-from-python
[4] 18.1. subprocess - Subprocess management - Python v2.6.2 documentation
http://docs.python.org/library/subprocess.html#subprocess.Popen.wait
[5] 18.1. subprocess - Subprocess management - Python v2.6.2 documentation
http://docs.python.org/library/subprocess.html#subprocess.Popen.kill
[6] Issue 1191964: asynchronous Subprocess - Python tracker
http://bugs.python.org/issue1191964
[7] Module to allow Asynchronous subprocess use on Windows and Posix
platforms - ActiveState Code
http://code.activestate.com/recipes/440554/
[8] subprocess.rst - subprocdev - Project Hosting on Google Code
http://code.google.com/p/subprocdev/source/browse/doc/subprocess.rst?spec=s…
[9] subprocdev - Project Hosting on Google Code
http://code.google.com/p/subprocdev
[10] Python Subprocess Dev
http://subdev.blogspot.com/
Copyright:
This P.E.P. is licensed under the Open Publication License;
http://www.opencontent.org/openpub/.
On Tue, Sep 8, 2009 at 22:56, Benjamin Peterson <benjamin(a)python.org> wrote:
> 2009/9/7 Eric Pruitt <eric.pruitt(a)gmail.com>:
>> Hello all,
>>
>> I have been working on adding asynchronous I/O to the Python
>> subprocess module as part of my Google Summer of Code project. Now
>> that I have finished documenting and pruning the code, I present PEP
>> 3145 for its inclusion into the Python core code. Any and all feedback
>> on the PEP (http://www.python.org/dev/peps/pep-3145/) is appreciated.
>
> Hi Eric,
> One of the reasons you're not getting many response is that you've not
> pasted the contents of the PEP in this message. That makes it really
> easy for people to comment on various sections.
>
> BTW, it seems like you were trying to use reST formatting with the
> text PEP layout. Double backquotes only mean something in reST.
>
>
> --
> Regards,
> Benjamin
>
Which I noticed since it's cited in the BeOpen license we still refer
to in LICENSE. Since pythonlabs.com itself is still up, it probably
isn't much work to make the logos.html URI work again, but I don't know
who maintains that page.
cheer,
Georg
--
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.
Hello everyone.
I see several problems with the two hex-conversion function pairs that
Python offers:
1. binascii.hexlify and binascii.unhexlify
2. bytes.fromhex and bytes.hex
Problem #1:
bytes.hex is not implemented, although it was specified in PEP 358.
This means there is no symmetrical function to accompany bytes.fromhex.
Problem #2:
Both pairs perform the same function, although The Zen Of Python suggests
that
"There should be one-- and preferably only one --obvious way to do it."
I do not understand why PEP 358 specified the bytes function pair although
it mentioned the binascii pair...
Problem #3:
bytes.fromhex may receive spaces in the input string, although
binascii.unhexlify may not.
I see no good reason for these two functions to have different features.
Problem #4:
binascii.unhexlify may receive both input types: strings or bytes, whereas
bytes.fromhex raises an exception when given a bytes parameter.
Again there is no reason for these functions to be different.
Problem #5:
binascii.hexlify returns a bytes type - although ideally, converting to hex
should
always return string types and converting from hex should always return
bytes.
IMO there is no meaning of bytes as an output of hexlify, since the output
is a
representation of other bytes.
This is also the suggested behavior of bytes.hex in PEP 358
Problems #4 and #5 call for a decision about the input and output of the
functions being discussed:
Option A : Strict input and output
unhexlify (and bytes.fromhex) may only receives string and may only return
bytes
hexlify (and bytes.hex) may only receives bytes and may only return strings
Option B : Robust input and strict output
unhexlify (and bytes.fromhex) may receive bytes and strings and may only
return bytes
hexlify (and bytes.hex) may receive bytes or strings and may only return
strings
Of course we may also consider a third option, which will allow the return
type of
all functions to be robust (perhaps specified in a keyword argument), but as
I wrote in
the description of problem #5, I see no sense in that.
Note that PEP 3137 describes: "... the more strict definitions of encoding
and decoding in
Python 3000: encoding always takes a Unicode string and returns a bytes
sequence, and decoding
always takes a bytes sequence and returns a Unicode string." - suggesting
option A.
To repeat problems #4 and #5, the current behavior does not match any
option:
* The return type of binascii.hexlify should be string, and this is not the
current behavior.
As for the input:
* Option A is not the current behavior because binascii.unhexlify may
receive both input types.
* Option B is not the current behavior because bytes.fromhex does not allow
bytes as input.
To fix these issues, three changes should be applied:
1. Deprecate bytes.fromhex. This fixes the following problems:
#4 (go with option B and remove the function that does not allow bytes
input)
#2 (the binascii functions will be the only way to "do it")
#1 (bytes.hex should not be implemented)
2. In order to keep the functionality that bytes.fromhex has over unhexlify,
the latter function should be able to handle spaces in its input (fix #3)
3. binascii.hexlify should return string as its return type (fix #5)
Hi,
recently I had a use case where I wanted to use logging in two
completely separate parts of the same process. One of them
needs to create instances a specific Logger subclass, while the
other is fine with the default loggers.
I got around the problem of the unique root node by using two
Managers (and then using Manager.getLogger() instead of
getLogger()), but I can only set the loggerClass globally.
Making the loggerClass configurable per manager would solve the
problem for me, and AFAICS since most applications don't use
different managers anyway, there should not be any detrimental
effects. What do you think?
cheers,
Georg
2009/12/30 Martin (gzlist) <gzlist(a)googlemail.com>:
> Hi Benjamin,
Hi!
>
> In rev 74094 of Python, you split the unittest module up, could you
> point me at any bug entries or discussion over this revision so I can
> catch up?
This was mostly a discussion on IRC between Michael Foord and myself.
>
> As a side-effect you seem to have changed the method of marking a
> module as not worth including in a traceback to be no longer
> extensible.
>
> Before:
> <http://svn.python.org/view/python/trunk/Lib/unittest.py?view=markup&pathrev…>
>
> A global was set at the top of the module:
>
> __unittest = 1
>
> Which is then checked for when constructing traceback output:
>
> def _is_relevant_tb_level(self, tb):
> return '__unittest' in tb.tb_frame.f_globals
>
> After:
> <http://svn.python.org/view/python/trunk/Lib/unittest/__init__.py?revision=7…>
>
> def _is_relevant_tb_level(self, tb):
> globs = tb.tb_frame.f_globals
> is_relevant = '__name__' in globs and \
> globs["__name__"].startswith("unittest")
> del globs
> return is_relevant
>
> Only packages actually named "unittest" can be excluded.
>
> What is now the prefered method of marking a module as test-internal?
> Overriding the leading-underscore _is_relevant_tb_level method? How
> can this be done cooperatively by different packages?
When I made that change, I didn't know that the __unittest "hack" was
being used elsewhere outside of unittest, so I felt fine replacing it
with another. While I still consider it an implementation detail, I
would be ok with exposing an "official" API for this. Perhaps
__unittest_ignore_traceback?
> I would have CCed a mailinglist with this question but don't like
> getting yelled at for posting on the wrong one, please feel free to do
> so with your reply if you feel it's appropriate (the CCing, not the
> yelling).
python-dev is perfect for this discussion.
--
Regards,
Benjamin
So there wasn't really any more feedback on the last post of the
argparse PEP other than a typo fix and another +1.
http://www.python.org/dev/peps/pep-0389/
Can I get a pronouncement? Here's a summary of the responses. (Please
correct me if I misinterpreted anyone.)
* Floris Bruynooghe +1
* Brett Cannon +1
* Nick Coghlan +1
* Michael Foord +1
* Yuval Greenfield +1
* Doug Hellmann +1
* Kevin Jacobs +1
* Paul Moore +1
* Jesse Noller +1
* Fernando Perez +1
* Jon Ribbens +1
* Vinay Sajip +1
* Barry Warsaw +1
* Antoine Pitrou -0
* Martin v. Löwis -0
* M.-A. Lemburg -1
Note that I've interpreted those who were opposed to the deprecation
of getopt as -0 since the PEP no longer proposes that, only the
deprecation of optparse. (People who opposed optparse's deprecation
are still -1.)
If there's any other information that would be helpful for a
pronouncement, please let me know.
Steve
--
Where did you get that preposterous hypothesis?
Did Steve tell you that?
--- The Hiphopopotamus
Hi,
On behalf of the Distutils-SIG, I would like to propose to addition of
PEP 345 (once and *if* PEP 386 is accepted).
It's the metadata v1.2: http://www.python.org/dev/peps/pep-0345/
PEP 345 was initiated a while ago by Richard Jones, and reworked since
then together with PEP 386, at Pycon last year and in Distutils-SIG.
The major enhancements are:
- being able to express dependencies on other *distributions* names,
rather than packages names or modules names. This
enhancement comes from Setuptools and has been used successfully for
years by the community.
- being able to express some fields which values are specific to some
platforms. For example, being able to define "pywin32"
as a dependency *only* on win32. This enhancement will allow any
tool to query PyPI and to get the metadata for a particular
execution context, without having to download, build, or install the
project itself.
- being able to provide a list of browsable URLs for the project, like
a bug tracker, a repository etc, in addition to the home url.
This will allow UIs like PyPI to display a list of URLs for a
project. A side-effect will be that a project maintainer will be able
to drive its
end users to the right places when they need to find detailed
documentation or provide some feedback. This enhancement
was driven by the discussions about the rating/comment system at
PyPI on catalog-sig.
We believe that having PEP 386 and PEP 345 accepted will be a major
improvement for the Python packaging eco-system. The next PEP in the
series we are working on is PEP 376.
As a side note, I would really like to see them (hopefully) accepted
before the first beta of Python 2.7 so we can add these features in
2.7/3.2 and start to work on third-party tools (Distribute, Pip, a
standalone version of Distutils for 2.6/2.5, etc..) to get ready to
support them by the time 2.7 final is out.
Regards
Tarek
--
Tarek Ziadé | http://ziade.org
Hello,
When I ported gmpy (Python to GMP multiple precision library) to
Python 3.x, I began to use PyLong_AsLongAndOverflow frequently. I
found the code to slightly faster and cleaner than using PyLong_AsLong
and checking for overflow. I looked at making PyLong_AsLongAndOverflow
available to Python 2.x. http://bugs.python.org/issue7528 includes a
patch that adds PyLong_AsLongAndOverflow to Python 2.7.
I also included a file (py3intcompat.c) that can be included with an
extension's source code and provides PyLong_AsLongAndOverflow to
earlier versions of Python 2.x. In the bug report, I suggested that
py3intcompat.c could be included in the Misc directory and be made
available to extension authors. This follows the precedent of
pymemcompat.h. But there may be more "compatibility" files that could
benefit extension authors. Mark Dickinson suggested that I bring the
topic on python-dev.
Several questions come to mind:
1) Is it reasonable to provide backward compatibility files (either as
.h or .c) to provide support to new API calls to extension authors?
2) If yes, should they be included with the Python source or
distributed as a separate entity? (2to3 and/or 3to2 projects, a Wiki
page)
3) If not, and extension authors can create their own compatibility
files, are there any specific attribution or copyright messages that
must be included? (I assuming the compatibility was done by extracting
the code for the new API and tweaking it to run on older versions of
Python.)
Thanks in advance for your attention,
Case Van Horsen
Hello,
A while ago I've proposed to refactor the APIs that provides access to
the installation paths and configuration variables at runtime into a
single module called "sysconfig", and make it easier for all
implementations to work with them.
I've started a branch and worked on it, and I'd like to ask here for
some feedback. And in particular from Jython and IronPython people
because they would probably need to work in that file for their
implementation and/or propose things to add. My understanding is that
we have people like Phillip (Jenvey), Michael F., Frank W. in this
list so they can comment directly and I don't need to cross-post this
mail elsewhere.
== Installation schemes ==
First, the module contains the installation schemes for each platform
CPython uses.
An install scheme is a mapping where the key is the "code" name for a
directory, and
the value the path of that directory, with some $variable that can be expanded.
Install schemes are stored in a private mapping, where the keys are
usually the value of os.name,
and the value, the mapping I've mentionned earlier.
So, for example, the paths for win32 are:
_INSTALL_SCHEMES = {
...
'nt': {
'stdlib': '$base/Lib',
'platstdlib': '$base/Lib',
'purelib': '$base/Lib/site-packages',
'platlib': '$base/Lib/site-packages',
'include': '$base/include',
'platinclude': '$base/include',
'scripts': '$base/Scripts',
'data' : '$base',
},
...
}
where each key corresponds to a directory that contains some Python files:
- stdlib : root of the standard library
- platstdlib: root of platform-specific elements of the standard library
- purelib: the site-packages directory for pure python modules
- platlib: the site-packages directory for platform-specific modules
- include: the include dir
- platinclude: the include dir for platform-specific files
- scripts: the directory where scripts are added
- data: the directory where data file are added
All these directory are read and used by:
- distutils when a package is installed, so the install command can
dispatch files in the right place
- site.py, when Python is initialized
IOW, any part of the stdlib can use these paths to locate and work
with Python files.
The public APIs are:
* get_path_names() : returns a list of the path names ("stdlib",
"platstdlib", etc.)
* get_paths(scheme, vars) : Returns a mapping containing an install scheme.
- "scheme" is the name of the scheme, if not provided will get the
default scheme of the current platform
- vars is an optonal mapping that can provide values for the
various $variables. Notice that they all have
default values, for example $base == sys.prefix.
for example: get_paths('nt')
* get_path(name, scheme, vars): Returns one path corresponding to the scheme.
for example : get_paths('stdlib', 'nt')
Those API are generic, but maybe we could add specific APIs like:
* get_stdlib_path('nt')
These API are basically a refactoring of what already exist in
distutils/command/install.py
== Configuration variables ==
distutils.sysconfig currently provides some APIs to read values from
files like Makefile and pyconfig.h.
These API have been placed in the global sysconfig module:
* get_config_vars(): return a dictionary of all configuration
variables relevant for the current platform.
* get_config_var(name): Return the value of a single variable
* get_platform(): Return a string that identifies the current
platform. (this one is used by site.py for example)
* get_python_version() : return the short python version
(sys.version[:3]) -- this one could probably go away but is useful
because that's the marker used by Python in some paths.
== code, status, next steps ==
The code of the module can be viewed here, it's a revamp of distutils.sysconfig:
http://svn.python.org/view/*checkout*/python/branches/tarek_sysconfig/Lib/s…
I've refactored distutils/ and site.py so they work with this new
module, and added deprecation warnings in distutils.sysconfig.
All tests pass in the branch, but note that the code is still using
the .h and Makefile files. This will probably be removed later in
favor of a static _sysconfig.py file generated when Python is built,
containing the variables sysconfig reads. I'll do this second step
after I get some feedback on the proposal.
Regards
Tarek
--
Tarek Ziadé | http://ziade.org