I've received some enthusiastic emails from someone who wants to
revive restricted mode. He started out with a bunch of patches to the
CPython runtime using ctypes, which he attached to an App Engine bug:
http://code.google.com/p/googleappengine/issues/detail?id=671
Based on his code (the file secure.py is all you need, included in
secure.tar.gz) it seems he believes the only security leaks are
__subclasses__, gi_frame and gi_code. (I have since convinced him that
if we add "restricted" guards to these attributes, he doesn't need the
functions added to sys.)
I don't recall the exploits that Samuele once posted that caused the
death of rexec.py -- does anyone recall, or have a pointer to the
threads?
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
Alright, I will re-submit with the contents pasted. I never use double
backquotes as I think them rather ugly; that is the work of an editor
or some automated program in the chain. Plus, it also messed up my
line formatting and now I have lines with one word on them... Anyway,
the contents of PEP 3145:
PEP: 3145
Title: Asynchronous I/O For subprocess.Popen
Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
Type: Standards Track
Content-Type: text/plain
Created: 04-Aug-2009
Python-Version: 3.2
Abstract:
In its present form, the subprocess.Popen implementation is prone to
dead-locking and blocking of the parent Python script while waiting on data
from the child process.
Motivation:
A search for "python asynchronous subprocess" will turn up numerous
accounts of people wanting to execute a child process and communicate with
it from time to time reading only the data that is available instead of
blocking to wait for the program to produce data [1] [2] [3]. The current
behavior of the subprocess module is that when a user sends or receives
data via the stdin, stderr and stdout file objects, dead locks are common
and documented [4] [5]. While communicate can be used to alleviate some of
the buffering issues, it will still cause the parent process to block while
attempting to read data when none is available to be read from the child
process.
Rationale:
There is a documented need for asynchronous, non-blocking functionality in
subprocess.Popen [6] [7] [2] [3]. Inclusion of the code would improve the
utility of the Python standard library that can be used on Unix based and
Windows builds of Python. Practically every I/O object in Python has a
file-like wrapper of some sort. Sockets already act as such and for
strings there is StringIO. Popen can be made to act like a file by simply
using the methods attached the the subprocess.Popen.stderr, stdout and
stdin file-like objects. But when using the read and write methods of
those options, you do not have the benefit of asynchronous I/O. In the
proposed solution the wrapper wraps the asynchronous methods to mimic a
file object.
Reference Implementation:
I have been maintaining a Google Code repository that contains all of my
changes including tests and documentation [9] as well as blog detailing
the problems I have come across in the development process [10].
I have been working on implementing non-blocking asynchronous I/O in the
subprocess.Popen module as well as a wrapper class for subprocess.Popen
that makes it so that an executed process can take the place of a file by
duplicating all of the methods and attributes that file objects have.
There are two base functions that have been added to the subprocess.Popen
class: Popen.send and Popen._recv, each with two separate implementations,
one for Windows and one for Unix based systems. The Windows
implementation uses ctypes to access the functions needed to control pipes
in the kernel 32 DLL in an asynchronous manner. On Unix based systems,
the Python interface for file control serves the same purpose. The
different implementations of Popen.send and Popen._recv have identical
arguments to make code that uses these functions work across multiple
platforms.
When calling the Popen._recv function, it requires the pipe name be
passed as an argument so there exists the Popen.recv function that passes
selects stdout as the pipe for Popen._recv by default. Popen.recv_err
selects stderr as the pipe by default. "Popen.recv" and "Popen.recv_err"
are much easier to read and understand than "Popen._recv('stdout' ..." and
"Popen._recv('stderr' ..." respectively.
Since the Popen._recv function does not wait on data to be produced
before returning a value, it may return empty bytes. Popen.asyncread
handles this issue by returning all data read over a given time
interval.
The ProcessIOWrapper class uses the asyncread and asyncwrite functions to
allow a process to act like a file so that there are no blocking issues
that can arise from using the stdout and stdin file objects produced from
a subprocess.Popen call.
References:
[1] [ python-Feature Requests-1191964 ] asynchronous Subprocess
http://mail.python.org/pipermail/python-bugs-list/2006-December/
036524.html
[2] Daily Life in an Ivory Basement : /feb-07/problems-with-subprocess
http://ivory.idyll.org/blog/feb-07/problems-with-subprocess
[3] How can I run an external command asynchronously from Python? - Stack
Overflow
http://stackoverflow.com/questions/636561/how-can-i-run-an-external-
command-asynchronously-from-python
[4] 18.1. subprocess - Subprocess management - Python v2.6.2 documentation
http://docs.python.org/library/subprocess.html#subprocess.Popen.wait
[5] 18.1. subprocess - Subprocess management - Python v2.6.2 documentation
http://docs.python.org/library/subprocess.html#subprocess.Popen.kill
[6] Issue 1191964: asynchronous Subprocess - Python tracker
http://bugs.python.org/issue1191964
[7] Module to allow Asynchronous subprocess use on Windows and Posix
platforms - ActiveState Code
http://code.activestate.com/recipes/440554/
[8] subprocess.rst - subprocdev - Project Hosting on Google Code
http://code.google.com/p/subprocdev/source/browse/doc/subprocess.rst?spec=s…
[9] subprocdev - Project Hosting on Google Code
http://code.google.com/p/subprocdev
[10] Python Subprocess Dev
http://subdev.blogspot.com/
Copyright:
This P.E.P. is licensed under the Open Publication License;
http://www.opencontent.org/openpub/.
On Tue, Sep 8, 2009 at 22:56, Benjamin Peterson <benjamin(a)python.org> wrote:
> 2009/9/7 Eric Pruitt <eric.pruitt(a)gmail.com>:
>> Hello all,
>>
>> I have been working on adding asynchronous I/O to the Python
>> subprocess module as part of my Google Summer of Code project. Now
>> that I have finished documenting and pruning the code, I present PEP
>> 3145 for its inclusion into the Python core code. Any and all feedback
>> on the PEP (http://www.python.org/dev/peps/pep-3145/) is appreciated.
>
> Hi Eric,
> One of the reasons you're not getting many response is that you've not
> pasted the contents of the PEP in this message. That makes it really
> easy for people to comment on various sections.
>
> BTW, it seems like you were trying to use reST formatting with the
> text PEP layout. Double backquotes only mean something in reST.
>
>
> --
> Regards,
> Benjamin
>
Which I noticed since it's cited in the BeOpen license we still refer
to in LICENSE. Since pythonlabs.com itself is still up, it probably
isn't much work to make the logos.html URI work again, but I don't know
who maintains that page.
cheer,
Georg
--
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.
Hello everyone.
I see several problems with the two hex-conversion function pairs that
Python offers:
1. binascii.hexlify and binascii.unhexlify
2. bytes.fromhex and bytes.hex
Problem #1:
bytes.hex is not implemented, although it was specified in PEP 358.
This means there is no symmetrical function to accompany bytes.fromhex.
Problem #2:
Both pairs perform the same function, although The Zen Of Python suggests
that
"There should be one-- and preferably only one --obvious way to do it."
I do not understand why PEP 358 specified the bytes function pair although
it mentioned the binascii pair...
Problem #3:
bytes.fromhex may receive spaces in the input string, although
binascii.unhexlify may not.
I see no good reason for these two functions to have different features.
Problem #4:
binascii.unhexlify may receive both input types: strings or bytes, whereas
bytes.fromhex raises an exception when given a bytes parameter.
Again there is no reason for these functions to be different.
Problem #5:
binascii.hexlify returns a bytes type - although ideally, converting to hex
should
always return string types and converting from hex should always return
bytes.
IMO there is no meaning of bytes as an output of hexlify, since the output
is a
representation of other bytes.
This is also the suggested behavior of bytes.hex in PEP 358
Problems #4 and #5 call for a decision about the input and output of the
functions being discussed:
Option A : Strict input and output
unhexlify (and bytes.fromhex) may only receives string and may only return
bytes
hexlify (and bytes.hex) may only receives bytes and may only return strings
Option B : Robust input and strict output
unhexlify (and bytes.fromhex) may receive bytes and strings and may only
return bytes
hexlify (and bytes.hex) may receive bytes or strings and may only return
strings
Of course we may also consider a third option, which will allow the return
type of
all functions to be robust (perhaps specified in a keyword argument), but as
I wrote in
the description of problem #5, I see no sense in that.
Note that PEP 3137 describes: "... the more strict definitions of encoding
and decoding in
Python 3000: encoding always takes a Unicode string and returns a bytes
sequence, and decoding
always takes a bytes sequence and returns a Unicode string." - suggesting
option A.
To repeat problems #4 and #5, the current behavior does not match any
option:
* The return type of binascii.hexlify should be string, and this is not the
current behavior.
As for the input:
* Option A is not the current behavior because binascii.unhexlify may
receive both input types.
* Option B is not the current behavior because bytes.fromhex does not allow
bytes as input.
To fix these issues, three changes should be applied:
1. Deprecate bytes.fromhex. This fixes the following problems:
#4 (go with option B and remove the function that does not allow bytes
input)
#2 (the binascii functions will be the only way to "do it")
#1 (bytes.hex should not be implemented)
2. In order to keep the functionality that bytes.fromhex has over unhexlify,
the latter function should be able to handle spaces in its input (fix #3)
3. binascii.hexlify should return string as its return type (fix #5)
Hi,
recently I had a use case where I wanted to use logging in two
completely separate parts of the same process. One of them
needs to create instances a specific Logger subclass, while the
other is fine with the default loggers.
I got around the problem of the unique root node by using two
Managers (and then using Manager.getLogger() instead of
getLogger()), but I can only set the loggerClass globally.
Making the loggerClass configurable per manager would solve the
problem for me, and AFAICS since most applications don't use
different managers anyway, there should not be any detrimental
effects. What do you think?
cheers,
Georg
Hi Mart
I'm back with some news about wpython. I completed all the work that I was
committed to do till the end of the year. I made a lot of changes to the
code, that I'll report here.
First, I added several conditional compilation sections that enable or
disable almost every optimization I introduced into the project. Everything
is controlled by a new include file, wpython.h, which holds a lot of
#DEFINEs for each one of them.
Every #DEFINE has a brief explanation, and some report an example with
Python code disassembled, showing what happens.
It can be useful both to document the code (also to access to the interested
parts), and to let people test the effect of all optimizations. There are
also a couple of #DEFINEs which are useful to enable or disable all
superinstructions, or to make wpython work like CPython (with all new
optimizations and superinstructions disabled).
Full tracing support required a big effort, due to the missing
SETUP_LOOP/POP_BLOCK instructions used in FOR_ITER blocks. It was a pain in
the neck to let them work, but I think I have found a good solution fot it.
If I remember correctly, Collin asked in the past about performance with
testing enabled. I believe that speed is comparable to CPython, since I can
trace FOR_ITER blocks enter/exit with very little time spent intercepting
them; stack unrolling (for forward jumps cases) is fast too.
Restoring Python object model required much of the work. I reverted all the
changes that I made to many PyObjects, and just added some accessory code
only to a few of them. There are no more hacks, and code is quite polite;
only CodeObject required one line of code change in the hash function, to
let it calculate hash correctly for the constants tuple (because it can hold
lists and dictionaries now, which usally aren't hashable).
Every file in Include/ and Objects/ that I modified has only 1 diff (except
frameobject.c, for tracing code), so it's easy so see what is changed and
the extra helper functions that I added to introduce lists and dictionaries
in the consts tuple.
In the meanwhile I've added a little optimization for lists and dictionaries
used in for loops. Writing this:
def f():
for x in ['a', 'b', 'c']: print x
generates the following (word)code with the previous wpython:
LOAD_CONST (['a', 'b', 'c'])
DEEP_LIST_COPY
GET_ITER
FOR_ITER
because ['a', 'b', 'c'] is a mutable object, and a copy must be made before
using it.
Now it'll be:
LOAD_CONST (['a', 'b', 'c'])
GET_ITER
FOR_ITER
So code is reduced and memory consumption too, because there's no need clone
the list. The trick works only for lists and dictionaries that holds
non-mutable objects, but I found it's a common pattern in Python code.
I've also updated the source to the latest Python 2.x version, 2.6.4.
All tests pass, both with Debug and Release code, on Visual Studio Express
with 32 bit code (I can't compile 64 bits versions with it).
There are only a few open issues.
test_syntax.py required some changes in the doctest (adding full filesystem
path) to let them pass correctly. It's really strange, but... works now!
test_compile.py has 2 tests disabled in test_compile_ast:
#['<forblock>', """for n in [1, 2, 3]:\n print n\n"""],
#[fname, fcontents],
that's because there's no support for constants (except Num_kind and
Str_kind) in the current ASTs code. However code compiles well, except that
it cannot make use of the new constant folding code.
I haven't updated Doc/library/dis.rst, which is exactly the same of CPython.
I'll do it when I stop introducing or changing opcodes.
Right now wpython requires manual patching of Include/Python-ast.h, with the
following lines:
enum _expr_kind {BoolOp_kind=1, BinOp_kind=2, UnaryOp_kind=3, Lambda_kind=4,
[...]
List_kind=18, Tuple_kind=19, Const_kind=20};
enum _expr_const {no_const=0, mutable_const=1, content_const=3,
pure_const=7};
struct _expr {
enum _expr_kind kind;
union {
[...]
struct {
object c;
enum _expr_const constant;
} Const;
} v;
int lineno;
int col_offset;
};
They are need to let ast.c handle constants for the new constant folding
code.
I greatly appreciate any help to let it be generated automatically with ASDL
grammar.
That's all about the new code. Now the weird and stupid part. A few days I
got a new gmail account, but accidentally I removed the google account that
I've used to create the wpython at Google Code. I definitely lost project
ownership, so I can't tag the old code and put the new one in trunk.
I'll thank very much if someone that works or has contacts with Google can
ask for moving ownership from my old account (cesare at pronto do it) to my
new (the one which I've using now to write this mail), so I'll commit ASAP.
Alternatively, I need to create a new project at Google Code.
I hope that the community will appreciate the work (when I'll upload it :-).
I know that it's a young project, but I think it's mature enough to take a
look at it.
Last but not least, think about it like a starting point. I have many ideas
on how to optimize several other parts of Python, and the wordcode structure
gives me rooms to do it in an elegant and efficient way thanks to the
superinstructions (when needed).
For the next release I plan to cleanup opcode.h and ceval.c, grouping some
instructions into single superinstructions (CALL_FUNCTIONs and IMPORT_NAME),
adding a few opcodes, and tweaking a bit the VM main loop (primarily
targeted to reduce the jump-table that compilers produce for the big switch
statement).
Then I'll consider porting it to python 2.7 and/or python 3.1/3.2 if
there'll be interest and feedbacks about it.
I'm also at your disposal to discuss any detail about wpython source code,
since I know that it isn't a simple patch to apply to some files.
Cheers,
Cesare
2009/11/4 Mart Sõmermaa <mrts.pydev(a)gmail.com>
> Thanks for the recap and for the good work on wpython!
>
> Best, eagerly waiting for the results of your work to land in mainline
> python,
> MS
>
PyCon 2010 registration has opened! Register by January 6 for the best
rates!
http://us.pycon.org/2010/registration/
Registering early gets you early-bird registration rates, guarantees you
the tutorials you want, and helps the PyCon volunteers plan better.
Scheduled talk and tutorial lists:
http://us.pycon.org/2010/conference/talks/http://us.pycon.org/2010/tutorials/
We'll see you in Atlanta! Spread the word!
--
Aahz (aahz(a)pythoncraft.com) <*> http://www.pythoncraft.com/
The best way to get information on Usenet is not to ask a question, but
to post the wrong information.
Hi all,
/trunk test_distutils is failing with the following error on Mac OS X 10.5:
Here's the error:
test_distutils
test test_distutils failed -- Traceback (most recent call last):
File
"/private/tmp/tmp8UfLPT/python27/Lib/distutils/tests/test_sdist.py",
line 342, in test_make_distribution_owner_group
self.assertEquals(member.gid, os.getgid())
AssertionError: 0 != 20
It has been a problem for over a week, perhaps longer. I've filed it as:
http://bugs.python.org/issue7408
... So why am I posting this to python-dev?
I went to double-check this on the buildbots and noticed that there
aren't any Mac OS X buildbots. I would be happy to give one or two
people remote access to my Mac OS X 10.5 iMac if someone wanted to
set up a buildbot and/or debug this issue further.
Tarek, I can give you access immediately through your lyorn account, too.
cheers,
--titus
--
C. Titus Brown, ctb(a)msu.edu
Hi all,
I got an MSI build working on my WinXP VM just now, and I wanted to
touch base with whomever it is that is maintaining this (wonderful!)
set of scripts...
I ran into three problems, and I managed to figure out two of them; the third
wasn't fatal. Note, the diff of my fixed checkout is attached.
First, the script that finds & builds the external dependencies has two
minor problems.
* it puts Tcl in tcl-8.*, and Tk in tk-8.*, but msi.py looks for them in
tcl8.* and tk8.* to grab the license text. I changed the glob strings
appropriately and that seemed to work.
* Tix isn't downloaded/installed/built automatically like everything else,
and msi.py looks for its license file, too. I just removed the
Tix reference. I can't figure out how to build Tix appropriately; any
tips?
Second, the buildmsi.bat file refers to python26a3.hhp instead of
python27a0.hhp.
Third, I could not get _tkinter to build properly, although it wasn't fatal
to the endeavor. It couldn't find ..\..\tcltk\lib\tcl85.lib, although
tcl85g.lib existed.
Oh, and there were a bunch of missing commands that (as a non-Windows xpert) I
had to figure out with google -- things like nasm/nasmw, for example. Are
these documented somewhere, or would it be helpful to document them? I think I
had to install:
- Microsoft HTML Help Compiler
- cygwin with make and python2.5 to build the docs
- nasm (and copy nasm.exe to nasmw.exe)
- cabarc
Errm, and the 'buildmsi.bat' file has 'build' misspelled as 'buold' ;)
I'd love to get this build process working completely automatically and
100% correctly, too.
Hat tip to Trent Nelson, who helped me figure out where the scripts are
and what other things I needed...
cheers,
--titus
--
C. Titus Brown, ctb(a)msu.edu