On 6 Feb, 11:53 pm, guido(a)python.org wrote:
>On Sat, Feb 6, 2010 at 3:22 PM, <exarkun(a)twistedmatrix.com> wrote:
>>On 10:29 pm, guido(a)python.org wrote:
>>>
>>>[snip]
>>>
>>>I haven't tried to repro this particular example, but the reason is
>>>that we don't want to have to call getpwd() on every import nor do we
>>>want to have some kind of in-process variable to cache the current
>>>directory. (getpwd() is relatively slow and can sometimes fail
>>>outright, and trying to cache it has a certain risk of being wrong.)
>>
>>Assuming you mean os.getcwd():
>
>Yes.
>>exarkun@boson:~$ python -m timeit -s 'def f(): pass' 'f()'
>>10000000 loops, best of 3: 0.132 usec per loop
>>exarkun@boson:~$ python -m timeit -s 'from os import getcwd'
>>'getcwd()'
>>1000000 loops, best of 3: 1.02 usec per loop
>>exarkun@boson:~$
>>So it's about 7x more expensive than a no-op function call. I'd call
>>this
>>pretty quick. Compared to everything else that happens during an
>>import,
>>I'm not convinced this wouldn't be lost in the noise. I think it's at
>>least
>>worth implementing and measuring.
>
>But it's a system call, and its speed depends on a lot more than the
>speed of a simple function call. It depends on the OS kernel, possibly
>on the filesystem, and so on.
Do you know of a case where it's actually slow? If not, how convincing
should this argument really be? Perhaps we can measure it on a few
platforms before passing judgement.
For reference, my numbers are from Linux 2.6.31 and my filesystem
(though I don't think it really matters) is ext3. I have eglibc 2.10.1
compiled by gcc version 4.4.1.
>Also "os.getcwd()" abstracts away
>various platform details that the C import code would have to
>replicate.
That logic can all be hidden behind a C API which os.getcwd() can then
be implemented in terms of. There's no reason for it to be any harder
to invoke from C than it is from Python.
>Really, the approach of preprocessing sys.path makes much
>more sense. If an app wants sys.path[0] to be an absolute path too
>they can modify it themselves.
That may turn out to be the less expensive approach. I'm not sure in
what other ways it is the approach that makes much more sense.
Quite the opposite: centralizing the responsibility for normalizing this
value makes a lot of sense if you consider things like reducing code
duplication and, in turn, removing the possibility for bugs.
Adding better documentation for __file__ is another task which I think
is worth undertaking, regardless of whether any change is made to how
its value is computed. At the moment, the two or three sentences about
it in PEP 302 are all I've been able to find, and they don't really get
the job done.
Jean-Paul
I have a minor concern about certain corner cases with math.hypot and complex.__abs__, namely when one component is infinite and one is not a number. If we execute the following code:
import math
inf = float('inf')
nan = float('nan')
print math.hypot(inf, nan)
print abs(complex(nan, inf))
... then we see that 'inf' is printed in both cases. The standard library tests (for example, test_cmath.py:test_abs()) seem to test for this behavior as well, and FWIW, I personally agree with this convention. However, the math module's documentation for both 2.6 and 3.1 states, "All functions return a quiet NaN if at least one of the args is NaN."
math.pow(1.0, nan) is another such exception to the rule. Perhaps the documentation should be updated to reflect this.
Thanks,
- David
I've got another idea of having a release timer on
http://python.org/dev/ page together with link to generated release
calendar.
It will help to automatically monitor deadlines for feature fixes in
alpha releases without manually monitoring this mailing list.
There is already a navigation box on the right side where this
information fits like a glove.
Does anybody else find this feature useful for Python development?
--
anatoly t.
Hello all,
The next 'big' change to unittest will (may?) be the introduction of
class and module level setUp and tearDown. This was discussed on
Python-ideas and Guido supported them. They can be useful but are also
very easy to abuse (too much shared state, monolithic test classes and
modules). Several authors of other Python testing frameworks spoke up
*against* them, but several *users* of test frameworks spoke up in
favour of them. ;-)
I'm pretty sure I can introduce setUpClass and setUpModule without
breaking compatibility with existing unittest extensions or backwards
compatibility issues - with the possible exception of test sorting.
Where you have a class level setUp (for example creating a database
connection) you don't want the tearDown executed a *long* time after the
setUp. In the presence of class or module level setUp /tearDown (but
only if they are used) I would expect test sorting to only sort within
the class or module [1]. I will introduce the setUp and tearDown as new
'tests' - so failures are reported separately, and all tests in the
class / module will have an explicit skip in the event of a setUp failure.
A *better* (more general) solution for sharing and managing resources
between tests is to use something like TestResources by Robert Collins.
http://pypi.python.org/pypi/testresources/
A minimal example of using test resources shows very little boilerplate
overhead from what setUpClass (etc) would need, and with the addition of
some helper functions could be almost no overhead. I've challenged
Robert that if he can provide examples of using Test Resources to meet
the class and module level use-cases then I would support bringing Test
Resources into the standard library as part of unittest (modulo
licensing issues which he is happy to work on).
I'm not sure what response I expect from this email, and neither option
will be implemented without further discussion - possibly at the PyCon
sprints - but I thought I would make it clear what the possible
directions are.
All the best,
Michael Foord
[1] I *could* allow sorting of all tests within a module, inserting the
setUpClass / tearDownClass in the right place after the sort. It would
probably be better to group tests per class anyway and in fact the
existing suite sorting support may do this already (in which case it
isn't an issue) - I haven't looked into the implementation.
--
http://www.ironpythoninaction.com/http://www.voidspace.org.uk/blog
READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.
--
http://www.ironpythoninaction.com/http://www.voidspace.org.uk/blog
READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.
It's about time for another 3.1 bug fix release. I propose this schedule:
March 6: Release Candidate (same day as 2.7a4)
March 20: 3.1.2 Final release
--
Regards,
Benjamin
Hi,
I'm working on a new sandbox project. The goal is to create an empty namespace
and write strict rules for the interaction with the existing namespace (full
featured Python namespace).
By default, you cannot read a file, use print, import a module or exit Python.
But you can enable some functions by using config "features".
Example: "regex" feature allows you to import the re module which will
contain a subset of the real re module, just enough to match a regex.
To protect the sandbox namespace, some attributes are "hidden": function
closure and globals, frame locals and type subclasses. __builtins__ is also
replaced by a read-only dictionary. Objects are not directly injected in the
sandbox namespace: a proxy is used to get a read-only view of the object.
pysandbox is based on safelite.py, project written by tav one year ago (search
tav in python-dev archive, February 2009). I tested RestrictedPython, but the
approach is different (rewrite bytecode) and the project is not maintained
since 3 or 4 years (only minor updates on the documentation or the copyright
header). pysandbox is different than RestrictedPython because it blocks
everything by default and has simply config options to enable a set of
features. pysandbox is different than safelite.py because it contains unit
tests (ensure that blocked features are really blocked).
Only attributes/functions allowing to escape the sandbox are blocked. Eg.
frames are still accessibles, only the frame locals are blocked. This
blacklist policy is broken by design, but it's a nice way to quickly get a
working sandbox without having to modify CPython too much.
pysandbox status is closer to a proof-of-concept than a beta version, there
are open issues (see above). Please test it and try to break it!
--
To try pysandbox, download the last version using git clone or a tarball at:
http://github.com/haypo/pysandbox/
You don't need to install it, use "python interpreter.py" or "python
execfile.py yourscript.py". Use --help to get more options.
I tested pysandbox on Linux with Python 2.5, 2.6 and 2.7. I guess that it
should work on Python 3.0 with minor changes.
--
The current syntax is:
config = SandboxConfig(...)
with Sandbox(config):
... execute untrusted code here ...
This syntax has a problem: local frame variables are not protected by a proxy,
nor removed from the sandbox namespace. I tried to remove the frame locals,
but Python uses STORE_FAST/LOAD_FAST bytecodes in a function, and this fast
cache is not accessible in Python. Clear this cache may introduce unexpected
behaviours.
pysandbox modify some structure attributes (frame.f_builtins and
frame.f_tstate.interp.builtins) directly in memory using some ctypes tricks.
I used that to avoid patching CPython and to get faster a working
proof-of-concept. Set USE_CPYTHON_HACK to False (in sandbox/__init__.py) to
disable these hacks, but they are needed to protect __builtins__ (see related
tests).
--
By default, pysandbox doesn't use CPython restricted mode, because this mode
is too restrictive (it's not possible to read a file or import a module). But
pysandbox can use it with SandboxConfig(cpython_restricted=True).
--
See README file for more information and TODO file for a longer status.
Victor
Starting around 14:00 UTC today, we will take the trackers at
bugs.python.org, bugs.jython.org, and psf.upfronthosting.co.za offline
for a system upgrade. The outage should not last longer than four hours
(probably much shorter).
Regards,
Martin
I recently set up a Mercurial hosting solution myself, and noticed that
there is no audit trail of who had been writing to the "master" clone.
There are commit messages, but they could be fake (even misleading to a
different committer).
The threat I'm concerned about is that of a stolen SSH key. If that is
abused to push suspicious changes into the repository, it is really
difficult to find out whose key had been used.
The solution I came up with is to define an "incoming" hook on the
repository which will log the SSH user along with the pack ID of the
pack being pushed.
I'd like to propose that a similar hook is installed on repositories
hosted at hg.python.org (unless Mercurial offers something better
already). Whether or not this log should be publicly visible can be
debated; IMO it would be sufficient if only sysadmins can inspect it in
case of doubt.
Alterntively, the email notification sent to python-checkins could could
report who the pusher was.
Dirkjan: if you agree to such a strategy, please mention that in the PEP.
Regards,
Martin
Has anyone come up with rules of thumb for what to intern and what the
performance implications of interning are?
I'm working on profiling App Engine again, and since they don't allow
marshall I have to modify pstats to save the profile via pickle.
While trying to get profiles under 1MB, I noticed that each function
has its own copy of the filename in which it is defined, and sometimes
these strings can be rather long.
Creating a code object already interns a bunch of stuff; argument
names, variable names, etc. Interning the filename will add some CPU
overhead during function creation, should save a decent amount of
memory, and ought to have minimal overall performance impact.
I have a local patch, but wanted to see if anyone had ideas or
experience weighing these tradeoffs.
-jake