Please hold off on any checkins after 1200 UTC on Thursday 18th
(about 22 hours after this message was sent). We're cutting 2.4rc1
then. Assuming all goes well, we'll be looking at a 2.4 final for
Could people please be _very_ conservative with checkins between
now and 2.4 final? A brown-paper-bag 2.4.1 would suck :-(
I posted a message on c.l.p a couple of days ago about a
python patch which adds a member __pycde__ to functions and
classes. This member is a string which holds the python code of
the function/class. (It works in interactively defined code
and exec'd definitions)
Supposing that: I ported this to 2.4b2 and made it have zero overhead
when python runs in normal mode, is there any chance it would be
considered a candidate for inclusion in mainline python?
If yes, I'd appreciate it if somebody took a look at the patch
because there may be a better way to do it.
Patch vs 2.3.4:
I have posted this question on c.l.py yesterday but received no
answers. Given the complexity of the theme, I hope that people will
not mind if I ask it here.
I am looking for information on packages & import hooks, including
simple examples on how to implement a simple import hook. Quick
googling turns out some documents, such as:
-- PEP 302:
-- What's new on Python 2.3:
-- Import SIG:
-- PyDoc for the ihooks module:
However, after reading them, I ended up more confused than before.
There are lots of references to the ihooks standard module, but there
is no documentation about it on the standard Python docs. The "Modules
That Needs Docs" Wiki entry
(http://www.python.org/moin/ModulesThatNeedDocs) lists ihooks and
imputils, but noone has volunteered to do it yet (worse, not even a
bug report was filled; this is something that I'm going to do).
Packages are also not documented (there is still a reference to an
essay from 1.5 days on the Python 2.3 docs).
For what I could see, the import system went over several iterations
and patches since the 1.5 release (when packages were added). It's now
hard to understand what is the preferred hooking mechanism as far as
Python 2.4 (and future versions) are concerned. For example: it is not
clear, from a simple reading, if PEP 302 is up-to-date, or if it is
considered the preferred approach for import hooks. For this reason, I
would like to know if there is any document which may be considered
"authoritative" for the import system (besides ihooks.py and
import.c). If *all* that exists is the source code, well, I guess I'll
have to read it. But anyway... pointers to simple examples also are
helpful. Thanks in advance.
Consultoria em Projetos
> Please fix your Thunderbird to stop adding spurious ^M characters:
Sorry about that. It looks like either Google's SMTP server is doing something
odd or Thunderbird's TLS implementation has issues - I switched back to my ISP's
SMTP server and the problem went away.
Nick Coghlan | Brisbane, Australia
Email: ncoghlan(a)email.com | Mobile: +61 409 573 268
> Well, not necessarily pushback, but I'd like a clarification at least.
> What kind of memory overhead does this introduce? If every function
> running around is holding a full copy of all its source, is this
> overhead potentially significant?
The overhead I'm worried about is the performance cost in
the parser. The memory doesn't worry me because computer's
memory gets more every year while python source remains compact.
It's the debate of maintanability vs optimization.
> What happens with decorators, which modify functions but are
> not explicit source-level transformations?
I don't know about that. I guess decorator declarations are
included in the source of a function.
> Since this is already fairly straightforward to implement via inspect,
> I'd like to see a pretty strong justification for its real need before
> seeing it go in.
'inspect' can't keep the source of functions defined with 'exec'
or interactively (or can it?). Moreover, you can edit a function
at runtime and see it with inspect.
As I stated in the OP at c.l.p, this all can be done in python
as it is, by using some framework (Zope, whatever). In fact I started
from such a framework but then it occured to me that it's
the job of the parser to do this.
In other words, with __pycode__ python becomes a framework.
But I'm not here to sell amway products:)
When/if pythoneers decide they like it, let me know.
> > > It would be awfully nice (on posix platforms, for my
> > > use-case) to find out whether a file is inaccessible
> > > due to permission restrictions, or due to non-existence.
Guido van Rossum:
> > Why can't you solve this by doing a stat() when access()
> > returns False?
> In my current case, stat() gives an error (even though the
> file exists):
> OSError: [Errno 2] No such file or directory:
> Running the same stat() call in the interpreter (as opposed
> to inside my mod_python app) gives no such error; I get normal stat
> output, so the file does exist. I figured since the app was running
> as LocalSystem it was a permissions issue. [One quick sanity check
> later] Yes, if I run Apache2 under my account, stat() does not error.
> Hm. I see now I'm following the wrong issue. It has more to
> do with how Windows shares mapped drives between users (it doesn't).
> If I use the UNC path, I don't have an issue.
Bah. Spoke too soon. I still have the issue even if I use UNC paths.
In Pythonwin interactive session (my logon):
os.stat() returns a tuple,
os.path.exists() returns True, and
os.access() returns True.
Inside the app (running under LocalSystem on the same Win2k machine):
os.stat() raises OSError: [Errno 2] No such file or directory,
os.path.exists() returns False, and
os.access() returns False.
I was just wondering about the status of PEP 310 ("with" statement) -
has there been any concensus/plan to implement it? (I tried to google
the answer, but failed ;-)
How about the potentiel inclusion of user-defined "blocks"? I suppose
this would be only a Python 3000 thing, if ever included?
"Martin v. Löwis" wrote:
> If you meant to parse
> as "<integer 1>" "." "<identifier __class__>", not as "<float 1.0>"
> "<identifier __class_>"
On a related note ...
All this would just go away if in Python 3.0 (or even earlier ;) floats required something after the decimal point i.e. to get '<float 1.0>' you had to type '1.0' and '1.' by itself was a syntax error.
I've never actually seen the use case for having '1.' parse as '<float 1.0>' although I'm sure there is one.
Nick Coghlan wrote:
> Ralf W. Grosse-Kunstleve wrote:
> > A pure Python program will spent the bulk of the time interpreting
> > bytecode.
> Perhaps, perhaps not.
Right. But remember what the actual goal is: we want to answer the
question "Is it worth reimplementing a piece currently written
in Python in C/C++?"
> A heck of a lot of the Python core isn't written
> in Python - time spent executing builtins, or running methods of builtin
> objects usually doesn't involve the main interpreter loop (we're
> generally in pure C-code at that point).
If a piece of Python code leads to heavy use of complex, time-consuming
builtin operations it will be of less benefit to reimplement that code
in C/C++. This is exactly what we want to learn.
> I'm curious how the suggested feature can provide any information that
> is actually useful for optimisation purposes. Just because a low
> proportion of time is spent in Python code, doesn't mean the Python code
> isn't at fault for poor performance.
> As an example, in CPython 2.3 and earlier, this:
> result = ""
> for x in strings:
> result += x
> is a lot worse performance-wise than:
> result = "".join(strings)
> The first version does spend more time in Python code, but the
> performance killer is actually in the string concatenation C code. So
> the time is spent in the C code, but the fault lies in the Python code
> (In Python 2.4, the latter version is still faster, but the difference
> isn't as dramatic as it used to be).
Exactly. If you try out my patch and look at time/ticks you will see
immediately that there is no point in reimplementing "".join(strings)
in C/C++. Importantly, you don't have to look at the code to arrive at
this conclusion. The time/tick alone will tell you. This is very
helpful if you are working with third-party code.
> Knowing "I'm spending x% of the time executing Python code" just isn't
> really all that interesting,
Right. Sorry if I gave the wrong impression that this could be
interesting. It is indeed not. What is interesting is the estimated
benefit of reimplementing a piece of Python in C/C++. This is in
fact highly correlated with the time/tick.
> I'd rather encourage people to write appropriate benchmark scripts and
> execute them using "python -m profile <benchmark> ",
This approach is impractical/impossible in the real world.
For example, this is the problem prompting me to implement
- It is not our code, i.e. it is difficult for us to know where
the time is spent.
- It makes heavy use of Numeric.
- It has a few innermost loops implemented in C.
We are using only some parts of this library.
Question: if we reimplement these parts completely in C++, what speedup
can we expect?
So we run a whole calculation and print the time/tick, which you can do
with less than a one-minute investment:
as the last statement of your code. If the printed value is close 0.15
on our reference platform we know that the speedup will be in the
neighborhood of 100. Any value higher than 0.15 indicates that the
expected speedup will less. In our case the value was 0.35, and after
we did the reimplementation in C++ we found a speedup of about 10. We
have other applications with time/tick around 10. Just looking at this
number tells us that there is not much to gain unless we completely
eliminate Python. Bring in the cost for the C++ programmer and the
increased cost of maintaining the C++ code compared to Python, and you
know what we decided (not) to do.
> rather than lead
> them up the garden path with a global "Python/non-Python" percentage
> estimation utility.
Please consider that that utility is simply printing
time.time()/sys.gettickeraccumulation(), that my patch is trivial, and
that the runtime penalty is close to non-existing.