I've been thinking about some ideas for reducing the
amount of refcount adjustment that needs to be done,
with a view to making GIL removal easier.
1) Permanent objects
In a typical Python program there are many objects
that are created at the beginning and exist for the
life of the program -- classes, functions, literals,
etc. Refcounting these is a waste of effort, since
they're never going to go away.
So perhaps there could be a way of marking such
objects as "permanent" or "immortal". Any refcount
operation on a permanent object would be a no-op,
so no locking would be needed. This would also have
the benefit of eliminating any need to write to the
object's memory at all when it's only being read.
2) Objects owned by a thread
Python code creates and destroys temporary objects
at a high rate -- stack frames, argument tuples,
intermediate results, etc. If the code is executed
by a thread, those objects are rarely if ever seen
outside of that thread. It would be beneficial if
refcount operations on such objects could be carried
out by the thread that created them without locking.
To achieve this, two extra fields could be added
to the object header: an "owning thread id" and a
"local reference count". (The existing refcount
field will be called the "global reference count"
in what follows.)
An object created by a thread has its owning thread
id set to that thread. When adjusting an object's
refcount, if the current thread is the object's owning
thread, the local refcount is updated without locking.
If the object has no owning thread, or belongs to
a different thread, the object is locked and the
global refcount is updated.
The object is considered garbage only when both
refcounts drop to zero. Thus, after a decref, both
refcounts would need to be checked to see if they
are zero. When decrementing the local refcount and
it reaches zero, the global refcount can be checked
without locking, since a zero will never be written
to it until it truly has zero non-local references
remaining.
I suspect that these two strategies together would
eliminate a very large proportion of refcount-related
activities requiring locking, perhaps to the point
where those remaining are infrequent enough to make
GIL removal practical.
--
Greg
Hi everyone,
In Python, the humble colon (:) has multiples uses:
1. as a signal to indentation increase, signaling a block of code, such as
1a) for function or class definitions
1b) for while/for/if/elif/else blocks
1c) for try/except/finally blocks
In these cases, the majority opinion (to which I subscribe) is that
using a colon increases readability. I am NOT suggesting to removing
the colon in those instances.
However, the colon has also some other uses.
2. in slices [a:b:c]
3. in dict assignments {a:b}
4. in lambda assignments (lambda x: x+1)
I would argue that, in these last three examples, there might be better choices.
(some of these choices have been inspired by reading
http://www.resolversystems.com/documentation/index.php/Differences_Between_…)
I don't expect the following suggestions to immediately convince
everyone (or anyone!) ... but, at least they will be on record.
Slices:
---------
I would argue that, the usual slicing notation would be more readable
if it were as follows:
[a -> b; c]
Thus
[1:10:2] would become [1 -> 10; 2]
[1:10] would become [1 -> 10]
The "shorter" combinations would not gain in terms of readability;
they would be as follows:
[ :10 : 2] would become [10; 2]
[ :10] would become [10;]
[:: -1] would become [; -1]
[:] would become [;]
If such a change were to be made, an second slicing notation, *with a
different meaning*, could be introduced:
[a => b; c]
This would be an inclusive range, i.e.
[a => b] is equivalent to [a -> b+1]
dict assignments
------------------------
Here again, I would argue that using "->" instead of ":" would make
the code more readable - at least for beginners.
numbers = {'one' -> 1, 'two' -> 2} instead of
numbers = {'one': 1, 'two': 2}
lambda assignments
---------------------------
Once again, same choice.
lambda x -> x+1
is, I think, more readable than
lambda x: x+1
(but perhaps the last two [dicts and lambda] largely depends on the
font choice...)
======
Other considerations:
If "->" were to be adopted for dict or lambda assignments, then the
"naturalness" of their choice for slices would be reduced. An
alternative might be inspired from the mathematical notation
[a, ..., b; c]
I realize that this is "much" longer than [a: b: c].
Final comment:
I have seen other alternatives for simple slices suggested in the past such as
[a..b] and [a...b] which would be the equivalent of [a->b] and [a=>b];
however, the extra "." might sometimes be difficult to read, whereas
the difference between "->" and "=>" is much easier to see.
Cheers,
André
Since PEP 3132 gives us:
>>> x = [1,2,3]
>>> a, *b = x
>>> a
1
>>> b
[2, 3]
it seems natural that we should be able to do it the other way too:
(doesn't actually work)
>>> a, b = 1, [2,3]
>>> x = [a,*b]
>>> x
[1, 2, 3]
This is essentially itertools.chain, but of course it isn't nearly as much fun:
>>> [n for n in itertools.chain([1],[2,3])]
[1, 2, 3]
Now you might be thinking, "yeah, that's cool, but you don't really
need it" but this actually came up in practice: I have a function that
has a certain behavior for standard types, but when it sees a type it
doesn't recognize, it calls a protocol function (like __iter__ or
__next__) and expects to receive an iterable whose 0th element is a
string. Now I'm probably not going to use itertools.chain because the
only members of itertools I can reliably remember are count() and
izip(), so my next alternative (which is perfectly acceptable) is
>>> a, b = 1, [2,3]
>>> x = [a] + b
This isn't too bad, but it is slightly less clear than x = [a,*b] and
much less efficient when b is long.
It seems so simple...makes me think it came up before and I missed it.
David
hi!
I had that proposal (stated back in 2001 here
http://bugs.python.org/issue449227 ), with these main moments to cite
myself:
<<<
I use rlcompleter extensively in interactive Python
mode.
I think it could be cool if callable objects were added
"("
when completed. This way it will be much faster to
program, without looking-up __doc__. For example:
>>> f.fil<TAB>
will give:
>>> f.fileno(_
("_" is to mark cursor position)
and:
>>> f.so<TAB>
will (as before) give:
>>> f.softspace _
One more illustration:
>>> f = open("myfile", "w")
>>> f.
f.__class__( f.__repr__( f.next(
f.__delattr__( f.__setattr__( f.read(
f.__doc__ f.__str__( f.readinto(
f.__enter__( f.close( f.readline(
f.__exit__( f.closed f.readlines(
f.__getattribute__( f.encoding f.seek(
f.__hash__( f.fileno( f.softspace
f.__init__( f.flush( f.tell(
f.__iter__( f.isatty( f.truncate(
f.__new__( f.mode f.write(
f.__reduce__( f.name f.writelines(
f.__reduce_ex__( f.newlines f.xreadlines(
>>> f.
- nice to remember which attributes are methods and which
aren't.
>>>
There are patches recently made by Manuel Muradás and Facundo Batista
for 2.6 (so I assume some people are interested in the patch, not only
me), but I have no idea how (and if) it ever gets into Python? 2.6 and 3k.
Its a very small feature to make a PEP, but how then? I hope rlcompleter
will not get obsoleted before the patch is accepted ;-)
Thanks!
Regards,
Roman
Hello,
I've made some improvements to PyUnit which adds the concept of
skipped test (when, for instance, the setUp fails), and some other
stuff. This changes are test-code-wise backward compatible, in the
sense that old tests will just work. But since the output of running
the tests is different, tools using that output, at any level (reading
the text, or running the tests programatically) will fail. I think it
would be possible to have a switch somewhere to even then make them
backward compatible.
Are there any chances of getting this code in Python 3k? I'd need to
work on it, polish it, write more tests for it, etc, so I'd like to
know if it'd be welcome or not before investing the time.
Thank you.
--
J. Pablo Fernández <pupeno(a)pupeno.com> (http://pupeno.com)
Temporarily using pupeno(a)gmail.com.
In the spirit of going in the reverse direction of turning print into
a function, what does python-ideas think of the sort statement?
numlist = [2,5,4]
sort numlist
sort numlist asc # borrowed from SQL ORDER BY statement
sort numlist desc
sort by employee.last_name for employee in employee_list # this uses key sorting
The main advantage is that it is impossible to make this mistake:
x = y.sort()
when you really mean
x = sorted(y)
Cheers,
David
I would like to propose that the + operator be a synonym for | (union)
on sets. IMHO, the succinct power of being able to say
sum(list_of_sets, set())
as opposed to
import operator
reduce(operator.and_, list_of_sets, set())
far outweighs any problems with having two operators for union. The
sum paradigm is certainly more readable as well. I realize that a
function named "unionall" could be defined trivially, but with a
built-in already serving the purpose in a readable and reasonable way,
it seems silly to not use it.
I anticipate that someone will ask why such a paradigm should be
available for union as opposed to the other set operations. My answer
is that several standard algorithms rely on a union over a sequence of
sets; one example is the construction of the "follow" set in
constructing an LL or SLR parser, and I know I've seen others that I
cannot remember off the top of my head. There is even a mathematical
symbol (big-U) for it akin to the summation sign (uppercase sigma).
Any feedback?
Brandon
>>>>> "Arnaud" == Arnaud Delobelle <arnodel(a)googlemail.com> writes:
Arnaud> I've written a patch [1] that does that. Following the suggestion
Arnaud> of Raymond Hettinger, I've implemented set.intersection by sorting
Arnaud> all its sets/frozensets/dicts in increasing order of size first,
Arnaud> then iterating over the smallest.
Hi Arnaud
I don't know if you'll do any benchmarking on this, but I'd suggest:
- First find the set with the smallest size (this is O(n) work).
- If that size is sufficiently small and the number of sets is
sufficiently large (numbers to be determined by testing), don't sort the
sets by size - just go for it.
- Else do the sorting.
The point being that if the smallest set is already quite small, the size
of the intersection is already tightly bounded and you're possibly going to
do an expensive sort that's really not needed. The O(n) work to find the
smallest is tiny compared to just blindly doing O(n lg n) immediately. Most
of the juice you get from moving from small to big sets comes from starting
with the smallest.
A few benchmarks should give an idea of when to sort.
BTW, having a quick look at your diff (not the patched source) it looks
like you're testing each of the elements of the smallest set against all
other hashtables. I haven't thought about it much, but that seems to partly
defeat the purpose of sorting. Speed will depend on the inputs, but I'd
have guessed that in general you should be testing each member of the
smallest for presence in the next set, short-circuiting if empty, then
testing each of the survivors against the next set, etc. That's more of a
"vertical" approach than the horizontal one you take across all the
hashtables (possibly also with a speed benefit due to locality of
reference).
Also, why not test first against the iterables that are not hashtables?
Wouldn't that be faster in the (common?) case of many sets being passed for
intersection?
Sorry if this is all clueless - it's just my thinking as I looked at your
diff.
Regards,
Terry
Hello,
To remove bottlenecks I usually instrument some functions in my application
inside a dedicated test and set speed goals there until they are met.
Then I leave the test to avoid speed regressions, when doable, by
translating times in pystones.
Unless I missed something in the standard library, I feel like there's a
missing tool to do it simply:
- the timeit module is nice to try out small code snippets but is not really
adapted to manually profile the code of an existing application
- the profile module is nice to profile an application as a whole but is not
very handy to gather statistics on specific functions in their real
execution context
What about adding a decorator that fills a statistics mapping in memory
(time+stones), like this:
>===========
import time
import sys
import logging
from test import pystone
benchtime, stones = pystone.pystones()
def secs_to_kstones(seconds):
return (stones*seconds) / 1000
stats = {}
def reset_stats():
global stats
stats = {}
def log_stats():
template = '%s : %.2f kstones, %.3f secondes'
for key, v in stats.items():
logging.debug(template % (key, v['stones'], v['time']))
if sys.platform == 'win32':
timer = time.clock
else:
timer = time.time
def profile(name='stats', stats=stats):
def _profile(function):
def __profile(*args, **kw):
start_time = timer()
try:
return function(*args, **kw)
finally:
total = timer() - start_time
kstones = secs_to_kstones(total)
stats[name] = {'time': total,
'stones': kstones}
return __profile
return _profile
>===========
This allows instrumenting the application by decorating some functions,
either inside the application, either in a dedicated test:
>======
def my_test():
my.app.slow_stuff = profile('seem slow')(my.app.slow_stuff)
my.app.other_slow_stuff = profile('seem slow
too')(my.app.other_slow_stuff)
# should not take more than 40k pystones !
assert stats['seem slow too']['profile'] < 40
# let's log them
log_stats()
>======
Regards,
Tarek
--
Tarek Ziadé | Association AfPy | www.afpy.org
Blog FR | http://programmation-python.org
Blog EN | http://tarekziade.wordpress.com/