I've been thinking about some ideas for reducing the
amount of refcount adjustment that needs to be done,
with a view to making GIL removal easier.
1) Permanent objects
In a typical Python program there are many objects
that are created at the beginning and exist for the
life of the program -- classes, functions, literals,
etc. Refcounting these is a waste of effort, since
they're never going to go away.
So perhaps there could be a way of marking such
objects as "permanent" or "immortal". Any refcount
operation on a permanent object would be a no-op,
so no locking would be needed. This would also have
the benefit of eliminating any need to write to the
object's memory at all when it's only being read.
2) Objects owned by a thread
Python code creates and destroys temporary objects
at a high rate -- stack frames, argument tuples,
intermediate results, etc. If the code is executed
by a thread, those objects are rarely if ever seen
outside of that thread. It would be beneficial if
refcount operations on such objects could be carried
out by the thread that created them without locking.
To achieve this, two extra fields could be added
to the object header: an "owning thread id" and a
"local reference count". (The existing refcount
field will be called the "global reference count"
in what follows.)
An object created by a thread has its owning thread
id set to that thread. When adjusting an object's
refcount, if the current thread is the object's owning
thread, the local refcount is updated without locking.
If the object has no owning thread, or belongs to
a different thread, the object is locked and the
global refcount is updated.
The object is considered garbage only when both
refcounts drop to zero. Thus, after a decref, both
refcounts would need to be checked to see if they
are zero. When decrementing the local refcount and
it reaches zero, the global refcount can be checked
without locking, since a zero will never be written
to it until it truly has zero non-local references
remaining.
I suspect that these two strategies together would
eliminate a very large proportion of refcount-related
activities requiring locking, perhaps to the point
where those remaining are infrequent enough to make
GIL removal practical.
--
Greg
AFAIK if you want enumerations you must either create your own, or use a
third party module, or use namedtuple:
Files = collections.namedtuple("Files", "minimum maximum")(1, 200)
...
x = Files.minimum
Using:
MINIMUM, MAXIMUM = 1, 200
is often inconvenient, since you might have several different ones. Of
course you could do:
MIN_FILES = 1
MIN_DIRS = 0
Personally, I like enums and consider them to be a fundamental, but I
don't like the above approaches.
There is an enum module in PyPI
http://pypi.python.org/pypi/enum/
and there are several versions in the Python Cookbook.
Wouldn't one of these be worth adopting for the standard library?
--
Mark Summerfield, Qtrac Ltd., www.qtrac.eu
Maybe dictionary unpacking would be a nice thing?
>>> d = {'foo': 42, 'egg': 23}
>>> {'foo': bar, 'egg': spam} = d
>>> print bar, spam
42 23
What do you think? Bad idea? Good idea?
-panzi
Greg wrote:
> What's needed is something very concise and unobtrusive,
> such as
>
> x, y => x + y
As inspired by Prolog:
x, y :- x + y
So this:
f = lambda x, y: x ** 2 + y ** 2
Or this:
def f(x, y): return x ** 2 + y ** 2
Becomes this:
f = x, y :- x ** 2 + y ** 2
And would logically transpose into this:
f(x, y) :- x ** 2 + y ** 2
Oooo...this rabbit hole is fun!
Joel
[follow up from py3k.devel list]
Neil Toronto wrote:
> Yep. In my seven years of CS instruction so far, I've only come across
> this once, in a theory of programming languages course. "Lambda" simply
> doesn't show up unless you do language theory or program in a Lisp... or
> in Python.
Since you mention Haskell below:
> It's a little less terse than Haskell's "\->"
it's worth pointing out that Haskell uses the backslash syntax because
it is the nearest ASCII equivalent to the (lower-case) letter lambda.
For example, see http://en.wikibooks.org/wiki/Haskell/More_on_functions
or the Google results for "haskell lambda backslash" (without the quotes).
Malte
Hello,
inspired by Greg's post about ideas on making the lambda syntax more
concise, like:
x,y => x+y
I was wondering if using unnamed arguments had already been debated.
Something like:
\(_1+_2)
where basically you're declaring implicitally declaring that your lambda
takes two arguments. You wouldn't be able to call them through keyword
arguments, nor to accept a variable number of arguments (nor to accept
more arguments than they are actually used), but wouldn't it cover most
use cases and be really compact?
Other examples:
k.sort(key=\(_1.foo))
k.sort(key=\(_1[0]))
--
Giovanni Bajo
Hello,
(this is an idea for Python 3)
Is there any reason for keeping the directory lib-tk at Lib ? I
believe renaming it to tkinter and making it a package would make more
sense. Tkinter module's code could then reside into __init__.py maybe.
Other change that could be done in this package would be renaming some
modules:
Dialog -> dialog
FileDialog -> filedialog
FixTk -> fixtk
...
Also, I believe tkSimpleDialog and dialog could be in a single module.
There are other modules like tkColorChooser and tkCommonDialog and
even tkSimpleDialog (and some others) that I'm not totally sure what
to do about them, but for me they should reside in possible a single
module.
That is. Thanks,
--
-- Guilherme H. Polo Goncalves
Hello,
I've read this idea about "preparing an existing RPC mechanism for the
standard library" at StandardLibrary ideas and I would be interested
in doing it, but as you all know, including something into stdlib is
not exactly easy and shouldn't be anyway. Also I'm not even sure if
this idea is still desired.
I'm considering the inclusion of rpyc, with appropriate changes
(possibly lots). And would like to know your opinions towards this.
Thanks,
--
-- Guilherme H. Polo Goncalves
(with apologies for the random extra level of quoting in the below...)
> On Thu, Mar 13, 2008 at 11:09 AM, Imri Goldberg <lorgandon(a)gmail.com>
> wrote:
>
> As I said earlier, I'd like static checkers (like Python-Lint) to catch
> > this sort of cases, whatever the decision may be.
> >
>
Hmm. Isn't that tricky? How does the static checker decide
whether the objects being compared are floats? I guess one could
be content with catching some cases where the operands to ==
are clearly floats... Wouldn't you have to have run-time warnings
to be really sure of catching all the cases?
> > It's already too late for Python 3.0.
> > Still, I believe it is worth discussing.
> >
>
Sure. I didn't mean that to come out in quite the dismissive way it did :).
Apologies. Maybe a PEP aimed at Python 4.0 is in order. If you're open
to the idea of just having some way to enable warnings, it could be
much sooner.
> > While checking against a==0.0 (and other similar conditions) before
> > dividing will indeed protect from outright division by zero, it will
> > enlarge any error you will have in the computation. I guess it would be
> > better to do the same check for 'a is small' for appropriate values of
> > 'small'.
>
>
Still, a check for 0.0 is good enough in some cases: if a is tiny, the
large intermediate values may appear and then disappear happily
before giving a sensible final result. These are usually the sort
of cases where just having division by 0.0 return an infinity
would have "just worked" too (making the whole "if" redundant), but
that's not (currently!) an option in Python.
It's a truism that floating-point equality tests should be avoided, but
it's just not true that floating-point equality testing is *always* wrong,
and I don't think that Python should make it so.
Actually, one of the reasons I thought about this subject in the first
> > place, was dict lookup for floating point numbers. It seems to me that
> > it's something you just shouldn't do.
> >
>
So your proposal would presumably include making
x in dict
and
x not in dict
errors for any float x, regardless of the contents of the dictionary
(or list, or set, or frozenset, or...) dict?
What would you do about Decimals? A Decimal is just another
floating point format (albeit base 10 instead of base 2); so
presumably all these warnings/errors should apply equally
to Decimal instances? If not, why not?
I'm not trying to be negative here---as Aahz says, this is an
interesting idea; I'm just trying to understand exactly how
things might work.
Mark
I would suggest to add question 4.28 to faq. Everybody who learns Python
reads that and will not ask questions about "one-element tuples". I have
compiled answer from responses to my last question about one-element tuple.
Question: Why python allows to put comma at the end of list? This looks
ugly and seems to break common rules...
Answer: There are may reasons that follow.
1. If you defined multiline dictionary
d = {
"A": [1, 5],
"B": [6, 7], # last trailing comma is optional but good style
}
it would be easier to add more elements, because you don't have to care
about colons -- you always put colon at the end of line and don't have
to reedit other lines. It eases sorting of such lines too -- just cut
line and paste above.
2. Missing comma can lead to errors that are hard to diagnose. For example:
x = [
"fee",
"fie"
"foo",
"fum"
]
contains tree elements "fee", "fiefoo" and "fum". So if programmer puts
comma always at the end of line he saves lots of trouble in a future.
2. Nearly all reasonable programming languages (C, C++, Java) allow for an
extra trailing comma in a comma-delimited list, for consistency and to
make programmatic code generation easier. So for example [1,2,3,] is
intentionally correct.
3. Creating one-element tuples using tuple(['hello']) syntax is much much
slower (a factor of 20x here) then writing just ['hello', ]. Trailing
commas
when you only have one item are how python tuple syntax is defined
allowing them to use commas instead of needing other tokens. If python
didn't allow comma at the end of tuple, you will have to use such slow
syntax.
4. The same rule applies to other type of lists, where delimiter can occur
at the end. For example both strings "alfa\nbeta\n" and "alfa\nbeta" contain
two lines.
Sources:
-- http://mail.python.org/pipermail/python-list/2003-October/231419.html
-- http://mail.python.org/pipermail/python-ideas/2008-March/001478.html
-- http://mail.python.org/pipermail/python-ideas/2008-March/001475.html