Recently-ish on c.l.py3k (iirc) folks were discussing how to write a
script that exited with a human-friendly warning message if run under an
incompatible version of the language. The problem with this code:
if sys.version < 3: sys.exit("Sorry, this script needs Python 3000")
is that the code only executes once tokenization is finished--if your
script uses any incompatible syntax, it will fail in the tokenizer, most
likely with an error message that doesn't make it particularly clear
what is going on.
After thinking about the problem for a while, it hit me--this is best
expressed as a "pragma". For Python's purposes, I would define a
"pragma" as an instruction to the tokenizer / compiler, executed
immediately upon its complete tokenization. The use case here is
pragma version >= 3 # python version must be less than 3.0
Again, this would be executed immediately, aborting before the tokenizer
has a chance to see some old syntax it didn't like.
What else might we use "pragma" for? Well, consider that Python already
has two specialized syntaxes that are really pragmas: "from __future__
import" and "# -*- coding: ". I think this functionality would be more
clearly expressed with a "pragma" syntax, for example:
pragma encoding latin-1
pragma enable floatdivision
It's a matter of taste, but I've never liked it when languages hide
important directives in comments--isn't the compiler supposed to
*ignore* comments?--nor do I like how "from __future__ import" doesn't
really have anything to do with importing modules. Your tastes may vary.
There was some discussion back in 2000 about adding a "pragma" to the
It sounds like GvR wasn't wholly against the idea:
But nothing seems to have come of it. The discussion died out in early
September of 2000, and I didn't find any subsequent revivals.
There was some worry back then that pragmas would be a slippery slope,
resulting in increasingly elaborate pragma syntaxes until
we--shudder!--wake up one day and have preprocessor macros. I agree
that we don't want to go too far down this slippery slope. I have some
specific suggestions on how we could obviate the temptation, but they
are predicated on having pragmas at all, so I might as well keep quiet
until such point as pragmas get traction.
If you'd like to discuss this in person--or just give me a good hard
slap for even suggesting it--I'm bouncing around PyCon until Wednesday
afternoon. I'm the guy with the Facebook logos plastered around his person.
Tiago A.O.A. wrote:
> I would suggest something like a ~= b, for "approximately equal to". How
> approximately? Well, there would be a default that could be changed
> Don't know if it's all that useful, though.
Don't forget a !~ b, a <~ b, and a >~ b, and the associated __sim__,
__nsim__, __ltsim__, and __gtsim__ slots.
I'm not at all sure how serious I am right now. It's late, and I have
fuzzy recollections of how those kinds of things might have been nice in
some past numerical code.
And then =~ and !~ could be defined for strings and do regular
expression matching! Woo! More operators! With pronouns!
This might be a minor thing, but I kind of wish that I could write this:
sys.stderr.print('another line here')
print('first line', file=sys.stderr)
print('another line here', file=sys.stderr)
print('and again', file=sys.stderr)
As it's a lot easier to read for me. Of course you can always add
spaces to make the lines line up, but with a long print statement your
eye has to go a long distance to figure out what file, if any, you're
printing to. It could be pretty simple to add:
def print(*args, **kwargs):
io.print(file=self, *args, **kwargs)
I haven't been able to find any discussion on this, has this already
For systems programming I often use
floats as timestamps in dictionaries,
and in this case I never do calculations
and all I care about is "same" or "different",
meaning "any single bit difference".
If you change the way == works and
also follow through and change the way
floats in dictionaries work you would
probably break very many applications
like this. I think any "almost equal"
should be implemented using a new
method or syntax x~=y rather than
- Aaron Watters
As per Aahz's suggestion, I'm moving this discussion here, from Python-Dev.
Mark Dickinson wrote:
> On Thu, Mar 13, 2008 at 4:20 AM, Imri Goldberg <lorgandon(a)gmail.com
> <mailto:email@example.com>> wrote:
> My suggestion is to do either of the following:
> 1. Change floating point == to behave like a valid floating point
> comparison. That means using precision and some error measure.
> 2. Change floating point == to raise an exception, with an error
> suggesting using precision comparison, or the decimal module.
> I don't much like either of these; I think option 1 would cause
> a lot of confusion and difficulty---it changes a conceptually
> simple operation into something more complicated.
> As for option 2., I'd agree that there are situations where having
> a warning (not an exception) for floating-point equality (and
> inequality) tests might be helpful; but that warning should be
> off by default, or at least easily turned off.
As I said earlier, I'd like static checkers (like Python-Lint) to catch
this sort of cases, whatever the decision may be.
> Some Fortran compilers have such a (compile-time) warning,
> I believe. But Fortran's users are much more likely to be
> writing the sort of code that cares about this.
> Since this change is not backwards compatible, I suggest it be added
> only to Python 3.
> It's already too late for Python 3.0.
Still, I believe it is worth discussing.
> 3. Programmers will still need the regular ==:
> Maybe, and even then, only for very rare cases. For these, a special
> function\method might be used, which could be named floating_exact_eq.
> I disagree with the 'very rare' here. I've seen, and written, code like:
> if a == 0.0:
> # deal with exceptional case
> b = c/a
> or similarly, a test (a==b) before doing a division by a-b. That
> one's kind of dodgy, by the way: a != b doesn't always guarantee
> that a-b is nonzero, though you're okay if you're on an IEEE 754
> platform and a and b are both finite numbers.
While checking against a==0.0 (and other similar conditions) before
dividing will indeed protect from outright division by zero, it will
enlarge any error you will have in the computation. I guess it would be
better to do the same check for 'a is small' for appropriate values of
> Or what if you wanted to generate random numbers in the open interval
> (0.0, 1.0). random.random gives you numbers in [0.0, 1.0), so a
> careful programmer might well write:
> while True:
> x = random.random()
> if x != 0.0:
> (A less fussy programmer might just say that the chance
> of getting 0.0 is about 1 in 2**53, so it's never going to happen...)
> Other thoughts:
> - what should x == x do?
If suggestion no. 1 is accepted, always return True. If no. 2 is
accepted, raise an exception.
Checking x==x is as meaningful as checking x==y.
> - what should
> 1.0 in set([0.0, 1.0, 2.0])
> 3.0 in set([0.0, 1.0, 2.0])
Actually, one of the reasons I thought about this subject in the first
place, was dict lookup for floating point numbers. It seems to me that
it's something you just shouldn't do.
As for your examples, I believe these two should both raise an
exception. This is even worse than normal comparison - here you are
checking against the hash of a floating point number. So if you do that
in the current implementation, there's a good chance you'll get
unexpected results. If you do that given the implementation of
suggestion 1, you'll have a hard time make set work.
Insert Signature Here
Hi. Some months ago I complained on the python-list
that python gc did too much work for apps that allocate
and deallocate lots of structures. In fact one of my apps
was spending about 1/3 of its time garbage collecting
and not finding anything to collect (before i disabled gc).
My proposal was that python
should have some sort of a smarter strategy for garbage
collection, perhaps involving watching the global
high water mark for memory allocation or other tricks.
The appropriate response was:
"great idea! patch please!" :)
Unfortunately dealing with cross platform
memory management internals is beyond my
C-level expertise, and I'm not having a lot of
luck finding good information sources. Does anyone
have any clues on this or other ideas for improving
the gc heuristic? For example, how do you find
out the allocated heap size(s) in a cross platform
This link provides some clues, but I don't really
understand this code well enough to hope to
-- Aaron Watters
Alexandre Vassalotti wrote:
> On Sun, Mar 9, 2008 at 7:21 PM, Forrest Voight <voights(a)gmail.com> wrote:
>> This would simplify the handling of list slices.
>> Slice objects that are produced in a list index area would be different,
>> and optionally the syntax for slices in list indexes would be expanded
>> to work everywhere. Instead of being containers for the start, end,
>> and step numbers, they would be generators, similar to xranges.
> I am not sure what you are trying to propose here. The slice object
> isn't special, it's just a regular built-in type.
> >>> slice(1,4)
> slice(1, 4, None)
> >>> [1,2,3,4,5,6][slice(1,4)]
> [2, 3, 4]
> I don't see how introducing new syntax would simplify indexing.
Likewise. It would simplify looping, though:
>>> for i in 1:5:
... print i
Since this kind of loop happens frequently, it makes sense to shorten
it. Slice objects (and syntax) seem ready-made for that - it wouldn't be
*new* syntax, just repurposed syntax.
Though Forrest didn't bring this up directly, I've often thought that
Python's having both xrange and slice (and in 3000, range and slice) is
mostly vestigial. Their information content is identical and their
purposes are highly analogous. Unifying them would reduce the number of
new concepts for a beginner by one, and these are frequently-used
concepts at that.
Negative indexes could throw the idea for a loop, though. (Pun! Ha ha!)
And this makes the colons look like some kind of enclosure:
>>> for i in :5:
I just checked the python site documentation on marshal and pickle and I
consider them to be irresponsibly and dangerously misleading.
For example. Suppose Mercurial is implemented using pickle.load (I sure
hope it isn't -- is it?).
1) I send someone a "patch" for their software claiming it makes their
package run faster.
2) That person uses mercurial to "unpack" the patch and mercurial uses
BAM! That person's filesystem is GONE! AND I'M NOT ASSUMING
THAT THERE IS ANY BUG IN MERCURIAL!
Now: suppose Mercurial is implemented using marshal: no such scenario is
unless there is a security bug in mercurial where they explicitly execute
RESOLVED: pickle should come with a large red label:
WARNING: LARK'S VOMIT --
NEVER USE PICKLE TO IMPLEMENT UNTRUSTED ARCHIVING OF ANY KIND.
It doesn't have one.
Marshal needs no such label: but it has one:
*Warning:* The marshal module is not intended to be secure against erroneous
or maliciously constructed data. Never unmarshal data received from an
untrusted or unauthenticated source.
This is bullshit.
Sorry, for the french and the caps, but this is REALLY IMPORTANT.
-- Aaron Watters