Noam Raphael wrote:
> I don't think that every type that supports equality
> comparison should support order comparison. I think
> that if there's no meaningful comparison (whether
> equality or order), an exception should be raised.
Just to keep myself sane...
def date_range(start=None, end=None):
if start == None:
start = datetime.date.today()
if end == None:
end = datetime.date.today()
return end - start
Are you saying the "if" statements will raise TypeError if start or end are dates? That would be a sad day for Python. Perhaps you're saying that there is a "meaningful comparison" between None and anything else, but please clarify if so.
Robert Brewer
System Architect
Amor Ministries
fumanchu(a)amor.org
Is it finally time in Python 2.5 to allow the "obvious" use of, say,
str(5,2) to give '101', just the converse of the way int('101',1)
gives 5? I'm not sure why str has never allowed this obvious use --
any bright beginner assumes it's there and it's awkward to explain
why it's not!-). I'll be happy to propose a patch if the BDFL
blesses this, but I don't even think it's worth a PEP... it's an
inexplicable though long-standing omission (given the argumentative
nature of this crowd I know I'll get pushback, but I still hope the
BDFL can Pronounce about it anyway;-).
Alex
What happened to the CurrentVersion registry entry documented at
http://www.python.org/windows/python/registry.html
AFAICT, even the python15.wse file did not fill a value in this
entry (perhaps I'm misinterpreting the wse file, though).
So was this ever used? Why is it documented, and who documented it
(unfortunately, registry.html is not in cvs/subversion, either)?
Regards,
Martin
>From comp.lang.python:
chrisperkins99(a)gmail.com wrote:
> It seems to me that str.count is awfully slow. Is there some reason
> for this?
> Evidence:
>
> ######## str.count time test ########
> import string
> import time
> import array
>
> s = string.printable * int(1e5) # 10**7 character string
> a = array.array('c', s)
> u = unicode(s)
> RIGHT_ANSWER = s.count('a')
>
> def main():
> print 'str: ', time_call(s.count, 'a')
> print 'array: ', time_call(a.count, 'a')
> print 'unicode:', time_call(u.count, 'a')
>
> def time_call(f, *a):
> start = time.clock()
> assert RIGHT_ANSWER == f(*a)
> return time.clock()-start
>
> if __name__ == '__main__':
> main()
>
> ###### end ########
>
> On my machine, the output is:
>
> str: 0.29365715475
> array: 0.448095498171
> unicode: 0.0243757237303
>
> If a unicode object can count characters so fast, why should an str
> object be ten times slower? Just curious, really - it's still fast
> enough for me (so far).
>
> This is with Python 2.4.1 on WinXP.
>
>
> Chris Perkins
Your evidence points to some unoptimized code in the underlying C
implementation of Python. As such, this should probably go to the
python-dev list (http://mail.python.org/mailman/listinfo/python-dev).
The problem is that the C library function memcmp is slow, and
str.count calls it frequently. See lines 2165+ in stringobject.c
(inside function string_count):
r = 0;
while (i < m) {
if (!memcmp(s+i, sub, n)) {
r++;
i += n;
} else {
i++;
}
}
This could be optimized as:
r = 0;
while (i < m) {
if (s[i] == *sub && !memcmp(s+i, sub, n)) {
r++;
i += n;
} else {
i++;
}
}
This tactic typically avoids most (sometimes all) of the calls to
memcmp. Other string search functions, including unicode.count,
unicode.index, and str.index, use this tactic, which is why you see
unicode.count performing better than str.count.
The above might be optimized further for cases such as yours, where a
single character appears many times in the string:
r = 0;
if (n == 1) {
/* optimize for a single character */
while (i < m) {
if (s[i] == *sub)
r++;
i++;
}
} else {
while (i < m) {
if (s[i] == *sub && !memcmp(s+i, sub, n)) {
r++;
i += n;
} else {
i++;
}
}
}
Note that there might be some subtle reason why neither of these
optimizations are done that I'm unaware of... in which case a comment
in the C source would help. :-)
--Ben
just noticed an embarrasing misspelling in one of my recent checkins, only
to find that I cannot fix it:
$ svn propedit --revprop -r 41759 svn:log
svn: Repository has not been enabled to accept revision propchanges;
ask the administrator to create a pre-revprop-change hook
$
would it be a good idea to ask the administrator to do this ?
</F>
A minor related point about on_missing():
Haven't we learned from regrets over the .next() method of iterators
that all "magically" invoked methods should be named using the __xxx__
pattern? Shouldn't it be named __on_missing__() instead?
-- Michael Chermside
Bill Janssen wrote:
> bytes -> base64 -> text
> text -> de-base64 -> bytes
It's nice to hear I'm not out of step with
the entire world on this. :-)
--
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury, | Carpe post meridiam! |
Christchurch, New Zealand | (I'm not a morning person.) |
greg.ewing(a)canterbury.ac.nz +--------------------------------------+
Instead of byte literals, how about a classmethod bytes.from_hex(), which
works like this:
# two equivalent things
expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227,
131, 79, 229, 201, 46, 106])
It's just a nicety; the former fits my brain a little better. This would
work fine both in 2.5 and in 3.0.
I thought about unicode.encode('hex'), but obviously it will continue to
return a str in 2.x, not bytes. Also the pseudo-encodings ('hex', 'rot13',
'zip', 'uu', etc.) generally scare me. And now that bytes and text are
going to be two very different types, they're even weirder than before.
Consider:
text.encode('utf-8') ==> bytes
text.encode('rot13') ==> text
bytes.encode('zip') ==> bytes
bytes.encode('uu') ==> text (?)
This state of affairs seems kind of crazy to me.
Actually users trying to figure out Unicode would probably be better served
if bytes.encode() and text.decode() did not exist.
-j
I ran Fredrik's listmodules script in my current sandbox and got a
deprecation warning for the regex module. According to PEP 4 it is already
obsolete. I saw nothing there about the timeframe for actual removal. Will
it ever go away?
Skip
I am considering developing a PEP for enabling a mechanism to assign to free
variables in a closure (nested function). My rationale is that with the
advent of PEP 227 <http://www.python.org/peps/pep-0227.html>, Python has
proper nested lexical scopes, but can have undesirable behavior (especially
with new developers) when a user makes wants to make an assignment to a free
variable within a nested function. Furthermore, after seeing numerous
kludges to "solve" the problem with a mutable object, like a list, as the
free variable do not seem "Pythonic." I have also seen mention that the use
of classes can mitigate this, but that seems, IMHO, heavy handed in cases
when an elegant solution using a closure would suffice and be more
appropriate--especially when Python already has nested lexical scopes.
I propose two possible approaches to solve this issue:
1. Adding a keyword such as "use" that would follow similar semantics as "
global" does today. A nested scope could declare names with this keyword to
enable assignment to such names to change the closest parent's binding. The
semantic would be to keep the behavior we experience today but tell the
compiler/interpreter that a name declared with the "use" keyword would
explicitly use an enclosing scope. I personally like this approach the most
since it would seem to be in keeping with the current way the language works
and would probably be the most backwards compatible. The semantics for how
this interacts with the global scope would also need to be defined (should "
use" be equivalent to a global when no name exists all parent scopes, etc.)
def incgen( inc = 1 ) :
a = 6
def incrementer() :
use a
#use a, inc <-- list of names okay too
a += inc
return a
return incrementer
Of course, this approach suffers from a downside that every nested scope
that wanted to assign to a parent scope's name would need to have the "use"
keyword for those names--but one could argue that this is in keeping with
one of Python's philosophies that "Explicit is better than implicit"
(PEP 20<http://www.python.org/peps/pep-0020.html>).
This approach also has to deal with a user declaring a name with "use" that
is a named parameter--this would be a semantic error that could be handled
like "global" does today with a SyntaxError.
2. Adding a keyword such as "scope" that would behave similarly to
JavaScript's "var" keyword. A name could be declared with such a keyword
optionally and all nested scopes would use the declaring scope's binding
when accessing or assigning to a particular name. This approach has similar
benefits to my first approach, but is clearly more top-down than the first
approach. Subsequent "scope" declarations would create a new binding at the
declaring scope for the declaring and child scopes to use. This could
potentially be a gotcha for users expecting the binding semantics in place
today. Also the scope keyword would have to be allowed to be used on
parameters to allow such parameter names to be used in a similar fashion in
a child scope.
def incgen( inc = 1 ) :
#scope inc <-- allow scope declaration for bound parameters (not a big fan
of this)
scope a = 6
def incrementer() :
a += inc
return a
return incrementer
This approach would be similar to languages like JavaScript that allow for
explicit scope binding with the use of "var" or more static languages that
allow re-declaring names at lower scopes. I am less in favor of this,
because I don't think it feels very "Pythonic".
As a point of reference, some languages such as Ruby will only bind a new
name to a scope on assignment when an enclosing scope does not have the name
bound. I do believe the Python name binding semantics have issues (for
which the "global" keyword was born), but I feel that the "fixing" the
Python semantic to a more "Ruby-like" one adds as many problems as it solves
since the "Ruby-like" one is just as implicit in nature. Not to mention the
backwards compatibility impact is probably much larger.
I would like the community's opinion if there is enough out there that think
this would be a worthwile endevour--or if there is already an initiative
that I missed. Please let me know your questions, comments.
Best Regards,
Almann
--
Almann T. Goo
almann.goo(a)gmail.com