question about xrange performance
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Fri Apr 17 23:09:18 EDT 2009
On Fri, 17 Apr 2009 13:58:54 -0700, ~flow wrote:
>> One might wonder why you are even writing code to test for existence
>> "in" a range list, when "blee <= blah < bloo" is obviously going to
>> outperform this kind of code.
>> -- Paul
>
> the reason is simply the idiomacy and convenience. i use (x)xranges to
> implement unicode blocks and similar things. it is natural to write `if
> cid in latin_1` and so on.
[soapbox]
Speaking about idiomacy, it is grammatically incorrect to start sentences
in English with lower-case letters, and it is rude because it harms the
reading ability of people reading your posts. If it saves you 0.01ms of
typing time to avoid pressing the shift key, and 100 people reading your
post take 0.01ms more mental processing time to comprehend your writing
because of the lack of a clear sentence break, then the harm you do to
others is 100 times greater than the saving you make for yourself. You're
not e.e. cummings, who was a dick anyway, and as a programmer you're
supposed to understand about precision in language, syntax and grammar.
[end soapbox]
I think testing y in xrange() is a natural thing to do, but as I recall,
it was actually removed from xrange a few years ago to simplify the code.
I thought that was a silly decision, because the code was written and
working and it's not like the xrange object was likely to change, but
what do I know?
> i always assumed it would be about the
> fastest and memory-efficient to use xrange for this.
If you don't need to iterate over the actual codepoints, the most memory-
efficient would be to just store the start and end positions, as a tuple
or possibly even a slice object, and then call t[0] <= codepoint < t[1].
If you do need to iterate over them, perhaps some variant of this would
suit your needs:
# Untested
class UnicodeBlock(object):
def __init__(self, start, end):
self.start = start
self.end = end
self._current = start
def __contains__(self, value):
if isinstance(value, (int, long)):
return self.start <= value < self.end
def __iter__(self):
return self
def next(self):
if self._current < self.end:
self._current += 1
return self._current
raise StopIterator
def reset(self):
self._current = self.start
[...]
> the `( x == int( x ) )` is not easily being done away with if emulation
> of present x/range behavior is desired (x.0 floats are in, all others
> are out).
x.0 floats working with xrange is an accident, not a deliberate design
decision, and has been deprecated in Python 2.6, which means it will
probably be gone in a few years:
>>> r = xrange(2.0, 5.0)
__main__:1: DeprecationWarning: integer argument expected, got float
--
Steven
More information about the Python-list
mailing list