Can I rely on...
tjreedy at udel.edu
Thu Mar 19 17:45:50 CET 2009
Emanuele D'Arrigo wrote:
> Hi everybody,
> I just had a bit of a shiver for something I'm doing often in my code
> but that might be based on a wrong assumption on my part. Take the
> following code:
> pattern = "aPattern"
> compiledPatterns = [ ]
> if(re.compile(pattern) in compiledPatterns):
Note that for this generally take time proportional to the length of the
list. And as MRAB said, drop the parens.
> print("The compiled pattern is stored.")
> As you can see I'm effectively assuming that every time re.compile()
> is called with the same input pattern it will return the exact same
> object rather than a second, identical, object. In interactive tests
> via python shell this seems to be the case but... can I rely on it -
> always- being the case? Or is it one of those implementation-specific
As MRAB indicated, this only works because the CPython re module itself
has a cache so you do not have to make one. It is, however, limited to
100 or so since programs that use patterns repeatedly generally use a
limited number of patterns. Caches usually use a dict so that
cache[input] == output and lookup is O(1).
> And what about any other function or class/method? Is there a way to
> discriminate between methods and functions that when invoked twice
> with the same arguments will return the same object and those that in
> the same circumstances will return two identical objects?
In general, a function that calculates and return an object will return
a new object. The exceptions are exceptions.
> If the answer is no, am I right to state the in the case portrayed
> above the only way to be safe is to use the following code instead?
> for item in compiledPatterns:
> if(item.pattern == pattern):
Yes. Unless you are comparing against None (or True or False in Py3) or
specifically know otherwise, you probably want '==' rather than 'is'.
Terry Jan Reedy
More information about the Python-list