[Python-bugs-list] [ python-Bugs-625698 ] Errors with recursive objects
SourceForge.net
noreply@sourceforge.net
Mon, 20 Jan 2003 12:44:29 -0800
Bugs item #625698, was opened at 2002-10-19 15:27
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=625698&group_id=5470
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Erik Andersén (erik_andersen)
Assigned to: Jeremy Hylton (jhylton)
Summary: Errors with recursive objects
Initial Comment:
List and dictionaries put Python into an infinite loop if they contain
more than one reference to itself
>>> d = {}
>>> d[1] =
d
>>> d == d # OK Python can handle one
recursion
1
>>> d[2] = d
>>> d == d # Crash with
two
Lists are similar
>>> l=[]
>>>
l.append(l)
>>> l==l # OK
1
>>> l.append(l)
>>>
l==l # Crash
>>> l>l # Also crash
Tested with
ActivePython 2.2.1 under Windows 98
and (the first part)
Python 2.2.2 under Linux
----------------------------------------------------------------------
>Comment By: Tim Peters (tim_one)
Date: 2003-01-20 15:44
Message:
Logged In: YES
user_id=31435
Cool. He's right that they can take a horrendously long time.
The 2-element list example takes c*2**20 units of time, for
some suitable constant c, and if it were extended to 3
elements, it would take c*3**20 units of time, thousands of
times longer.
As Erik's latest example shows, the outcome isn't always
particularly well defined either. An alternative to speeding this
silliness is to raise an exception instead when recursive
objects are detected. There was some hack value in doing
the graph isomorphism bit, but no real practical value I can
see.
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-20 15:20
Message:
Logged In: YES
user_id=33168
Sorry Tim, I was going to reply to one of your first msgs
and forgot. Erik sent me mail which said there wasn't a
crash, just that "the statements take horrendously long time
to execute."
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2003-01-20 15:09
Message:
Logged In: YES
user_id=31435
BTW, Erik, what did you mean by "crash"? Nobody else has
seen a crash here. You started the report by saying "infinite
loop", which isn't the same thing as a crash. There isn't an
infinite loop here either, although the algorithm can take
astronomical amounts of time to finish (so may appear to be
in an infinite loop).
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2003-01-20 14:50
Message:
Logged In: YES
user_id=31435
Equailty for self-consistent recursive objects does a graph
isomorphism computation.
In your self-inconsistent example, it so happens that False is
returned, although True would be returned if the compile-time
constant (in object.c) NESTING_LIMIT were odd instead of
even.
But if you worry about this, you need to get out more <wink>.
----------------------------------------------------------------------
Comment By: Erik Andersén (erik_andersen)
Date: 2003-01-20 14:24
Message:
Logged In: YES
user_id=364358
Excuse me for interupting the efficiency discussion, but what is the
definition of equality for recursive containers. I am particularly worrided
about cases like
>>> class A:
... def __eq__(self,other):
return not self.u==other.u
...
>>> a=A()
>>> l=[a]
>>>
a.u=l
>>> l==l # ????
If l==l then a!=a and l[0]!=l[0], so
l!=l.
If l!=l then a==a and l[0]==l[0], so l==l !!!!!!!
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2003-01-20 11:12
Message:
Logged In: YES
user_id=31435
I deleted my patch -- it didn't make sense.
Martin, AFAIK the docs say nothing about the relationship (if
any) between "is" and "==". It was deliberate intent that "is"
not imply "==" for rich comparisons, though, in part so that
IEEE-754 rules for NaN could be implemented by users.
>>> class NaN:
... def __eq__(self, other): return False
...
>>> n = NaN()
>>> n is n
True
>>> n == n
False
>>>
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2003-01-20 03:20
Message:
Logged In: YES
user_id=21627
Where does the documentation say that "is" may not imply ==?
The check for recursion assumes this implication, anyway: if
we meet the very same object, we infer that they are equal.
What is the purpose of clear_inprogress_dict in your patch?
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2003-01-19 23:28
Message:
Logged In: YES
user_id=31435
The patch assumes "is" implies "==", which isn't necessarily
so for rich comparisons.
The attached patch cuts the time about in half, via:
1. Getting rid of the per-thread inprogress dicts, in favor of
a shared inprogress dict.
2. Restoring the intent of the code that tuples be exempted
(tuples can't be recursive). Rev 2.65 of tupleobject.c gave
tuples a tp_as_mapping slot, breaking the intent of the
code.
3. Changing the inprogress dict to map a tuple to the
smallest value of compare_nesting at which the tuple was
seen. delete_token() was changed to leave the token alone
if the current compare_nesting value is greater than that.
This allows tuples to stay in the dict longer.
4. Cutting NESTING_LIMIT from 20 to 19. This has a huge
effect, because the algorithm is still basically exponential-
time.
The problem remaining is that the inprogress dict is still
useless (and empty) so long as compare_nesting is less
than NESTING_LIMIT. This can leave an exponential
amount of work to be done to fight back up to
NESTING_LIMIT.
Leaving stuff in inprogress longer, and consulting inprogress
regardless of the value of compare_nesting (provided that's
gone over NESTING_LIMIT at least once) slashes the time
to triviality. However, it isn't safe: we can't know that the
addresses in the dict keys refer to the same objects then.
We could if started storing into inprogress from the very
start, but that would be a big speed hit for usual-case
comparisons. I'd be happy to knock NESTING_LIMIT down
more, though.
Assigned to Jeremy, as IIRC this was his code originally.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2003-01-19 12:30
Message:
Logged In: YES
user_id=21627
The patch is fine, except that a comment needs to be added
explaining what this does and why it does that.
It should be understood that this can't fix the very similar
case
a = []
a.append(a)
a.append(a)
b = []
b.append(b)
b.append(b)
a==b
This would still take a long time, right?
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-19 11:05
Message:
Logged In: YES
user_id=33168
Attached is a patch which fixes the problem for me.
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-07 19:59
Message:
Logged In: YES
user_id=33168
Erik, do you still have this problem with 2.2.2? I can't
reproduce your problem under Linux. mly can't reproduce the
problems with 2.2.1 on Windows (2000). Do you only have the
problem on Windows?
----------------------------------------------------------------------
Comment By: Magnus Lyckå (mly)
Date: 2002-10-26 19:12
Message:
Logged In: YES
user_id=95217
I tried the above under
Python 2.2.1 (#34, Sep 27 2002, 18:37:42) [MSC 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more
information.
(ActiveState on Win 2000)
and
Python 2.1.1 (#1, Aug 20 2001, 20:23:29)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.60mdk)] on
linux-i386
In both cases it works correctly. No crash.
It takes a lot of time though, Several seconds
on Duron 700 w/ Win2k, and tens of seconds
on K6-233 w/ Linux.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=625698&group_id=5470