[ python-Bugs-1080424 ] Inplace set merge produces wrong results

SourceForge.net noreply at sourceforge.net
Tue Dec 7 12:35:05 CET 2004


Bugs item #1080424, was opened at 2004-12-07 00:36
Message generated for change (Comment added) made by rhettinger
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1080424&group_id=5470

Category: Python Interpreter Core
Group: Not a Bug
Status: Closed
Resolution: Invalid
Priority: 6
Submitted By: Matthias Klose (doko)
Assigned to: Nobody/Anonymous (nobody)
Summary: Inplace set merge produces wrong results

Initial Comment:
[forwarded from http://bugs.debian.org/284490]

the inplace set merge can produce wrong results
compared to the a = a | b non in-place (and slower)
version. 
 
Please see the attached tarball: running the "test"
script shows the problem. 


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2004-12-07 06:35

Message:
Logged In: YES 
user_id=80475

On line 39, replace
    self.items[t] = add_elements
with
    self.items[t] = add_elements.copy()

That will fix all three.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-12-07 02:46

Message:
Logged In: YES 
user_id=31435

I can pretty much guarantee this isn't a bug in Python, 
but is in some aspect of your algorithm that *relies* on 
not sharing mutable sets.

For example, if I leave Debtags1.py's

	self.items[t] |= add_elements

alone but add this right after it:

	self.items[t] = self.items[t].copy()

then Debtags1.py produces the same output as 
Debtags.py.  Same thing with Debtags2.py:  adding that 
line also makes Debtag2.py's output the same.

That proves the problem isn't in the implementations 
of "|=" or .update().  It strongly suggests that you're 
mutating a shared set that you're not expecting to 
mutate, or aren't expecting is shared.

For example, your driver does

s = db.elset(sys.argv[1])
for t in sys.argv[2:]:
	s &= db.elset(t)

and that mutates the set in self.items[sys.argv[1]].  If 
you don't intend that computing output will mutate the 
sets in db, then that code is confused.  That's not the 
source of your differing output, but "something like it" 
probably is.

In fact, the problem is probably here:

self.items[t] = add_elements

That can assign the same add_elements as the value 
associated with *many* distinct values of t.  Then you 
try to update those later in place, but "those" is a single 
shared set.  Changing the value associated with one of 
the t's then changes the value associated with all of the 
t's that originally got assigned the same "add_elements" 
set.

If I go back to the original Debtags1.py, and replace

self.items[t] = add_elements

with

self.items[t] = add_elements.copy()

then the later updates-in-place do no harm, and it 
produces the output you said you expected.

If you don't understand this, here's a dead simple 
example:

>>> x = set([1])
>>> y = x  # y and x are the *same* set now
>>> x |= set([2])  # so mutating x in place ...
>>> x
set([1, 2])
>>> y   # ... also mutates the set bound to y
set([1, 2])
>>>

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1080424&group_id=5470


More information about the Python-bugs-list mailing list