[Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)

Sat, 09 Dec 2000 18:44:14 +0200

Guido van Rossum wrote:
> 
> After the last round of discussion, I was left with the idea that the
> best thing we could do to help destructive iteration is to introduce a
> {}.popitem() that returns an arbitrary (key, value) pair and deletes
> it.  I wrote about this:
> 
> > > One more concern: if you repeatedly remove the *first* item, the hash
> > > table will start looking lobsided.  Since we don't resize the hash
> > > table on deletes, maybe picking an item at random (but not using an
> > > expensive random generator!) would be better.
> 
> and Tim replied:
> 
> > Which is the reason SETL doesn't specify *which* set item is removed:  if
> > you always start looking at "the front" of a dict that's being consumed, the
> > dict fills with turds without shrinking, you skip over them again and again,
> > and consuming the entire dict is still quadratic time.
> >
> > Unfortunately, while using a random start point is almost always quicker
> > than that, the expected time for consuming the whole dict remains quadratic.
> >
> > The clearest way around that is to save a per-dict search finger, recording
> > where the last search left off.  Start from its current value.  Failure if
> > it wraps around.  This is linear time in non-pathological cases (a
> > pathological case is one in which it isn't linear time <wink>).
> 
> I've implemented this, except I use a static variable for the finger
> intead of a per-dict finger.  I'm concerned about adding 4-8 extra
> bytes to each dict object for a feature that most dictionaries never
> need.  So, instead, I use a single shared finger.  This works just as
> well as long as this is used for a single dictionary.  For multiple
> dictionaries (either used by the same thread or in different threads),
> it'll work almost as well, although it's possible to make up a
> pathological example that would work qadratically.
> 
> An easy example of such a pathological example is to call popitem()
> for two identical dictionaries in lock step.
> 
> Comments please!  We could:
> 
> - Live with the pathological cases.
> 
> - Forget the whole thing; and then also forget about firstkey()
>   etc. which has the same problem only worse.
> 
> - Fix the algorithm. Maybe jumping criss-cross through the hash table
>   like lookdict does would improve that; but I don't understand the
>   math used for that ("Cycle through GF(2^n)-{0}" ???).

That algorithm is really a gem which you should know,
so let me try to explain it.

Intro: A little story about finite field theory (very basic).
-------------------------------------------------------------

For every prime p and every power p^n, there
exists a Galois Field ( GF(p^n) ), which is
a finite field.
The additive group is called "elementary Abelian",
it is commutative, and it looks a little like a
vector space, since addition works in cycles modulo p
for every p cell.
The multiplicative group is cyclic, and it never
touches 0. Cyclic groups are generated by a single
primitive element. The powers of that element make up all the
other elements. For all elements of the multiplication
group GF(p^n)* the equality    x^(p^n -1) == 1 . A generator
element is therefore a primitive (p^n-1)th root of unity.

>From another point of view, the elements of GF(p^n)
can be seen as coefficients of polynomials over
GF(p). It can be easily shown that every generator
of the multiplicative group is an irreducible
polynomial of degree n with coefficients in GF(p).

An irreducible polynomial over a field has the property
not to vanish for any value of the field. It has no
zero in the field. By adjoining such a zero to the
field, we turn it into an extension field: GF(p^n).

Now on the dictionary case.
---------------------------

The idea is to conceive every non-zero bit pattern
as coefficients of a polynomial over GF(2)[x].
We need to find an irreducible polynomial of degee
n over the prime field GF(2). There exists a primitive
n'th root µ of unity in GF(2^n) which generates every
non-zero bit pattern of length n, being coefficients
of a polynomial over µ.
That means, every given bit pattern can be seen as
some power of µ. µ enumerates the whole multiplicative
group, and the given pattern is just one position in
that enumeration. We can go to the next position
in this cycle simply by multiplying the pattern by µ.
But since we are dealing with polynomials over µ,
this multiplication isn't much more that adding one
to very exponent in the polynomial, hence a left
shift of our pattern.
Adjusting the overflow of this pattern involves
a single addition, which is just an XOR in GF(2^n).

Example: p=2  n=3  G = GF(8) = GF(2^3)
----------------------------------------
"""
Since 8 = 2^3, the prime field is GF(2) and we need to
find a monic irreducible cubic polynomial over that
field. Since the coefficients can only be 0 and 1, the
list of irreducible candidates is easily obtained. 

     x^3 + 1 
     x^3 + x + 1 
     x^3 + x^2 + 1 
     x^3 + x^2 + x + 1

Now substituting 0 gives 1 in all cases, and substituting
1 will give 0 only if there are an odd number of x terms,
so the irreducible cubics are just x^3 + x + 1 and x^3 + x^2 + 1.
Now the multiplicative group of this field is a cyclic
group of order 7 and so every nonidentity element is a
generator. Letting µ be a root of the first polynomial, we
have µ^3 + µ + 1 = 0, or µ^3 = µ + 1, so the powers of µ are: 

     µ^1 = µ 
     µ^2 = µ^2 
     µ^3 = µ + 1 
     µ^4 = µ^2 + µ 
     µ^5 = µ^2 + µ + 1 
     µ^6 = µ^2 + 1 
     µ^7 = 1 

"""

We could of course equally choose the second polynomial with
an isomorphic result.

The above example was taken from
http://www-math.cudenver.edu/~wcherowi/courses/finflds.html
Note that finding the irreducible polynomial was so easy
since a reducible cubic always has a linear factor. In the
general case, we have to check against all possible
subpolynomials or use much more of the theory.

Application of the example to the dictionary algorithm (DA)
-----------------------------------------------------------

We stay in GF(8) and use the above example.
The maximum allowed pattern value in our system is
2^n - 1. This is the variable "mask" in the program.

We assume a given non-zero bit pattern with coefficients
   (a2, a1, a0)
and write down a polynomial in µ for it:

    p = a2*µ^2 + a1*µ + a0

To obtain the next pattern in the group's enumeration,
we multiply by the generator polynomial µ:

    p*µ = a2*µ^3 + a1*µ^2 + a0*µ

In the program, this is line 210:
		incr = incr << 1;

a simple shift.
But in the case that a2 is not zero, we get an overflow,
and we have to fold the bit back, by the following identity:

    µ^3 = µ+1

That means, we have to subtract µ^3 from the pattern and
add µ+1 instead. But since addition and subtraction is
identical in GF(2), we just add the whole polynomial.
>From a different POV, we just add zero here, since

    µ^3 + µ + 1  =  0

The full progam to perform the polynomial multiplication
gets down to just a shift, a test and an XOR:

		incr = incr << 1;
		if (incr > mask)
			incr ^= mp->ma_poly;

Summary
=======

For every power n of two, we can find a generator element
µ of GF(2^n). Every non-zero bit pattern can be taken as
coefficient of a polynomial in µ. 
The powers of µ reach all these patterns. Therefore, each
pattern *is* some power of µ. By multiplication with µ
we can reach every possible pattern exactly once.
Since these patterns are used as distances from the
primary hash-computed slot modulo 2^n, and the distances
are never zero, all slots can be reached.

-----------------------------------

Appendix, on the use of finger:
-------------------------------

Instead of using a global finger variable, you can do the
following (involving a cast from object to int) :

- if the 0'th slot of the dict is non-empty:
  return this element and insert the dummy element
  as key. Set the value field to the Dictionary Algorithm
  would give for the removed object's hash. This is the
  next finger.
- else:
  treat the value field of the 0'th slot as the last finger.
  If it is zero, initialize it with 2^n-1.
  Repetitively use the DA until you find an entry. Save
  the finger in slot 0 again.

This dosn't cost an extra slot, and even when the dictionary
is written between removals, the chance to loose the finger
is just 1:(2^n-1) on every insertion.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com