[Python-checkins] CVS: python/dist/src/Lib Cookie.py,1.8,1.9

Tim Peters tim_one@users.sourceforge.net
Sat, 12 May 2001 17:19:33 -0700


Update of /cvsroot/python/python/dist/src/Lib
In directory usw-pr-cvs1:/tmp/cvs-serv1323/python/dist/src/Lib

Modified Files:
	Cookie.py 
Log Message:
Get rid of the superstitious "~" in dict hashing's "i = (~hash) & mask".
The comment following used to say:
	/* We use ~hash instead of hash, as degenerate hash functions, such
	   as for ints <sigh>, can have lots of leading zeros. It's not
	   really a performance risk, but better safe than sorry.
	   12-Dec-00 tim:  so ~hash produces lots of leading ones instead --
	   what's the gain? */
That is, there was never a good reason for doing it.  And to the contrary,
as explained on Python-Dev last December, it tended to make the *sum*
(i + incr) & mask (which is the first table index examined in case of
collison) the same "too often" across distinct hashes.

Changing to the simpler "i = hash & mask" reduced the number of string-dict
collisions (== # number of times we go around the lookup for-loop) from about
6 million to 5 million during a full run of the test suite (these are
approximate because the test suite does some random stuff from run to run).
The number of collisions in non-string dicts also decreased, but not as
dramatically.

Note that this may, for a given dict, change the order (wrt previous
releases) of entries exposed by .keys(), .values() and .items().  A number
of std tests suffered bogus failures as a result.  For dicts keyed by
small ints, or (less so) by characters, the order is much more likely to be
in increasing order of key now; e.g.,

>>> d = {}
>>> for i in range(10):
...    d[i] = i
...
>>> d
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>>

Unfortunately. people may latch on to that in small examples and draw a
bogus conclusion.

test_support.py
    Moved test_extcall's sortdict() into test_support, made it stronger,
    and imported sortdict into other std tests that needed it.
test_unicode.py
    Excluced cp875 from the "roundtrip over range(128)" test, because
    cp875 doesn't have a well-defined inverse for unicode("?", "cp875").
    See Python-Dev for excruciating details.
Cookie.py
    Chaged various output functions to sort dicts before building
    strings from them.
test_extcall
    Fiddled the expected-result file.  This remains sensitive to native
    dict ordering, because, e.g., if there are multiple errors in a
    keyword-arg dict (and test_extcall sets up many cases like that), the
    specific error Python complains about first depends on native dict
    ordering.


Index: Cookie.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/Cookie.py,v
retrieving revision 1.8
retrieving revision 1.9
diff -C2 -r1.8 -r1.9
*** Cookie.py	2001/04/06 19:39:11	1.8
--- Cookie.py	2001/05/13 00:19:31	1.9
***************
*** 71,76 ****
     >>> C["sugar"] = "wafer"
     >>> print C
-    Set-Cookie: sugar=wafer;
     Set-Cookie: fig=newton;
  
  Notice that the printable representation of a Cookie is the
--- 71,76 ----
     >>> C["sugar"] = "wafer"
     >>> print C
     Set-Cookie: fig=newton;
+    Set-Cookie: sugar=wafer;
  
  Notice that the printable representation of a Cookie is the
***************
*** 94,99 ****
     >>> C.load("chips=ahoy; vienna=finger")
     >>> print C
-    Set-Cookie: vienna=finger;
     Set-Cookie: chips=ahoy;
  
  The load() method is darn-tootin smart about identifying cookies
--- 94,99 ----
     >>> C.load("chips=ahoy; vienna=finger")
     >>> print C
     Set-Cookie: chips=ahoy;
+    Set-Cookie: vienna=finger;
  
  The load() method is darn-tootin smart about identifying cookies
***************
*** 494,498 ****
          if attrs is None:
              attrs = self._reserved_keys
!         for K,V in self.items():
              if V == "": continue
              if K not in attrs: continue
--- 494,500 ----
          if attrs is None:
              attrs = self._reserved_keys
!         items = self.items()
!         items.sort()
!         for K,V in items:
              if V == "": continue
              if K not in attrs: continue
***************
*** 587,591 ****
          """Return a string suitable for HTTP."""
          result = []
!         for K,V in self.items():
              result.append( V.output(attrs, header) )
          return string.join(result, sep)
--- 589,595 ----
          """Return a string suitable for HTTP."""
          result = []
!         items = self.items()
!         items.sort()
!         for K,V in items:
              result.append( V.output(attrs, header) )
          return string.join(result, sep)
***************
*** 596,600 ****
      def __repr__(self):
          L = []
!         for K,V in self.items():
              L.append( '%s=%s' % (K,repr(V.value) ) )
          return '<%s: %s>' % (self.__class__.__name__, string.join(L))
--- 600,606 ----
      def __repr__(self):
          L = []
!         items = self.items()
!         items.sort()
!         for K,V in items:
              L.append( '%s=%s' % (K,repr(V.value) ) )
          return '<%s: %s>' % (self.__class__.__name__, string.join(L))
***************
*** 603,607 ****
          """Return a string suitable for JavaScript."""
          result = []
!         for K,V in self.items():
              result.append( V.js_output(attrs) )
          return string.join(result, "")
--- 609,615 ----
          """Return a string suitable for JavaScript."""
          result = []
!         items = self.items()
!         items.sort()
!         for K,V in items:
              result.append( V.js_output(attrs) )
          return string.join(result, "")