[Python-checkins] CVS: python/dist/src/Lib/test test_cookie.py,1.9,1.10 test_extcall.py,1.15,1.16 test_pyexpat.py,1.8,1.9 test_regex.py,1.9,1.10 test_support.py,1.21,1.22 test_unicode.py,1.32,1.33

Tim Peters tim_one@users.sourceforge.net
Sat, 12 May 2001 17:19:33 -0700


Update of /cvsroot/python/python/dist/src/Lib/test
In directory usw-pr-cvs1:/tmp/cvs-serv1323/python/dist/src/Lib/test

Modified Files:
	test_cookie.py test_extcall.py test_pyexpat.py test_regex.py 
	test_support.py test_unicode.py 
Log Message:
Get rid of the superstitious "~" in dict hashing's "i = (~hash) & mask".
The comment following used to say:
	/* We use ~hash instead of hash, as degenerate hash functions, such
	   as for ints <sigh>, can have lots of leading zeros. It's not
	   really a performance risk, but better safe than sorry.
	   12-Dec-00 tim:  so ~hash produces lots of leading ones instead --
	   what's the gain? */
That is, there was never a good reason for doing it.  And to the contrary,
as explained on Python-Dev last December, it tended to make the *sum*
(i + incr) & mask (which is the first table index examined in case of
collison) the same "too often" across distinct hashes.

Changing to the simpler "i = hash & mask" reduced the number of string-dict
collisions (== # number of times we go around the lookup for-loop) from about
6 million to 5 million during a full run of the test suite (these are
approximate because the test suite does some random stuff from run to run).
The number of collisions in non-string dicts also decreased, but not as
dramatically.

Note that this may, for a given dict, change the order (wrt previous
releases) of entries exposed by .keys(), .values() and .items().  A number
of std tests suffered bogus failures as a result.  For dicts keyed by
small ints, or (less so) by characters, the order is much more likely to be
in increasing order of key now; e.g.,

>>> d = {}
>>> for i in range(10):
...    d[i] = i
...
>>> d
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>>

Unfortunately. people may latch on to that in small examples and draw a
bogus conclusion.

test_support.py
    Moved test_extcall's sortdict() into test_support, made it stronger,
    and imported sortdict into other std tests that needed it.
test_unicode.py
    Excluced cp875 from the "roundtrip over range(128)" test, because
    cp875 doesn't have a well-defined inverse for unicode("?", "cp875").
    See Python-Dev for excruciating details.
Cookie.py
    Chaged various output functions to sort dicts before building
    strings from them.
test_extcall
    Fiddled the expected-result file.  This remains sensitive to native
    dict ordering, because, e.g., if there are multiple errors in a
    keyword-arg dict (and test_extcall sets up many cases like that), the
    specific error Python complains about first depends on native dict
    ordering.


Index: test_cookie.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_cookie.py,v
retrieving revision 1.9
retrieving revision 1.10
diff -C2 -r1.9 -r1.10
*** test_cookie.py	2001/04/06 21:20:58	1.9
--- test_cookie.py	2001/05/13 00:19:31	1.10
***************
*** 21,25 ****
      print repr(C)
      print str(C)
!     for k, v in dict.items():
          print ' ', k, repr( C[k].value ), repr(v)
          verify(C[k].value == v)
--- 21,27 ----
      print repr(C)
      print str(C)
!     items = dict.items()
!     items.sort()
!     for k, v in items:
          print ' ', k, repr( C[k].value ), repr(v)
          verify(C[k].value == v)

Index: test_extcall.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_extcall.py,v
retrieving revision 1.15
retrieving revision 1.16
diff -C2 -r1.15 -r1.16
*** test_extcall.py	2001/05/05 03:56:37	1.15
--- test_extcall.py	2001/05/13 00:19:31	1.16
***************
*** 1,13 ****
! from test_support import verify, verbose, TestFailed
  from UserList import UserList
  
- def sortdict(d):
-     keys = d.keys()
-     keys.sort()
-     lst = []
-     for k in keys:
-         lst.append("%r: %r" % (k, d[k]))
-     return "{%s}" % ", ".join(lst)
- 
  def f(*a, **k):
      print a, sortdict(k)
--- 1,5 ----
! from test_support import verify, verbose, TestFailed, sortdict
  from UserList import UserList
  
  def f(*a, **k):
      print a, sortdict(k)
***************
*** 229,234 ****
                  if vararg: arglist.append('*' + vararg)
                  if kwarg: arglist.append('**' + kwarg)
!                 decl = 'def %s(%s): print "ok %s", a, b, d, e, v, k' % (
!                     name, ', '.join(arglist), name)
                  exec(decl)
                  func = eval(name)
--- 221,227 ----
                  if vararg: arglist.append('*' + vararg)
                  if kwarg: arglist.append('**' + kwarg)
!                 decl = (('def %s(%s): print "ok %s", a, b, d, e, v, ' +
!                          'type(k) is type ("") and k or sortdict(k)')
!                          % (name, ', '.join(arglist), name))
                  exec(decl)
                  func = eval(name)

Index: test_pyexpat.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_pyexpat.py,v
retrieving revision 1.8
retrieving revision 1.9
diff -C2 -r1.8 -r1.9
*** test_pyexpat.py	2001/04/25 16:03:54	1.8
--- test_pyexpat.py	2001/05/13 00:19:31	1.9
***************
*** 6,12 ****
  from xml.parsers import expat
  
  class Outputter:
      def StartElementHandler(self, name, attrs):
!         print 'Start element:\n\t', repr(name), attrs
  
      def EndElementHandler(self, name):
--- 6,14 ----
  from xml.parsers import expat
  
+ from test_support import sortdict
+ 
  class Outputter:
      def StartElementHandler(self, name, attrs):
!         print 'Start element:\n\t', repr(name), sortdict(attrs)
  
      def EndElementHandler(self, name):

Index: test_regex.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_regex.py,v
retrieving revision 1.9
retrieving revision 1.10
diff -C2 -r1.9 -r1.10
*** test_regex.py	2001/01/17 21:51:36	1.9
--- test_regex.py	2001/05/13 00:19:31	1.10
***************
*** 1,3 ****
! from test_support import verbose
  import warnings
  warnings.filterwarnings("ignore", "the regex module is deprecated",
--- 1,3 ----
! from test_support import verbose, sortdict
  import warnings
  warnings.filterwarnings("ignore", "the regex module is deprecated",
***************
*** 41,45 ****
  print cre.group('one', 'two')
  print 'realpat:', cre.realpat
! print 'groupindex:', cre.groupindex
  
  re = 'world'
--- 41,45 ----
  print cre.group('one', 'two')
  print 'realpat:', cre.realpat
! print 'groupindex:', sortdict(cre.groupindex)
  
  re = 'world'

Index: test_support.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_support.py,v
retrieving revision 1.21
retrieving revision 1.22
diff -C2 -r1.21 -r1.22
*** test_support.py	2001/03/23 18:04:02	1.21
--- test_support.py	2001/05/13 00:19:31	1.22
***************
*** 91,94 ****
--- 91,102 ----
          raise TestFailed(reason)
  
+ def sortdict(dict):
+     "Like repr(dict), but in sorted order."
+     items = dict.items()
+     items.sort()
+     reprpairs = ["%r: %r" % pair for pair in items]
+     withcommas = ", ".join(reprpairs)
+     return "{%s}" % withcommas
+ 
  def check_syntax(statement):
      try:

Index: test_unicode.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_unicode.py,v
retrieving revision 1.32
retrieving revision 1.33
diff -C2 -r1.32 -r1.33
*** test_unicode.py	2001/05/02 14:21:52	1.32
--- test_unicode.py	2001/05/13 00:19:31	1.33
***************
*** 6,10 ****
  
  """#"
! from test_support import verify, verbose
  import sys
  
--- 6,10 ----
  
  """#"
! from test_support import verify, verbose, TestFailed
  import sys
  
***************
*** 494,501 ****
  
      'mac_greek', 'mac_iceland','mac_roman', 'mac_turkish',
!     'cp1006', 'cp875', 'iso8859_8',
  
      ### These have undefined mappings:
      #'cp424',
  
      ):
--- 494,504 ----
  
      'mac_greek', 'mac_iceland','mac_roman', 'mac_turkish',
!     'cp1006', 'iso8859_8',
  
      ### These have undefined mappings:
      #'cp424',
+ 
+     ### These fail the round-trip:
+     #'cp875'
  
      ):