[Python-checkins] python/dist/src/Doc/lib libdifflib.tex,1.11,1.12

tim_one@sourceforge.net tim_one@sourceforge.net
Sun, 28 Apr 2002 18:37:34 -0700


Update of /cvsroot/python/python/dist/src/Doc/lib
In directory usw-pr-cvs1:/tmp/cvs-serv32409/python/Doc/lib

Modified Files:
	libdifflib.tex 
Log Message:
Mostly in SequenceMatcher.{__chain_b, find_longest_match}:
This now does a dynamic analysis of which elements are so frequently
repeated as to constitute noise.  The primary benefit is an enormous
speedup in find_longest_match, as the innermost loop can have factors
of 100s less potential matches to worry about, in cases where the
sequences have many duplicate elements.  In effect, this zooms in on
sequences of non-ubiquitous elements now.

While I like what I've seen of the effects so far, I still consider
this experimental.  Please give it a try!


Index: libdifflib.tex
===================================================================
RCS file: /cvsroot/python/python/dist/src/Doc/lib/libdifflib.tex,v
retrieving revision 1.11
retrieving revision 1.12
diff -C2 -d -r1.11 -r1.12
*** libdifflib.tex	29 Nov 2001 19:04:50 -0000	1.11
--- libdifflib.tex	29 Apr 2002 01:37:31 -0000	1.12
***************
*** 91,101 ****
    for filter functions (or \code{None}):
  
!   \var{linejunk}: A function that should accept a single string
!   argument, and return true if the string is junk (or false if it is
!   not). The default is module-level function
    \function{IS_LINE_JUNK()}, which filters out lines without visible
    characters, except for at most one pound character (\character{\#}).
  
!   \var{charjunk}: A function that should accept a string of length 1.
    The default is module-level function \function{IS_CHARACTER_JUNK()},
    which filters out whitespace characters (a blank or tab; note: bad
--- 91,107 ----
    for filter functions (or \code{None}):
  
!   \var{linejunk}: A function that accepts a single string
!   argument, and returns true if the string is junk, or false if not.
!   The default is (\code{None}), starting with Python 2.3.  Before then,
!   the default was the module-level function
    \function{IS_LINE_JUNK()}, which filters out lines without visible
    characters, except for at most one pound character (\character{\#}).
+   As of Python 2.3, the underlying \class{SequenceMatcher} class
+   does a dynamic analysis of which lines are so frequent as to
+   constitute noise, and this usually works better than the pre-2.3
+   default.
  
!   \var{charjunk}: A function that accepts a character (a string of
!   length 1), and returns if the character is junk, or false if not.
    The default is module-level function \function{IS_CHARACTER_JUNK()},
    which filters out whitespace characters (a blank or tab; note: bad
***************
*** 151,155 ****
    if \var{line} is blank or contains a single \character{\#},
    otherwise it is not ignorable.  Used as a default for parameter
!   \var{linejunk} in \function{ndiff()}.
  \end{funcdesc}
  
--- 157,161 ----
    if \var{line} is blank or contains a single \character{\#},
    otherwise it is not ignorable.  Used as a default for parameter
!   \var{linejunk} in \function{ndiff()} before Python 2.3.
  \end{funcdesc}
  
***************
*** 444,457 ****
    for filter functions (or \code{None}):
  
!   \var{linejunk}: A function that should accept a single string
!   argument, and return true if the string is junk.  The default is
!   module-level function \function{IS_LINE_JUNK()}, which filters out
!   lines without visible characters, except for at most one pound
!   character (\character{\#}).
  
!   \var{charjunk}: A function that should accept a string of length 1.
!   The default is module-level function \function{IS_CHARACTER_JUNK()},
!   which filters out whitespace characters (a blank or tab; note: bad
!   idea to include newline in this!).
  \end{classdesc}
  
--- 450,461 ----
    for filter functions (or \code{None}):
  
!   \var{linejunk}: A function that accepts a single string
!   argument, and returns true if the string is junk.  The default is
!   \code{None}, meaning that no line is considered junk.
  
!   \var{charjunk}: A function that accepts a single character argument
!   (a string of length 1), and returns true if the character is junk.
!   The default is \code{None}, meaning that no character is
!   considered junk.
  \end{classdesc}