[Python-checkins] python/dist/src/Doc/tools prechm.py,1.3,1.4

tim_one@sourceforge.net tim_one@sourceforge.net
Fri, 19 Apr 2002 11:41:49 -0700


Update of /cvsroot/python/python/dist/src/Doc/tools
In directory usw-pr-cvs1:/tmp/cvs-serv16402

Modified Files:
	prechm.py 
Log Message:
Added a stop-list to reduce the size of the full text search index.  Fred,
populate the "stop_list" triple-quoted string with your favorite handful
of stop words.


Index: prechm.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Doc/tools/prechm.py,v
retrieving revision 1.3
retrieving revision 1.4
diff -C2 -d -r1.3 -r1.4
*** prechm.py	19 Apr 2002 18:07:52 -0000	1.3
--- prechm.py	19 Apr 2002 18:41:46 -0000	1.4
***************
*** 1,3 ****
! '''
      Makes the necesary files to convert from plain html of
      Python 1.5 and 1.5.x Documentation to
--- 1,3 ----
! """
      Makes the necesary files to convert from plain html of
      Python 1.5 and 1.5.x Documentation to
***************
*** 14,18 ****
      and Fred Drake.  Obtained from Robin Dunn's .chm packaging of the
      Python 2.2 docs, at <http://alldunn.com/python/>.
! '''
  
  import sys
--- 14,18 ----
      and Fred Drake.  Obtained from Robin Dunn's .chm packaging of the
      Python 2.2 docs, at <http://alldunn.com/python/>.
! """
  
  import sys
***************
*** 39,43 ****
  project_template = '''
  [OPTIONS]
- Compatibility=1.1
  Compiled file=%(arch)s.chm
  Contents file=%(arch)s.hhc
--- 39,42 ----
***************
*** 45,48 ****
--- 44,48 ----
  Default topic=index.html
  Display compile progress=No
+ Full text search stop list file=%(arch)s.stp
  Full-text search=Yes
  Index file=%(arch)s.hhk
***************
*** 81,84 ****
--- 81,101 ----
  '''
  
+ 
+ # List of words the full text search facility shouldn't index.  This
+ # becomes file ARCH.stp.  Note that this list must be pretty small!
+ # Different versions of the MS docs claim the file has a maximum size of
+ # 256 or 512 bytes (including \r\n at the end of each line).
+ # Note that "and", "or", "not" and "near" are operators in the search
+ # language, so not point indexing them even if wanted to.
+ stop_list = '''
+ a  an  and
+ is
+ near
+ not
+ of
+ or
+ the
+ '''
+ 
  # Library Doc list of tuples:
  # each 'book' : ( Dir, Title, First page, Content page, Index page)
***************
*** 336,339 ****
--- 353,365 ----
  
      if not (('-p','') in optlist) :
+         fname = arch + '.stp'
+         f = openfile(fname)
+         print "Building stoplist", fname, "..."
+         words = stop_list.split()
+         words.sort()
+         for word in words:
+             print >> f, word
+         f.close()
+ 
          f = openfile(arch + '.hhp')
          print "Building Project..."