[spambayes-dev] Website bug and proposed fix

Tue Nov 11 16:58:41 EST 2003

Hi,

Jens Rantil has kindly pointed out that we have some broken links on
our website, in particular the "SF Project Page" link that appears
throughout the site.

I've never looked at the website stuff before so I could be way off
base, but the problem seems to be that we're applying posixpath.normpath
to a URL, with results that look like this:

>>> import posixpath
>>> posixpath.normpath("http://sourceforge.net/projects/spambayes")
'http:/sourceforge.net/projects/spambayes'

I'd say the fix was to break apart the URL and only run the path
component through normpath.  Here's a patch - I don't want to commit it,
partly because I don't know the code, and partly because the website
build system doesn't fully work on my machine so I can't thoroughly test
it.  I've also removed a rather cryptic comment that seems to refer to
history rather than the current state of play.

Index: scripts/ht2html/LinkFixer.py
===================================================================
RCS file: /cvsroot/spambayes/website/scripts/ht2html/LinkFixer.py,v
retrieving revision 1.2
diff -c -r1.2 LinkFixer.py
*** scripts/ht2html/LinkFixer.py	28 Oct 2003 04:37:08 -0000	1.2
--- scripts/ht2html/LinkFixer.py	11 Nov 2003 21:57:40 -0000
***************
*** 8,13 ****
--- 8,15 ----
  """

  import sys
+ import urlparse
+ import posixpath # use posix semantics for urls
  from types import StringType

  SLASH = '/'
***************
*** 37,49 ****
              url = 'index.html'
          elif url[-1] == '/':
              url = url + 'index.html'
!         absurl = SLASH.join([self.__rootdir, self.__relthis, url])
          # normalize the path, kind of the way os.path.normpath() does.
!         # urlparse ought to have something like this...
!         # hrm - MarkH thinks this is broken, so it has been replaced
!         # with normpath - what is the problem with normpath?
!         import posixpath # use posix semantics for urls
!         absurl = posixpath.normpath(absurl)
          self.msg('absurl= %s', absurl)
          return absurl

--- 39,51 ----
              url = 'index.html'
          elif url[-1] == '/':
              url = url + 'index.html'
!         
          # normalize the path, kind of the way os.path.normpath() does.
!         # urlparse ought to have something like this built in...
!         scheme, addr, path, params, query, frag = urlparse.urlparse(url)
!         abspath = SLASH.join([self.__rootdir, self.__relthis, path])
!         path = posixpath.normpath(abspath)
!         absurl = urlparse.urlunparse((scheme, addr, path, params, query, frag))
          self.msg('absurl= %s', absurl)
          return absurl

-- 
Richie Hindle
richie at entrian.com