[Spambayes-checkins] spambayes/scripts sb_dbexpimp.py,1.2,1.3

Tony Meyer anadelonbrin at users.sourceforge.net
Tue Nov 25 19:12:46 EST 2003


Update of /cvsroot/spambayes/spambayes/scripts
In directory sc8-pr-cvs1:/tmp/cvs-serv592/scripts

Modified Files:
	sb_dbexpimp.py 
Log Message:
Import/Export data as utf-8.  Part of patch [ 824651 ] Japanese (and/or other CJK languages) message support

Index: sb_dbexpimp.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/scripts/sb_dbexpimp.py,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** sb_dbexpimp.py	10 Sep 2003 04:33:17 -0000	1.2
--- sb_dbexpimp.py	26 Nov 2003 00:12:44 -0000	1.3
***************
*** 102,105 ****
--- 102,114 ----
  import sys, os, getopt, errno, re
  import urllib
+ from types import UnicodeType
+ 
+ def uquote(s):
+     if isinstance(s, UnicodeType):
+         s = s.encode('utf-8')
+     return urllib.quote(s)
+ 
+ def uunquote(s):
+     return unicode(urllib.unquote(s), 'utf-8')
  
  def runExport(dbFN, useDBM, outFN):
***************
*** 132,136 ****
          hamcount = wi.hamcount
          spamcount = wi.spamcount
!         word = urllib.quote(word)
          fp.write("%s`%s`%s`\n" % (word, hamcount, spamcount))
          
--- 141,145 ----
          hamcount = wi.hamcount
          spamcount = wi.spamcount
!         word = uquote(word)
          fp.write("%s`%s`%s`\n" % (word, hamcount, spamcount))
          
***************
*** 190,194 ****
      for line in lines:
          (word, hamcount, spamcount, junk) = re.split('`', line)
!         word = urllib.unquote(word)
         
          try:
--- 199,203 ----
      for line in lines:
          (word, hamcount, spamcount, junk) = re.split('`', line)
!         word = uunquote(word)
         
          try:





More information about the Spambayes-checkins mailing list