[Spambayes-checkins] spambayes/scripts sb_dbexpimp.py,1.2,1.3
Tony Meyer
anadelonbrin at users.sourceforge.net
Tue Nov 25 19:12:46 EST 2003
Update of /cvsroot/spambayes/spambayes/scripts
In directory sc8-pr-cvs1:/tmp/cvs-serv592/scripts
Modified Files:
sb_dbexpimp.py
Log Message:
Import/Export data as utf-8. Part of patch [ 824651 ] Japanese (and/or other CJK languages) message support
Index: sb_dbexpimp.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/scripts/sb_dbexpimp.py,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** sb_dbexpimp.py 10 Sep 2003 04:33:17 -0000 1.2
--- sb_dbexpimp.py 26 Nov 2003 00:12:44 -0000 1.3
***************
*** 102,105 ****
--- 102,114 ----
import sys, os, getopt, errno, re
import urllib
+ from types import UnicodeType
+
+ def uquote(s):
+ if isinstance(s, UnicodeType):
+ s = s.encode('utf-8')
+ return urllib.quote(s)
+
+ def uunquote(s):
+ return unicode(urllib.unquote(s), 'utf-8')
def runExport(dbFN, useDBM, outFN):
***************
*** 132,136 ****
hamcount = wi.hamcount
spamcount = wi.spamcount
! word = urllib.quote(word)
fp.write("%s`%s`%s`\n" % (word, hamcount, spamcount))
--- 141,145 ----
hamcount = wi.hamcount
spamcount = wi.spamcount
! word = uquote(word)
fp.write("%s`%s`%s`\n" % (word, hamcount, spamcount))
***************
*** 190,194 ****
for line in lines:
(word, hamcount, spamcount, junk) = re.split('`', line)
! word = urllib.unquote(word)
try:
--- 199,203 ----
for line in lines:
(word, hamcount, spamcount, junk) = re.split('`', line)
! word = uunquote(word)
try:
More information about the Spambayes-checkins
mailing list