[spambayes-bugs] [ spambayes-Bugs-1187208 ] import into CDB chokes on 8-bit chars

SourceForge.net noreply at sourceforge.net
Thu Apr 21 10:45:22 CEST 2005


Bugs item #1187208, was opened at 2005-04-21 01:45
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=1187208&group_id=61702

Category: hammie
Group: 1.1.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Leonid (leobru)
Assigned to: Nobody/Anonymous (nobody)
Summary: import into CDB chokes on 8-bit chars

Initial Comment:
If  the CSV file contains an iso-8859-1 character,
import into CDB fails:

file csv (2 lines, save as iso-8859-1):
1,1
fiancée,1,1

sb_dbexpimp.py -i -o
Storage:persistent_use_database:cdb -o
Storage:persistent_storage_file:cdb -v -f csv
Importing file csv into database /.../cdb
Storing database, please be patient.  Even moderately sized
databases may take a very long time to store.
Traceback (most recent call last):
  File "./sb_dbexpimp.py", line 248, in ?
    runImport(dbFN, useDBM, newDBM, flatFN)
  File "./sb_dbexpimp.py", line 200, in runImport
    bayes.store()
  File
"/usr/home/leob/spambayes-1.1a1/scripts/spambayes/storage.py",
line 649, in store
    cdb.cdb_make(db, items)
  File
"/usr/home/leob/spambayes-1.1a1/scripts/spambayes/cdb.py",
line 166, in cdb_make
    outfile.write(key)
UnicodeEncodeError: 'ascii' codec can't encode
character u'\xe9' in position 5: ordinal not in range(128)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=1187208&group_id=61702


More information about the Spambayes-bugs mailing list