davea at davea.name
Thu Mar 7 14:10:17 CET 2013
On 03/07/2013 01:33 AM, John Nagle wrote:
> Here's a traceback that's not helping:
A bit more context would be helpful. Starting with Python version.
> Traceback (most recent call last):
> File "InfoCompaniesHouse.py", line 255, in <module>
> File "InfoCompaniesHouse.py", line 251, in main
> loader.dofile(infile) # load this file
> File "InfoCompaniesHouse.py", line 213, in dofile
> self.dofilezip(infilename) # do ZIP file
> File "InfoCompaniesHouse.py", line 198, in dofilezip
> self.dofilecsv(infile, infd) # as a CSV file
> File "InfoCompaniesHouse.py", line 182, in dofilecsv
> for fields in reader : # read entire
> CSV file
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in
> position 14: ordinal not in range(128)
> This is wierd, becuase "for fields in reader" isn't directly
> doing a decode. That's further down somewhere, and the backtrace
> didn't tell me where.
> The program is converting some .CSV files that come packaged in .ZIP
> files. The files are big, so rather than expanding them, they're
> read directly from the ZIP files and processed through the ZIP
> and CSV modules.
> Here's the code that's causing the error above:
> decoder = codecs.getreader('utf-8')
> with decoder(infdraw,errors="replace") as infd :
> with codecs.open(outfilename, encoding='utf-8', mode='w') as
> outfd :
> headerline = infd.readline()
> reader = csv.reader(infd, delimiter=',', quotechar='"')
> for fields in reader :
> Normally, the "pass" is a call to something that
> uses the data, but for test purposes, I put a "pass" in there. It still
> fails. With that "pass", nothing is ever written to the
> output file, and no "encoding" should be taking place.
> "infdraw" is a stream from the zip module, create like this:
> with inzip.open(zipelt.filename,"r") as infd :
You probably need a 'rb' rather than 'r', since the file is not ASCII.
> self.dofilecsv(infile, infd)
> This works for data records that are pure ASCII, but as soon as some
> non-ASCII character comes through, it fails.
> Where is the error being generated? I'm not seeing any place
> where there's a conversion to ASCII. Not even a print.
> John Nagle
If that isn't enough, then please give the whole context, such as where
zipelt and filename came from. And don't forget to specify Python
version. Version 3.x treats nonbinary files very differently than 2.x
More information about the Python-list