[Tutor] extract uri from beautiful soup string

Sander Sweers sander.sweers at gmail.com
Mon Oct 15 02:35:05 CEST 2012


Please don't top post.

> On Mon, Oct 15, 2012 at 12:12 AM, Sander Sweers <sander.sweers at gmail.com> wrote:
> > Norman Khine schreef op zo 14-10-2012 om 23:10 [+0100]:
> >> One thing is that when I try to write the assoc_data into a CSV file,
> >> it groaks on
> >>
> >> UnicodeEncodeError: 'ascii' codec can't encode character u'\xc7' in position 0:
> >
> > It looks like python is doing an implicit decode/encode on one of your
> > strings. It may be caused in codecs.open. You will have to hunt down
> > where this implicit decode/encode is done, see
> > http://nedbatchelder.com/text/unipain.html for more info.
> >
> >> here some sample data from the print:
> >
> > These strings don't cause any errors for me and fit in ascii. Add some
> > print statements before your write the string to find which string is
> > causing you grief.

> Norman Khine schreef op ma 15-10-2012 om 00:17 [+0100]:
> i tried this: http://pastie.org/5059153
> 
> but now i get a
> 
> Traceback (most recent call last):
>   File "nimes_extract.py", line 75, in <module>
>     c.writerow([item.encode("UTF-8")])
> TypeError: 'NoneType' object is not callable


You have several str() likely to work around real bugs but now they are
biting back. In your code I don;t see any use for it..

Example how str() is hiding bugs for you.

>>> str(None).encode('UTF-8')
'None'
>>> None.encode('UTF-8')

Traceback (most recent call last):
  File "<pyshell#9>", line 1, in <module>
    None.encode('UTF-8')
AttributeError: 'NoneType' object has no attribute 'encode'

Get rid of all the str() and make sure you have only unicode strings
*everywhere* and start fixing all the bugs that got hidden because of
it.

Do make sure to watch the video as it explains the pain you are having
with unicode.

Greets
Sander



More information about the Tutor mailing list