csv reader

Tue Dec 15 17:12:01 EST 2009

Then my problem is diferent!

In fact I'm reading a csv file saved from openoffice oocalc using
UTF-8 encoding. I get a list of list (let's cal it tab) with the csv
data.
If I do:

print tab[2][4]
In ipython, I get:
equação de Toricelli. Tarefa exercícios PVR 1 e 2 ; PVP 1

If I only do:
tab[2][4]

In ipython, I get:
'equa\xc3\xa7\xc3\xa3o de Toricelli. Tarefa exerc\xc3\xadcios PVR 1 e
2 ; PVP 1'

Does that mean that my problem is not the one I'm thinking?

My real problem is when I use that that kind of UTF-8 encoded (?) with
selenium here.
Here is an small code example of a not-working case giving the same
error that on my bigger program:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from selenium import selenium
import sys,os,csv,re

class test:
    '''classe para interagir com o sistema acadêmico'''
    def __init__(self):
        self.webpage=''
        self.arquivo=''
        self.script=[]
        self.sel = selenium('localhost', 4444, '*firefox', 'http://
www.google.com.br')
        self.sel.start()
        self.sel.open('/')
        self.sel.wait_for_page_to_load(30000)
        self.sel.type("q", "equação")
        #self.sel.type("q", u"equacao")
        self.sel.click("btnG")
        self.sel.wait_for_page_to_load("30000")

def main():
    teste=test()

if __name__ == "__main__":
    main()

If I just switch the folowing line:
self.sel.type("q", "equação")

by:
self.sel.type("q", u"equação")

It works fine!
The problem is that the csv.reader does give a "equação" and not a
u"equação"

Here is the error given with bad code (with "equação"):
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (1202, 0))

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call
last)

/home/manu/Labo/Cefetes_Colatina/Scripts/
20091215_test_acentuated_caracters.py in <module>()
     27
     28 if __name__ == "__main__":
---> 29     main()
     30
     31

/home/manu/Labo/Cefetes_Colatina/Scripts/
20091215_test_acentuated_caracters.py in main()
     23
     24 def main():
---> 25     teste=test()
     26
     27

/home/manu/Labo/Cefetes_Colatina/Scripts/
20091215_test_acentuated_caracters.py in __init__(self)
     16         self.sel.open('/')
     17         self.sel.wait_for_page_to_load(30000)
---> 18         self.sel.type("q", "equação")
     19         #self.sel.type("q", u"equacao")
     20         self.sel.click("btnG")

/home/manu/Labo/Cefetes_Colatina/Scripts/selenium.pyc in type(self,
locator, value)
    588         'value' is the value to type
    589         """
--> 590         self.do_command("type", [locator,value,])
    591
    592

/home/manu/Labo/Cefetes_Colatina/Scripts/selenium.pyc in do_command
(self, verb, args)
    201         body = u'cmd=' + urllib.quote_plus(unicode(verb).encode
('utf-8'))
    202         for i in range(len(args)):
--> 203             body += '&' + unicode(i+1) + '=' +
urllib.quote_plus(unicode(args[i]).encode('utf-8'))
    204         if (None != self.sessionId):
    205             body += "&sessionId=" + unicode(self.sessionId)

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
4: ordinal not in range(128)
WARNING: Failure executing file:
<20091215_test_acentuated_caracters.py>
Python 2.6.4 (r264:75706, Oct 27 2009, 06:16:59)