Spanish Accents

Rami Chowdhury rami.chowdhury at gmail.com
Thu Dec 22 10:30:21 EST 2011


On Thu, Dec 22, 2011 at 15:25, Stan Iverson <iversonstan at gmail.com> wrote:

> On Thu, Dec 22, 2011 at 10:58 AM, Chris Angelico <rosuav at gmail.com> wrote:
>
>> Firstly, are you using Python 2 or Python 3? Things will be slightly
>> different, since the default 'str' object in Py3 is Unicode.
>>
>
> 2
>
>>
>> I would guess that your page is being output as UTF-8; you may find
>> that the solution is as easy as declaring the encoding of your text
>> file when you read it in.
>>
>
> So I tried this:
>
> file = open(p + "2.txt")
> for line in file:
>   print unicode(line, 'utf-8')
>

Could you try using the 'open' function from the 'codecs' module?

file = codecs.open(p + "2.txt", "utf-8")  # or whatever encoding your file
is written in
for line in file:
    print line



>
> and got this error:
>
>  142   print unicode(line, 'utf-8')
>    143
>    144 print '''<br /><br /><form id="signup" action="
> http://13gems.com/Sign_Up.py" method="post" target="_blank">
>  *builtin* *unicode* = <type 'unicode'>, *line* = '<span class="text">\r\n
> '   /usr/lib64/python2.4/encodings/utf_8.py in *decode*(input=<read-only
> buffer ptr 0x2b197e378454, size 21>, errors='strict')    14
>     15 def decode(input, errors='strict'):
>     16     return codecs.utf_16_decode(input, errors, True)
>     17
>     18 class StreamWriter(codecs.StreamWriter):
>  *global* *codecs* = <module 'codecs' from
> '/usr/lib64/python2.4/codecs.pyc'>, codecs.*utf_16_decode* = <built-in
> function utf_16_decode>, *input* = <read-only buffer ptr 0x2b197e378454,
> size 21>, *errors* = 'strict', *builtin* *True* = True
>
> *UnicodeDecodeError*: 'utf16' codec can't decode byte 0x0a in position
> 20: truncated data
>       args = ('utf16', '<span class="text">\r\n', 20, 21, 'truncated
> data')
>       encoding = 'utf16'
>       end = 21
>       object = '<span class="text">\r\n'
>       reason = 'truncated data'
>       start = 20
>
> Tried it with utf-16 with same results.
>
> TIA,
>
> Stan
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>


-- 
Rami Chowdhury
"Never assume malice when stupidity will suffice." -- Hanlon's Razor
+44-7581-430-517 / +1-408-597-7068 / +88-0189-245544
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20111222/db40fc53/attachment.html>


More information about the Python-list mailing list