Spanish Accents
Rami Chowdhury
rami.chowdhury at gmail.com
Thu Dec 22 10:30:21 EST 2011
On Thu, Dec 22, 2011 at 15:25, Stan Iverson <iversonstan at gmail.com> wrote:
> On Thu, Dec 22, 2011 at 10:58 AM, Chris Angelico <rosuav at gmail.com> wrote:
>
>> Firstly, are you using Python 2 or Python 3? Things will be slightly
>> different, since the default 'str' object in Py3 is Unicode.
>>
>
> 2
>
>>
>> I would guess that your page is being output as UTF-8; you may find
>> that the solution is as easy as declaring the encoding of your text
>> file when you read it in.
>>
>
> So I tried this:
>
> file = open(p + "2.txt")
> for line in file:
> print unicode(line, 'utf-8')
>
Could you try using the 'open' function from the 'codecs' module?
file = codecs.open(p + "2.txt", "utf-8") # or whatever encoding your file
is written in
for line in file:
print line
>
> and got this error:
>
> 142 print unicode(line, 'utf-8')
> 143
> 144 print '''<br /><br /><form id="signup" action="
> http://13gems.com/Sign_Up.py" method="post" target="_blank">
> *builtin* *unicode* = <type 'unicode'>, *line* = '<span class="text">\r\n
> ' /usr/lib64/python2.4/encodings/utf_8.py in *decode*(input=<read-only
> buffer ptr 0x2b197e378454, size 21>, errors='strict') 14
> 15 def decode(input, errors='strict'):
> 16 return codecs.utf_16_decode(input, errors, True)
> 17
> 18 class StreamWriter(codecs.StreamWriter):
> *global* *codecs* = <module 'codecs' from
> '/usr/lib64/python2.4/codecs.pyc'>, codecs.*utf_16_decode* = <built-in
> function utf_16_decode>, *input* = <read-only buffer ptr 0x2b197e378454,
> size 21>, *errors* = 'strict', *builtin* *True* = True
>
> *UnicodeDecodeError*: 'utf16' codec can't decode byte 0x0a in position
> 20: truncated data
> args = ('utf16', '<span class="text">\r\n', 20, 21, 'truncated
> data')
> encoding = 'utf16'
> end = 21
> object = '<span class="text">\r\n'
> reason = 'truncated data'
> start = 20
>
> Tried it with utf-16 with same results.
>
> TIA,
>
> Stan
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
--
Rami Chowdhury
"Never assume malice when stupidity will suffice." -- Hanlon's Razor
+44-7581-430-517 / +1-408-597-7068 / +88-0189-245544
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20111222/db40fc53/attachment-0001.html>
More information about the Python-list
mailing list