[Tutor] Unicode trouble

Kent Johnson kent37 at tds.net
Thu Dec 1 11:48:53 CET 2005


Øyvind wrote:
> I tried the error='replace' as you suggested and the program made it thru
> the list. However, here are some results:
> 
> the gjenoppl�et gjenoppl�
> from
> the gjenoppløst	det gjenoppløste
> 
> kan v� konsentrert
> from
> kan være konsentrert

It seems pretty clear that you are using the wrong encoding somewhere.
> 
> I did check the site http://www.columbia.edu/kermit/utf8.html and the
> letters that is the problem here are a part of the utf-8.

That doesn't mean anything. Pretty much every letter used in every natural language of the world is part of unicode, that's the point of it. utf-8 is just a way to encode unicode so it includes all unicode characters.

The important question is, what is actual encoding of your source data?
> 
> Is there anything else I could try?

Understand why the above question is important, then answer it. Until you do you are just thrashing around in the dark.

Do you know what a character encoding is? Do you understand the difference between utf-8 and latin-1?

Kent
-- 
http://www.kentsjohnson.com



More information about the Tutor mailing list