UTF-8 and latin1
Barry Scott
barry at barrys-emacs.org
Tue Oct 25 13:59:06 EDT 2022
> On 25 Oct 2022, at 11:16, Stefan Ram <ram at zedat.fu-berlin.de> wrote:
>
> ram at zedat.fu-berlin.de (Stefan Ram) writes:
>> You can let Python guess the encoding of a file.
>> def encoding_of( name ):
>> path = pathlib.Path( name )
>> for encoding in( "utf_8", "cp1252", "latin_1" ):
>> try:
>> with path.open( encoding=encoding, errors="strict" )as file:
>
> I also read a book which claimed that the tkinter.Text
> widget would accept bytes and guess whether these are
> encoded in UTF-8 or "ISO 8859-1" and decode them
> accordingly. However, today I found that here it does
> accept bytes but it always guesses "ISO 8859-1".
The best you can do is assume that if the text cannot decode as utf-8 it may be 8859-1.
Barry
>
> main.py
>
> import tkinter
>
> text = tkinter.Text()
> text.insert( tkinter.END, "AÄäÖöÜüß".encode( encoding='ISO 8859-1' ))
> text.insert( tkinter.END, "AÄäÖöÜüß".encode( encoding='UTF-8' ))
> text.pack()
> print( text.get( "1.0", "end" ))
>
> output
>
> AÄäÖöÜüßAÄäÖöÜüß
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list