[Tutor] How does len() compute length of a string in UTF-8, 16, and 32?

Thu Aug 10 23:27:50 EDT 2017

On Fri, Aug 11, 2017 at 2:34 AM, Cameron Simpson <cs at cskk.id.au> wrote:
>
> In files however, the default encoding for text files is 'utf-8': Python
> will read the file's bytes as UTF-8 data and will write Python string
> characters in UTF-8 encoding when writing.

The default encoding for source files is UTF-8. The default encoding
for text files is the locale's preferred encoding, i.e.
locale.getpreferredencoding(False). On Windows this is the locale's
ANSI codepage (e.g. 1252 in Western Europe). Also, the default newline
string in text files is os.linesep. On Windows this is "\r\n".