Guido has asked me to do some research in aid of a file encoding detection/defaulting PEP.<br><br>I only have access to a small number of operating systems and language variants so I need help.<br><br>If you have access to "German Windows XP", "Japanese Windows XP", "Spanish OS X", "Japanese OS X", "German Ubuntu" etc., I would appreciate answers to the following questions.
<br><br>1. On US English Windows, Notepad defaults to an encoding called "ANSI". "ANSI" is not a real encoding at all (and certainly not one from the American National Standards Institutue -- they should sue!). ANSI is just the default Windows character set for your localization set. What does "ANSI" map to in European and Asian versions of Windows?
<br><br>2. On my English Mac, the default character set for textedit is "Mac OS Roman". What is it for foreign language macs? What API does an application use to query this default character set? What setting is it derived from? The Unix-level locale (seems not!) or some GUI-level setting (which one)?
<br><br>3. In general, how do modern versions of Linux and other Unix handle this issue? In particular: what is your default encoding and how did your operating system determine it? Did you install a locale-specific version? Did the installer ask you? Did you edit a configuration file? Did you change a GUI setting? What is the relationship between your localization of Gnome/KDE and your default encoding?
<br><br> Paul Prescod<br><br>