Try this

mensanator at aol.com mensanator at aol.com
Sun Sep 16 18:53:53 EDT 2007


On Sep 16, 5:28?pm, John Machin <sjmac... at lexicon.net> wrote:
> On Sep 17, 7:54 am, "mensana... at aol.com" <mensana... at aol.com> wrote:
>
>
>
>
>
> > On Sep 16, 2:22?pm, Steve Holden <st... at holdenweb.com> wrote:
>
> > > mensana... at aol.com wrote:
> > > > On Sep 16, 1:10?pm, Dennis Lee Bieber <wlfr... at ix.netcom.com> wrote:
> > > >> On Sun, 16 Sep 2007 01:46:34 -0700, GeorgeRXZ <george... at gmail.com>
> > > >> declaimed the following in comp.lang.python:
>
> > > >>> Then Open the Notepad and type the following sentence, and save the
> > > >>> file and close the notepad. Now reopen the file and you will find out
> > > >>> that, Notepad is not able to save the following text line.
> > > >>> Well you are speed
> > > >>> This occurs not only with above sentence but any sentence that has
> > > >>> 4 3 3 5 (sequence of characters: Well=4 you=3 are=3 speed=5)
> > > >>         I tried. I also opened the saved file in SciTE...
> > > >> And the text WAS there...
>
> > > >>         It is Notepad that can not properly render what it,
> > > >> itself, saved.
>
> > > > C:\Documents and Settings\mensanator\My Documents>type huh.txt
> > > > Well you are speed
>
> > > > Yes, file was saved correctly.
> > > > But reopening it shows 9 unprintable characters.
> > > > If I copy those to a new file (huh1.txt):
>
> > > > C:\Documents and Settings\mensanator\My Documents>type huh1.txt
> > > > ?????????
>
> > > > But wait...the new file is 20 characters, not 9.
>
> > > > 09/16/2007  01:44 PM                18 huh.txt
> > > > 09/16/2007  01:54 PM                20 huh1.txt
>
> > > > C:\Documents and Settings\mensanator\My Documents>dump huh.txt
> > > > huh.txt:
> > > > 00000000  5765 6c6c 2079 6f75 2061 7265 2073 7065 Well you are spe
> > > > 00000010  6564                                    ed
>
> > > > Here's what it's actually doing:
>
> > > > C:\Documents and Settings\mensanator\My Documents>dump huh1.txt
> > > > huh1.txt:
> > > > 00000000  fffe 5765 6c6c 2079 6f75 2061 7265 2073 .~Well you are s
> > > > 00000010  7065 6564                               peed
>
> > > One word: Unicode.
>
> > > The "open" and "save" dialogs allow you to specify an encoding.
>
> > And the encoding specified was ANSI.
>
> > > If you
> > > specify Unicode the you will get what you see above.
>
> > And if you specify ANSI _before_ you click the file name,
> > the specification switches to Unicode and has to then
> > be manually switched back to ANSI.
>
> > > If you specify ANSI
> > > you will get the text you entered.
>
> > It's still a bug in the "open" dialog.
>
> It's more like a bug/feature in its encoding detector.

It is NOT a feature. If I save something as ANSI,
there is no excuse for it not to re-open in ANSI.

> I can get it to
> switch to Unicode only if there's an even number of characters AND the
> line is NOT terminated by CRLF -- add/remove one alpha character, or
> hit the enter key at the end of the line, and it won't detect it as
> Unicode when you open it again.
>
> You only get the BOM (0xfffe) if you are silly enough to save it while
> it's open in Unicode mode.

That was a test. I wasn't so stupid as to save
to the original file, but to make a copy.

>
>
>
> > > By the way, this has precisely what to do with Python?
>
> > I've been known to use Notepad to create Python
> > source code.
>
> Your source code would have to be trivially short to trigger the
> strange behaviour.

Makes you wonder what other edge cases aren't
handled properly.

Makes you wonder why Microsoft doesn't employ
professional programmers.




More information about the Python-list mailing list