python33, windows, UnicodeEncodeError: 'charmap' codec can't encode characters in position, to print out the file contents to stdout,

wxjmfauth at gmail.com wxjmfauth at gmail.com
Mon Jul 7 08:58:05 CEST 2014


Le dimanche 6 juillet 2014 21:37:36 UTC+2, Rick Johnson a écrit :
> On Sunday, July 6, 2014 1:14:38 PM UTC-5, wxjm... at gmail.com wrote:
> 
> > Le dimanche 6 juillet 2014 18:53:34 UTC+2, Rick Johnson a écrit :
> 
> > [...]
> 
> > 
> 
> > > Seems like she'd better do the decoding before printing
> 
> > No
> 
> > 
> 
> > > or am i wrong again?
> 
> > Yes
> 
> > 
> 
> > >>> s = 'abc需'
> 
> > >>> sys.stdout.encoding
> 
> > '<unicode>'
> 
> > >>> print(s)
> 
> > abc需
> 
> > >>> sys.stdout.encoding = 'cp437'
> 
> > >>> sys.stdout.encoding
> 
> > 'cp437'
> 
> > >>> print(s)
> 
> > Traceback (most recent call last):
> 
> >   File "<eta last command>", line 1, in <module>   File "D:\jm\jmpy\eta\eta40beta2\etastdio.py", line 158, in write
> 
> >     s = s.encode(self.pencoding).decode('cp1252')
> 
> >   File "C:\Python32\lib\encodings\cp437.py", line 12, in encode
> 
> >     return codecs.charmap_encode(input,errors,encoding_map)
> 
> > UnicodeEncodeError: 'charmap' codec can't encode characters in position 4-5: 
> 
> > character maps to <undefined> >>> print(s.encode(sys.stdout.encoding, 'replace'))
> 
> > 'abcé??'
> 
> > >>> sys.stdout.encoding = 'cp850'
> 
> > >>> sys.stdout.encoding
> 
> > 'cp850'
> 
> > >>> print(s)
> 
> > Traceback (most recent call last):
> 
> >   File "<eta last command>", line 1, in <module>   File "D:\jm\jmpy\eta\eta40beta2\etastdio.py", line 158, in write
> 
> >     s = s.encode(self.pencoding).decode('cp1252')
> 
> >   File "C:\Python32\lib\encodings\cp850.py", line 12, in encode
> 
> >     return codecs.charmap_encode(input,errors,encoding_map)
> 
> > UnicodeEncodeError: 'charmap' codec can't encode characters in position 4-5: 
> 
> > character maps to <undefined> >>> print(s.encode(sys.stdout.encoding, 'replace'))
> 
> > 'abcé??'
> 
> > >>> # and so on
> 
> > >>> sys.stdout.encoding = 'cp1252')
> 
> >   File "<eta last command>", line 1
> 
> >     sys.stdout.encoding = 'cp1252')
> 
> >                                   ^
> 
> > SyntaxError: invalid syntax
> 
> > >>> # oops
> 
> > >>> sys.stdout.encoding = 'cp1252'
> 
> > >>> print(s)
> 
> > abc需
> 
> > >>> sys.stdout.encoding = 'mac-roman'
> 
> > >>> print(s)
> 
> > abcŽÏÛ
> 
> > >>> print(s.encode(sys.stdout.encoding, 'replace'))
> 
> > 'abc需'
> 
> > >>> sys.stdout.encoding = 'utf-8'
> 
> > >>> print(s.encode(sys.stdout.encoding, 'replace'))
> 
> > 'abc需'
> 
> > >>> sys.stdout.encoding = 'utf-16-le'
> 
> > >>> print(s.encode(sys.stdout.encoding, 'replace'))
> 
> > 'abc需'
> 
> > >>> sys.stdout.encoding = 'utf-32-be'
> 
> > >>> print(s.encode(sys.stdout.encoding, 'replace'))
> 
> > 'abc需'
> 
> > jmf
> 
> 
> 
> Oh my, all that code just so you can handle glyphs with
> 
> squiggly little accent marks? Are you so afraid you'll
> 
> forget how to pronounce the words without them? I wonder how
> 
> those simpleton Americans are able to pronounce words without
> 
> a tutorial? Boggles the mind really, BOGGLES THE MIND!
> 
> 
> 
> For the remainder of your bloated Unicode char set that
> 
> defines symbols, and cute little miniature fractions, and
> 
> snowmen, and all sorts of ridiculous scrawl... what a waste
> 
> of time!
> 
> 
> 
> You know, instead of bending over backwards to "include"
> 
> every selfish char man in his complete and utter stupidity
> 
> can muster, when are we going to realize that keyboards can
> 
> only contain a very *finite* number of keys before they
> 
> become unusable. If you want to draw shapes and scrawl to a
> 
> screen, USE THE CORRECT DAMN TOOL!
> 
> 
> 
> In this day and age of "globalism", when are we going to
> 
> unite under a single form of written communication. And if
> 
> not for the sake of programmers, i can think of an even more
> 
> important reason,,, for the sake of miscommunications and
> 
> animosity. How many fights and wars have been started simply
> 
> on the grounds of a miscommunication?
> 
> 
> 
>     "THE MINDS OF LITTLE MEN ARE CONSUMED WITH 
> 
>     EMOTIONAL IDENTITIES AND THE PURSUIT OF CREATING 
> 
>     SYMBOLS THAT DEFINE THOSE IDENTITIES"
> 
>     
> 
> The true free man does not belong to any "imaginary" group,
> 
> he does not pay allegiance to any one country, or any one
> 
> religion, or any one sports team (if at all), no, he is free
> 
> and belongs to group defined by a single word, a group from
> 
> which he does NOT choose, but a group from which he is
> 
> *BOUND*:
> 
> 
> 
>     THAT GROUP BE HUMANITY!
> 
> 
> 
> My keyboard has every char i could ever need to express
> 
> myself sufficiently across a medium that requires redundant
> 
> "pecking" of keys on a keyboard. This is the form of
> 
> communication we have achieved thus far, so until we can
> 
> evolve past this "pecking olymics", we would be wise to keep
> 
> our pecking to a minimum and employ a ubiquitous written
> 
> language that is elegant and simplistic, and optimized for
> 
> typing.
> 
> 
> 
> I'm sorry but i guess i'm just too practical for all this
> 
> nonsense. To me Unicode is just like automobiles. Everyone
> 
> has their own color and brand, and this one has a butt
> 
> warmer and that one has a vanity mirror, when in fact all
> 
> automobiles serve the same purpose of transportation.
> 
> 
> 
> Although, unlike automobiles where i can choose to drive
> 
> ONLY my car, with Unicode i'm forced to interface with your
> 
> selfish idea of what a car should be. I just want to get
> 
> from point A to point B without being pulled over and
> 
> arrested because you were "partying" with three prostitutes
> 
> who forgot their crack pipe under the seat!
> 
> 
> 
> But don't bother listening to me anyway, and go on with your
> 
> selfish pursuits, continue dividing people instead of
> 
> uniting them, continue creating a world that is superfluously
> 
> complex by your own hand -- but don't be surprised when this
> 
> monstrosity comes crashing down on top of you!
> 
> 
> 
>     THERE IS PRIDE BEFORE THE FALL

--------

I forgot to mention, the example I gave came from
one of my interactive Python interpreter (a gui
app).

You have certainly noticed, it has the faculty
to modify on the fly the coding of the "device" that
will host a unicode. It mimics, in fact, any device
which may receive a unicode (terminal, file, db,
gui text widget, ...) and that a unicode has
to be encoded.

The characters I'm using are just characters very
representative of unicode and I deliberatly choose
them with care.
Glyphs have here nothing to do. But, if you wish
examples with others chars/fonts like
junicode (medieval scripts), stix (math symbols),
polytonic Greek or why not Urdu script (nastaliq,
naskh [*]), ...

https://medium.com/@eteraz/the-death-of-the-urdu-script-9ce935435d90

[*] not yet tested.

jmf




More information about the Python-list mailing list