Pound sign problem
Tim Chase
python.list at tim.thechases.com
Tue Apr 11 12:55:06 EDT 2017
On 2017-04-12 02:29, Steve D'Aprano wrote:
> >> In 2017, unless you are reading from old legacy files created
> >> using a non-Unicode encoding, you should just use UTF-8.
> >
> > Thanks for your opinion. My opinion differs.
>
> What would you suggest then, if not UTF-8?
>
> My personal favourite legacy encoding is MacRoman, but I wouldn't
> recommend anyone use it except to interoperate with legacy Mac
> applications and/or data from the 80s and 90s.
>
> What's your recommendation? "Anything but ASCII"?
Heh, how about "Unicode as ASCII-compatible-Python-strings"? ;-)
Got this from Peter Otten a while back in response to my request for
functionality something like this.
http://www.mail-archive.com/python-list@python.org/msg420100.html
-tkc
$ cat codecs_mynamereplace.py
# -*- coding: utf-8 -*-
import codecs
import unicodedata
try:
codecs.namereplace_errors
except AttributeError:
print("using mynamereplace")
def mynamereplace(exc):
return u"".join(
"\\N{%s}" % unicodedata.name(c)
for c in exc.object[exc.start:exc.end]
), exc.end
codecs.register_error("namereplace", mynamereplace)
print(u"maƱana".encode("ascii", "namereplace").decode())
$ python3.5 codecs_mynamereplace.py
ma\N{LATIN SMALL LETTER N WITH TILDE}ana
$ python3.4 codecs_mynamereplace.py
using mynamereplace
ma\N{LATIN SMALL LETTER N WITH TILDE}ana
$ python2.7 codecs_mynamereplace.py
using mynamereplace
ma\N{LATIN SMALL LETTER N WITH TILDE}ana
More information about the Python-list
mailing list