[Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

Victor Stinner victor.stinner at gmail.com
Thu Dec 8 05:27:48 EST 2016


FYI you can also get a character by its name:

>>> import unicodedata
>>> unicodedata.name(chr(1040))
'CYRILLIC CAPITAL LETTER A'
>>> "\N{CYRILLIC CAPITAL LETTER A}"
'А'

Victor

2016-12-08 0:52 GMT+01:00 Mikhail V <mikhailwas at gmail.com>:
> In past discussion about inputing and printing characters,
> I was proposing decimal notation instead of hex.
> Since the discussion was lost in off-topic talks, I'll try to
> summarise my idea better.
>
> I use ASCII only for code input (there are good reasons for that).
> Here I'll use Python 3.6, and Windows 7, so I can use print() with unicode
> directly and it works now in system console.
>
> Suppose I only start programming and want to do some character manipulation.
> The vey first thing I would probably start with is a simple output for
> latin and cyrillic capital letters:
>
> caps_lat = ""
> for o in range(65, 91):
>     caps_lat =  caps_lat + chr(o)
> print (caps_lat)
>
> caps_cyr = ""
> for o in range(1040, 1072):
>     caps_cyr =  caps_cyr + chr(o)
> print (caps_cyr)
>
>
> Which prints:
> ABCDEFGHIJKLMNOPQRSTUVWXYZ
> АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ
>
>
> Say, I want now to input something direct in code:
>
> s = "first cyrillic letters: " + chr(1040) + chr(1041) + chr(1042)
>
> Which works fine and has clean look. However it is not very convinient
> because of much typing and also, if I generate such strings,
> adds a bit more complexity. But in general it is fine, and I use this
> method currently.
>
> =========
> Proposal: I would want to have a possibility to input it *by decimals*:
>
> s = "first cyrillic letters: \{1040}\{1041}\{1042}"
> or:
> s = "first cyrillic letters: \(1040)\(1041)\(1042)"
>
> =========
>
> This is more compact and seems not very contradictive with
> current Python escape characters in string literals.
> So backslash is a start of some escaping in most cases.
>
> For me most important is that in such way I would avoid
> any presence of hex numbers in strings, which I find very good
> for readability and for me it is very convinient since I use decimals
> for processing everywhere (and encourage everyone to do so).
>
> So this is my proposal, any comments on this are appreciated.
>
>
> PS:
>
> Currently Python 3 supports these in addition to \x:
> (from https://docs.python.org/3/howto/unicode.html)
> """
> If you can’t enter a particular character in your editor or want to keep
> the source code ASCII-only for some reason, you can also use escape
> sequences in string literals.
>
>>>> "\N{GREEK CAPITAL LETTER DELTA}"  # Using the character name
>>>> "\u0394"                          # Using a 16-bit hex value
>>>> "\U00000394"                      # Using a 32-bit hex value
>
> """
> So I have many possibilities and all of them strangely contradicts with
> my image of intuitive and readable. Well, using charater name is readable,
> but seriously not much of a practical solution for input, but could be
> very useful
> for printing description of a character.
>
>
> Mikhail
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list