Mikhail V wrote:
if "\u1230" <= c <= "\u123f":
o = ord (c) if 100 <= o <= 150:
Note that, if need be, you could also write that as
if 0x64 <= o <= 0x96:
So yours is a valid code but for me its freaky, and surely I stick to the second variant.
The thing is, where did you get those numbers from in the first place?
If you got them in some way that gives them to you in decimal, such as print(ord(c)), there is nothing to stop you from writing them as decimal constants in the code.
But if you got them e.g. by looking up a character table that gives them to you in hex, you can equally well put them in as hex constants. So there is no particular advantage either way.
You said, I can better see in which unicode page I am by looking at hex ordinal, but I hardly need it, I just need to know one integer, namely where some range begins, that's it. Furthermore this is the code which would an average programmer better read and maintain.
To a maintainer who is familiar with the layout of the unicode code space, the hex representation of a character is likely to have some meaning, whereas the decimal representation will not. So for that person, using decimal would make the code *harder* to maintain.
To a maintainer who doesn't have that familiarity, it makes no difference either way.
So your proposal would result in a *decrease* of maintainability overall.
if I make a mistake, typo, or want to expand the range by some value I need to make summ and substract operation in my head to progress with my code effectively. Is it clear now what I mean by conversions back and forth?
Yes, but in my experience the number of times I've had to do that kind of arithmetic with character codes is very nearly zero. And when I do, I'm more likely to get the computer to do it for me than work out the numbers and then type them in as literals. I just don't see this as being anywhere near being a significant problem.
In standard ASCII there are enough glyphs that would work way better together,
Out of curiosity, what glyphs do you have in mind?
ұұ-ұ ---- ---- ---ұ
you can downscale the strings, so a 16-bit value would be ~60 pixels wide
Yes, you can make the characters narrow enough that you can take 4 of them in at once, almost as though they were a single glyph... at which point you've effectively just substituted one set of 16 glyphs for another. Then you'd have to analyse whether the *combined* 4-element glyphs were easier to disinguish from each other than the ones they replaced. Since the new ones are made up of repetitions of just two elements, whereas the old ones contain a much more varied set of elements, I'd be skeptical about that.
BTW, your choice of ұ because of its "peak readibility" seems to be a case of taking something out of context. The readability of a glyph can only be judged in terms of how easy it is to distinguish from other glyphs. Here, the only thing that matters is distinguishing it from the other symbol, so something like "|" would perhaps be a better choice.
||-| ---- ---- ---|
So if you are more than 40 years old (sorry for some familiarity) this can be really strong issue and unfortunately hardly changeable.
Sure, being familiar with the current system means that it would take me some effort to become proficient with a new one.
What I'm far from convinced of is that I would gain any benefit from making that effort, or that a fresh person would be noticeably better off if they learned your new system instead of the old one.
At this point you're probably going to say "Greg, it's taken you 40 years to become that proficient in hex. Someone learning my system would do it much faster!"
Well, no. When I was about 12 I built a computer whose only I/O devices worked in binary. From the time I first started toggling programs into it to the time I had the whole binary/hex conversion table burned into my neurons was maybe about 1 hour. And I wasn't even *trying* to memorise it, it just happened.
It is not about speed, it is about brain load. Chinese can read their hieroglyphs fast, but the cognition load on the brain is 100 times higher than current latin set.
Has that been measured? How?
This one sets off my skepticism alarm too, because people that read Latin scripts don't read them a letter at a time -- they recognise whole *words* at once, or at least large chunks of them. The number of English words is about the same order of magnitude as the number of Chinese characters.
I know people who can read bash scripts fast, but would you claim that bash syntax can be any good compared to Python syntax?
For the things that bash was designed to be good for, yes, it can. Python wins for anything beyond very simple programming, but bash wasn't designed for that. (The fact that some people use it that way says more about their dogged persistence in the face of adversity than it does about bash.)
I don't doubt that some sets of glyphs are easier to distinguish from each other than others. But the letters and digits that we currently use have already been pretty well optimised by scribes and typographers over the last few hundred years, and I'd be surprised if there's any *major* room left for improvement.
Mixing up letters and digits is certainly jarring to many people, but I'm not sure that isn't largely just because we're so used to mentally categorising them into two distinct groups. Maybe there is some objective difference that can be measured, but I'd expect it to be quite small compared to the effect of these prior "habits" as you call them.