Instagram: 40% Py3 to 99% Py3 in 10 months (Posting On Python-List Prohibited)
Steve D'Aprano
steve+python at pearwood.info
Thu Jun 22 09:33:26 EDT 2017
On Wed, 21 Jun 2017 09:23 am, Lawrence D’Oliveiro wrote:
> Though the Perl 6 folks claim their approach (encoding “characters” rather
> than “code points”) is superior.
Can you explain what you are referring to precisely?
According to the Perl 6 docs, they do encode code points, not "characters"
(which is an ill-defined concept, and besides some Unicode code points are not
characters at all).
http://www.unicode.org/faq/private_use.html#noncharacters
For example:
https://docs.perl6.org/language/unicode
talks about code points. The very first section is titled "Entering Unicode
Codepoints and Codepoint Sequences".
Likewise there is a method "codes" which returns the number of code points in a
string:
https://docs.perl6.org/routine/codes
On the other hand there is also a method "chars" which returns the number
of "characters" (graphemes? grapheme clusters? it doesn't specify) in the
string.
https://docs.perl6.org/routine/chars
Anyone here got Perl 6 installed and can try it out? How many "characters" does
it think the string "a\uFDD5\uFDD6z" contain?
- if it says 4, that's the number of code points;
- if it says 2, that's the number of characters less the number of
noncharacters.
--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.
More information about the Python-list
mailing list