On Wed, Oct 23, 2019 at 12:02:37PM -0400, David Mertz wrote:
colors2 = "cyan forest green burnt umber".split() # oops, not what I wanted, quote each separately
Ha, speaking about "Oops" moments, I *totally* failed to notice that "forest green" is intended to be a single colour. The perils of posting in the wee hours of the morning, sorry.
It isn't shared by the proposal.
colors2 = %w[cyan forest green burnt\x20umber]
I don't get it. There is weird escaping of spaces that aren't split?
The source code has spaces between cyan and "forest-green" (let's pretend that's what it said all along...) and between forest-green and "burnt\x20umber". The parser/lexer splits on whitespace in the source code, giving three tokens:
cyan forest-green burnt\x20umber
each of which are treated as strings, complete with standard string escaping.
That is confusing and a bug magnet.
David, you literally wrote the book on text processing in Python. I think you are being disingenious here, and below when you describe a standard string hex-escape \x20 that has been in Python forever and in just about all C-like languages as "weird".
If you can understand why this works:
string = "Single\n quoted\n string\n containing newlines!"
you can understand the burnt\x20umber example.
What are the rules for escaping all whitespace, exactly? All the Unicode space-like code points, or just x20?
(1) I am assuming that we don't change any of the existing string escapes. That would be a backwards-incompatible change that would change the meaning of existing strings.
(2) The parser splits on whitespace in the source code. After that, the tokens are treated as normal string tokens except that you don't need to put start/end delimiters (quotes) on them.