Re: [Python-ideas] Proposal: Tuple of str with w'list of words'

13 Nov 2016

      On Sat, Nov 12, 2016 at 12:06 PM Steven D'Aprano 
wrote:
...
I consider the need for that to indicate a possibly poor design of
pandas. Unless there is a good reason not to, I believe that any
function that requires a list of strings should also accept a single
space-delimited string instead. Especially if the strings are intended
as names or labels. So that:
func(['fe', 'fi', 'fo', 'fum'])
and
func('fe fi fo fum')
should be treated the same way.
They don't because df[ 'Column Name'] is a valid way to get a single column
worth of data when the column name contains spaces (not encouraged, but it
is valid).
...
...
mydf = df[ ('field1', 'field2', 'field3') ]
Are your field names usually constants known when you write the script?
Yes.  All the time.  When I'm on the side of creating APIs for data
analysts to use, I think of the columns abstractly.  When they're writing
scripts to analyze data, it's all very explicit and in the domain of the
data. Things like:

df [df.age > 10]
adf = df.pivot_table( ['runid','block'] )

Are common and the "right" way to do things in the problem domain.
...
So not only do we have to learn yet another special kind of string:
- unicode strings
- byte strings
- raw strings (either unicode or bytes)
- f-strings
- and now w-strings
Very valid point.  I also was considering (and rejected) a 'wb' for tuple
of bytes.
...
I would prefer a simple, straight-forward rule: it unconditionally
splits on whitespace. If you need to include non-splitting spaces, use a
proper non-breaking space \u00A0, or split the words into a tuple by
hand, like you're doing now. I don't think it is worth complicating the
feature to support non-splitting spaces.
You're right there.  If there are spaces in the columns, make it explicit
and don't use the w''.  I withdraw the <backspace><space> "feature".  And I
think you're right that all the existing escape rules should work in the
same way they do for regular unicode strings (don't go the raw strings
route).  Basically, w'foo bar' == tuple('foo bar'.split())
...
The fact that other languages do something like this is a (weak) point
in its favour. But I see that there are a few questions on Stackoverflow
asking what %w
means, how it is different from %W, etc. For example:
http://stackoverflow.com/questions/1274675/what-does-warray-mean
http://stackoverflow.com/questions/690794/ruby-arrays-w-vs-w
Well, I'd lean towards not having a W'fields' that does something funky
:-).   But your point is well taken.
...
...
I'm rather luke-warm on this proposal, although I might be convinced to
support it if:
- w'...' unconditionally split on any whitespace (possibly
  excluding NBSP);
- and normal escapes worked.
Even then I'm not really convinced this needs to be a language feature.
I'm realizing that a lot of the reason that I'm seeing this a lot is that
it seems to be particular issue to using python for data science.  In some
ways, they're pushing the language a bit beyond what it's designed to do
(the df[ (df.age > 10) & (df.gender=="F")] idiom is amazing and
troubling).  Since I'm doing a lot of this, these little language issues
loom a bit larger than they would with "normal" programming.

Thanks for responding.

Re: [Python-ideas] Proposal: Tuple of str with w'list of words'

Gary Godfrey