On Wed, Jan 30, 2019 at 10:14 PM Chris Angelico <rosuav@gmail.com> wrote:

I didn't, but I don't know if Chris Barker did.

nope -- not me either :-)
 
(Can't swing a cat without hitting someone named Steve or Chris, in
some spelling or another!)

good thing there aren't a lot of cats being swung around, then.

One more note about this whole thread:

I do a lot of numerical programming, and used to use MATLAB and now numpy a lot. So I am very used to "vectorization" -- i.e. having operations that work on a whole collection of items at once.

Example:

a_numpy_array * 5

multiplies every item in the array by 5

In pure Python, you would do something like:

[ i * 5 for i in a_regular_list]

You can imagine that for more complex expressions the "vectorized" approach can make for much clearer and easier to parse code. Also much faster, which is what is usually talked about, but I think the readability is the bigger deal.

So what does this have to do with the topic at hand?

I know that when I'm used to working with numpy and then need to do some string processing or some such, I find myself missing this "vectorization" -- if I want to do the same operation on a whole bunch of strings, why do I need to write a loop or comprehension or map? that is:

[s.lower() for s in a_list_of_strings]

rather than:

a_list_of_strings.lower()

(NOTE: I prefer comprehension syntax to map, but map would work fine here, too)

It strikes me that that is the direction some folks want to go.

If so, then I think the way to do it is not to add a bunch of stuff to Python's str or sequence types, but rather to make a new library that provides quick and easy manipulation of sequences of strings.  -- kind of a stringpy -- analogous to numpy.

At the core of numpy is the ndarray: a "a multidimensional, homogeneous array
of fixed-size items"

a strarray could be simpler -- I don't see any reason for more than 1-D, nor more than one datatype. But it could be a "vector" of strings that was guaranteed to be all strings, and provide operations that acted on the entire collection in one fell swoop.

If it turned out to be useful, you could even make a version in C or Cython that might give significant performance benefits.

I don't have a use case for this -- but if someone does, it's an idea.

-CHB






Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov