On Fri, Feb 1, 2019 at 5:00 PM David Mertz <mertz@gnosis.cx> wrote:
is is certainly doable. But why would it be better than:

map(str.lower, my_string_vector)
map(compute_grad, my_student_vector)

or [s.lower() for s in my_string_vector]

Side note: It's really interesting to me that Python introduced comprehension sytax some years ago, and even "hid" reduce(), and now there seems to be a big interest / revival of "map".

Even numpy supports inhomogeneous data:
py> a = np.array([1, 'spam'])
py> a
array(['1', 'spam'],

well, no -- it doesn't -- look carefully, that is an array or type '!S4' -- i,e, a 4 element long string --every element in that array is that same type. Also note that numpy's support for strings a not very complete.

numpy does support an "object" type, that can be inhomogeneous -- it's still a single type, but that type is a python object (under the hood it's an array fo pointers to pyobjects):

In [3]: a = np.array([1, 'spam'], dtype=np.object)                              
In [4]: a                                                                       
Out[4]: array([1, 'spam'], dtype=object)

And it does support vectorization to some extent:
In  [5]: a * 5                                                                   
Out [5]: array([5, 'spamspamspamspamspam'], dtype=object)

But not with any performance benefits.

I think there are good reasons to have a "string_vector" that is known to be homogenous:

Performance -- it could be significantly optimized (are there many use cases for that? I don't know.

Clear API: a string_vector would have all the relevant string methods. 

You could easily write a list subclass that passed on method calls to the enclosed objects, but then you'd have a fair bit of confusion as to what might be a vector method vs a method on the objects.

which I suppose leaves us with something like:


list.elements * 5

hmm -- not sure how much I like this, but it's pretty doable.

I still haven't seen any examples that aren't already spelled 'map(fun, it)'

and I don't think you will -- I *think* get credit for starting this part of the the thread, and I started by saying I have often longed for essentially a more concise way to spell map() or comprehensions. performance asside, I use numpy because:

c = np.sqrt(a**2 + b**2)

is a heck of a lot easer to read, write, and get correct than:

c = list(map(math.sqrt, map(lambda x, y: x + y, map(lambda x: x**2, a),
                                                map(lambda x: x**2, b)


[math.sqrt(x) for x in (a + b for a, b in zip((x**2 for x in a),
                                              (x**2 for x in b)

Note: it took me quite a while to get those right! (and I know I could have used the operator module to get the map version maybe a bit cleaner, but the point stands)

Does this apply to string processing? I'm not sure, though I do a fair bit of chaining of string operations:


if you wanted to do that to a list of strings:


is a lot nicer than:

[s.title() for s in (s.lower() for s in [s.strip(s) for s in a_list_of_strings])]


list(map(str.title, (map(str.lower, (map(str.strip, a_list_of_strings)))) # untested

How common is that use case? not common enough for me to go any further with this.



Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython