Many apologies if people got one or more encrypted versions of this.
On 2/7/19 12:13 AM, Steven D'Aprano wrote:
It wasn't a concrete proposal, just food for thought. Unfortunately the thinking seems to have missed the point of the Julia syntax and run off with the idea of a wrapper class.
I did not miss the point! I think adding new syntax à la Julia is a bad idea—or at very least, not something we can experiment with today (and wrote as much).
Therefore, something we CAN think about and experiment with today is a wrapper class. This approach is pretty much exactly the same thing I tried in a discussion of PEP 505 a while back (None-aware operators). In the same vein as that—where I happen to dislike PEP 505 pretty strongly—one approach to simulate or avoid new syntax is precisely to use a wrapper class.
As a footnote, I think my demonstration of PEP 505 got derailed by lots of comments along the lines of "Your current toy library gets the semantics of the proposed new syntax wrong in these edge cases." Those comments were true (and I think I didn't fix all the issues since my interest faded with the active thread)... but none of them were impossible to fix, just small errors I had made.
With my *very toy* stringpy.Vector class, I'm just experimenting with usage ideas. I have shown a number of uses that I think could be useful to capture most or all of what folks want in "string vectorization." Most of what I've but in this list is what the little module does already, but some is just ideas for what it might do if I add the code (or someone else makes a PR at https://github.com/DavidMertz/stringpy).
One of the principles I had in mind in my demonstration is that I want to wrap the original collection type (or keep it an iterator if it started as one). A number of other ideas here, whether for built-in syntax or different behaviors of a wrapper, effectively always reduce every sequence to a list under the hood. This makes my approach less intrusive to move things in and out of "vector mode." For example:
v1 = Vector(set_of_strings) set_of_strings = v1.lower().apply(my_str_fun)._it # Get a set back v2 = Vector(list_of_strings) list_of_strings = v2.lower().apply(my_str_fun)._it # Get a list back v3 = Vector(deque_of_strings) deque_of_strings = v3.lower().apply(my_str_fun)._it # Get a deque back v4 = Vector(iter_of_strings) iter_of_strings = v4.lower().apply(my_str_fun)._it # stays lazy!
So this is round-tripping through vector-land.
Small note: I use the attribute `._it` to store the "sequential thing." That feels internal, so maybe some better way of spelling "get the wrapped thing" would be desirable.
I've also lost track of whether anyone is proposing a "vector of strings' as opposed to a vector of arbitrary objects.
Nothing I wrote is actually string-specific. That is just the main use case stated. My `stringpy.Vector` might be misnamed in that it is happy to contain any kind of items. But we hope they are all items with the particular methods we want to vectorize. I showed an example where a list might contain a custom string-like object that happens to have methods like `.lower()` as an illustration.
Inasmuch as I want to handle iterator here, it is impossible to do any type check upon creating a Vector. For concrete `collections.abc.Sequence` objects we could check, in principle. But I'd rather it be "we're all adults here" ... or at most provide some `check_type_uniformity()` function or method that had to be called explicitly.