Hello,
I'm thinking of this idea that we have a pseudo-operator called "Respectively" and shown maybe with ;
Some examples first:
a;b;c = x1;y1;z1 + x2;y2;z2 is equivalent to a=x1+x2 b=y1+y2 c=z1+z2
So it means for each position in the statement, do something like respectively. It's like what I call a vertical expansion, i.e. running statements one by one. Then there is another unpacking operator which maybe we can show with $ sign and it operates on lists and tuples and creates the "Respectively" version of them. So for instance, vec=[]*10 $vec = $u + $v will add two 10-dimensional vectors to each other and put the result in vec.
I think this is a syntax that can make many things more concise plus it makes component wise operation on a list done one by one easy.
For example, we can calculate the inner product between two vectors like follows (inner product is the sum of component wise multiplication of two vectors):
innerProduct =0 innerProduct += $a * $b
which is equivalent to innerProduct=0 for i in range(len(a)): ...innerProduct += a[i]+b[i]
For example, let's say we want to apply a function to all element in a list, we can do: f($a)
The $ and ; take precedence over anything except ().
Also, an important thing is that whenever, we don't have the respectively operator, such as for example in the statement above on the left hand side, we basically use the same variable or value or operator for each statement or you can equivalently think we have repeated that whole thing with ;;;;. Such as: s=0 s;s;s += a;b;c; d;e;f which result in s being ad+b,ce+df
Also, I didn't spot (at least for now any ambiguity). For example one might think what if we do this recursively, such as in: x;y;z + (a;b;c);(d;e;f);(g;h;i) using the formula above this is equivalent to (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) if we apply print on the statement above, the result will be: x+a x+b x+c y+d y+e y+f z+g z+h z+i
Beware that in all of these ; or $ does not create a new list. Rather, they are like creating new lines in the program and executing those lines one by one( in the case of $, to be more accurate, we create for loops).
I'll appreciate your time and looking forward to hearing your thoughts.
Cheers, Moj
On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
Hello,
I'm thinking of this idea that we have a pseudo-operator called "Respectively" and shown maybe with ;
Hopefully, you're already aware of sequence unpacking? Search for 'unpacking' at https://docs.python.org/2/tutorial/datastructures.html . Unfortunately, it does not have its own section I can directly link to.
x, y = 3, 5
would give the same result as
x = 3
y = 5
But it's more robust, as it can also deal with things like
x, y = y + 1, x + 4
Some examples first:
a;b;c = x1;y1;z1 + x2;y2;z2 is equivalent to a=x1+x2 b=y1+y2 c=z1+z2
So what would happen with the following?
a; b = x1;a + x2;5
So it means for each position in the statement, do something like respectively. It's like what I call a vertical expansion, i.e. running statements one by one. Then there is another unpacking operator which maybe we can show with $ sign and it operates on lists and tuples and creates the "Respectively" version of them. So for instance, vec=[]*10 $vec = $u + $v will add two 10-dimensional vectors to each other and put the result in vec.
I think this is a syntax that can make many things more concise plus it makes component wise operation on a list done one by one easy.
For example, we can calculate the inner product between two vectors like follows (inner product is the sum of component wise multiplication of two vectors):
innerProduct =0 innerProduct += $a * $b
which is equivalent to innerProduct=0 for i in range(len(a)): ...innerProduct += a[i]+b[i]
From what I can see, it would be very beneficial for you to look into numpy: http://www.numpy.org/ . It already provides inner product, sums of arrays and such. I myself am not very familiar with it, but I think it provides what you need.
For example, let's say we want to apply a function to all element in a list, we can do: f($a)
The $ and ; take precedence over anything except ().
Also, an important thing is that whenever, we don't have the respectively operator, such as for example in the statement above on the left hand side, we basically use the same variable or value or operator for each statement or you can equivalently think we have repeated that whole thing with ;;;;. Such as: s=0 s;s;s += a;b;c; d;e;f which result in s being ad+b,ce+df
Also, I didn't spot (at least for now any ambiguity). For example one might think what if we do this recursively, such as in: x;y;z + (a;b;c);(d;e;f);(g;h;i) using the formula above this is equivalent to (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) if we apply print on the statement above, the result will be: x+a x+b x+c y+d y+e y+f z+g z+h z+i
Beware that in all of these ; or $ does not create a new list. Rather, they are like creating new lines in the program and executing those lines one by one( in the case of $, to be more accurate, we create for loops).
I'll appreciate your time and looking forward to hearing your thoughts.
Again, probably you should use numpy. I'm not really sure it warrants a
change to the language, because it seems like it would really only be
beneficial to those working with matrices. Numpy already supports it,
and I'm suspecting that the use case for a;b = c;d + e;f
can already
be satisfied by a, b = c + e, d + f
, and it already has clearly
documented semantics and still works fine when one of the names on the
left also appears on the right: First all the calculations on the right
are performed, then they are assigned to the names on the left.
Cheers, Moj
Kind regards, Sjoerd Job
Yes, I'm aware sequence unpacking. There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here.
For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy. For example:
$StudentFullName = $FirstName + " " + $LastName
So, in effect, I think one big part of is component wise operations.
Another thing that can't be achieved with sequence unpacking is: f($x) i.e. applying f for each component of x.
About your question above, it's not ambiguous here either: a; b = x1;a + x2;5 is exactly "Equivalent" to a = x1+x2 b = a + 5
Also, there is a difference in style in sequence unpacking, and here. In sequence unpacking, you have to pair up the right variables and repeat the operator, for example: x,y,z = x1+x2 , y1+y2, z1+z2 Here you don't have to repeat it and pair up the right variables, i.e. x;y;z = x1;y1;z1 + x2;y2;z2 It's I think good that you (kind of) don't break the encapsulation-ish thing we have for the three values here. Also, you don't risk, making a mistake in the operator for one of the values by centralizing the operator use. For example you could make the mistake: x,y,z = x1+x2, y1-y2, z1+z2
Also there are all sort of other things that are less of a motivation for me but that cannot be done with sequence unpacking. For instance: add ; prod = a +;* y (This one I'm not sure how can be achieved without ambiguity) x;y = f;g (a;b)
On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus sjoerdjob@sjec.nl wrote:
On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
Hello,
I'm thinking of this idea that we have a pseudo-operator called "Respectively" and shown maybe with ;
Hopefully, you're already aware of sequence unpacking? Search for 'unpacking' at https://docs.python.org/2/tutorial/datastructures.html . Unfortunately, it does not have its own section I can directly link to.
x, y = 3, 5
would give the same result as
x = 3
y = 5
But it's more robust, as it can also deal with things like
x, y = y + 1, x + 4
>
Some examples first:
a;b;c = x1;y1;z1 + x2;y2;z2 is equivalent to a=x1+x2 b=y1+y2 c=z1+z2
So what would happen with the following?
a; b = x1;a + x2;5
>
So it means for each position in the statement, do something like respectively. It's like what I call a vertical expansion, i.e. running statements one by one. Then there is another unpacking operator which maybe we can show with $ sign and it operates on lists and tuples and creates the "Respectively" version of them. So for instance, vec=[]*10 $vec = $u + $v will add two 10-dimensional vectors to each other and put the result in vec.
I think this is a syntax that can make many things more concise plus it makes component wise operation on a list done one by one easy.
For example, we can calculate the inner product between two vectors like follows (inner product is the sum of component wise multiplication of two vectors):
innerProduct =0 innerProduct += $a * $b
which is equivalent to innerProduct=0 for i in range(len(a)): ...innerProduct += a[i]+b[i]
From what I can see, it would be very beneficial for you to look into numpy: http://www.numpy.org/ . It already provides inner product, sums of arrays and such. I myself am not very familiar with it, but I think it provides what you need.
>
For example, let's say we want to apply a function to all element in a list, we can do: f($a)
The $ and ; take precedence over anything except ().
Also, an important thing is that whenever, we don't have the respectively operator, such as for example in the statement above on the left hand side, we basically use the same variable or value or operator for each statement or you can equivalently think we have repeated that whole thing with ;;;;. Such as: s=0 s;s;s += a;b;c; d;e;f which result in s being ad+b,ce+df
Also, I didn't spot (at least for now any ambiguity). For example one might think what if we do this recursively, such as in: x;y;z + (a;b;c);(d;e;f);(g;h;i) using the formula above this is equivalent to (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) if we apply print on the statement above, the result will be: x+a x+b x+c y+d y+e y+f z+g z+h z+i
Beware that in all of these ; or $ does not create a new list. Rather, they are like creating new lines in the program and executing those lines one by one( in the case of $, to be more accurate, we create for loops).
I'll appreciate your time and looking forward to hearing your thoughts.
Again, probably you should use numpy. I'm not really sure it warrants a
change to the language, because it seems like it would really only be
beneficial to those working with matrices. Numpy already supports it,
and I'm suspecting that the use case for a;b = c;d + e;f
can already
be satisfied by a, b = c + e, d + f
, and it already has clearly
documented semantics and still works fine when one of the names on the
left also appears on the right: First all the calculations on the right
are performed, then they are assigned to the names on the left.
>
Cheers, Moj
Kind regards, Sjoerd Job
On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi mojtaba.gharibi@gmail.com wrote:
Yes, I'm aware sequence unpacking. There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here.
For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy.
Yes, you can do it with numpy.
Obviously you don't get the performance benefits when you aren't using "native" types (like int32) and operations that have vectorizes implementations (like adding two arrays of int32 or taking the dot product of float64 matrices), but you do still get the same elementwise operators, and even a way to apply arbitrary callables over arrays, or even other collections:
>>> firsts = ['John', 'Jane']
>>> lasts = ['Smith', 'Doe']
>>> np.vectorize('{1}, {0}'.format)(firsts, lasts)
array(['Smith, John', 'Doe, Jane'], dtype='<U11)
That's everything you're asking for, with even more flexibility, with no need for any new ugly perlesque syntax: just use at least one np.array type in an operator expression, call a method on an array type, or wrap a function in vectorize, and everything is elementwise.
And of course when you actually _are_ using numbers, as in every single one of your examples, using numpy also gives you around a 6:1 space and 20:1 time savings, which is a nice bonus.
For example:
$StudentFullName = $FirstName + " " + $LastName
So, in effect, I think one big part of is component wise operations.
Another thing that can't be achieved with sequence unpacking is: f($x) i.e. applying f for each component of x.
That's a very different operation, which I think is more readably spelled map(f, x).
About your question above, it's not ambiguous here either: a; b = x1;a + x2;5 is exactly "Equivalent" to a = x1+x2 b = a + 5
Also, there is a difference in style in sequence unpacking, and here. In sequence unpacking, you have to pair up the right variables and repeat the operator, for example: x,y,z = x1+x2 , y1+y2, z1+z2 Here you don't have to repeat it and pair up the right variables, i.e. x;y;z = x1;y1;z1 + x2;y2;z2
If you only have two or three of these, that isn't a problem. Although in this case, it sure looks like you're trying to add two 3D vectors, so maybe you should just be storing 3D vectors as instances of a class (with an __add__ method, of course), or as arrays, or as columns in a larger array, rather than as 3 separate variables. What could be more readable than this:
v = v1 + v2
And if you have more than about three separate variables, you _definitely_ want some kind of array or iterable, not a bunch of separate variables. You're worried about accidentally typing "y1-y2" when you meant "+", but you're far more likely to screw up one of the letters or numbers than the operator. You also can't loop over separate variables, which means you can't factor out some logic and apply it to all three axes, or to both vectors. Also consider how you'd do something like transposing or pivoting or anything even fancier. If you've got a 2D array or iterable of iterables, that's trivial: transpose or zip, etc. If you've got N*M separate variables, you have to write them all individually. Your syntax at best cuts the source length and opportunity for errors in half; using collections cuts it down to 1.
On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert abarnert@yahoo.com wrote:
On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi mojtaba.gharibi@gmail.com wrote:
Yes, I'm aware sequence unpacking. There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here.
For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy.
Yes, you can do it with numpy.
Obviously you don't get the performance benefits when you aren't using "native" types (like int32) and operations that have vectorizes implementations (like adding two arrays of int32 or taking the dot product of float64 matrices), but you do still get the same elementwise operators, and even a way to apply arbitrary callables over arrays, or even other collections:
>>> firsts = ['John', 'Jane']
>>> lasts = ['Smith', 'Doe']
>>> np.vectorize('{1}, {0}'.format)(firsts, lasts)
array(['Smith, John', 'Doe, Jane'], dtype='<U11)
I think the form I am suggesting is simpler and more readable. I'm happy you brought vectorize to my attention though. I think as soon you make the statement just a bit complex, it would become really complicated with vectorize.
For example lets say you have x=[1,2,3,4,5,...] y=['A','BB','CCC',...] p=[2,3,4,6,6,...] r=[]*n
$r = str(len($y*$p)+$x)
It would be really complex to calculate such a thing with vectorize. All I am saving on is basically a for-loop and the indexing. We don't really have to use numpy,etc. I think it's much easier to just use for-loop and indexing, if you don't like the syntax. So I think the question is, does my syntax bring enough convenience to avoid for-loop and indexing. For example the above could be equivalently written as for i in range(0,len(r)): ...r[i] = str(len(y[i]*p[i])+x[i]) So that's the whole saving. Just a for-loop and indexing operator.
That's everything you're asking for, with even more flexibility, with no need for any new ugly perlesque syntax: just use at least one np.array type in an operator expression, call a method on an array type, or wrap a function in vectorize, and everything is elementwise.
And of course when you actually _are_ using numbers, as in every single one of your examples, using numpy also gives you around a 6:1 space and 20:1 time savings, which is a nice bonus.
For example:
$StudentFullName = $FirstName + " " + $LastName
So, in effect, I think one big part of is component wise operations.
Another thing that can't be achieved with sequence unpacking is: f($x) i.e. applying f for each component of x.
That's a very different operation, which I think is more readably spelled map(f, x).
About your question above, it's not ambiguous here either: a; b = x1;a + x2;5 is exactly "Equivalent" to a = x1+x2 b = a + 5
Also, there is a difference in style in sequence unpacking, and here. In sequence unpacking, you have to pair up the right variables and repeat the operator, for example: x,y,z = x1+x2 , y1+y2, z1+z2 Here you don't have to repeat it and pair up the right variables, i.e. x;y;z = x1;y1;z1 + x2;y2;z2
If you only have two or three of these, that isn't a problem. Although in this case, it sure looks like you're trying to add two 3D vectors, so maybe you should just be storing 3D vectors as instances of a class (with an __add__ method, of course), or as arrays, or as columns in a larger array, rather than as 3 separate variables. What could be more readable than this:
v = v1 + v2
And if you have more than about three separate variables, you _definitely_ want some kind of array or iterable, not a bunch of separate variables. You're worried about accidentally typing "y1-y2" when you meant "+", but you're far more likely to screw up one of the letters or numbers than the operator. You also can't loop over separate variables, which means you can't factor out some logic and apply it to all three axes, or to both vectors. Also consider how you'd do something like transposing or pivoting or anything even fancier. If you've got a 2D array or iterable of iterables, that's trivial: transpose or zip, etc. If you've got N*M separate variables, you have to write them all individually. Your syntax at best cuts the source length and opportunity for errors in half; using collections cuts it down to 1.
On Wed, Jan 27, 2016 at 12:55:33PM -0500, Mirmojtaba Gharibi wrote:
On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert abarnert@yahoo.com wrote:
For example lets say you have x=[1,2,3,4,5,...] y=['A','BB','CCC',...] p=[2,3,4,6,6,...] r=[]*n
$r = str(len($y*$p)+$x)
Several (current-Python) solutions are there I can already see:
r = [str(len(yv * pv) + xv) for xv, yv, pv in zip(x, y, p)]
r = map(lambda xv, yv, pv: str(len(yv * pv) + xv), x, y, p)
# Assuming x, y, p are numpy arrays
r = np.vectorize(lambda xv, yv, pv: str(len(yv * pv) + xv))(x, y, p)
Furthermore, the str(len(y * p) + x)
is supposed to actually do
something, I presume. Why does that not have a name? Foobarize?
r = [foobarize(xv, yv, pv) for xv, yv, pv in zip(x, y, p)]
r = map(foobarize, x, y, p)
r = np.vectorize(foobarize)(x, y, p)
or in your syntax
$r = foobarize($x, $y, $p)
I assume?
Also, supposing f
is a function of two arguments:
$r = f(x, $y)
means
r = [f(x, y_val) for y_val in y]
And
$r = f($x, y)
means
r = [f(x_val, y) for x_val in x]
Then what does
$r = f($x, $y)
mean? I suppose you want it to mean
r = [f(x_val, y_val) for x_val, y_val in zip(x, y)]
= map(f, x, y)
which can be confusing if x
and y
have different
lengths.
Maybe
r = [f(x_val, y_val) for x_val in x for y_val in y]
or
r = [f(x_val, y_val) for y_val in y for x_val in x]
?
Besides the questionable benefit of shorter syntax, I think this would actually not be a good case. Numpy, list/generator comprehensions and the map/zip builtins already provide more than enough ways to do it. Why add even another syntax.
No, you don't have to use numpy. If you don't need it, please don't use it. But, do not forget that the standard set of builtins is already powerful enough to give you what you want.
Python is a general-purpose programming language (though often used in sciency-stuff). Matlab is a 'matrix lab' language. If the language its only purpose is working with matrices: please, go ahead and build matrix-specific syntax.
In my experience, Python has a lot more purposes than just matrix
manipulation. Codebases I've worked on only had use for the $
operator
you're suggesting for too little lines of code to bother learning the
extra syntax.
I'm definitively -1 on yet another syntax when there are already multiple obvious ways to solve the same problem:(numpy, comprehensions, map.
(not sure if I even have the right to vote here, given that I'm not a core developer, but just giving my opinion)
It would be really complex to calculate such a thing with vectorize. All I am saving on is basically a for-loop and the indexing. We don't really have to use numpy,etc. I think it's much easier to just use for-loop and indexing, if you don't like the syntax. So I think the question is, does my syntax bring enough convenience to avoid for-loop and indexing. For example the above could be equivalently written as for i in range(0,len(r)): ...r[i] = str(len(y[i]*p[i])+x[i]) So that's the whole saving. Just a for-loop and indexing operator.
And I listed some of the ways you can save the loop + indexing. That doesn't need new syntax.
That's everything you're asking for, with even more flexibility, with no need for any new ugly perlesque syntax: just use at least one np.array type in an operator expression, call a method on an array type, or wrap a function in vectorize, and everything is elementwise.
And of course when you actually _are_ using numbers, as in every single one of your examples, using numpy also gives you around a 6:1 space and 20:1 time savings, which is a nice bonus.
For example:
$StudentFullName = $FirstName + " " + $LastName
So, in effect, I think one big part of is component wise operations.
Another thing that can't be achieved with sequence unpacking is: f($x) i.e. applying f for each component of x.
That's a very different operation, which I think is more readably spelled map(f, x).
About your question above, it's not ambiguous here either: a; b = x1;a + x2;5 is exactly "Equivalent" to a = x1+x2 b = a + 5
Also, there is a difference in style in sequence unpacking, and here. In sequence unpacking, you have to pair up the right variables and repeat the operator, for example: x,y,z = x1+x2 , y1+y2, z1+z2 Here you don't have to repeat it and pair up the right variables, i.e. x;y;z = x1;y1;z1 + x2;y2;z2
If you only have two or three of these, that isn't a problem. Although in this case, it sure looks like you're trying to add two 3D vectors, so maybe you should just be storing 3D vectors as instances of a class (with an __add__ method, of course), or as arrays, or as columns in a larger array, rather than as 3 separate variables. What could be more readable than this:
v = v1 + v2
And if you have more than about three separate variables, you _definitely_ want some kind of array or iterable, not a bunch of separate variables. You're worried about accidentally typing "y1-y2" when you meant "+", but you're far more likely to screw up one of the letters or numbers than the operator. You also can't loop over separate variables, which means you can't factor out some logic and apply it to all three axes, or to both vectors. Also consider how you'd do something like transposing or pivoting or anything even fancier. If you've got a 2D array or iterable of iterables, that's trivial: transpose or zip, etc. If you've got N*M separate variables, you have to write them all individually. Your syntax at best cuts the source length and opportunity for errors in half; using collections cuts it down to 1.
On Jan 27, 2016, at 09:55, Mirmojtaba Gharibi mojtaba.gharibi@gmail.com wrote:
On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert abarnert@yahoo.com wrote:
On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi mojtaba.gharibi@gmail.com wrote:
Yes, I'm aware sequence unpacking. There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here.
For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy.
Yes, you can do it with numpy.
Obviously you don't get the performance benefits when you aren't using "native" types (like int32) and operations that have vectorizes implementations (like adding two arrays of int32 or taking the dot product of float64 matrices), but you do still get the same elementwise operators, and even a way to apply arbitrary callables over arrays, or even other collections:
>>> firsts = ['John', 'Jane']
>>> lasts = ['Smith', 'Doe']
>>> np.vectorize('{1}, {0}'.format)(firsts, lasts)
array(['Smith, John', 'Doe, Jane'], dtype='<U11)
I think the form I am suggesting is simpler and more readable.
But the form you're suggesting doesn't work for vectorizing arbitrary functions, only for operator expressions (including simple function calls, but that doesn't help for more general function calls). The fact that numpy is a little harder to read for cases that your syntax can't handle at all is hardly a strike against numpy.
And, as I already explained, for the cases where your form _does_ work, numpy already does it, without all the sigils:
c = a + b
c = a*a + 2*a*b + b*b
c = (a * b).sum()
It also works nicely over multiple dimensions. For example, if a and b are both arrays of N 3-vectors instead of just being 3-vectors, you can still elementwise-add them just with +; you can sum all of the results with sum(axis=1); etc. How would you write any of those things with your $-syntax?
I'm happy you brought vectorize to my attention though. I think as soon you make the statement just a bit complex, it would become really complicated with vectorize.
For example lets say you have x=[1,2,3,4,5,...] y=['A','BB','CCC',...] p=[2,3,4,6,6,...] r=[]*n
$r = str(len($y*$p)+$x)
As a side note, []n is always just []. Maybe you meant [None for _ in range(n)] or [None]n? Also, where does n come from? It doesn't seem to have anything to do with the lengths of x, y, and p. So, what happens if it's shorter than them? Or longer? With numpy, of course, that isn't a problem--there's no magic being attempted on the = operator (which is good, because = isn't an operator in Python, and I'm not sure how you'd even properly define your design, much less implement it); the operators just create arrays of the right length.
Anyway, that's still mostly just operators. You _could_ wrap up an operator expression in a function to vectorize, but you almost never want to. Just use the operators directly on the arrays.
So, let's try a case that has even some minimal amount of logic, where translating to operators would be clumsy at best:
@np.vectorize
def sillyslice(y, x, p):
if x < p: return y[x:p]
return y[p:x]
r = sillyslice(y, x, p)
Being a separate function provides all the usual benefits: sillyslice is reusable, debuggable, unit-testable, usable as a first-class object, etc. But forget that; how would you do this at all with your $-syntax?
Since you didn't answer any of my other questions, I'll snip them and repost shorter versions:
On Wed, Jan 27, 2016 at 01:19:56AM -0500, Mirmojtaba Gharibi wrote:
Yes, I'm aware sequence unpacking. There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here.
For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy. For example:
$StudentFullName = $FirstName + " " + $LastName
So, in effect, I think one big part of is component wise operations.
Another thing that can't be achieved with sequence unpacking is: f($x) i.e. applying f for each component of x.
map(f, x)
About your question above, it's not ambiguous here either: a; b = x1;a + x2;5 is exactly "Equivalent" to a = x1+x2 b = a + 5
Now that's confusing, that it differs from sequence unpacking.
Also, there is a difference in style in sequence unpacking, and here. In sequence unpacking, you have to pair up the right variables and repeat the operator, for example: x,y,z = x1+x2 , y1+y2, z1+z2 Here you don't have to repeat it and pair up the right variables, i.e. x;y;z = x1;y1;z1 + x2;y2;z2 It's I think good that you (kind of) don't break the encapsulation-ish thing we have for the three values here. Also, you don't risk, making a mistake in the operator for one of the values by centralizing the operator use. For example you could make the mistake: x,y,z = x1+x2, y1-y2, z1+z2
Also there are all sort of other things that are less of a motivation for me but that cannot be done with sequence unpacking. For instance: add ; prod = a +;* y (This one I'm not sure how can be achieved without ambiguity) x;y = f;g (a;b)
On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus sjoerdjob@sjec.nl wrote:
On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
Hello,
I'm thinking of this idea that we have a pseudo-operator called "Respectively" and shown maybe with ;
Hopefully, you're already aware of sequence unpacking? Search for 'unpacking' at https://docs.python.org/2/tutorial/datastructures.html . Unfortunately, it does not have its own section I can directly link to.
x, y = 3, 5
would give the same result as
x = 3
y = 5
But it's more robust, as it can also deal with things like
x, y = y + 1, x + 4
>
Some examples first:
a;b;c = x1;y1;z1 + x2;y2;z2 is equivalent to a=x1+x2 b=y1+y2 c=z1+z2
So what would happen with the following?
a; b = x1;a + x2;5
>
So it means for each position in the statement, do something like respectively. It's like what I call a vertical expansion, i.e. running statements one by one. Then there is another unpacking operator which maybe we can show with $ sign and it operates on lists and tuples and creates the "Respectively" version of them. So for instance, vec=[]*10 $vec = $u + $v will add two 10-dimensional vectors to each other and put the result in vec.
I think this is a syntax that can make many things more concise plus it makes component wise operation on a list done one by one easy.
For example, we can calculate the inner product between two vectors like follows (inner product is the sum of component wise multiplication of two vectors):
innerProduct =0 innerProduct += $a * $b
which is equivalent to innerProduct=0 for i in range(len(a)): ...innerProduct += a[i]+b[i]
Thinking about this some more:
How do you know if this is going to return a list of products, or the sum of those products?
That is, why is innerProduct += $a * $b
not equivalent to
innerProduct = $innerProduct + $a * $b
? Or is it? Not quite sure.
A clearer solution would be
innerProduct = sum(map(operator.mul, a, b))
But that's current-Python syntax.
To be honest, I still haven't seen an added benefit that the new syntax would gain. Maybe you could expand on that?
>
From what I can see, it would be very beneficial for you to look into numpy: http://www.numpy.org/ . It already provides inner product, sums of arrays and such. I myself am not very familiar with it, but I think it provides what you need.
>
For example, let's say we want to apply a function to all element in a list, we can do: f($a)
The $ and ; take precedence over anything except ().
Also, an important thing is that whenever, we don't have the respectively operator, such as for example in the statement above on the left hand side, we basically use the same variable or value or operator for each statement or you can equivalently think we have repeated that whole thing with ;;;;. Such as: s=0 s;s;s += a;b;c; d;e;f which result in s being ad+b,ce+df
Also, I didn't spot (at least for now any ambiguity). For example one might think what if we do this recursively, such as in: x;y;z + (a;b;c);(d;e;f);(g;h;i) using the formula above this is equivalent to (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) if we apply print on the statement above, the result will be: x+a x+b x+c y+d y+e y+f z+g z+h z+i
Beware that in all of these ; or $ does not create a new list. Rather, they are like creating new lines in the program and executing those lines one by one( in the case of $, to be more accurate, we create for loops).
I'll appreciate your time and looking forward to hearing your thoughts.
Again, probably you should use numpy. I'm not really sure it warrants a
change to the language, because it seems like it would really only be
beneficial to those working with matrices. Numpy already supports it,
and I'm suspecting that the use case for a;b = c;d + e;f
can already
be satisfied by a, b = c + e, d + f
, and it already has clearly
documented semantics and still works fine when one of the names on the
left also appears on the right: First all the calculations on the right
are performed, then they are assigned to the names on the left.
>
Cheers, Moj
Kind regards, Sjoerd Job
I think the component-wise operation is the biggest benefit and a more compact and understandable syntax. For example,
innerProduct = sum(map(operator.mul, a, b)) is much more complex than innerProduct += $a * $b
MATLAB has a built-in easy way of achieving component-wise operation and I think Python would benefit from that without use of libraries such as numpy.
Regarding your question about the difference between innerProduct += $a $b and innerProduct = $innerProduct + $a $b
The second statement returns error. I mentioned in my initial email that $ applies to a list or a tuple. Here I explicitly set my innerProduct=0 initially which you omitted in your example.
innerProduct += $a $b is equivalent to for i in len(range(a)): ...innerProduct +=a[i]b[i]
On Wed, Jan 27, 2016 at 2:30 AM, Sjoerd Job Postmus sjoerdjob@sjec.nl wrote:
On Wed, Jan 27, 2016 at 01:19:56AM -0500, Mirmojtaba Gharibi wrote:
Yes, I'm aware sequence unpacking. There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here.
For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy. For example:
$StudentFullName = $FirstName + " " + $LastName
So, in effect, I think one big part of is component wise operations.
Another thing that can't be achieved with sequence unpacking is: f($x) i.e. applying f for each component of x.
map(f, x)
>
About your question above, it's not ambiguous here either: a; b = x1;a + x2;5 is exactly "Equivalent" to a = x1+x2 b = a + 5
Now that's confusing, that it differs from sequence unpacking.
>
Also, there is a difference in style in sequence unpacking, and here. In sequence unpacking, you have to pair up the right variables and repeat the operator, for example: x,y,z = x1+x2 , y1+y2, z1+z2 Here you don't have to repeat it and pair up the right variables, i.e. x;y;z = x1;y1;z1 + x2;y2;z2 It's I think good that you (kind of) don't break the encapsulation-ish thing we have for the three values here. Also, you don't risk, making a mistake in the operator for one of the values by centralizing the operator use. For example you could make the mistake: x,y,z = x1+x2, y1-y2, z1+z2
Also there are all sort of other things that are less of a motivation for me but that cannot be done with sequence unpacking. For instance: add ; prod = a +;* y (This one I'm not sure how can be achieved without ambiguity) x;y = f;g (a;b)
On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus sjoerdjob@sjec.nl wrote:
On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
Hello,
I'm thinking of this idea that we have a pseudo-operator called "Respectively" and shown maybe with ;
Hopefully, you're already aware of sequence unpacking? Search for 'unpacking' at https://docs.python.org/2/tutorial/datastructures.html . Unfortunately, it does not have its own section I can directly link to.
x, y = 3, 5
would give the same result as
x = 3
y = 5
But it's more robust, as it can also deal with things like
x, y = y + 1, x + 4
>
Some examples first:
a;b;c = x1;y1;z1 + x2;y2;z2 is equivalent to a=x1+x2 b=y1+y2 c=z1+z2
So what would happen with the following?
a; b = x1;a + x2;5
>
So it means for each position in the statement, do something like respectively. It's like what I call a vertical expansion, i.e. running statements one by one. Then there is another unpacking operator which maybe we can show with $ sign and it operates on lists and tuples and creates the "Respectively" version of them. So for instance, vec=[]*10 $vec = $u + $v will add two 10-dimensional vectors to each other and put the result in vec.
I think this is a syntax that can make many things more concise plus it makes component wise operation on a list done one by one easy.
For example, we can calculate the inner product between two vectors like follows (inner product is the sum of component wise multiplication of two vectors):
innerProduct =0 innerProduct += $a * $b
which is equivalent to innerProduct=0 for i in range(len(a)): ...innerProduct += a[i]+b[i]
Thinking about this some more:
How do you know if this is going to return a list of products, or the sum of those products?
That is, why is innerProduct += $a * $b
not equivalent to
innerProduct = $innerProduct + $a * $b
? Or is it? Not quite sure.
A clearer solution would be
innerProduct = sum(map(operator.mul, a, b))
But that's current-Python syntax.
To be honest, I still haven't seen an added benefit that the new syntax would gain. Maybe you could expand on that?
>
From what I can see, it would be very beneficial for you to look into numpy: http://www.numpy.org/ . It already provides inner product, sums of arrays and such. I myself am not very familiar with it, but I think it provides what you need.
>
For example, let's say we want to apply a function to all element in a list, we can do: f($a)
The $ and ; take precedence over anything except ().
Also, an important thing is that whenever, we don't have the respectively operator, such as for example in the statement above on the left hand side, we basically use the same variable or value or operator for each statement or you can equivalently think we have repeated that whole thing with ;;;;. Such as: s=0 s;s;s += a;b;c; d;e;f which result in s being ad+b,ce+df
Also, I didn't spot (at least for now any ambiguity). For example one might think what if we do this recursively, such as in: x;y;z + (a;b;c);(d;e;f);(g;h;i) using the formula above this is equivalent to (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) if we apply print on the statement above, the result will be: x+a x+b x+c y+d y+e y+f z+g z+h z+i
Beware that in all of these ; or $ does not create a new list. Rather, they are like creating new lines in the program and executing those lines one by one( in the case of $, to be more accurate, we create for loops).
I'll appreciate your time and looking forward to hearing your thoughts.
Again, probably you should use numpy. I'm not really sure it warrants a
change to the language, because it seems like it would really only be
beneficial to those working with matrices. Numpy already supports it,
and I'm suspecting that the use case for a;b = c;d + e;f
can already
be satisfied by a, b = c + e, d + f
, and it already has clearly
documented semantics and still works fine when one of the names on the
left also appears on the right: First all the calculations on the right
are performed, then they are assigned to the names on the left.
>
Cheers, Moj
Kind regards, Sjoerd Job
On 27 January 2016 at 17:12, Mirmojtaba Gharibi
mojtaba.gharibi@gmail.com wrote:
innerProduct = sum(map(operator.mul, a, b)) is much more complex than innerProduct += $a * $b
Certainly the second is shorter. But it's full of weird "magic" behaviour that I don't even begin to know how to explain in general terms (i.e., without having to appeal to specific examples):
Oh, and your "standard Python" implementation of inner product is not the most readable (which is a matter of opinion, certainly) approach, so you're asking a loaded question. An alternative way of writing it would be
innerProduct = sum(x*y for x, y in zip(a, b))
Variable names that aren't 1-character would probably help the "normal" version. I can't be sure if they'd help or harm the proposed version. Probably wouldn't make much difference.
Sorry, but I see no particular value in this proposal, and many issues with it. So -1 from me. Paul
I think a lot of your question are answered in my very first email. Stuff about initialization. I had initialized my variable, but Sjoerd dropped it when giving his example. Please refer to the very first email.
Regarding how to explain the behaviour in simple term, I also refer you to my very first email. Basically it's a pair of (kind of) operators I called Respectively and unpacking. You can read it more extensively there.
It's supposed that in a pairwise operation like this, you provide identical length lists. If a and b are different length, my idea is that we just go as much as the length of the first list in the operation or alternatively the biggest list and then throw an exception for instance.
On Wed, Jan 27, 2016 at 12:54 PM, Paul Moore p.f.moore@gmail.com wrote:
On 27 January 2016 at 17:12, Mirmojtaba Gharibi
mojtaba.gharibi@gmail.com wrote:
innerProduct = sum(map(operator.mul, a, b)) is much more complex than innerProduct += $a * $b
Certainly the second is shorter. But it's full of weird "magic" behaviour that I don't even begin to know how to explain in general terms (i.e., without having to appeal to specific examples):
Oh, and your "standard Python" implementation of inner product is not the most readable (which is a matter of opinion, certainly) approach, so you're asking a loaded question. An alternative way of writing it would be
innerProduct = sum(x*y for x, y in zip(a, b))
Variable names that aren't 1-character would probably help the "normal" version. I can't be sure if they'd help or harm the proposed version. Probably wouldn't make much difference.
Sorry, but I see no particular value in this proposal, and many issues with it. So -1 from me. Paul
On Jan 27, 2016, at 09:12, Mirmojtaba Gharibi mojtaba.gharibi@gmail.com wrote:
I think the component-wise operation is the biggest benefit and a more compact and understandable syntax. For example,
innerProduct = sum(map(operator.mul, a, b)) is much more complex than innerProduct += $a * $b
MATLAB has a built-in easy way of achieving component-wise operation and I think Python would benefit from that without use of libraries such as numpy.
Why? What's wrong with using numpy?
It seems like only problem in your initial post was that you thought numpy can't do what you want, when in fact it can, and trivially so. Adding the same amount of complexity to the base language wouldn't make it any more discoverable--it would just mean that _all_ Python users now have the potential to be confused, rather than only Python+numpy users, which sounds like a step backward.
Also, this is going to sound like a rhetorical, or even baited, question, but it's not intended that way: what's wrong with APL, or J, or MATLAB, and what makes you want to use Python instead? I'll bet that, directly or indirectly, the reason is the simplicity, consistency, and readability of Python. If you make Python more cryptic and dense, there's a very good chance it'll end up less readable than J rather than more, which would defeat the entire purpose.
Also, while we're at it, if you want the same features as APL and MATLAB, why invent a very different syntax instead of just using their syntax? Most proposals for adding elementwise computation to the base language suggest adding array operators like .+ that work the same way on all types, not adding object-wrapping operators that turn a list or a bunch of separate objects into some hidden type that overloads the normal + operator to be elementwise. What's the rationale for doing it your way instead of the usual way? (I can see one pretty good answer--consistency with numpy--but I don't think it's what you have in mind.)
Regarding your question about the difference between innerProduct += $a $b and innerProduct = $innerProduct + $a $b
The second statement returns error. I mentioned in my initial email that $ applies to a list or a tuple. Here I explicitly set my innerProduct=0 initially which you omitted in your example.
innerProduct += $a $b is equivalent to for i in len(range(a)): ...innerProduct +=a[i]b[i]
On Wed, Jan 27, 2016 at 2:30 AM, Sjoerd Job Postmus sjoerdjob@sjec.nl wrote: On Wed, Jan 27, 2016 at 01:19:56AM -0500, Mirmojtaba Gharibi wrote:
Yes, I'm aware sequence unpacking. There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here.
For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy. For example:
$StudentFullName = $FirstName + " " + $LastName
So, in effect, I think one big part of is component wise operations.
Another thing that can't be achieved with sequence unpacking is: f($x) i.e. applying f for each component of x.
map(f, x)
>
About your question above, it's not ambiguous here either: a; b = x1;a + x2;5 is exactly "Equivalent" to a = x1+x2 b = a + 5
Now that's confusing, that it differs from sequence unpacking.
>
Also, there is a difference in style in sequence unpacking, and here. In sequence unpacking, you have to pair up the right variables and repeat the operator, for example: x,y,z = x1+x2 , y1+y2, z1+z2 Here you don't have to repeat it and pair up the right variables, i.e. x;y;z = x1;y1;z1 + x2;y2;z2 It's I think good that you (kind of) don't break the encapsulation-ish thing we have for the three values here. Also, you don't risk, making a mistake in the operator for one of the values by centralizing the operator use. For example you could make the mistake: x,y,z = x1+x2, y1-y2, z1+z2
Also there are all sort of other things that are less of a motivation for me but that cannot be done with sequence unpacking. For instance: add ; prod = a +;* y (This one I'm not sure how can be achieved without ambiguity) x;y = f;g (a;b)
On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus sjoerdjob@sjec.nl wrote:
On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
Hello,
I'm thinking of this idea that we have a pseudo-operator called "Respectively" and shown maybe with ;
Hopefully, you're already aware of sequence unpacking? Search for 'unpacking' at https://docs.python.org/2/tutorial/datastructures.html . Unfortunately, it does not have its own section I can directly link to.
x, y = 3, 5
would give the same result as
x = 3
y = 5
But it's more robust, as it can also deal with things like
x, y = y + 1, x + 4
>
Some examples first:
a;b;c = x1;y1;z1 + x2;y2;z2 is equivalent to a=x1+x2 b=y1+y2 c=z1+z2
So what would happen with the following?
a; b = x1;a + x2;5
>
So it means for each position in the statement, do something like respectively. It's like what I call a vertical expansion, i.e. running statements one by one. Then there is another unpacking operator which maybe we can show with $ sign and it operates on lists and tuples and creates the "Respectively" version of them. So for instance, vec=[]*10 $vec = $u + $v will add two 10-dimensional vectors to each other and put the result in vec.
I think this is a syntax that can make many things more concise plus it makes component wise operation on a list done one by one easy.
For example, we can calculate the inner product between two vectors like follows (inner product is the sum of component wise multiplication of two vectors):
innerProduct =0 innerProduct += $a * $b
which is equivalent to innerProduct=0 for i in range(len(a)): ...innerProduct += a[i]+b[i]
Thinking about this some more:
How do you know if this is going to return a list of products, or the sum of those products?
That is, why is innerProduct += $a * $b
not equivalent to
innerProduct = $innerProduct + $a * $b
? Or is it? Not quite sure.
A clearer solution would be
innerProduct = sum(map(operator.mul, a, b))
But that's current-Python syntax.
To be honest, I still haven't seen an added benefit that the new syntax would gain. Maybe you could expand on that?
>
From what I can see, it would be very beneficial for you to look into numpy: http://www.numpy.org/ . It already provides inner product, sums of arrays and such. I myself am not very familiar with it, but I think it provides what you need.
>
For example, let's say we want to apply a function to all element in a list, we can do: f($a)
The $ and ; take precedence over anything except ().
Also, an important thing is that whenever, we don't have the respectively operator, such as for example in the statement above on the left hand side, we basically use the same variable or value or operator for each statement or you can equivalently think we have repeated that whole thing with ;;;;. Such as: s=0 s;s;s += a;b;c; d;e;f which result in s being ad+b,ce+df
Also, I didn't spot (at least for now any ambiguity). For example one might think what if we do this recursively, such as in: x;y;z + (a;b;c);(d;e;f);(g;h;i) using the formula above this is equivalent to (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i) if we apply print on the statement above, the result will be: x+a x+b x+c y+d y+e y+f z+g z+h z+i
Beware that in all of these ; or $ does not create a new list. Rather, they are like creating new lines in the program and executing those lines one by one( in the case of $, to be more accurate, we create for loops).
I'll appreciate your time and looking forward to hearing your thoughts.
Again, probably you should use numpy. I'm not really sure it warrants a
change to the language, because it seems like it would really only be
beneficial to those working with matrices. Numpy already supports it,
and I'm suspecting that the use case for a;b = c;d + e;f
can already
be satisfied by a, b = c + e, d + f
, and it already has clearly
documented semantics and still works fine when one of the names on the
left also appears on the right: First all the calculations on the right
are performed, then they are assigned to the names on the left.
>
Cheers, Moj
Kind regards, Sjoerd Job
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Jan 27, 2016, at 09:12, Mirmojtaba Gharibi mojtaba.gharibi@gmail.com wrote:
I think the component-wise operation is the biggest benefit and a more compact and understandable syntax. For example,
innerProduct = sum(map(operator.mul, a, b)) is much more complex than innerProduct += $a * $b
Frankly, I'd prefer simply innerProduct = sum($a * $b) - i'm not sure how you can reasonably define all the semantics of all operators in all combinations in a way that makes your "+=" work.
Furthermore, I think your expressions could also get hairy.
a = [1, 2] b = [3, 4] c = 5 a $b = [[1, 2]3, [1, 2]4]] = [[1, 2, 1, 2, 1, 2], [1, 2, 1, 2, 1, 2, 1, 2]] $a b = [[1[3, 4], 2[3, 4]] = [[3, 4], [3, 4, 3, 4]] $a $b = [13, 24] = [3, 8] ($a $b) c = [3, 8] 5 = [3, 8, 3, 8, 3, 8, 3, 8, 3, 8] # and let's ignore the associativity problems for the moment $($a $b) c = $[3, 8] 5 = [35, 8*5] = [15, 40] # oh, look, we have to put $ on an arbitrary expression, not just a name
Do you need multiple $ signs to operate on multiple dimensions? If not, why not?
(Arguably, sequence repeating should be a different operator than multiplication anyway, but that ship has long sailed)
MATLAB has a built-in easy way of achieving component-wise operation and I think Python would benefit from that without use of libraries such as numpy.
On Wed, Jan 27, 2016, at 13:13, Andrew Barnert via Python-ideas wrote:
Why? What's wrong with using numpy?
It seems like only problem in your initial post was that you thought numpy can't do what you want, when in fact it can, and trivially so. Adding the same amount of complexity to the base language wouldn't make it any more discoverable--it would just mean that _all_ Python users now have the potential to be confused, rather than only Python+numpy users, which sounds like a step backward.
My impression is that the ultimate idea is to allow/require/recommend a post-numpy library to use the same syntax for these semantics, so that the base semantics with the plain operators are not different between post-numpy and base python, in order to make post-numpy less confusing than numpy.
I.e. that the semantics when operating on sequences of numbers ought to be defined solely by the syntax (not confusing, even if it's more complex than what we have now), rather than by what library the sequence object comes from (confusing).
On Wed, Jan 27, 2016 at 9:12 AM, Mirmojtaba Gharibi mojtaba.gharibi@gmail.com wrote:
> >
MATLAB has a built-in easy way of achieving component-wise operation and I think Python would benefit from that without use of libraries such as numpy.
I've always thought there should be a component-wise operations in Python. The wlay to do it now is somthing like:
[i + j for i,j in zip(a,b)]
is really pretty darn wordy, compared to :
a_numpy_array + another_numpy array
(similar in matlab).
But maybe an operator is the way to do it. But it was long ago decide dnot to introduce a full set of extra operators, alla matlab:
.+ .* etc....
rather, it was realized that for numpy, which does element-wise operations be default, matrix multiplication was really the only non-elementwise operation widely used, so the new @ operator was added.
And we're kind of stuck --even if we added a full set, then in numpy, the regular operators would be element wise, but for built-in Python sequences, the special ones would be elementwise -- really confusing!
if you really want this, I'd make your own sequences that re-define the operators.
Or just use Numpy... you can use object arrays if you want to handle non-numeric values:
In [4]: a1 = np.array(["this", "that"], dtype=object)
In [5]: a2 = np.array(["some", "more"], dtype=object)
In [6]: a1 + a2
Out[6]: array(['thissome', 'thatmore'], dtype=object) -CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
On Wed, Jan 27, 2016 at 4:53 PM, Chris Barker chris.barker@noaa.gov wrote:
On Wed, Jan 27, 2016 at 9:12 AM, Mirmojtaba Gharibi mojtaba.gharibi@gmail.com wrote:
> >
MATLAB has a built-in easy way of achieving component-wise operation and I think Python would benefit from that without use of libraries such as numpy.
I've always thought there should be a component-wise operations in Python. The wlay to do it now is somthing like:
[i + j for i,j in zip(a,b)]
is really pretty darn wordy, compared to :
a_numpy_array + another_numpy array
(similar in matlab).
But maybe an operator is the way to do it. But it was long ago decide dnot to introduce a full set of extra operators, alla matlab:
.+ .* etc....
rather, it was realized that for numpy, which does element-wise operations be default, matrix multiplication was really the only non-elementwise operation widely used, so the new @ operator was added.
And we're kind of stuck --even if we added a full set, then in numpy, the regular operators would be element wise, but for built-in Python sequences, the special ones would be elementwise -- really confusing!
if you really want this, I'd make your own sequences that re-define the operators.
Problem is you always forego the hassle of subclassing at that exact moment that you need element-wise and just use for loops. So it's almost always not worth the hassle.
>
Or just use Numpy... you can use object arrays if you want to handle non-numeric values:
In [4]: a1 = np.array(["this", "that"], dtype=object)
In [5]: a2 = np.array(["some", "more"], dtype=object)
In [6]: a1 + a2
Out[6]: array(['thissome', 'thatmore'], dtype=object) -CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
Hello,
I'm thinking of this idea that we have a pseudo-operator called "Respectively" and shown maybe with ;
Some examples first:
a;b;c = x1;y1;z1 + x2;y2;z2
Regardless of the merits of this proposal, the suggested syntax cannot be used because that's already valid Python syntax equivalent to:
a b c = x1 y1 z1 + x2 y2 z2
So forget about using the ; as that would be ambiguous.
[...]
Then there is another unpacking operator which maybe we can show with $ sign and it operates on lists and tuples and creates the "Respectively" version of them. So for instance, vec=[]*10 $vec = $u + $v will add two 10-dimensional vectors to each other and put the result in vec.
[]*10 won't work, as that's just []. And it seems very unpythonic to need to pre-allocate a list just to do vectorized addition.
I think you would be better off trying to get better support for vectorized operations into Python:
vec = add(u, v)
is nearly as nice looking as u + v, and it need not even be a built-in. It could be a library.
In an earlier version of the statistics module, I experimented with vectorized functions for some of the operations. I wanted a way for the statistics functions to automatically generate either scalar or vector results without any extra programming effort.
E.g. writing mean([1, 2, 3]) would return the scalar 2, of course, while:
mean([(1, 10, 100), (2, 20, 200), (3, 30, 300)])
would operate column-wise and return (2, 20, 200). To do that, I needed vectorized versions of sum, division, sqrt etc. I didn't mind if they were written as function calls instead of operators:
divide(6, 3) # returns 2 divide((6, 60, 600), 3) # returns (2, 20, 200)
which I got with a function:
divide = vectorize(operator.truediv)
where vectorize() took a scalar operator and returned a function that looped over two vectors and applied the operator to each argument in an elementwise fashion. I eventually abandoned this approach because the complexity and performance hit of my initial implementation was far too great, but maybe that was just my poor implementation.
I think that vectorized functions would be very useful in Python. Performance need not be super-fast -- numpy, numba, and the other heavy-duty third-party tools would continue to dominate the high-performance scientific computing niche, but they should be at least no worse than the equivalent code using a loop.
If you had a vectorized add() function, your example:
a;b;c = x1;y1;z1 + x2;y2;z2
would become:
a, b, c = add([x1, y1, z1], [x2, y2, z2])
Okay, it's not as nice looking as the + operator, but it will do. Or you could subclass list to do this instead of concatenation.
I would support the addition of a vectorize() function which took an arbitrary scalar function, and returned a vectorized version:
func = vectorized(lambda x, y: 2x + y**3 - xy/3) a, b, c = func(vector_x, vector_y)
being similar to:
f = lambda x, y: 2x + y**3 - xy/3 a, b, c = [f(x, y) for x, y in zip(vector_x, vector_y)]
[...]
For example, we can calculate the inner product between two vectors like follows (inner product is the sum of component wise multiplication of two vectors):
innerProduct =0 innerProduct += $a * $b
which is equivalent to innerProduct=0 for i in range(len(a)): ...innerProduct += a[i]+b[i]
def mult(vectors): for t in zip(vectors): yield reduce(operator.mul, t)
innerProduct = sum(mult(a, b))
-- Steve
On 2016-01-28 00:12:51, "Steven D'Aprano" steve@pearwood.info wrote:
On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
Hello,
I'm thinking of this idea that we have a pseudo-operator called "Respectively" and shown maybe with ;
Some examples first:
a;b;c = x1;y1;z1 + x2;y2;z2
Regardless of the merits of this proposal, the suggested syntax cannot be used because that's already valid Python syntax equivalent to:
a b c = x1 y1 z1 + x2 y2 z2
So forget about using the ; as that would be ambiguous.
[...]
Then there is another unpacking operator which maybe we can show with $ sign and it operates on lists and tuples and creates the "Respectively" version of them. So for instance, vec=[]*10 $vec = $u + $v will add two 10-dimensional vectors to each other and put the result in vec.
[]*10 won't work, as that's just []. And it seems very unpythonic to need to pre-allocate a list just to do vectorized addition.
I think you would be better off trying to get better support for vectorized operations into Python:
vec = add(u, v)
is nearly as nice looking as u + v, and it need not even be a built-in. It could be a library.
[snip]
An alternative would be to add an element-wise class:
class Vector: def __init__(self, *args): self.args = args
def __str__(self):
return '<%s>' % ', '.join(repr(arg) for arg in self.args)
def __add__(self, other):
if isinstance(other, Vector):
return Vector(*[left + right for left, right in
zip(self.args, other.args)])
return Vector(*[left + other for left in self.args])
def __iter__(self):
return iter(self.args)
Then you could write:
a, b, c = Vector(x1, y1, z1) + Vector(x2, y2, z2)
I wonder whether there's a suitable pair of delimiters that could be used to create a 'literal' for it.
On Jan 27, 2016, at 16:12, Steven D'Aprano steve@pearwood.info wrote:
I think you would be better off trying to get better support for vectorized operations into Python:
I really think, at least 90% of the time, and probably a lot more, people are better off just using numpy than reinventing it. Obviously, building a "statistics-without-numpy" module to be added to the stdlib is an exception. But otherwise, the fact that numpy already exists, and has had a couple decades of heavy use and expert attention and two predecessor libraries to work out the kinks in the design, means that it's likely to be better, even for your limited purposes, than any limited-purpose thing you come up with.
There are a lot more edge cases than you think. For example, you thought far enough ahead that your sum that works column-wise on 2D arrays. But what about when you need to work row-wise? What's the best interface: an axis parameter, or a transpose function (hey, you can even just use zip)? How do you then extend whichever choice you made to 3D? Or to when you want to get the total sum across both axes? For another example: should I be able to use vectorize to write a function of two arrays, and then apply it to a single N+1-D array, or is that going to cause more confusion than help? And so on. I wouldn't trust my own a priori intuition on those questions, so I'd go look at APL, J, MATLAB, R, and maybe Mathematica and see how their idioms best translate to Python in a variety of different kinds of problems. And I'd probably get some of it wrong, as numpy's ancestors did, and then have to agonize over compatibility-breaking changes.
And after all that, what would be the benefit? I no longer have to install numpy--but now I have to install pyvec instead. Which is just a less-featureful, less-tested, less-optimized, and less-refined numpy.
If there's something actually _wrong_ with numpy's design for your purposes (and you can't trivially wrap it away), that's different. Maybe you could do a whole lot lazily by sticking to the iterator protocol? (There's a nifty Haskell package for vectorizing lazily that might be worth looking at, as long as you can stand reading about everything in terms of monadic lifting where you'd say vectorize, etc.) But "I want the same as numpy but less good" doesn't seem like a good place to start, because at best, that's what you'll end up with.
On Wed, Jan 27, 2016 at 05:51:46PM -0800, Andrew Barnert wrote:
On Jan 27, 2016, at 16:12, Steven D'Aprano steve@pearwood.info wrote:
I think you would be better off trying to get better support for vectorized operations into Python:
I really think, at least 90% of the time, and probably a lot more, people are better off just using numpy than reinventing it.
Oh I agree.
[...]
There are a lot more edge cases than you think. For example, you thought far enough ahead that your sum that works column-wise on 2D arrays. But what about when you need to work row-wise?
I thought of all those questions, and honestly I'm not sure what the right answer is. But the nice thing about writing code for the simple use-cases is that you don't have to worry about the hard use-cases :-)
I'm mostly influenced by the UI of calculators like the HP-48GX and the TI CAS calculators, and typically they don't give you the option. If you want to do an average across the row, transpose your data :-)
And after all that, what would be the benefit? I no longer have to install numpy--but now I have to install pyvec instead. Which is just a less-featureful, less-tested, less-optimized, and less-refined numpy.
True, true, and for many people that's probably a deal-breaker. But for others, more features == more things you don't understand and don't know why you would ever need them.
Anyway, I'm not proposing that any of this should end up in the stdlib, so while I could waffle on for hours, I should bring this to a close before it goes completely off-topic.
-- Steve
I would really like
(a;b;c) in L
vs
a in L and b in L and c in L
or
all(i in L for i in (a,b,c))
because readability very matters.
But if I understand then this is not what we could get from your proposal because a;b;c is not expression. Right?
So we have to write something like
vec=[None]*3 vec=(a;b;c) in L all(vec) # which is now equivalent to (a in L and b in L and c in L)
vec=[]*10 $vec = $u + $v
First row is mistake (somebody wrote it yet) but why not simply? ->
vec = list($u +$v)
Because $u+$v is not expression. It is construct for "unpacking" operations. It could be useful to have "operator" (calling it operator is misleading because result is not object) to go back to python variable. (but with which type? tuple?) Probably $(a;b) could ("return") be transformed to tuple(a,b)
a=[1,2] print(a) print($a)
[1,2] 1 2 (1,2)
So I could write
a in L and b in L and c in L
as
all($((a;b;c) in L)) # which is much less nice as "(a;b;c) in L"
and
all(i in L for i in (a,b,c)) # which is similar readable and don't need language changes
s=0 s;s;s += a;b;c; d;e;f which result in s being ad+b,ce+df
do you mean ad+bd+c*f ?
Your idea is interesting. If you like to test it then you could improve implementation of next function and play with it (probably you will find some caveats):
def respectively(statement): if statement!='a;b=b;a': raise SyntaxError("only supported statement is 'a;b=b;a'") exec('global a\nglobal b\na=b\nb=a')
a,b=1,2 respectively('a;b=b;a') print(a,b) 2 2
Unfortunately this could work only in global context due to limitations around 'exec' and 'locals' functions.
On Fri, Jan 29, 2016 at 11:04 AM, Pavol Lisy pavol.lisy@gmail.com wrote:
I would really like
(a;b;c) in L
vs
a in L and b in L and c in L
or
all(i in L for i in (a,b,c))
because readability very matters.
But if I understand then this is not what we could get from your proposal because a;b;c is not expression. Right?
So we have to write something like
vec=[None]*3 vec=(a;b;c) in L all(vec) # which is now equivalent to (a in L and b in L and c in L)
That's right. Instead, we can get it this way: a;b;c = $L which is equivalent to a=L[0] b=L[1] c=L[2]
but as others suggested for this particular example, we can already get it from unpacking syntax, i.e. a,b,c = *L
>
vec=[]*10 $vec = $u + $v
First row is mistake (somebody wrote it yet) but why not simply? ->
vec = list($u +$v)
Because $u+$v is not expression. It is construct for "unpacking" operations. It could be useful to have "operator" (calling it operator is misleading because result is not object) to go back to python variable. (but with which type? tuple?) Probably $(a;b) could ("return") be transformed to tuple(a,b)
a=[1,2] print(a) print($a)
[1,2] 1 2 (1,2)
So I could write
a in L and b in L and c in L
as
all($((a;b;c) in L)) # which is much less nice as "(a;b;c) in L"
and
all(i in L for i in (a,b,c)) # which is similar readable and don't need language changes
s=0 s;s;s += a;b;c; d;e;f which result in s being ad+b,ce+df
do you mean ad+bd+c*f ?
Yes, Oops, it was a typo.
>
Your idea is interesting. If you like to test it then you could improve implementation of next function and play with it (probably you will find some caveats):
def respectively(statement): if statement!='a;b=b;a': raise SyntaxError("only supported statement is 'a;b=b;a'") exec('global a\nglobal b\na=b\nb=a')
a,b=1,2 respectively('a;b=b;a') print(a,b) 2 2
Unfortunately this could work only in global context due to limitations around 'exec' and 'locals' functions.
Sounds good. I'd like to experiment with it actually.
Pavol Lisy writes:
I would really like
(a;b;c) in L
Not well-specified (does order matter? how about repeated values? is (a;b;c) an object? it sure looks like one, and if so, object in L already has a meaning). But for one obvious interpretation:
{a, b, c} <= set(L)
and in this interpretation you should probably optimize to
{a, b, c} <= L
by constructing L as a set in the first place. Really this thread probably belongs on python-list anyway.
Thanks everyone for your feedback. I think I have a clearer look at it as a result. It seems the most important feature is the vector operation aspect of it. Also that magical behavior and the fact that $ or ;;; does not produce types is troublesome. Also, some of the other aspects such as x,y = 1+2, 3+4 is already addressed by the above notation, so we're not gaining anything there.
I'll have some ideas to address the concerns and will post them later again.
Moj
On Fri, Jan 29, 2016 at 12:32 PM, Stephen J. Turnbull stephen@xemacs.org wrote:
Pavol Lisy writes:
I would really like
(a;b;c) in L
Not well-specified (does order matter? how about repeated values? is (a;b;c) an object? it sure looks like one, and if so, object in L already has a meaning). But for one obvious interpretation:
{a, b, c} <= set(L)
and in this interpretation you should probably optimize to
{a, b, c} <= L
by constructing L as a set in the first place. Really this thread probably belongs on python-list anyway.