Syntax idea for 2D lists\arrays

Once there was a discussion about alternative list and arrays creation and assignment syntaxes. I think this is one of things with room for ideas and has frequent usage. I've had some ideas for syntax long ago, and now I remembered it because of the recent topic "Descouraging the implicit string concatenation". Idea is a concept for 2D arrays/lists syntax, which should simplify some editing boilerplate while working with arrays and improve readability for bigger arrays. Lets start with a simple list example : L === 1 5 9 155 53 44 44 34 returns a 2d list: [[1, 5, 9, 155], [53, 44, 44, 34]] Syntax rules here: - "===" indicates parsing start (may be some other decorator) - data block *always* starts on the new line and parsing always like in a "def:" block (by indentation, till next negative indent) - new line is a next element - white space separated inner elements (tabs, spaces) Line continuation as usual: L === 2 1 5 \ 3 4 6 returns: [[2, 1, 5, 3, 4, 5]] SPECIAL CASE: 1d list could be explicitly marked with e.g. *: L ===* 2 1 5 3 4 6 returns : [2, 1, 5, 3, 4, 5] Also in 1d case one might omit the line continuation \: L ===* "msg1" "msg2" "error" returns: ["msg1", "msg2", "error"] IMO the latter looks nice and very useful for pretty-formatting of string lists. ---- *Examples for working with 3D arrays:* import numpy as np A = np.zeros((2,3,2)) A[0,0] === 8 8 A[1] === 0 0 155 1 1 1 returns: [[[ 8. 8.] [ 0. 0.] [ 0. 0.]] [[ 0. 0.] [ 155. 1.] [ 1. 1.]]] Compared to same operation in current syntax: A[0, 0] = [8, 8] A[1] = [ [0, 0], [155, 1], [1, 1] ] (which is ugly, though can be a bit more compact in simplest case) I think these examples show the general idea already. Pros: - clean readable look - copy-pasting tables from e.g. excel with little to no re-typing - no more editing annoyance with trailing commas/brackets Cons: - less compact for trivail cases (newline always added) ----- Problems that I see with this syntax: *Comma-separated vs space separated*. Space separated IMO better readable in general, though single space can be too dense with some proportional fonts. In this case two or more spaces could be better. Also consider expressions inside arrays, e.g.: SS ===* 11 + 3 5 6 SS ===* 11 + 3, 5, 6 I would not say comma really improves things here, but opinions may vary. Still, all in all, I am on the side of space-separated. Also numpy uses space-separated with print() method by default. ------ How do you like such syntax? Maybe someone can come up with counter-examples where such syntax will lead to ugly look? Mikhail

On 3/14/2018 8:32 PM, Mikhail V wrote:
I would rather type this, with or without a return. Literal arrays are not this common. When the entries are constant, a tuple of tuples is much more efficient as it is built at compile time
No, this would return "msg1msg2error"
-- Terry Jan Reedy

On Thu, Mar 15, 2018 at 2:10 AM, Terry Reedy <tjreedy@udel.edu> wrote:
Ah well, if we already in implementation details - sure, a complete solution would be a whole new layer in parser - lists, tuples, lists of tuples, dicts. It'd be what I suppose is called a [new to python] micro-language, with own preprocessing.

On Thu, Mar 15, 2018 at 01:32:35AM +0100, Mikhail V wrote:
I don't understand; we already have perfectly good syntax for working with 2D arrays.
We already have: L = [[1, 5, 9, 155], [53, 44, 44, 34]] which is more compact (one line rather than two) and explicitly delimits the start and end of each list. Like everything else in Python, it uses commas to separate items, not whitespace. If you prefer: L = [[1, 5, 9, 155], [53, 44, 44, 34]] Using spaces to separate items has the fatal flaw that it cannot distinguish x - y 0 # two items, the expression `x - y` and the integer 0 from: x - y 0 # three items, `x`, `-y`, and 0 making it ambiguous. I stopped reading your post once I realised that. -- Steve

I'd just do this, which works today: ================== import numpy import io ar = numpy.loadtxt(io.StringIO(""" 1 5 9 155 53 44 44 34 """)) ================== Of course, this is only worth the trouble if you are somehow loading a very large matrix. (And then, are you sure you want to embed it in your code?) Stephan 2018-03-15 6:15 GMT+01:00 Steven D'Aprano <steve@pearwood.info>:

On Thu, Mar 15, 2018 at 6:15 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Mar 15, 2018 at 01:32:35AM +0100, Mikhail V wrote:
When you say "it cannot distinguish" what is "it"? you mean current parser can't separate items due to the allowed unary negation operator? Well then it is not like I would parse the table, for this case my parsing rule would be: x - y 0 -> [x - y, 0] x-y -> [x - y, 0] x -y 0 -> [x, -y, 0] That's in case I use same char for unary negation and minus operator (which I find rather inconvinient for parsing). Mikhail

On Thu, Mar 15, 2018 at 6:15 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Mar 15, 2018 at 01:32:35AM +0100, Mikhail V wrote:
Ah, sorry, my bad. Now I remember preciser what was my formatting idea originally. That would make things clearer indeed - the separator would be TAB character only. (I wrote tabs or spaces in proposal). The root of the idea was finding a cleaner way for defining tables and data plus some experiments with nesting syntax, etc. Back then I did not think much about supportin expressions inside elements, so this important issue slipped through. So with the TAB separator, just think of replacement TAB->comma, this should support all Python expressions automatically. At least seems to me so, but if I am delusional - please correct me. Of course the reality is still, that sadly, most editors cannot handle tabulaton adequately. But I am a believer and hope for better future. (Heck, people building space ships and what not, so maybe tabulation in code editors comes next?)

On 3/14/2018 8:32 PM, Mikhail V wrote:
I would rather type this, with or without a return. Literal arrays are not this common. When the entries are constant, a tuple of tuples is much more efficient as it is built at compile time
No, this would return "msg1msg2error"
-- Terry Jan Reedy

On Thu, Mar 15, 2018 at 2:10 AM, Terry Reedy <tjreedy@udel.edu> wrote:
Ah well, if we already in implementation details - sure, a complete solution would be a whole new layer in parser - lists, tuples, lists of tuples, dicts. It'd be what I suppose is called a [new to python] micro-language, with own preprocessing.

On Thu, Mar 15, 2018 at 01:32:35AM +0100, Mikhail V wrote:
I don't understand; we already have perfectly good syntax for working with 2D arrays.
We already have: L = [[1, 5, 9, 155], [53, 44, 44, 34]] which is more compact (one line rather than two) and explicitly delimits the start and end of each list. Like everything else in Python, it uses commas to separate items, not whitespace. If you prefer: L = [[1, 5, 9, 155], [53, 44, 44, 34]] Using spaces to separate items has the fatal flaw that it cannot distinguish x - y 0 # two items, the expression `x - y` and the integer 0 from: x - y 0 # three items, `x`, `-y`, and 0 making it ambiguous. I stopped reading your post once I realised that. -- Steve

I'd just do this, which works today: ================== import numpy import io ar = numpy.loadtxt(io.StringIO(""" 1 5 9 155 53 44 44 34 """)) ================== Of course, this is only worth the trouble if you are somehow loading a very large matrix. (And then, are you sure you want to embed it in your code?) Stephan 2018-03-15 6:15 GMT+01:00 Steven D'Aprano <steve@pearwood.info>:

On Thu, Mar 15, 2018 at 6:15 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Mar 15, 2018 at 01:32:35AM +0100, Mikhail V wrote:
When you say "it cannot distinguish" what is "it"? you mean current parser can't separate items due to the allowed unary negation operator? Well then it is not like I would parse the table, for this case my parsing rule would be: x - y 0 -> [x - y, 0] x-y -> [x - y, 0] x -y 0 -> [x, -y, 0] That's in case I use same char for unary negation and minus operator (which I find rather inconvinient for parsing). Mikhail

On Thu, Mar 15, 2018 at 6:15 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Mar 15, 2018 at 01:32:35AM +0100, Mikhail V wrote:
Ah, sorry, my bad. Now I remember preciser what was my formatting idea originally. That would make things clearer indeed - the separator would be TAB character only. (I wrote tabs or spaces in proposal). The root of the idea was finding a cleaner way for defining tables and data plus some experiments with nesting syntax, etc. Back then I did not think much about supportin expressions inside elements, so this important issue slipped through. So with the TAB separator, just think of replacement TAB->comma, this should support all Python expressions automatically. At least seems to me so, but if I am delusional - please correct me. Of course the reality is still, that sadly, most editors cannot handle tabulaton adequately. But I am a believer and hope for better future. (Heck, people building space ships and what not, so maybe tabulation in code editors comes next?)
participants (4)
-
Mikhail V
-
Stephan Houben
-
Steven D'Aprano
-
Terry Reedy