[Python-ideas] Python multi-dimensional array constructor
Chris Barker
chris.barker at noaa.gov
Thu Oct 20 18:08:54 EDT 2016
On Wed, Oct 19, 2016 at 5:32 PM, Todd <toddrjen at gmail.com> wrote:
> If there is a problem with the current options (and I'm not convinced
>> there is) it's that it in'st a literal for multidimensional array, but
>> rather a literal for a bunch of nested lists -- the list themselves are
>> created, and so are all the "boxed" values in the array -- only to be
>> pulled out and unboxed to be put in the array.
>>
>>
> But as you said, that is not a multidimensional array. We aren't
> comparing "a = [| 0, 1, 2 || 3, 4, 5 |]" and "a = [[0, 1, 2],[3, 4, 5]]",
> we are comparing "a = [| 0, 1, 2 || 3, 4, 5 |]" and "a = np.array([[0, 1,
> 2],[3, 4, 5]])". That is a bigger difference.
>
Well then, you have mixed two proposals here:
1) a literal syntax for nd arrays -- that is not going to fly if there is
NO ndarray object builtin to python. I kinda think there should be, though
even then there need not be a literal for it (see Decimal). So I'd say --
get an nd array object into the standard library first, then we can talk
about the literal
2) what the syntax should be for such a literal. OK, in this case,
suggested that the way to hash that out is to start out with passing a
string to a function that constructs the array -- then you could try things
out without any additions to the language or the libraries -- it could be a
stand-alone module that extends numpy:
from ndarray_literal import nda
my array = nda('||| 3, 4, 5 || 6, 7, 8 |||')
(is that legal in your notation -- I honestly am not sure)
and yes, that still requires typing "nda('", which you are trying to avoid.
But honestly, I really have written a lot of numpy code, and writing:
np.array( ..... )
does not bother me at all. IF I did support a literal, it would be so that
the object could be constructed immediately rather than by creating other
python objects first (liss, usually), and then making an array from that.
If you do want to push the syntax idea further, I'd suggest going to the
numpy list and seeing what folks there think.
But as I can't help myself. It's clear from the other posts on the list
here that others find your proposed syntax as confusing as I do. but maybe
it could be made more clear. Taking a page from MATLAB:
1 2 3; 4 5 6
is a 2x3 2-d array. no in MATLAB, there only used to be matrixes, so this
was pretty nice, but a bit hard to extend to multiple dimensions. But the
principle is handy: one delimter for the first dimension,l a nother one for
the second, etc..
we probably dont want to go with trying colons, and ! and who knows what
else, so I liek your idea.
a 2-d array:
1 | 2 | 3 || 4 | 5 | 6
(Or better)
1 | 2 | 3 ||
4 | 5 | 6
a 3d array:
0 | 1 | 2 | 3 ||
4 | 5 | 6 | 7 ||
8 | 9 | 10 | 11 |||
12 | 13 | 14 | 15||
16 | 17 | 18 | 19||
20 | 21 | 22 | 23||
Notes:
1) guess how I wrote that? I did: np.arange(24).reshape((2,3,4)) and edited
the results -- making the point that maybe the current state of affairs is
not so bad...
2) These are delimiters, rather than brackets -- so don't go at the
beginning and optional at the end (like commas in python)
3) It doesn't use commas as all, as having a consistent system is clearer
4) Whitespace is insignificant as with the rest of Python -- though you
want to be able to use a line ending as insignificant whitespace, so this
may have to wrapped in a parentheses, or something to actually use it --
though a non-issue if it's a string
Hmm -- about point (3), maybe use only commas:
0, 1, 2, 3,,
4, 5, 6, 7,,
8, 9, 10, 11,,,
12, 13, 14, 15,,
16, 17, 18, 19,,
20, 21, 22, 23
That would be more consistent with the rest of python, and multiple commas
in a row are currently a syntax error.
Even if your original data is large, I often need smaller areas when
> processing, for example for broadcasting or as arguments to processing
> functions.
>
sure I do hard-coded arrays all teh time -- but not big ones, and I don't
think I've ever needed more than 2D and certainly not more than 3D. and not
large enough that performance matters.
It is:
>>
>
> r_[[0, 1, 2], [3, 4, 5]
>
no, that's a shorthand for "row stack" -- and really not much better than
the array() call, except a few less characters
I meant the np.matrix() function that Alexander pointed out -- which is
only really there to make folks coming from MATLAB happier...(and it makes
a Matrix object, which you likely don't want). The point was that it's easy
to make such a beast for your new syntax to try it out
b = np.array([[ 0, 1, 2 ],
> [ 3, 4, 5 ]])
>
> The whole point of this is to avoid the "np.array" call.
>
again, trying to separate out the idea of a literal, from the syntax of the
literal.
but thinking now, Python already uses (), [], {}, and < and > -- so I don't
think there are any more brackets. but why not just use commas with square
brackets:
2Darr = [1, 2, 3,,
4, 5, 6]
maybe too subtle?
Yes, I understand that. But some projects are already doing that on their
> own. I think having a way for them to do it without losing the list
> constructor (which is the approach currently being taken) would be a
> benefit.
>
huh? is anyone actually overriding the list constructor??
multiple dims apart (my [ and ,, example shows that you can do that with
the current syntax) this is kind of like adding Decimal -- there is another
type, but does it need a literal? I have maybe 90% of the code I write with
an:
import numpy as np
at the top -- so yes, I kind a would like a literal, but it's really a
pretty small deal -- at least once I got used to it after using MATLAB for
years.
I'd ask folks that have been using numpy for along time -- would this
really help?
One more problem -- with the addition of the @ operator, there have not
been any use cases in the stdlib, but it is an operator, and Python already
has a mechanism for operator overloading.
As far as I know, every python literal maps to a SINGLE type -- so creating
a literal for a non existent type makes no sense at all.
-CHB
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20161020/92beb1f2/attachment-0001.html>
More information about the Python-ideas
mailing list