[Numpy-discussion] Creating small arrays from strings and concatenating with empty arrays
Christopher Barker
Chris.Barker at noaa.gov
Fri May 12 21:45:08 EDT 2006
Bill Baxter wrote:
> it doesn't support the
> full syntax of the matrix string constructor, which allows for things like
>
>>>> numpy.matrix("[1 2; 2 3;3 4]")
> matrix([[1, 2],
> [2, 3],
> [3, 4]])
>
> I think an array version of the matrix string constructor that returns the
> latter would be handy.
> But it's admittedly a pretty minor thing.
I agree it's pretty minor indeed, but as long as the code is in the
matrix object, why not in the array object as well?
As I think about it, I can see two reasons:
1) arrays are n-d. after commas and semi-colons, how do you construct a
higher-than-rank-two array?
2) is it really that much harder to type the parentheses?:
I suppose there is a bit of inefficiency in creating all those tuples,
just to have them dumped, but I can't imagine that ever really matters.
By the way, you can do:
>>> a = numpy.fromstring("1 2; 2 3; 3 4", sep=" ").reshape((-1,2))
>>> a
array([[1, 2],
[2, 3],
[3, 4]])
Which, admittedly, is kind of clunky, and, in fact, the ";" is being
ignored, but you can put it there to remind yourself what you meant.
A note about fromstring/fromfile:
I sometimes might have a mix of separators, like the above example. It
would be nice I I could pass in more than one that would get used. the
above example will only work if there is a space in addition to the
semi-colon. It would be nice to be able to do:
a = numpy.fromstring("1 2;2 3;3 4", sep=" ;")
or
a = numpy.fromstring("1,2;2,3;3,4", sep=",;")
and have that work.
Travis, I believe you said that this code was inspired by my Scanfile
code I posted on this list a while back. In that code, I allowed any
character that ?scanf didn't interpret as a number be used as a
separator: if you asked for the next ten numbers in the file, you'd get
the next ten numbers, regardless of what was in between them.
While that seems kind of ripe for masking errors, I find that I need to
know what the file format I'm working with looks like anyway, and while
this approach might mask an error when you read the data, it'll show up
soon enough later on, and it sure does make it easy to use and code.
Maybe a special string for sep could give us this behavior, like "*" or
something.
I'm also not sure it's the best idea to put this functionality into
fromstring, rather than a separate function, perhaps fromtext()? (or
scantext(), or ? ) That's not a big deal, but it just seems like it's a
bit hidden there, and scanning a string is a very different operation
that interpreting that string as binary data.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list