[Numpy-discussion] Creating small arrays from strings and concatenating with empty arrays

Christopher Barker Chris.Barker at noaa.gov
Fri May 12 21:45:08 EDT 2006


Bill Baxter wrote:
> it doesn't support the
> full syntax of the matrix string constructor,  which allows for things like
> 
>>>> numpy.matrix("[1 2; 2 3;3 4]")
> matrix([[1, 2],
>       [2, 3],
>       [3, 4]])
> 

> I think an array version of the matrix string constructor that returns the
> latter would be handy.
> But it's admittedly a pretty minor thing.

I agree it's pretty minor indeed, but as long as the code is in the 
matrix object, why not in the array object as well?

As I think about it, I can see two reasons:

1) arrays are n-d. after commas and semi-colons, how do you construct a 
higher-than-rank-two array?

2) is it really that much harder to type the parentheses?:

I suppose there is a bit of inefficiency in creating all those tuples, 
just to have them dumped, but I can't imagine that ever really matters.

By the way, you can do:
 >>> a = numpy.fromstring("1 2; 2 3; 3 4", sep=" ").reshape((-1,2))
 >>> a
array([[1, 2],
        [2, 3],
        [3, 4]])

Which, admittedly, is kind of clunky, and, in fact, the ";" is being 
ignored, but you can put it there to remind yourself what you meant.

A note about fromstring/fromfile:

I sometimes might have a mix of separators, like the above example. It 
would be nice I I could pass in more than one that would get used. the 
above example will only work if there is a space in addition to the 
semi-colon. It would be nice to be able to do:

a = numpy.fromstring("1 2;2 3;3 4", sep=" ;")
or
a = numpy.fromstring("1,2;2,3;3,4", sep=",;")

and have that work.

Travis, I believe you said that this code was inspired by my Scanfile 
code I posted on this list a while back. In that code, I allowed any 
character that ?scanf didn't interpret as a number be used as a 
separator: if you asked for the next ten numbers in the file, you'd get 
the next ten numbers, regardless of what was in between them.

While that seems kind of ripe for masking errors, I find that I need to 
know what the file format I'm working with looks like anyway, and while 
this approach might mask an error when you read the data, it'll show up 
soon enough later on, and it sure does make it easy to use and code. 
Maybe a special string for sep could give us this behavior, like "*" or 
something.

I'm also not sure it's the best idea to put this functionality into 
fromstring, rather than a separate function, perhaps fromtext()? (or 
scantext(), or ? ) That's not a big deal, but it just seems like it's a 
bit hidden there, and scanning a string is a very different operation 
that interpreting that string as binary data.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov




More information about the NumPy-Discussion mailing list