Converting a list to a Numarray array?

Alex Martelli aleax at aleax.it
Thu Feb 27 05:58:49 EST 2003


S Timmins wrote:

> Alex:
>       Thanks for your answer but what if the list comes from a file
> via split and:
> 
>    somelist = ['1','2','3']
> 
> Then you have to long each element because the list does not contain a
> numeric type!

If you have a list of strings, and want a Numeric.array of numbers,
then most assuredly you do have to convert strings into numbers,
which I assume is what you mean by that peculiar verb "long".

So, "anarray = numarray.array(map(int,somelist))" or the like (you
may use other ways than int to convert, list comprehensions in
lieu of map, and the like).


>  I am also confused about how to ignore certain "records" from a
> similar list from reading this file:
> #Zip   Julian  mm dd year icd9  age m/f race   <--- NOT in file
> 03901 2451634 03 30 2000 V3000   0 1 1
> 03901 2451642 04 07 2000 5990    0 2 1
> 03901 2451648 04 13 2000 3004   16 2 1
> 
> I do this:
>  input = open(dir+file)
>  s = input.read()
>  buf = s.split()

Hmmm -- so basically you're throwing away the _structure_ of the
file.  I think you'd be much happier NOT doing that, i.e.:

buf2 = [ line.split() for line in input ]

KEEP the structure rather than throwing it away.

> but wish I could select the lines in buf (assuming it were 2-D) to be

the way I just built it, buf2 _IS_ "2-D" -- a list of lists.

> converted and say choose just the "rows" (lines) which contain say the
> "V3000" code in the icd9 "field".

chosen = [ record for record in buf2 if record[5]=='V3000' ]

this relies on the _numeric index_ of the relevant field rather
than on "field names", since we haven't defined any "names" for
the "columns" of each record.  That's not too hard to fix, by
defining a suitable wrapper object, but it would affect only the
readability of the resulting code, not its functionality.


> Since Python only seems to have 1-D lists it is difficult to "reshape"

Maybe "seems" is the operative word.  Why won't lists of lists work
just as well as the "2-D lists" you're after?

> buf to a Number of lines by 9 list and then "slice" the icd9 field to
> choose just the desired records.

if some evil witch gave you a flattened-out list of 9*N items
it wouldn't be TOO hard to rebuild a list of N lists of 9 items
each, actually:

buf2 = [ buf[i:i+9]  for i in range(0, len(buf), 9) ]

but it's easier to make buf2 at once rather than to make buf
and then have to work to make buf2 out of it, when feasible;-).

>    There must be some tricks I don't know to handle this...

I think the main "tricks" you may not know are: items of lists
can be anything, including other lists; list comprehensions are
handy, though of course you could easily write loops that do
just the same thing; and some tools such as map are also handy,
though the "could write it more explicitly" caveat applies even
more to such tools.

With good old Numeric you could also build an array of generic
Python objects then reshape it at will -- all it took was to
use typecode PyObject -- but I don't know if numarray also has
a similar handy functionality (maybe some numarray expert can
help here...?).  But even if you have to work with native
Python stuff, you're not badly placed, anyway.


Alex





More information about the Python-list mailing list