[Numpy-discussion] Possible example application of the array interface

Michael Sorich mike_lists at yahoo.com.au
Wed Apr 6 10:12:39 EDT 2005

I think that this is a great idea! While I have a
strong preference for python, I generally use R for
statistical analyses due to the large number of mature
libraries available. There are also some aspects of
the R data types (eg data-frames and column/row names
for 2D arrays) that are really nice for spreadsheet
like data. I hope that scipy.base record arrays will
be as easily manipulated as data-frames are. 

While RPy works well for small simple problems, there
are data conversion limitations between R and Python.
If one could efficiently convert between the major R
data types and python scipy.base data types without
loss of data, it would become possible to do most of
the data manipulation in python and freely mix in R
functions when required. This may encourage the use of
python for the development of statistical routines. 

>From my meager understanding of RPy:

R vectors are converted to python lists. It may make
more sense to convert them to an array (either stdlib
or scipy.base version) - without copying data if

R arrays and matrices are converted to Numeric arrays.

In [8]: r.array([1,2,3,4,5,6],dim=[2,3])
array([[1, 3, 5],
       [2, 4, 6]])

However, column and row names (or dimnames for arrays
with >2 dimensions) are lost in R->Py conversion. I do
not know whether these conversions require copying of
the data.

R data-frames are currently converted to python
dictionaries and I don’t think that there is any
simple way to convert a python object to an R data
frame. This is the biggest limitation of rpy in my

In [16]:
Out[16]: {'col2': ['one', 'two', 'three', 'four'],
'col1': [1, 2, 3, 4]}

If it were possible to convert between an R data-frame
and a scipy.base record array without copying or
losing data, RPy would become more useful.

I wish I understood C, scipy.base and R well enough to
give this a go. However, this is Way over my head! 


--- Magnus Lie Hetland <magnus at hetland.org> wrote:
> I was just thinking about some experimental designs,
> and whether I
> could, perhaps, do the statistics in Python. I
> remembered having used
> RPy [1] briefly at some time (there may be other
> similar bindings out
> there -- I don't remember) and started thinking
> about whether I could,
> perhaps, combine it with numpy in some way. My first
> thought was to
> reimplement the relevant statistical functions; then
> I thought about
> how to convert data back and forth -- but then it
> occurred to me that
> R also uses arrays extensively, and that it could,
> perhaps, be
> possible to expose those (through something like
> RPy) through the
> array interface/protocol!
> This would be (IMO) a good example of the benefits
> of the array
> protocol; it's not a matter of "getting yet another
> array module". RPy
> is an external library/language with *lots* of
> features that might be
> useful to numpy users, many of which aren't likely
> to be implemented
> in Python for quite a while, I'd guess (unless,
> perhaps, someone
> writes a translator from R, which I'm sure is
> doable).
> I don't know enough (at least yet ;) about the
> implementation of RPy
> and the R library to say for sure whether this would
> even be possible,
> but it does seem like it could be really useful...
> [1] rpy.sf.net
> -- 
> Magnus Lie Hetland                    Fall seven
> times, stand up eight
> http://hetland.org                                 
> [Japanese proverb]
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT
> Products from real users.
> Discover which products truly live up to the hype.
> Start reading now.
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net

Find local movie times and trailers on Yahoo! Movies.

More information about the NumPy-Discussion mailing list