[Numpy-discussion] when did column_stack become C-contiguous?

Mon Oct 19 08:55:05 EDT 2015

On Mon, Oct 19, 2015 at 2:14 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, Oct 18, 2015 at 9:35 PM,  <josef.pktd at gmail.com> wrote:
> >>>> np.column_stack((np.ones(10), np.ones(10))).flags
> >   C_CONTIGUOUS : True
> >   F_CONTIGUOUS : False
> >
> >>>> np.__version__
> > '1.9.2rc1'
> >
> >
> > on my notebook which has numpy 1.6.1 it is f_contiguous
> >
> >
> > I was just trying to optimize a loop over variable adjustment in
> regression,
> > and found out that we lost fortran contiguity.
> >
> > I always thought column_stack is for fortran usage (linalg)
> >
> > What's the alternative?
> > column_stack was one of my favorite commands, and I always assumed we
> have
> > in statsmodels the right memory layout to call the linalg libraries.
> >
> > ("assumed" means we don't have timing nor unit tests for it.)
>
> In general practice no numpy functions make any guarantee about memory
> layout, unless that's explicitly a documented part of their contract
> (e.g. 'ascontiguous', or some functions that take an order= argument
> -- I say "some" b/c there are functions like 'reshape' that take an
> argument called order= that doesn't actually refer to memory layout).
> This isn't so much an official policy as just a fact of life -- if
> no-one has any idea that the someone is depending on some memory
> layout detail then there's no way to realize that we've broken
> something. (But it is a good policy IMO.)
>

I understand that in general.

However, I always thought column_stack is a array creation function which
have guaranteed memory layout. And since it's stacking by columns I thought
that order is always Fortran.
And the fact that it doesn't have an order keyword yet, I thought is just a
missing extension.

>
> If this kind of problem gets caught during a pre-release cycle then we
> generally do try to fix it, because we try not to break code, but if
> it's been broken for 2 full releases then there's no much we can do --
> we can't go back in time to fix it so it sounds like you're stuck
> working around the problem no matter what (unless you want to refuse
> to support 1.9.0 through 1.10.1, which I assume you don't... worst
> case, you just have to do a global search replace of np.column_stack
> with statsmodels.utils.column_stack_f, right?).
>
> And the regression issue seems like the only real argument for
> changing it back -- we'd never guarantee f-contiguity here if starting
> from a blank slate, I think?
>

When the cat is out of the bag, the down stream developer writes
compatibility code or helper functions.

I will do that at at least the parts I know are intentionally designed for
F memory order.

---

statsmodels doesn't really check or consistently optimize the memory order,
except in some cython functions.
But, I thought we should be doing quite well with getting Fortran ordered
arrays. I only paid attention where we have more extensive loops internally.

Nathniel, Does patsy guarantee memory layout (F-contiguous) when creating
design matrices?

Josef

>
> -n
>
> --
> Nathaniel J. Smith -- http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151019/821202a8/attachment.html>