[Numpy-discussion] when did column_stack become C-contiguous?
josef.pktd at gmail.com
josef.pktd at gmail.com
Mon Oct 19 00:51:33 EDT 2015
On Mon, Oct 19, 2015 at 12:35 AM, <josef.pktd at gmail.com> wrote:
> >>> np.column_stack((np.ones(10), np.ones(10))).flags
> C_CONTIGUOUS : True
> F_CONTIGUOUS : False
>
> >>> np.__version__
> '1.9.2rc1'
>
>
> on my notebook which has numpy 1.6.1 it is f_contiguous
>
>
> I was just trying to optimize a loop over variable adjustment in
> regression, and found out that we lost fortran contiguity.
>
> I always thought column_stack is for fortran usage (linalg)
>
> What's the alternative?
> column_stack was one of my favorite commands, and I always assumed we have
> in statsmodels the right memory layout to call the linalg libraries.
>
> ("assumed" means we don't have timing nor unit tests for it.)
>
What's the difference between using array and column_stack except for a
transpose and memory order?
my current usecase is copying columns on top of each other
#exog0 = np.column_stack((np.ones(nobs), x0, x0s2))
exog0 = np.array((np.ones(nobs), x0, x0s2)).T
exog_opt = exog0.copy(order='F')
the following part is in a loop, followed by some linear algebra for OLS,
res_optim is a scalar parameter.
exog_opt[:, -1] = np.clip(exog0[:, k] + res_optim, 0, np.inf)
Are my assumption on memory access correct, or is there a better way?
(I have quite a bit code in statsmodels that is optimized for fortran
ordered memory layout especially for sequential regression, under the
assumption that column_stack provides that Fortran order.)
Also, do I need to start timing and memory benchmarking or is it obvious
that a loop
for k in range(maxi):
x = arr[:, :k]
<calculate>
depends on memory order?
Josef
>
> Josef
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151019/0bb6a891/attachment.html>
More information about the NumPy-Discussion
mailing list