I have a couple more changes to loadtxt() that I'd like to code up in time
for 1.3, but I thought I should run them by the list before doing too much
work. These are already implemented in some fashion in
matplotlib.mlab.csv2rec(), but the code bases are different enough, that
pretty much only the idea can be lifted. All of these changes would be done
in a manner that is backwards compatible with the current API.
1) Support for setting the names of fields in the returned structured array
without using dtype. This can be a passed in list of names or reading the
names of fields from the first line of the file. Many files have a header
line that gives a name for each column. Adding this would obviously make
loadtxt much more general and allow for more generic code, IMO. My current
thinking is to add a *name* keyword parameter that defaults to None, for no
support for reading names. Setting it to True would tell loadtxt() to read
the names from the first line (after skiprows). The other option would be
to set names to a list of strings.
2) Support for automatic dtype inference. Instead of assuming all values
are floats, this would try a list of options until one worked. For strings,
this would keep track of the longest string within a given field before
setting the dtype. This would allow reading of files containing a mixture
of types much more easily, without having to go to the trouble of
constructing a full dtype by hand. This would work alongside any custom
converters one passes in. My current thinking of API would just be to add
the option of passing the string 'auto' as the dtype parameter.
3) Better support for missing values. The docstring mentions a way of
handling missing values by passing in a converter. The problem with this is
that you have to pass in a converter for *every column* that will contain
missing values. If you have a text file with 50 columns, writing this
dictionary of converters seems like ugly and needless boilerplate. I'm
unsure of how best to pass in both what values indicate missing values and
what values to fill in their place. I'd love suggestions
Here's an example of my use case (without 50 columns):
ID,First Name,Last Name,Homework1,Homework2,Quiz1,Homework3,Final
Currently reading in this code requires a bit of boilerplace (declaring
dtypes, converters). While it's nothing I can't write, it still would be
easier to write it once within loadtxt and have it for everyone.
Any support for *any* of these ideas? Any suggestions on how the user
should pass in the information?
Graduate Research Assistant
School of Meteorology
University of Oklahoma
>>> import numpy as np
>>> x = np.ones((3,0))
array(, shape(3,0), dtype=float64)
To preempt, I'm not really concerned with the answer to: Why would
anyone want to do this?
I just want to know what is happening. Especially, with
>>> x[0,:] = 5
(which works). It seems that nothing is really happening here...given
that, why is it allowed? Ie, are there reasons for not requiring the
shape dimensions to be greater than 0?
I'm thinking of changing the names of fmax and fmin to fmaximum and fminimum
so that fmax and fmin can play the roles corresponding to max and min.
Should I add the names atanh, asinh, and acosh as aliases for arctanh,
arcsinh, and arccosh? The vote looked pretty evenly split. If we add them, I
suggest we merely add a note to the documentation of the old functions
suggesting use of the new names to conform to general practice. A while ago
I added deg2rad and rad2deg as aliases for radians and degrees respectively,
so this can be seen as more of the same.
Pauli, can you change the name of code_generators/docstrings to something
more descriptive? I think ufunc_docstrings would be a bit clearer. I expect
this requires various fixups here and there, so I'm tossing the problem over
is there an effective way to remove a row with a given index from
a matrix ?
Dr. rer. nat. Uwe Schmitt
Science Park 2
Telefon: +49 (0)681 8390 5334
Telefax: +49 (0)681 830 4376
Geschäftsführung: Dr.-Ing. Mathias Bauer
Amtsgericht Saarbrücken HRB 12339
Hopefully someone here will be interested in this, and it won't be
considered too spammy... please let me know if this isn't welcome, and
I'll desist in future.
I'm delighted to announce the release of Ironclad v0.7, which is now
available from http://code.google.com/p/ironclad/downloads/list . This
release is a major step forward:
* Runs transparently on vanilla IronPython 2.0RC2, without creating
extra PythonEngines or breaking .NET namespace imports
* Many numpy 1.2 tests (from the core, fft, lib, linalg and random
subpackages) now reliably pass (run "ipy numpytests.py" from the build
* Significant performance improvements (by several orders of magnitude
in some places :D)
So... if you want to use numpy (or other C extension modules) with
IronPython on Win32, please download it and try it out; I'm very keen to
hear your experiences, and to know which neglected features will be most
useful to you.
I have a question about assigning to masked arrays. a is a len ==3
masked array, with 2 unmasked elements. b is a len == 2 array. I
want to put the elements of b into the unmasked elements of a. How do
I do that?
In : a
masked_array(data = [1 -- 3],
mask = [False True False],
In : b
Out: array([7, 8])
I'd like an operation that gives me:
masked_array(data = [7 -- 8],
mask = [False True False],
Seems like it shouldn't be that hard, but I can't figure it out. Any
Back in the beginning of the summer, I jumped through a lot of hoops to
build numpy+scipy on solaris, 64-bit with gcc. I received a lot of help from
David C., and ended up, by some very ugly hacking, building an acceptable
numpy+scipy+matplotlib trio for use at my company.
However, I'm back at it again trying to build the same tools in both a
32-bit abi and a 64-bit ABI. I'm starting with the 32-bit build, because I
suspect it'd be simpler (less trouble adding things like -m64 and other such
flags). However, I've run into a very basic problem right at the get-go.
This time instead of bothering David at the beginning of my build, I was
hoping that other people may have experience to contribute to resolving my
Here is my build environment:
2) Solaris 10 update 3
3) sunperf libraries (for blas+lapack support)
I can provide more detail since that's not a very specific list.
Anyway, when I try building numpy-1.2.1 after setting up my site.cfg and
build-related environment this is what I get:
Setting the site.cfg
Running from numpy source directory.
F2PY Version 2_5972
non-existing path in 'numpy/core': 'code_generators/array_api_order.txt'
scons: Reading SConscript files ...
scons: warning: Ignoring missing SConscript
line 108, in DistutilsSConscript
scons: done reading SConscript files.
scons: Building targets ...
scons: *** [Errno 2] No such file or directory:
scons: building terminated because of errors.
error: Error while executing scons command. See above for more information.
If you think it is a problem in numscons, you can also try executing the
command with --log-level option for more detailed output of what numscons is
doing, for example --log-level=0; the lowest the level is, the more detailed
the output it.
then similar errors repeat themselves over and over including ignoreing
missing SConscript, and no sconsign.dblite file, until the build bombs out.
I've got numscons installed from pypi:
>>> import numscons.version
Can anyone get me on the right track here?
I get numpy errors after I install Picalo (www.picalo.org) on Mac OS X
10.4.11 Tiger. I have tried to import numpy in Picalo using the instructions
in PicaloCookBook, p.101.
I get this error message which I don't understand.
Per Picalo author (see below for his reply to my email to Picalo discussion
forum), I try it here.
I use numpy v. 1.0.4. distributed with Scipy superpack (
Could anyone please help?
Thanks, and cheers
Traceback (most recent call last):
File "<input>", line 1, in <module>
line 93, in <module>
line 9, in <module>
line 4, in <module>
line 8, in <module>
line 5, in <module>
2): Symbol not found: _PyUnicodeUCS4_FromUnicode
Expected in: dynamic lookup
Conan C. Albrecht to me, users
show details Nov 23 (3 days ago)
You're doing everything right from my perspective. It looks like a problem
with NumPy. The stack trace goes to multiarray.so in their core toolkit. I
think you should hit their forums and see if they can help.
One idea is that Picalo uses unicode for all data values. Perhaps numpy
can't handle unicode?