find appropriate dtype based on a set of values
Hi, I'd like to find the smallest possible representation of an array given a set of possible values. I've checked the function 'np.min_scalar_type', it works well for scalar input, but contrary to my assumption when arraylike param is given: array's dtype is simply returned, instead of finding a more compact dtype for the array values.
np.version.version '1.7.0'
np.min_scalar_type(0) # ok dtype('uint8')
np.min_scalar_type(1) # ok dtype('int8')
np.min_scalar_type( [0,1] ) # int8 expected, returns platformdefault int dtype('int32')
np.min_scalar_type( [0,256] ) # uint16 expected dtype('int32')
np.min_scalar_type([1,256]) # int16 expected dtype('int32')
Am I missing something? Anyone knows how to achieve the desired operation? Thanks a lot, Gregorio
On Mon, Sep 2, 2013 at 4:21 PM, Gregorio Bastardo <gregorio.bastardo@gmail.com> wrote:
np.min_scalar_type([1,256]) # int16 expected dtype('int32')
Am I missing something? Anyone knows how to achieve the desired operation?
The docstring states explicitly that this use case is not supported. Here's one way of doing it: https://gist.github.com/stefanv/6413742 Stéfan
Thanks Stéfan, your script works well. There's a small typo on line 12. I also discovered the functions 'np.iinfo' and 'np.finfo' for machine limits on integer/float types (a note for myself, you might be already familiar with them). After having read the docstring, I was only curious why this feature is not provided by the function itself, as returning the input array's dtype seems not so useful (can't imagine such a use case). Gregorio 2013/9/2 Stéfan van der Walt <stefan@sun.ac.za>:
On Mon, Sep 2, 2013 at 4:21 PM, Gregorio Bastardo <gregorio.bastardo@gmail.com> wrote:
np.min_scalar_type([1,256]) # int16 expected dtype('int32')
Am I missing something? Anyone knows how to achieve the desired operation?
The docstring states explicitly that this use case is not supported.
Here's one way of doing it: https://gist.github.com/stefanv/6413742
Stéfan _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
On Mon, Sep 2, 2013 at 3:55 PM, Stéfan van der Walt <stefan@sun.ac.za> wrote:
On Mon, Sep 2, 2013 at 4:21 PM, Gregorio Bastardo <gregorio.bastardo@gmail.com> wrote:
np.min_scalar_type([1,256]) # int16 expected dtype('int32')
Am I missing something? Anyone knows how to achieve the desired
operation?
The docstring states explicitly that this use case is not supported.
Here's one way of doing it: https://gist.github.com/stefanv/6413742
You can probably reduce the amount of work by only comparing a.min() and a.max() instead of the whole array.  Robert Kern
On Tue, Sep 3, 2013 at 2:47 PM, Robert Kern <robert.kern@gmail.com> wrote:
Here's one way of doing it: https://gist.github.com/stefanv/6413742
You can probably reduce the amount of work by only comparing a.min() and a.max() instead of the whole array.
Thanks, fixed. Stéfan
@Stéfan: the 'np.all' calls are now unnecessary on line 26 @Stéfan, Robert: Is it worth to bring this solution into numpy? I mean it's probably not a rare problem, and now users have to bring this snippet into their codebase. Gregorio 2013/9/3 Stéfan van der Walt <stefan@sun.ac.za>:
On Tue, Sep 3, 2013 at 2:47 PM, Robert Kern <robert.kern@gmail.com> wrote:
Here's one way of doing it: https://gist.github.com/stefanv/6413742
You can probably reduce the amount of work by only comparing a.min() and a.max() instead of the whole array.
Thanks, fixed.
Stéfan _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
Hi Stéfan, I ran into a problem:
min_typecode( (18446744073709551615L,) ) # ok <type 'numpy.uint64'>
min_typecode( (0, 18446744073709551615L,) ) # ? Traceback (most recent call last): ... ValueError: Can only handle integer arrays.
It seems that np.asarray converts the input sequence into a float64 array in the second case (same behaviour with np.array). Anyone knows the reason behind? python 2.7.4 win32 numpy 1.7.1 Gregorio 2013/9/4 Gregorio Bastardo <gregorio.bastardo@gmail.com>:
@Stéfan: the 'np.all' calls are now unnecessary on line 26
@Stéfan, Robert: Is it worth to bring this solution into numpy? I mean it's probably not a rare problem, and now users have to bring this snippet into their codebase.
Gregorio
2013/9/3 Stéfan van der Walt <stefan@sun.ac.za>:
On Tue, Sep 3, 2013 at 2:47 PM, Robert Kern <robert.kern@gmail.com> wrote:
Here's one way of doing it: https://gist.github.com/stefanv/6413742
You can probably reduce the amount of work by only comparing a.min() and a.max() instead of the whole array.
Thanks, fixed.
Stéfan _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
participants (3)

Gregorio Bastardo

Robert Kern

Stéfan van der Walt