1D ndarray to java double[]
How best to write a 1D ndarray as a block of doubles, for reading in java as double[] or a stream of double? Maybe the performance of simple looping over doubles in python.write() and java.read() is fine, but maybe there are representational diffs? Maybe there's a better solution for the use case? Use case: I get the ndarray from keras, and it represents a 2D distance matrix. I want to find the top-50 matches for each item, per row and column. I'm looking at moving the top-50 task to java for its superior parallel threading. (Java doesn't fork processes with a copy of the array, which is ~5% of memory; rather one gets 1 process with e.g. 1475% CPU.) Thanks, Bill Ross -- Phobrain.com
On Sat, 31 Dec 2022 23:45:54 -0800 Bill Ross <bross_phobrain@sonic.net> wrote:
How best to write a 1D ndarray as a block of doubles, for reading in java as double[] or a stream of double?
Maybe the performance of simple looping over doubles in python.write() and java.read() is fine, but maybe there are representational diffs? Maybe there's a better solution for the use case?
Java is known to be big-endian ... but your CPU is probably little-endian. Numpy has the tools to represent an array of double BE.
Use case: I get the ndarray from keras, and it represents a 2D distance matrix. I want to find the top-50 matches for each item, per row and column. I'm looking at moving the top-50 task to java for its superior parallel threading. (Java doesn't fork processes with a copy of the array, which is ~5% of memory; rather one gets 1 process with e.g. 1475% CPU.)
What about numba or cython then ? Happy new year Jerome
Thanks!
Java is known to be big-endian ... your CPU is probably little-endian.
$ lscpu | grep -i endian Byte Order: Little Endian
Numpy has the tools to represent an array of double BE.
Is there a lower-level ndarray method that writes an array that could be used this way? Bill -- Phobrain.com On 2023-01-01 05:13, Jerome Kieffer wrote:
On Sat, 31 Dec 2022 23:45:54 -0800 Bill Ross <bross_phobrain@sonic.net> wrote:
How best to write a 1D ndarray as a block of doubles, for reading in java as double[] or a stream of double?
Maybe the performance of simple looping over doubles in python.write() and java.read() is fine, but maybe there are representational diffs? Maybe there's a better solution for the use case?
Java is known to be big-endian ... but your CPU is probably little-endian. Numpy has the tools to represent an array of double BE.
Use case: I get the ndarray from keras, and it represents a 2D distance matrix. I want to find the top-50 matches for each item, per row and column. I'm looking at moving the top-50 task to java for its superior parallel threading. (Java doesn't fork processes with a copy of the array, which is ~5% of memory; rather one gets 1 process with e.g. 1475% CPU.)
What about numba or cython then ?
Happy new year
Jerome _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: bross_phobrain@sonic.net
On Sun, 01 Jan 2023 05:31:55 -0800 Bill Ross <bross_phobrain@sonic.net> wrote:
Thanks!
Java is known to be big-endian ... your CPU is probably little-endian.
$ lscpu | grep -i endian Byte Order: Little Endian
Numpy has the tools to represent an array of double BE.
Is there a lower-level ndarray method that writes an array that could be used this way?
One example: numpy.array([1,2,3], dtype=">d").tobytes() b'?\xf0\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x00@\x08\x00\x00\x00\x00\x00\x00' numpy.array([1,2,3], dtype="<d").tobytes() b'\x00\x00\x00\x00\x00\x00\xf0?\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x08@'
byteswap() looks like a general endian solution for ndarrays: https://stackoverflow.com/questions/49578507/fast-way-to-reverse-float32-end... numpy.memmap(infile, dtype=numpy.int32).byteswap().tofile(outfile) numpy.memmap(infile, dtype=numpy.int32).byteswap(inplace=True).flush() Bill -- Phobrain.com On 2023-01-01 08:31, Jerome Kieffer wrote:
On Sun, 01 Jan 2023 05:31:55 -0800 Bill Ross <bross_phobrain@sonic.net> wrote:
Thanks!
Java is known to be big-endian ... your CPU is probably little-endian. $ lscpu | grep -i endian Byte Order: Little Endian
Numpy has the tools to represent an array of double BE. Is there a lower-level ndarray method that writes an array that could be used this way?
One example: numpy.array([1,2,3], dtype=">d").tobytes() b'?\xf0\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x00@\x08\x00\x00\x00\x00\x00\x00' numpy.array([1,2,3], dtype="<d").tobytes() b'\x00\x00\x00\x00\x00\x00\xf0?\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x08@' _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: bross_phobrain@sonic.net
Getting back to this, byteswap() looks like a general endian solution for ndarrays: https://stackoverflow.com/questions/49578507/fast-way-to-reverse-float32-end... The examples there specify float32 format for opening a file, but that seemed to scramble the header that was output. No header being the desired behavior for the BE file, which happened to occur when using byteswap from the same program that made the mem-mapped data that was the normal ndarray, in which case no header was written (conveniently for java). Little-endian access can be handled in java with ByteBuffer.order(). Bill -- Phobrain.com On 2023-01-01 08:31, Jerome Kieffer wrote:
On Sun, 01 Jan 2023 05:31:55 -0800 Bill Ross <bross_phobrain@sonic.net> wrote:
Thanks!
Java is known to be big-endian ... your CPU is probably little-endian. $ lscpu | grep -i endian Byte Order: Little Endian
Numpy has the tools to represent an array of double BE. Is there a lower-level ndarray method that writes an array that could be used this way?
One example: [not applicable, redacted over bounce of previous try] _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: bross_phobrain@sonic.net
participants (2)
-
Bill Ross
-
Jerome Kieffer