numpy.load truncates read from network file on XP

I'm having trouble loading a large remote .npy file on Windows XP. This is on numpy-1.3.0 on Windows XP SP3: numpy.load(r'\\myserver\mydir\big.npy') will fail with this sort of error being printed: "14328000 items requested but only 54 read" and then I get this with a backtrace: "ValueError: total size of new array must be unchanged" (due to the truncated array) The file big.npy is a big 2d array, about 112MB. The same file when stored locally gives no error when read. I can also read it into an editor, or copy it, and I get the whole thing. More strangely, the same file when read from the same UNC path on Windows 7 64-bit (with the same 32-bit versions of all Python-related software) does not give an error either. The fileserver in question is a NetApp box. Any clues? My websearching hasn't netted any leads. Thanks a lot, Dan

On Thu, Apr 28, 2011 at 4:22 PM, Dan Halbert <halbert@halwitz.org> wrote:
I'm having trouble loading a large remote .npy file on Windows XP. This is on numpy-1.3.0 on Windows XP SP3:
numpy.load(r'\\myserver\mydir\big.npy')
will fail with this sort of error being printed: "14328000 items requested but only 54 read" and then I get this with a backtrace: "ValueError: total size of new array must be unchanged" (due to the truncated array)
The file big.npy is a big 2d array, about 112MB.
The same file when stored locally gives no error when read. I can also read it into an editor, or copy it, and I get the whole thing.
More strangely, the same file when read from the same UNC path on Windows 7 64-bit (with the same 32-bit versions of all Python-related software) does not give an error either.
The fileserver in question is a NetApp box. Any clues? My websearching hasn't netted any leads.
Thanks a lot, Dan
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Don't know what a 'NetApp box' is, what OS it runs or what are the filesystem types being used. Really we need more information especially what the file is and how it was created etc., as well as the different OSes involved with the exact Python and numpy versions on each system - numpy 1.3 is getting a little old now with 1.6 due soon. A previous error has been the different line endings between OS's when transferring between Linux and windows. That should show up with a smaller version of the file. So at least try finding the smallest file that you gives an error. Other other issue may be connection related in that Python is not getting the complete file so you might want to read the file directly from Python first or change some of XP's virtual memory settings. There have been significant changes between XP and Win 7 in that regard: http://en.wikipedia.org/wiki/Comparison_of_Windows_Vista_and_Windows_XP "File copy operations proved to be one area where Vista performs better than XP. A 1.25 GB file was copied from a network share to each desktop. For XP, it took 2 minutes and 54 seconds, for Vista with SP1 it took 2 minutes and 29 seconds. This test was done by CRN Test Center, but it omitted the fact that a machine running Vista takes circa one extra minute to boot, if compared to a similar one operating XP. However, the Vista implementation of the file copy is arguably more complete and correct as the file does not register as being transferred until it has completely transferred. In Windows XP, the file completed dialogue box is displayed prior to the file actually finishing its copy or transfer, with the file completing after the dialogue is displayed. This can cause an issue if the storage device is ejected prior to the file being successfully transferred or copied in windows XP due to the dialogue box's premature prompt." Bruce

Several time I encountered problems in transfering large files between XP stations on a wireless network. Could be a result of the unsafe UDP prtocol used by microsoft network protocol (do not have a vista/7 machines to test it). Nadav ________________________________________ From: numpy-discussion-bounces@scipy.org [numpy-discussion-bounces@scipy.org] On Behalf Of Bruce Southey [bsouthey@gmail.com] Sent: 29 April 2011 04:56 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] numpy.load truncates read from network file on XP On Thu, Apr 28, 2011 at 4:22 PM, Dan Halbert <halbert@halwitz.org> wrote:
I'm having trouble loading a large remote .npy file on Windows XP. This is on numpy-1.3.0 on Windows XP SP3:
numpy.load(r'\\myserver\mydir\big.npy')
will fail with this sort of error being printed: "14328000 items requested but only 54 read" and then I get this with a backtrace: "ValueError: total size of new array must be unchanged" (due to the truncated array)
The file big.npy is a big 2d array, about 112MB.
The same file when stored locally gives no error when read. I can also read it into an editor, or copy it, and I get the whole thing.
More strangely, the same file when read from the same UNC path on Windows 7 64-bit (with the same 32-bit versions of all Python-related software) does not give an error either.
The fileserver in question is a NetApp box. Any clues? My websearching hasn't netted any leads.
Thanks a lot, Dan
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Don't know what a 'NetApp box' is, what OS it runs or what are the filesystem types being used. Really we need more information especially what the file is and how it was created etc., as well as the different OSes involved with the exact Python and numpy versions on each system - numpy 1.3 is getting a little old now with 1.6 due soon. A previous error has been the different line endings between OS's when transferring between Linux and windows. That should show up with a smaller version of the file. So at least try finding the smallest file that you gives an error. Other other issue may be connection related in that Python is not getting the complete file so you might want to read the file directly from Python first or change some of XP's virtual memory settings. There have been significant changes between XP and Win 7 in that regard: http://en.wikipedia.org/wiki/Comparison_of_Windows_Vista_and_Windows_XP "File copy operations proved to be one area where Vista performs better than XP. A 1.25 GB file was copied from a network share to each desktop. For XP, it took 2 minutes and 54 seconds, for Vista with SP1 it took 2 minutes and 29 seconds. This test was done by CRN Test Center, but it omitted the fact that a machine running Vista takes circa one extra minute to boot, if compared to a similar one operating XP. However, the Vista implementation of the file copy is arguably more complete and correct as the file does not register as being transferred until it has completely transferred. In Windows XP, the file completed dialogue box is displayed prior to the file actually finishing its copy or transfer, with the file completing after the dialogue is displayed. This can cause an issue if the storage device is ejected prior to the file being successfully transferred or copied in windows XP due to the dialogue box's premature prompt." Bruce _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Thursday, April 28, 2011 5:22pm, "Dan Halbert" <halbert@halwitz.org> said:
I'm having trouble loading a large remote .npy file on Windows XP. This is on numpy-1.3.0 on Windows XP SP3:
numpy.load(r'\\myserver\mydir\big.npy')
will fail with this sort of error being printed: "14328000 items requested but only 54 read"
I have tracked this down: it's not specific to numpy, or even Python, though there's a Python bug report about it: "size limit exceeded for read() from network drive" http://bugs.python.org/issue1478529 The problem stems from a buffer size limit of not quite 64MB for the Windows API ReadFile()function. ReadFile() is used (maybe indirectly) by the Microsoft fread() implementation, which is in turn used by Python file.read(). Apparently most of the time the underlying Windows code is cognizant of this limit, but under certain circumstances, a read from a large networked file is not handled correctly (I think when there is also an entry in "My Network Places" for the path). It's quite obscure, and there's nothing that can be fixed at the numpy level. Dan
participants (3)
-
Bruce Southey
-
Dan Halbert
-
Nadav Horesh