joesox at gmail.com
Sat Feb 24 19:54:49 CET 2007
On 2/24/07, J. Merrill <jvm_cop at spamcop.net> wrote:
> The file's size in bytes must be (at least) the number of values being read times the size of each value. If you are giving the file size in bytes as the length of the array, that will not work.
One of the challenges here is that this exception is being thrown from
someone else's natural language processing tool written in Python. I
would like to use this tool with IronPython for development in a .NET
environment. I have been stepping thru the CPython runs and it
appears to me that the natural language processing tool is using
'260759' as the length value for the .fromfile, and does not result in
> What information do you have about the format of the file?
The actual file size is 1044480 bytes.
If I open it in notepad and click 'Save As' it wants to save it as an ANSI.
A sample of the characters display in notepad as such:
" * - 2 6 9 = @ C H L
O R U Z ^ a d g j m p s w z } „ ‡
Š ' – £ ¦ (c) ¬ ¯ ² ¶ º ½ ¿ Ã Ç
Ê Í Ñ × ß å ï ô ö û & *
3 @ G M R U Z a e i m o s w | ‚
‡ Œ Ž " ˜ ž £ § ª ¯ ¹ ¼ ¿ Â Ç Î Ô
Ö Ý é ô ö ú
# & - 4 8 =
E J M S \ b f l o u z … ‰ ' " ™
¡ ¥ ¨"
These are packed lexicon datafiles.
> Are the values supposed to be 4-byte integers?
Yes, the natural language processing tool's module for loading these
datafiles use 'L' which is 4 bytes.
> If so, you should divide the file size in bytes by 4 and use the I type, and look at the data to see if it makes any sense.
I really don't wish to modify the natural language processing tool
module any more that I already have. Is it really necessary for
IronPython to change the "minimum" sizes of the array types? (and
hopefully this is my problem, I guess I will need to modify array.cs
to truly find out)
More information about the Ironpython-users