[Python-bugs-list] [Bug #110829] REQ: array module should provide "swap to native byte order" functionality, similar to struct module (PR#166)
noreply@sourceforge.net
noreply@sourceforge.net
Mon, 14 Aug 2000 08:51:33 -0700
Bug #110829, was updated on 2000-Aug-01 14:12
Here is a current snapshot of the bug.
Project: Python
Category: Modules
Status: Closed
Resolution: Fixed
Bug Group: Feature Request
Priority: 5
Summary: REQ: array module should provide "swap to native byte order" functionality, similar to struct module (PR#166)
Details: Jitterbug-Id: 166
Submitted-By: aa8vb@ipass.net
Date: Wed, 22 Dec 1999 07:57:58 -0500 (EST)
Version: 1.5.2
OS: Irix
In implementing a Python reader for VPF GIS data, I noticed that the struct
module makes it very easy to read in data that may be in a different byte
order than the local machine:
val = struct.unpack( "<l", buf )
This parses from a little-endian file/buffer, and swaps bytes "if needed" to
the endianness of the local machine.
However, using the array module is not so convenient. The developer has to
write code to sense the byte order of the local machine, and tell the
array module whether or not to swap bytes. I.e. there is no
"swap to native byte order" functionality:
def _IsByteSwapNeeded( file_byte_order ):
""" The array module doesn't do byte swapping to the native byte order.
So we have to get under the hood and check it ourselves.
"""
def host_endian():
if ord( array.array( "i", [1] ).tostring()[ 0 ] ): return 'L'
else: return 'M'
assert file_byte_order in 'LM'
return ( host_endian() != file_byte_order )
a = array.array( "l", buf )
if _IsByteSwapNeeded( 'L' ):
a.byteswap()
val = a.tolist()
The reason for using the array module (versus the struct module with
repeat counts) is for more efficient memory storage and access of large
numbers of lists containing large numbers of coordinates. struct insists
on converting everything at once, and the length must be exactly right.
array provides direct access to the any element of slice of a list so it
is my preferred choice.
It would be useful if the "<" and ">" byte order prefixes used
in the struct module were added to the array module.
Thanks,
Randall
====================================================================
Audit trail:
Wed Jan 12 18:16:54 2000 guido changed notes
Wed Jan 12 18:16:54 2000 guido moved from incoming to request
Follow-Ups:
Date: 2000-Aug-01 14:12
By: none
Comment:
From: Guido van Rossum <guido@CNRI.Reston.VA.US>
Subject: Re: [Python-bugs-list] REQ: array module should provide "swap to native byte order" functionality, similar to struct module (PR#166)
Date: Wed, 22 Dec 1999 09:30:21 -0500
> In implementing a Python reader for VPF GIS data, I noticed that the struct
> module makes it very easy to read in data that may be in a different byte
> order than the local machine:
>
> val = struct.unpack( "<l", buf )
>
> This parses from a little-endian file/buffer, and swaps bytes "if needed" to
> the endianness of the local machine.
>
> However, using the array module is not so convenient. The developer has to
> write code to sense the byte order of the local machine, and tell the
> array module whether or not to swap bytes. I.e. there is no
> "swap to native byte order" functionality:
>
> def _IsByteSwapNeeded( file_byte_order ):
> """ The array module doesn't do byte swapping to the native byte order.
> So we have to get under the hood and check it ourselves.
> """
> def host_endian():
> if ord( array.array( "i", [1] ).tostring()[ 0 ] ): return 'L'
> else: return 'M'
(I presume 'L' is little-endian, but what does 'M' stand for?)
> assert file_byte_order in 'LM'
> return ( host_endian() != file_byte_order )
>
> a = array.array( "l", buf )
> if _IsByteSwapNeeded( 'L' ):
> a.byteswap()
> val = a.tolist()
>
>
> The reason for using the array module (versus the struct module with
> repeat counts) is for more efficient memory storage and access of large
> numbers of lists containing large numbers of coordinates. struct insists
> on converting everything at once, and the length must be exactly right.
> array provides direct access to the any element of slice of a list so it
> is my preferred choice.
>
> It would be useful if the "<" and ">" byte order prefixes used
> in the struct module were added to the array module.
You're right that this is more work than it should be.
However I don't think that adding '<' to the format string is the
right solution -- the format string declares the format of the data in
the array, not how it should be converted from elsewhere. (The array
module has other uses besides I/O of binary data.)
I can see several solutions:
- Add a byte order indicator to some standard module (e.g. the array
module, or the sys module) so you can write
a = array.array("l", buf)
if sys.byte_order == 'big':
a.byteswap()
- Add a byte order flag to all the array methods that add raw data to
the array object (constructor, fromfile(), fromstring(); and for
symmetry also to the methods that write raw data out (tofile(),
tostring()). A problem with tofile() is that in order to do the
byteswap we'd either have to allocate temporary memory or byteswap in
place, and then byteswap back after writing.
I vote for the byte order indicator, giving total control to the user.
--Guido van Rossum (home page: http://www.python.org/~guido/)
-------------------------------------------------------
Date: 2000-Aug-01 14:12
By: none
Comment:
From: "Fred L. Drake, Jr." <fdrake@acm.org>
Subject: Re: [Python-bugs-list] REQ: array module should provide "swap to native byte order" functionality, similar to struct module (PR#166)
Date: Wed, 22 Dec 1999 10:43:32 -0500 (EST)
guido@cnri.reston.va.us writes:
> I vote for the byte order indicator, giving total control to the user.
I agree, but I'd also like to see an indicator for the native byte
order somewhere (like sys) as well. (How possible would this be for
JPython, Barry?)
-Fred
--
Fred L. Drake, Jr. <fdrake at acm.org>
Corporation for National Research Initiatives
-------------------------------------------------------
Date: 2000-Aug-01 14:12
By: none
Comment:
From: Guido van Rossum <guido@CNRI.Reston.VA.US>
Subject: Re: [Python-bugs-list] REQ: array module should provide "swap to native byte order" functionality, similar to struct module (PR#166)
Date: Wed, 22 Dec 1999 11:45:34 -0500
> guido@cnri.reston.va.us writes:
> > I vote for the byte order indicator, giving total control to the user.
>
> I agree, but I'd also like to see an indicator for the native byte
> order somewhere (like sys) as well. (How possible would this be for
> JPython, Barry?)
Ack! I was ambiguous! I meant to see a native byte order somewhere!
--Guido van Rossum (home page: http://www.python.org/~guido/)
-------------------------------------------------------
Date: 2000-Aug-01 14:12
By: none
Comment:
From: "Barry A. Warsaw" <bwarsaw@cnri.reston.va.us>
Subject: Re: [Python-bugs-list] REQ: array module should provide "swap to native byte order" functionality, similar to struct module (PR#166)
Date: Wed, 22 Dec 1999 11:59:07 -0500 (EST)
>>>>> "Fred" == <fdrake@acm.org> writes:
Fred> I agree, but I'd also like to see an indicator for the
Fred> native byte order somewhere (like sys) as well. (How
Fred> possible would this be for JPython, Barry?)
Java's native format is bigendian, regardless of the system's
underlying endian-ness. I'm not aware of any way of figuring out what
the system's endian-ness is.
-Barry
-------------------------------------------------------
Date: 2000-Aug-01 14:12
By: none
Comment:
From: Randall Hopper <aa8vb@ipass.net>
Subject: Re: [Python-bugs-list] REQ: array module should provide "swap to native byte order" functionality, similar to struct module (PR#166)
Date: Thu, 30 Dec 1999 19:19:09 -0500
Guido van Rossum:
|Randall Hopper:
|> However, using the array module is not so convenient. The developer has to
|> write code to sense the byte order of the local machine, and tell the
|> array module whether or not to swap bytes. I.e. there is no
|> "swap to native byte order" functionality:
|>
|> def _IsByteSwapNeeded( file_byte_order ):
|> """ The array module doesn't do byte swapping to the native byte order.
|> So we have to get under the hood and check it ourselves.
|> """
|> def host_endian():
|> if ord( array.array( "i", [1] ).tostring()[ 0 ] ): return 'L'
|> else: return 'M'
|
|(I presume 'L' is little-endian, but what does 'M' stand for?)
L/M - LSB/MSB - [L]east/[M]ost Significant Byte First
though:
L/B - [L]ittle/[B]ig Endian
might have been more intuitive.
|However I don't think that adding '<' to the format string is the
|right solution -- the format string declares the format of the data in
|the array, not how it should be converted from elsewhere. (The array
|module has other uses besides I/O of binary data.)
...
|I can see several solutions:
|- Add a byte order indicator to some standard module (e.g. the array
| module, or the sys module) so you can write a = array.array("l", buf) if
| sys.byte_order == 'big': a.byteswap()
|- Add a byte order flag to all the array methods that add raw data to
| the array object (constructor, fromfile(), fromstring(); and for
| symmetry also to the methods that write raw data out (tofile(),
| tostring()). A problem with tofile() is that in order to do the
| byteswap we'd either have to allocate temporary memory or byteswap in
| place, and then byteswap back after writing.
Having considered this again from what you've said, I agree that a byte
order indicator would probably be better for the array case.
I see your point: temporary buffers for reads or writes would be needed
since array is a type used for storage whereas struct is not. A byte order
indicator leads to more user code in doing the byte swapping and is
inconsistent with struct's, but it does avoid all (potentially large)
temporary buffer generation whether byte swapping is needed or not, which
is important.
Thanks,
Randall
-------------------------------------------------------
Date: 2000-Aug-01 14:12
By: none
Comment:
Actually, all we need is a simpler way to discover the native byte order.
A flag in the struct module would do just fine.
-------------------------------------------------------
Date: 2000-Aug-14 08:51
By: fdrake
Comment:
The sys.byte_order indicator is now in CVS, and will be available in Python 2.0. I think there's nothing left to fix.
-------------------------------------------------------
For detailed info, follow this link:
http://sourceforge.net/bugs/?func=detailbug&bug_id=110829&group_id=5470