[Numpy-discussion] vacuum expansion of strings in record array

Les Schaffer schaffer at optonline.net
Sun Jun 5 08:10:25 EDT 2005


Todd Miller wrote:

>This is a subtle quirk that it would be nice to be without but not an
>accident or bug.  It's an intentional feature since if conforms to the
>FITS file format which motivated the development of records.py to begin
>with.  I think there probably should be a less eclectic subclass of
>RawCharArray,  but don't have the time to write it myself.
>  
>
i take it this is what bit us:

    "When an element of a CharArray is fetched trailing whitespace is
    stripped off. The sole exception to this rule is that a single
    whitespace is never stripped down to the empty string."


so the strings are all the same length in storage, and the empty string 
'' expands to this fixed size WITH SPACES and then contract down to a 
single space when retrieved.  true inflation, something from the vacuum.

i am not familiar with the FITS file format to know why you would want 
such creation-prone behavior: why not just fill the slots with \0's  
(empty string '' ---> n*\0)? in any case,  i will take a look at whats 
involved in basing a subclassed RecordArray on a variant of the raw char 
array. is this stuff in C or in Python? a quick hint where to look would 
help.

Record Arrays are nice for holding stuff from Excel tables where columns 
are of similar type, with a column name up top. However, to grab Excel 
data w/ Python requires COM which delivers everything as UniCode strings 
which needed to be encoded() before RecordArray accepts them. is there a 
plan to include UniCode eventually?

Les Schaffer




More information about the NumPy-Discussion mailing list