
Stéfan van der Walt <stefan@sun.ac.za> writes:
We should certainly look at applying the non-API-changing parts, though. I'm not sure what the best way is to represent these structures on the Python side.
Thouis, you've thought about this a lot: could you tell us the pros and cons of switching to the new representation?
The reason Ray and I changed some of the representations is that we wanted the mapping from Matlab to Python to be symmetric: anything read from a MAT-file should be represented in a way that allows the writer code to write it back in its original form. This requires that the original Matlab type be deducible from the Python representation. * Struct arrays: Matlab struct arrays were previously represented as numpy arrays of dtype=object filled with instances of mat_struct. The problem is that Matlab cell arrays were also represented as numpy arrays of dtype=objects. The writer code could in most cases have identified structs by looking at the contents (instances of mat_struct), but there was no way to distinguish a 0x0 cell array from a 0x0 struct array. We therefore opted to represent struct arrays as numpy record arrays. In order not to break existing code, we could introduce a keyword argument to loadmat that selects the old or new representation, similar to numpy.histogram's "new" argument. In 0.7, leaving the argument out would default to False (old behavior), but give a deprecation warning. Later versions can first change the default to True and then remove the old behavior entirely. The best name I can think of for this keyword argument is "struct_as_record". * Char arrays/strings: Same story. At the lowest level, the code represented char arrays as numpy arrays of dtype='U1', which is fine. A very useful "processor function" (in miobase) turns them into arrays of strings, however. This processor function created an array of dtype=object. We changed this to 'U...' so the array could be distinguished from a cell array. I think this is unlikely to break any code, do you agree? * Objects: This change in representation was purely for our convenience, and we should be able to fix our patch to keep the old representation. Vebjorn