Re: [Numpy-discussion] np.loadtxt : yet a new implementation...

Dec. 2, 2008


      On Dec 2, 2008, at 3:12 PM, Ryan May wrote:
...
Pierre GM wrote:
...
Well, looks like the attachment is too big, so here's the
implementation. The tests will come in another message.
A couple of quick nitpicks:
1) On line 186 (in the NameValidator class), you use
excludelist.append() to append a list to the end of a list.  I think  
you
meant to use excludelist.extend()
Good call.
...
2) When validating a list of names, why do you insist on lower casing
them? (I'm referring to the call to lower() on line 207).  On one  
hand,
this would seem nicer than all upper case, but on the other hand this
can cause confusion for someone who sees certain casing of names in  
the
file and expects that data to be laid out the same.
I recall a life where names were case-insensitives, so 'dates' and  
'Dates' and 'DATES' were the same field. It should be easy enough to  
get rid of that limitations, or add a parameter for case-sensitivity


On Dec 2, 2008, at 2:47 PM, Zachary Pincus wrote:
...
Specifically, on line 115 in LineSplitter, we have:
            self.delimiter = delimiter.strip() or None
so if I pass in, say, '\t' as the delimiter, self.delimiter gets set
to None, which then causes the default behavior of any-whitespace-is-
delimiter to be used. This makes lines like "Gene Name\tPubMed ID
\tStarting Position" get split wrong, even when I explicitly pass in
'\t' as the delimiter!
OK, I'll check that.
...
I think that treating an explicitly-passed-in ' ' delimiter as
identical to 'no delimiter' is a bad idea. If I say that ' ' is the
delimiter, or '\t' is the delimiter, this should be treated *just*
like ',' being the delimiter, where the expected output is:
['1', '2', '3', '4', '', '5']
Valid point.
Well, all, stay tuned for yet another "yet another implementation..."
...
Other than those, it's working fine for me here.
Ryan