Python evolution: Unease

Bulba! bulba at bulba.com
Wed Jan 5 14:08:42 EST 2005


On Wed, 5 Jan 2005 07:37:25 -0600, Skip Montanaro <skip at pobox.com>
wrote:

>
>    Terry> Numarray has a record array type.  If there is not one publicly
>    Terry> available, perhaps you could write a CSV file to record-array
>    Terry> slurper and contribute it to the Recipes site or maybe even the
>    Terry> CSV module.
>    >> 
>    >> -1 on putting such a beast into the CSV module, especially if, as it
>    >> seems, it would rely on something outside the core.
>
>    Carlos> Although I see your point, in the long term it will be required.
>    Carlos> Assuming that Numarray will, at some point in the future, be
>    Carlos> included in the stdlib... why not give these people some help,
>    Carlos> easing the integration? 
>
>I'm not sure they really need my help.  I've never needed Numarray (or
>Numeric) in my own work.  

I've never needed numeric stuff either. I just need to do things like:

.>>> table.sort(column_name) # that obviously would sort rows of table
by the values of column column_name

or

.>>> unique = table.unique(column_name)    # that would yield such
a subset of all the rows that would contain unique values from column
column_name

or

.>>> table1.union(table2 [,drop_missing_columns |
,fill_missing_with_None])

or 

.>>> common = table1.intersection(table2, column_name [, unique |
, redundant]) 

# that would yield all the rows that have the same values in
column_name in both table1 and table2; if optional keyword 
"unique" were given, those could e.g. be only rows
from table1, when the "redundant" keyword were specified, 
that could be a union of common rows from table1 and table2

or

.>>> complement = table.complement(complement_function) # where
complement function could be anything, like (!cellvalue) or
string.upper; that obviously would run the complement_function
on every cell in table

(obviously, this could also be implemented as 
map(table, complement_function) )

Now suppose a programmer could write a custom complement function
that detects all the irregularly distributed "anomalous" data points
(be it whatever, missing surnames from personnel records or values
from a physical experiments that are below some threshold) in this
table and returns, say, a list of tuples that are coordinates of those
data points. Getting it from a specific table would be a matter of one
instruction!

Yes, I know, it can be written by hand. But by this line of logic why
bother learning VHLL and not just stay with C? 

>If it's deemed useful I'm sure someone from that
>community could whip something out in a few minutes.  The concepts
>represented by the csv module are a lot shallower than those represented by
>Numarray.

True, and I may scratch enough time together to learn all the
necessary stuff (I'm not even half done in learning Python)
to write it myself. 

That is not the point, however: the biggest boost and one of the 
main points of getting into Python, at least for me, but I'm sure
this is also motivation for quite a lot of other people, is precisely 
the ease of exploiting capabilities of data structures like
dictionaries and lists, which when coupled with this data structure's
object-style .method are simply very convenient and fast. This is 
where IMHO Python excels among the VHLL languages.

I'm about to post reworked version of my program that doesn't
use a _single_ traditional loop to do all the data transformations
I need (I just still need to solve some problems there / polish
it).  

This is not just about that damn CSV file that I already have 
the way I wanted it and sent it to customer, this is about _terse
and clear_ manipulations of rich data structures in Python. Why not
extend them with flexible tables / matrices / arrays that would work
in as "Pythonic" ways as dictionaries and lists already do?

If Pythoners say a=['A'], it's only logical to say a.append('B'). :-)




--
It's a man's life in a Python Programming Association.



More information about the Python-list mailing list