<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Sorry Raymond if this offends you.<br>
But after some extensive use of namedtuple I think it needs <br>
a re-design.<br>
<br>
The pros:<br>
_________<br>
<br>
- a namedtuple can be easily used in place of a tuple.<br>
- it provides names for its fields and behaves like a tuple,
otherwise.<br>
Furthermorre:<br>
- a named tuple is the natural choice for modelling the records of a
database.<br>
<br>
The cons:<br>
<br>
- All of the above becomes wrong if you use namedtuple as a real
replacement for tuple.<br>
- especially for databases it makes little sense as it is.<br>
<br>
Reason:<br>
<br>
Pickling<br>
________<br>
<br>
- pickling of tuples:<br>
always possible if it contains built-in types.<br>
tuples are simply tuples. There is just _one_ type.<br>
<br>
- pickling of namedtuple:<br>
sometimes possible, if your definition is static enough.<br>
namedtuple has a subtype per tuple layout, and you need to cope
with that.<br>
<br>
Just to be clear about that: Sure, it is possible to pickle named
tuples,<br>
but you have to think about it! And to have to thing about it
trashes<br>
a lot of the fun of having those names for free.<br>
<br>
And typically this happens after you did your analysis of 20 GB of
data:<br>
You cannot pickle your nicely formatted namedtuple instances after
the fact.<br>
Actually, to save all the computation, you do a hack that turns all
your<br>
alive namedtuple instances back into ordinary tuples.<br>
<br>
<br>
Silent implications introduced by namedtuple:<br>
_____________________________________________<br>
<br>
Without being very explicit, namedtuple makes you use it happily
instead of tuples.<br>
But instead of using a native, simple type, you now use a
not-so-simple,<br>
user-defined type.<br>
This type<br>
<br>
- is not built in<br>
- has a class definition<br>
- needs a global, constant definition _somewhrere_<br>
<br>
Typically, you run a script interactively, and typically you need to
pickle<br>
some __main__.somename namedtuple class instances.<br>
This is exactly what you don't want! You want to have some anonymous<br>
data in the pickle and don't want to make anything fixed in stone.<br>
<br>
<br>
<meta charset="utf-8">
namedtuple() Factory Function for Tuples with Named Fields<br>
__________________________________________________________<br>
<br>
It would be great if namedtuple were just this, as the doku says.<br>
But instead, it <br>
<br>
- creates a named class, i.e. forces me to name it<br>
- you create instances of that specific class and not just tuple.<br>
<br>
<br>
Usability for databases<br>
______________________<br>
<br>
For simple databases which enumerate (employee, salary, ...) or<br>
(shoesize, height, married) as example "database"s, namedtuple is
ok.<br>
<br>
As soon as you write a real database implementation with no fixed<br>
layout, you get into trouble.<br>
<br>
Easy database approach:<br>
<br>
You define a dbtable as a collection of tuples, maybe as a dict with
fast<br>
index keys. Not a problem with tuples, which are of type tuple.<br>
<br>
With named tuple, you suddenly see yourself creating namedtuple
instead.<br>
But those namedtuple records cannot be pickled when used as a
replacement of<br>
regular tuples, because they now have a dynamically created type,
and extra actions<br>
are necessary to make it possible to pickle those.<br>
<br>
From the documentation:<br>
<br>
<meta charset="utf-8">
<pre style="overflow-x: auto; overflow-y: hidden; padding: 5px; background-color: rgb(238, 255, 204); color: rgb(51, 51, 51); line-height: 15px; border-top-width: 1px; border-bottom-width: 1px; border-style: solid none; border-top-color: rgb(170, 204, 153); border-bottom-color: rgb(170, 204, 153); font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span class="n">EmployeeRecord</span> <span class="o" style="color: rgb(102, 102, 102);">=</span> <span class="n">namedtuple</span><span class="p">(</span><span class="s" style="color: rgb(64, 112, 160);">'EmployeeRecord'</span><span class="p">,</span> <span class="s" style="color: rgb(64, 112, 160);">'name, age, title, department, paygrade'</span><span class="p">)</span></pre>
<br>
This implicitly suggests that namedtuple is the tool of choice to
support<br>
databases in Python.<br>
Wrong!<br>
It supports simple, fixed data structures in Python.<br>
<br>
For instance, you can use it to define a fixed structural record to
define the<br>
general description of a database field and its attributes.<br>
<br>
For a database itself instead, this is very wrong.<br>
Nobody uses a fixed class definition to define the structure of some
database<br>
tables. Instead, you use a generic approach with a row class that
describes<br>
what a row is, but dynamically.<br>
<br>
<br>
So why all this rant?<br>
___________________<br>
<br>
What I'm trying to explain is that namedtuple should be closer to
tuple!<br>
A namedtuple should not be an explicit class with instances, but a
generic<br>
subclass of tuple, for all namedtuples.<br>
<br>
Then, if the user decides to use a namedtuple to build his own class
upon that,<br>
fine. He then might want to do everything needed to support and
pickle<br>
his class.<br>
<br>
But the namedtuple should be a singe (maybe builtin) class that is
just<br>
a tuple with field names, nothing more.<br>
<br>
<br>
Implementation idea (roughly)<br>
_____________________________<br>
<br>
Whatever a namedtuple does, it should behave as closely as possible<br>
to a tuple, just providing attribute names.<br>
Pickling support should be so that the user does not need to know
that<br>
a namedtuple has a special class. Actually, there should be only a
generic<br>
class, and the namedtuple "class" is a template instance that just
holds<br>
the names. Those names could go into some registry or whatever.<br>
<br>
The only interesting thing about a namedtuple is the set of names
used.<br>
This set of names is not eligible to enforce the whole import
machinery,<br>
the associated problems etc. The set of attribute names defines the
namedtuple,<br>
and that's it.<br>
<br>
If it is necessary to have class instances like today, ok. But there
is no<br>
need to search that class in a pickle! Instead, the defining set of
attributes<br>
could be pickled (uniquely stored by tuple comparison), and the
class could<br>
be re-created on-the-fly at unpickling time.<br>
<br>
<br>
Conclusion<br>
___________<br>
<br>
I love namedtuple, and I hate it. I want to get rid of the second
half of this sentence.<br>
Let us invent one that does not enforce class behavior.<br>
<br>
I am thinking of a prototype...<br>
<br>
cheers - chris<br>
<br>
<br>
p.s.: there is a lot about database design not mentioned here.<br>
<pre class="moz-signature" cols="72">--
Christian Tismer :^) <a class="moz-txt-link-rfc2396E" href="mailto:tismer@stackless.com"><mailto:tismer@stackless.com></a>
Software Consulting : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 : *Starship* <a class="moz-txt-link-freetext" href="http://starship.python.net/">http://starship.python.net/</a>
14482 Potsdam : PGP key -> <a class="moz-txt-link-freetext" href="http://pgp.uni-mainz.de">http://pgp.uni-mainz.de</a>
phone +49 173 24 18 776 fax +49 (30) 700143-0023
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? <a class="moz-txt-link-freetext" href="http://www.stackless.com/">http://www.stackless.com/</a></pre>
</body>
</html>