[Python-ideas] Making PyStructSequence expose _fields (was Re: namedtuple base class)

Andrew Barnert abarnert at yahoo.com
Mon Jan 13 01:32:21 CET 2014


Here's a quick patch:

diff -r bc5f257f5cc1 Lib/test/test_structseq.py
--- a/Lib/test/test_structseq.pySun Jan 12 14:12:59 2014 -0800
+++ b/Lib/test/test_structseq.pySun Jan 12 16:31:15 2014 -0800
@@ -28,6 +28,16 @@
         for i in range(-len(t), len(t)-1):
             self.assertEqual(t[i], astuple[i])
 
+    def test_fields(self):
+        t = time.gmtime()
+        self.assertEqual(t._fields,
+                         ('tm_year', 'tm_mon', 'tm_mday', 'tm_hour', 'tm_min', 
+                          'tm_sec', 'tm_wday', 'tm_yday', 'tm_isdst'))
+        st = os.stat(__file__)
+        self.assertIn("st_mode", st._fields)
+        self.assertIn("st_ino", st._fields)
+        self.assertIn("st_dev", st._fields)
+
     def test_repr(self):
         t = time.gmtime()
         self.assertTrue(repr(t))
diff -r bc5f257f5cc1 Objects/structseq.c
--- a/Objects/structseq.cSun Jan 12 14:12:59 2014 -0800
+++ b/Objects/structseq.cSun Jan 12 16:31:15 2014 -0800
@@ -7,6 +7,7 @@
 static char visible_length_key[] = "n_sequence_fields";
 static char real_length_key[] = "n_fields";
 static char unnamed_fields_key[] = "n_unnamed_fields";
+static char _fields_key[] = "_fields";
 
 /* Fields with this name have only a field index, not a field name.
    They are only allowed for indices < n_visible_fields. */
@@ -14,6 +15,7 @@
 _Py_IDENTIFIER(n_sequence_fields);
 _Py_IDENTIFIER(n_fields);
 _Py_IDENTIFIER(n_unnamed_fields);
+_Py_IDENTIFIER(_fields);
 
 #define VISIBLE_SIZE(op) Py_SIZE(op)
 #define VISIBLE_SIZE_TP(tp) PyLong_AsLong( \
@@ -327,6 +329,7 @@
     PyMemberDef* members;
     int n_members, n_unnamed_members, i, k;
     PyObject *v;
+    PyObject *_fields;
 
 #ifdef Py_TRACE_REFS
     /* if the type object was chained, unchain it first
@@ -389,6 +392,19 @@
     SET_DICT_FROM_INT(real_length_key, n_members);
     SET_DICT_FROM_INT(unnamed_fields_key, n_unnamed_members);
 
+    _fields = PyTuple_New(desc->n_in_sequence);
+    if (!_fields)
+        return -1;
+    for (i = 0; i != desc->n_in_sequence; ++i) {
+        PyObject *field = PyUnicode_FromString(members[i].name);
+        PyTuple_SET_ITEM(_fields, i, field);
+    }
+    if (PyDict_SetItemString(dict, _fields_key, _fields) < 0) {
+        Py_DECREF(_fields);
+        return -1;
+    }
+    Py_DECREF(_fields);
+
     return 0;
 }
 
@@ -417,7 +433,8 @@
 {
     if (_PyUnicode_FromId(&PyId_n_sequence_fields) == NULL
         || _PyUnicode_FromId(&PyId_n_fields) == NULL
-        || _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL)
+        || _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL
+        || _PyUnicode_FromId(&PyId__fields) == NULL)
         return -1;
 
     return 0;




----- Original Message -----
> From: Andrew Barnert <abarnert at yahoo.com>
> To: "python-ideas at python.org" <python-ideas at python.org>
> Cc: 
> Sent: Sunday, January 12, 2014 4:17 PM
> Subject: [Python-ideas] Making PyStructSequence expose _fields (was Re: namedtuple base class)
> 
> I don't think the proposed NamedTuple ABC adds anything on top of duck 
> typing on _fields (or on whichever other method you need, and possibly checking 
> for Sequence). As Raymond Hettinger summarized it nicely, namedtuple is a 
> protocol, not a type.
> 
> But I think one of the ideas that came out of that discussion is worth pursuing 
> on its own: giving a _fields member to every structseq type.
> 
> Most of the namedtuple-like classes in the builtins/stdlib, like os.stat_result, 
> are implemented with PyStructSequence. Since 3.3, that's been a public, 
> documented protocol. A structseq type is already a tuple. And it stores all the 
> information needed to expose the fields to Python, it just doesn't expose 
> them in any way. And making it do so is easy. (Either add it to the type 
> __dict__ at type creation, or add a getter that generates it on the fly from 
> tp_members.)
> 
> Of course a structseq can do more than a namedtuple. In particular, using a 
> structseq via its _fields would mean that you miss its "non-sequence" 
> fields, like st_mtime_ns. But then that's already true for using a structseq 
> as a sequence, or just looking at its repr, so I don't think that's a 
> problem. (The "visible fields" are visible for a reason…)
> 
> And this still wouldn't mean that _fields is part of the "named tuple 
> protocol" described in the glossary, just that it's part of structseq 
> types as well as collections.namedtuple types.
> 
> And this wouldn't give structseq an on-demand __dict__ so you can just call 
> var(s) instead of OrderedDict(zip(s._fields, s)).
> 
> Still, it seems like a clear win. A small patch, a bit of extra storage on each 
> structseq type object (not on the instances), and now you can reflect on the 
> most common kind of C named tuple types the same way you do on the most common 
> kind of Python named tuple types.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 


More information about the Python-ideas mailing list