[lxml-dev] [lxml][objectify] optimization of recursive object dumping
Hi, I've posted this before but messed with the threads, so here it is again: (Note: patch line numbers might differ, this was based on 1.1 branch of 2 weeks ago, but I could of course update this and send a new patch) First some background: I'm experimenting with a custom objectified datetime class based on Python's datetime that employs the dateutil.parser module to detect if some element value is in a valid datetime format, i.e. the parse function from dateutil.parser is used to implement the type_check for the PyType type registry. 1) Invoking this parse method is quite expensive, so I want this to happen rarely. As I am using "recursive element dumping" as default I found that for every __str__ call .pyval of the ObjectifiedDataElements in a tree is accessed, which in turn triggers parsing for my custom datetime class. As I don't really see a way to avoid this I propose the introduction of an additional property "_pyval_repr" that can be overridden in subclasses, which makes it possible to simply return element.text, if getting .pyval is expensive. S.th. like: *** ORIG/lxml-1.1/src/lxml/objectify.pyx Wed Sep 27 09:18:30 2006 --- src/lxml/objectify.pyx Wed Oct 4 11:00:09 2006 *************** *** 484,489 **** --- 484,493 ---- def __get__(self): return textOf(self._c_node) + property _pyval_repr: + def __get__(self): + return self.pyval + def __str__(self): return textOf(self._c_node) or '' *************** *** 931,938 **** cdef object _dump(_Element element, int indent): indentstr = " " * indent ! if hasattr(element, "pyval"): ! value = element.pyval else: value = textOf(element._c_node) if value and not value.strip(): --- 935,942 ---- cdef object _dump(_Element element, int indent): indentstr = " " * indent ! if hasattr(element, "_pyval_repr"): ! value = element._pyval_repr else: value = textOf(element._c_node) if value and not value.strip(): This can substantially speed up things for complicated type_check routines (in my usecase :) Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empfänger sind oder falls diese E-Mail irrtümlich an Sie adressiert wurde, verständigen Sie bitte den Absender sofort und löschen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte Übermittlung sind nicht gestattet. Die Sicherheit von Übermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Bestätigung wünschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version.
participants (2)
-
Holger Joukl
-
Stefan Behnel