How to convert markup text to plain text in python?

Stefan Behnel stefan_ml at
Sun Feb 3 18:34:31 CET 2008

geoffbache wrote:
> I have some marked up text and would like to convert it to plain text,
> by simply removing all the tags. Of course I can do it from first
> principles but I felt that among all Python's markup tools there must
> be something that would do this simply, without having to create an
> XML parser etc.
> I've looked around a bit but failed to find anything, any tips?
> (e.g. convert "<B>Today</B> is <U>Friday</U>" to "Today is Friday")

   >>> import lxml.etree as et
   >>> doc = et.HTML("<b>Today</b> is <u>Friday</u>")
   >>> et.tostring(doc, method='text', encoding=unicode)
   u'Today is Friday'


