[Python-bugs-list] [ python-Feature Requests-513840 ] entity unescape for sgml/htmllib

noreply@sourceforge.net noreply@sourceforge.net
Wed, 06 Feb 2002 09:55:07 -0800


Feature Requests item #513840, was opened at 2002-02-06 09:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=355470&aid=513840&group_id=5470

Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Greg Chapman (glchapman)
Assigned to: Nobody/Anonymous (nobody)
Summary: entity unescape for sgml/htmllib

Initial Comment:
The parsers defined in htmllib and sgmllib do not 
provide any facilities for unescaping a tag attribute 
which has an embedded html entityref (i.e., they do 
not provide a way to convert "a&b" to "a&b").  The 
parser in HTMLParser unescapes all tag attributes 
automatically.  I'm not sure that's the right approach 
for sgmllib and htmllib (since it might break existing 
code), but it seems to me that one of the modules 
ought to provide a function or method which can do the 
unescaping if needed.  (I'm not familiar with either 
the SGML or the HTML specification, but I assume one 
of them mandates the escaping of '&' (e.g.) in tag 
attributes.  If so, then it seems appropriate for one 
of the modules to provide a function which undoes the 
mandated transformation.)


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=355470&aid=513840&group_id=5470