[Patches] [ python-Patches-545300 ] sgmllib support for additional tag forms

noreply@sourceforge.net noreply@sourceforge.net
Fri, 22 Nov 2002 01:23:57 -0800


Patches item #545300, was opened at 2002-04-17 20:16
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Steven F. Lott (slott56)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: sgmllib support for additional tag forms

Initial Comment:
MS-word generated HTML includes declaration 
tags of the form: 
<![if !supportEmptyParas]>&nbsp;<![endif]>
scattered throughout the body of an HTML 
document.

The current sgmllib parse_declaration routine 
rejects these as invalid syntax, where browsers 
tolerate these embedded declarations.

This patch accepts these declaration forms.

----------------------------------------------------------------------

>Comment By: Martin v. L÷wis (loewis)
Date: 2002-11-22 10:23

Message:
Logged In: YES 
user_id=21627

I now recommend to approve this patch. It improves SGML
correctness, and, while supporting an MS extension,
explicitly points out that it is doing so.

----------------------------------------------------------------------

Comment By: Steven F. Lott (slott56)
Date: 2002-04-22 20:50

Message:
Logged In: YES 
user_id=328067

My suggestion for handling this MS extension syntax is 
to (1) tolerate the extension without an error, (2) treat it 
as an SGML marked section, using the 
unknown_decl() call-back.  Since this is a separate 
function, subclasses can override to alter this behavior.  

The content hidden in these MS-specific marked 
section appears to always be a &nbsp;.  While it might 
be expedient to completly skip over this junk, it makes it 
difficult to handle marked sections in a future version of 
markupbase.

Attached is a revised patch against V1.39 of sgmllib.py 
and 1.4 of markupbase.py

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-21 17:11

Message:
Logged In: YES 
user_id=3066

This is the same as bug #505747.

These "tags" are not legal HTML in any form, but are some
Microsoft invention.  It's not entirely clear what the right
thing to do is, but it is clear that we need to deal with
these in some different way.

Changed group to indicate that such changes can only go into
the trunk; feature changes in maintenance versions are not
allowed.

----------------------------------------------------------------------

Comment By: Martin v. L÷wis (loewis)
Date: 2002-04-18 19:23

Message:
Logged In: YES 
user_id=21627

That patch looks wrong: You are changing what a tag is,
removing the underscore, however, underscores are allowed in
tag names.

Also, could you please generate the patch against the CVS
version of the code? Your patch doesn't apply for the
current code, which has changed significantly compared to
the version you appear to be using.

There is no way that this can go into 2.1 IMO.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470