[Patches] [ python-Patches-1022173 ] Improve Template error detection and reporting

SourceForge.net noreply at sourceforge.net
Mon Sep 6 01:22:20 CEST 2004


Patches item #1022173, was opened at 2004-09-04 02:54
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1022173&group_id=5470

Category: Library (Lib)
Group: Python 2.4
Status: Open
Resolution: None
Priority: 6
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Martin v. Löwis (loewis)
Summary: Improve Template error detection and reporting

Initial Comment:
Report line number and token rather than just character
position.

Detect and report situations where non-ASCII alphabet
characters are used in a placeholder number. 
Currently, this situation results in a silent error for
SafeTemplates and either a KeyError or mis-substitution
for Templates.

Does not change the API or existing tests.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2004-09-06 01:22

Message:
Logged In: YES 
user_id=21627

If the string is a Unicode string, you can use .isletter. If
it is a byte string, then it is impossible to determine
letters (strictly speaking, it is even impossible to
determin ASCII letters then).

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-06 00:41

Message:
Logged In: YES 
user_id=80475

Can you recommend a non-locale sensitive approach to
detecting alphabetic characters outside of A-Za-z?

For SafeTemplates especially, capturing only $ma is a small
disaster.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2004-09-05 23:37

Message:
Logged In: YES 
user_id=21627

I think the locale should have no effect whatsoever on
templates. The template most likely uses the encoding of the
source code, which may or may not be encoding of the locale
at run-time. In many cases, it won't, as the run-time locale
will be "C", as locale.setlocale has not been called.

Of course, it might be possible to state an explicit set of
termination characters (e.g. all ASCII punctuation and
whitespace) and mandate that the template either terminates
with one of these, or uses explicit parentheses. That would
mean that the only requirement is that the source encoding
is an ASCII superset, which is a requirement, anyway.

Whether such a change to the PEP is still possible at this
point in time, I don't know.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-05 21:30

Message:
Logged In: YES 
user_id=80475

Martin, I'm failing to articulate something that seems
obvious to me.  Can you add your thoughts on the most user
friendly way to treat a placeholder like $mañana in a latin
locale.

Currently, it captures $ma and proceeds.  My thought is to
raise a ValueError noting that $mañana contains characters
other than _A-Za-z0-9.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-05 21:19

Message:
Logged In: YES 
user_id=80475

Sure, it is documented that way.  That doesn't mean that we
can't give a useful error message when a potentially common
end-user mistake is made.

The locale has no affect on valid python identifiers;
however, it is a strong indicator of what the user expects
to be valid alphabetical characters.  The idea is to avoid a
silent failure for non-programmer end users who may
understandably not know that some of their everyday
characters will be viewed as delimiters by the template
logic.  As it stands, it is a usability bug (documented, but
a problem never-the-less).

MvL concurred when I discussed with him two weeks ago.



----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2004-09-05 17:34

Message:
Logged In: YES 
user_id=12800

Why would the locale have any effect on what Python defines
an identifier as?  The PEP and documentation clearly state
that the substitution variables are Python identifiers and
that's a well-defined, locale-neutral concept.

The resolution of your hypothetical bug report is:

Won't Fix -- "mañana" is not a Python identifier.  You can't
use it as a variable in regular Python code, and you can't
use it as a placeholder in a Template string.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-04 21:44

Message:
Logged In: YES 
user_id=80475

When a user has their locale set so that a different alphabet 
is enabled, then they see $mañana as a single token.  To 
them, the third character is not out of place with the rest -- 
anymore than we think of the letter "d" as not being special.  
In such case, SafeTemplate will pass the error silently.

Bug Report:
"""SafeTemplate broke.  I see placeholder in dictionary but it 
no substitute.  Please fix.

>>>   SafeTemplate("vamanos $mañana o esté dia") % 
{'mañana':'ahora'}
u'vamanos $mañana o esté dia' 

"""

The templates are likely to be exposed to end users (non-
programmers).  The above is not an unlikely scenario.  We 
should give the users as much help as possible.  

Yes, tests and docs will be updated if accepted.  It's a waste 
of time to do so now if you think that $ma was the intended 
placeholder and want the silent error to pass.

Also, the line number / token is an important part of the 
error message.  In a long template, it is useless to say that 
there is an error at position 23019 for example.


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-04 21:43

Message:
Logged In: YES 
user_id=80475

When a user has their locale set so that a different alphabet 
is enabled, then they see $mañana as a single token.  To 
them, the third character is not out of place with the rest -- 
anymore than we think of the letter "d" as not being special.  
In such case, SafeTemplate will pass the error silently.

Bug Report:
"""SafeTemplate broke.  I see placeholder in dictionary but it 
no substitute.  Please fix.

>>>   SafeTemplate("vamanos $mañana o esté dia") % 
{'mañana':'ahora'}
u'vamanos $mañana o esté dia' 

"""

The templates are likely to be exposed to end users (non-
programmers).  The above is not an unlikely scenario.  We 
should give the users as much help as possible.  

Yes, tests and docs will be updated if accepted.  It's a waste 
of time to do so now if you think that $ma was the intended 
placeholder and want the silent error to pass.

Also, the line number / token is an important part of the 
error message.  In a long template, it is useless to say that 
there is an error at position 23019 for example.


----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2004-09-04 19:15

Message:
Logged In: YES 
user_id=12800

I wonder about this patch.  PEP 292 clearly says that the
first non-identifier character terminates the placeholder. 
So why would you expect that the eñe would cause an
exception to be raised instead of a valid substitution for $ma?

Will discuss on python-dev, but in any event, if we accept
this patch we would need a unittest update, as well as
documentation and PEP 292 updates.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1022173&group_id=5470


More information about the Patches mailing list