[Python-Dev] s1 == (sf % (s1 / sf))? A bad idea?

Peter Funk pf@artcom-gmbh.de
Tue, 3 Apr 2001 01:19:33 +0200 (MEST)

At the moment it is very silent on Python-dev.  I guess you guys are
all out hunting dead parrots, which escaped from the cages on April 1st. ;-)

So this might be the right moment to present a possibly bad idea (TM).
see below.
Regards, Peter
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany)

Title: String Scanning
Version: $Revision$
Author: pf@artcom-gmbh.de (Peter Funk)
Status: Not yet Draft
Type: Standards Track
Python-Version: 2.2 
Created: 02-Apr-2001


    This document proposes a string scanning feature for Python to
    allow easier string parsing.  The suggested syntax change is to
    allow the use of the division '/' operator for string operands
    as counterpart to the already existing '%' string interpolation
    operator.  In current Python this raises an exception: 'TypeError:
    bad operand type(s) for /'.  With the proposed enhancement the
    expression string1 / format2 should either return a simple value,
    a tuple of values or a dictionary depending on the content of
    the right operand (aka. format) string.


    This document is in the public domain.


    The feature should mimic the behaviour of the scanf function
    well known to C programmers.  For any format string sf and any
    matching input string si the following pseudo condition should 
    be true:

        string.split( sf % (si / sf) ) == string.split( si )

    That is modulo any differences in white space the result of the
    string interpolation using the intermediate result from the string
    scanning operation should look similar to original input string.

    All conversions are introduced by the % (percent sign) character.  
    The format string may also contain other characters.  White space 
    (such as blanks, tabs, or newlines) in the format string match any  
    amount of white space, including none, in the input.  Everything
    else matches only itself.  Scanning stops when an input character  
    does not match such a format character.  Scanning also stops when 
    an input conversion cannot be made (see below).


    Here is an example of an interactive session exhibiting the
    expected behaviour of this feature.

        >>> "12345 John Doe" / "%5d %8s"
        (12345, 'John Doe')
        >>> "12 34 56 7.890" / "%d %d %d %f"
        (12, 34, 56, 7.8899999999999997)
        >>> "12345 John Doe, Foo Bar" / "%(num)d %(n)s, %(f)s %(b)s"
        {'n': 'John Doe', 'f': 'Foo', 'b': 'Bar', 'num': 12345}
        >>> "1 2" / "%d %d %d"
        Traceback (innermost last):
          File "<stdin>", line 1, in ?
        TypeError: not all arguments filled


    This should fix the assymetry between arithmetic types and strings.
    It should also make the life easier for C programmers migrating to
    Python (see FAQ 4.33).  Those poor souls are acustomed to scanf
    as the counterpart of printf and usually feel uneasy to convert
    to string slitting, slicing or the syntax of regular expressions.

Security Issues

    There should be no security issues.


    There is no implementation yet.  This is just an idea.

Local Variables:
mode: indented-text
indent-tabs-mode: nil