[Python-checkins] r54108 - sandbox/trunk/pep3101/README.txt sandbox/trunk/pep3101/setup.py sandbox/trunk/pep3101/unicodeformat.c

patrick.maupin python-checkins at python.org
Sat Mar 3 21:21:36 CET 2007


Author: patrick.maupin
Date: Sat Mar  3 21:21:29 2007
New Revision: 54108

Modified:
   sandbox/trunk/pep3101/README.txt
   sandbox/trunk/pep3101/setup.py
   sandbox/trunk/pep3101/unicodeformat.c
Log:
Added pep_differences.txt to document initial implementation target.
Updated README.txt to move info into pep_differences.
Cleaned up escape-to-markup processing to fix bug and enable
easy alternate syntax testing.
Changed version number in setup.py to reflect the fact we're not at 1.0 yet.


Modified: sandbox/trunk/pep3101/README.txt
==============================================================================
--- sandbox/trunk/pep3101/README.txt	(original)
+++ sandbox/trunk/pep3101/README.txt	Sat Mar  3 21:21:29 2007
@@ -10,12 +10,10 @@
     Eric V. Smith (eric at trueblade.com)
     Pete Shinner
 
-The code is only half-baked at present
-(development was started at PyCon 2007 and is in progress).
-
-Although the PEP3101 goal is a (unicode) string format method, since
-this is a sandbox, we might do a few more ambitious things as well,
-to see if people like them.
+The code is only half-baked at present.  Development was started at
+PyCon 2007 and is steadily progressing.  The feature set targeted
+for the initial release is documented in pep_differences.txt in
+this directory.
 
 The current plan of record is to make a pep3101 extension module.
 It will have at least the following features:
@@ -25,53 +23,43 @@
     - can be compiled against 2.4, 2.5, and Py3K
     - Works with the string object as well as unicode
 
-The current code has a module which is progressing nicely, and some
-unittests for the current version.
-
 Files:
 
-    - unicodeformat.c is designed to be easily added to Python
-      as a method of the unicode object.
-    - stringformat.c is a wrapper around unicodeformat.c, which
-      "templatizes" the entire file to make it easy to add to Python
-      as a method of the string object.
+    - loadpep.py -- Attempts to add the appropriate build directory
+      to the Python path, then runs the tests.
+    - makefile -- At least one of the developers is a Luddite who
+      can barely make setup.py function.
     - pep3101.h contains definitions for the functions in stringformat
       and unicodeformat
     - pep3101.c contains a module implementation which can be linked
       with these method files for testing and/or use with earlier
       Python versions.
+    - pep_differences.txt documents differences between what is
+      built in this directory, and the original PEP
+    - README.txt -- this file.
     - setup.py  -- Use "build" option to make the extension module
     - test_simpleformat.py  -- initial unittests
     - StringFormat.py -- Talin's original implementation in Python.
       This is only for historical interest: it doesn't exactly match
       the PEP or C implementation.
+    - stringformat.c is a wrapper around unicodeformat.c, which
+      "templatizes" the entire file to make it easy to add to Python
+      as a method of the string object.
+    - unicodeformat.c is designed to be easily added to Python
+      as a method of the unicode object.
 
 Todo:
 
     - finish up format specifier handling
     - document differences between PEP and implementation
+        (in pep_differences.txt)
     - Add docstrings to module
-    - print string offset information on certain errors
-    - Add _flags options
-    - Play with possible implementations for formatting
-      strings against dictionaries as well as the format
-      (dangerous)
+    - Add keyword options and string metadata options
+      as described in pep_differences.
     - Play with possible implementations for exposing
       lowest level format specifier handler for use in
       compatible template systems.
-    - Play with possible options for specifying additional
-      escape syntaxes
     - Should we have stricter checking on format strings?  For example
       type "s" doesn't allow a sign character.  Should specifying one
       be an error?
     - Test suite needs to check for specific exceptions.
-
-_flags options to consider adding:
-
-    - useall=1   means all arguments should be used
-    - allow_leading_under  means leading underbars allowed
-    - syntax=0,1,2,3 -- different syntaxes
-    - hook=object -- callback hook as described in PEP
-    - informational mode to dump exceptions into string
-      (as described in pep)
-    - max_recursion=xxx (default 4)

Modified: sandbox/trunk/pep3101/setup.py
==============================================================================
--- sandbox/trunk/pep3101/setup.py	(original)
+++ sandbox/trunk/pep3101/setup.py	Sat Mar  3 21:21:29 2007
@@ -6,6 +6,6 @@
                     )
 
 setup (name = 'pep3101',
-       version = '1.0',
+       version = '0.01',
        description = 'Extension module to implement features of PEP 3101',
        ext_modules = [module1])

Modified: sandbox/trunk/pep3101/unicodeformat.c
==============================================================================
--- sandbox/trunk/pep3101/unicodeformat.c	(original)
+++ sandbox/trunk/pep3101/unicodeformat.c	Sat Mar  3 21:21:29 2007
@@ -161,6 +161,8 @@
     /* For some interface functions, we could have a list or tuple of
        dictionaries to search, e.g. locals()/globals(). */
     int keywords_is_tuple;
+    /* Support for different escape-to-markup syntaxes */
+    int syntaxmode;
 } FmtState;
 
 /* Some forward declarations for recursion */
@@ -1220,6 +1222,25 @@
     return result;
 }
 
+/*
+    get_field_and_render calls get_field_and_spec to get
+    the field object and specification, then calls
+    render_field to output it.
+*/
+static int
+get_field_and_render(FmtState *fs)
+{
+    PyObject *myobj;
+    int ok;
+
+    fs->fieldstart = fs->fmtstr.ptr;
+    myobj = get_field_and_spec(fs);
+    ok = (myobj != NULL) && render_field(fs, myobj);
+    Py_XDECREF(myobj);
+    Py_XDECREF(fs->fieldspec.obj);
+    return ok;
+}
+
 /************************************************************************/
 /******* Output string allocation and escape-to-markup processing  ******/
 /************************************************************************/
@@ -1233,52 +1254,84 @@
 static int
 do_markup(FmtState *fs)
 {
-    PyObject *myobj;
-    CH_TYPE c, *start;
-    Py_ssize_t count, total;
     SubString fmtstr;
-    int doubled, ok;
-
-    fmtstr = fs->fmtstr;
-    ok = 1;
-    c = '\0';  /* Avoid compiler warning */
-    while (fmtstr.ptr < fmtstr.end) {
-        start = fmtstr.ptr;
-        count = total = fmtstr.end - start;
-        while (count && ((c = *fmtstr.ptr) != '{') && (c != '}')) {
-            fmtstr.ptr++;
-            count--;
-        }
-        fs->fieldstart = fmtstr.ptr++;
-        count = total - count;
-        total -= count;
-        doubled = (total > 1) && (*fmtstr.ptr == c);
-        if (doubled) {
-            output_data(fs, start, count+1);
-            fmtstr.ptr++;
-            continue;
-        } else if (count)
-            output_data(fs, start, count);
-        fs->fmtstr.ptr = fmtstr.ptr;
-        if (c == '}') {
-            SetError(fs, "Single } encountered");
-            ok = 0;
+    CH_TYPE c, *start, *ptr, *end;
+    Py_ssize_t count;
+    int syntaxmode, escape;
+
+    end = fs->fmtstr.end;
+    syntaxmode = fs->syntaxmode;
+
+    while (((start = ptr = fs->fmtstr.ptr) != NULL) && (ptr < end)) {
+        escape = 0;
+        while (ptr < end) {
+            switch (c = *ptr++) {
+                case '{':
+                    if ((syntaxmode == 2) &&
+                        ((ptr == start) || (fmtstr.ptr[-2] != '$')))
+                        continue;
+                    break;
+                case '}':
+                    if (syntaxmode != 0)
+                        continue;
+                    break;
+                default:
+                    continue;
+            }
+            escape = 1;
             break;
         }
-        if (total < 2) {
-            ok = !total ||
-                   (int)SetError(fs, "Single { encountered");
-            break;
+        count = ptr - start;
+        if (ptr < end) {
+            switch (syntaxmode) {
+                case 0:
+                    if ((c == '}') && (c != *ptr)) {
+                        fs->fmtstr.ptr = ptr;
+                        return (int)SetError(fs, "Single } encountered");
+                    }
+                case 1:
+                    if (c == *ptr) {
+                        ptr++;
+                        escape = 0;
+                    }
+                    else
+                        count--;
+                    break;
+                case 2:
+                    count -= 2;
+                    escape = !count || (fmtstr.ptr[-3] != '$');
+                    if (!escape)
+                        ptr--;
+                    break;
+                case 3:
+                    switch (*ptr) {
+                        case ' ':
+                            ptr++;
+                        case '\n': case '\r':
+                            escape = 0;
+                            break;
+                        default:
+                            count--;
+                            break;
+                    }
+                    break;
+                default:
+                    fs->fmtstr.ptr = ptr;
+                    return (int)SetError(fs, "Unsupported syntax mode");
+            }
+        }
+        else if (escape) {
+            fs->fmtstr.ptr = ptr;
+            return (int)SetError(fs, "Unexpected escape to markup");
         }
-        myobj = get_field_and_spec(fs);
-        ok = (myobj != NULL) && render_field(fs, myobj);
-        Py_XDECREF(fs->fieldspec.obj);
-        Py_XDECREF(myobj);
-        if (!ok)
-             break;
-        fmtstr.ptr = fs->fmtstr.ptr;
+
+        fs->fmtstr.ptr = ptr;
+        if (count && !output_data(fs, start, count))
+            return 0;
+        if (escape && !get_field_and_render(fs))
+            return 0;
     }
-    return ok;
+    return 1;
 }
 
 /*
@@ -1350,6 +1403,7 @@
     fs->positional_arg_set = 0;
     fs->keyword_arg_set = NULL;
     fs->keywords_is_tuple = 0;
+    fs->syntaxmode = 0;
     fs->do_markup = do_markup;
     fs->keywords = keywords;
 


More information about the Python-checkins mailing list