[Patches] string.translate behaviour

Peter Schneider-Kamp peter@schneider-kamp.de
Sun, 28 May 2000 11:48:11 +0200


This is a multi-part message in MIME format.
--------------CBBB6C52367D6C601C2E514C
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Problem:
The string.translate() function has an unpleasant interface, requiring a
256-character string containing a translation table.
string.maketrans() was subsequently added as a helper function to make
the job easier, but string.translate() should really have an
interface like maketrans: (fromchars, tochars [, deletechars]) 
AMK says: I later proposed a solution that would let us keep
string.translate, and would behave naturally, yet doesn't seem likely to
break any code. string.translate would have a signature (fromchars,
tochars [, deletechars]). If len(fromchars)!=256, it must be the
new style; otherwise it must be the old style. The only ambiguous is if
len(fromchars)==len(tochars)==256; is it a new-style call
that's doing a complete permutation, or is it an old-style call with a
256-char list of characters to delete. Case II seems unlikely to
occur in practice, so we'll assume case I and resolve the ambiguity that
way.
AMK will implement this one spare weekend. 
Edit this entry / Log info / Last changed on Wed Jul 22 09:26:56 1998 by
A.M. Kuchling 

Solution:
I still think this is a good idea and so I have implemented
this behaviour. It does not make the code cleaner, but it's sure
a nice feature to have.

patch attached as plaintext context diff
--
I confirm that, to the best of my knowledge and belief, this
contribution is free of any claims of third parties under
copyright, patent or other rights or interests ("claims").  To
the extent that I have any such claims, I hereby grant to CNRI a
nonexclusive, irrevocable, royalty-free, worldwide license to
reproduce, distribute, perform and/or display publicly, prepare
derivative versions, and otherwise use this contribution as part
of the Python software and its related documentation, or any
derivative versions thereof, at no cost to CNRI or its licensed
users, and to authorize others to do so.

I acknowledge that CNRI may, at its sole discretion, decide
whether or not to incorporate this contribution in the Python
software and its related documentation.  I further grant CNRI
permission to use my name and other identifying information
provided to CNRI by me for use in connection with the Python
software and its related documentation.
--
Peter Schneider-Kamp          ++47-7388-7331
Herman Krags veg 51-11        mailto:peter@schneider-kamp.de
N-7050 Trondheim              http://schneider-kamp.de
--------------CBBB6C52367D6C601C2E514C
Content-Type: text/plain; charset=us-ascii;
 name="string.translate.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="string.translate.patch"

diff -c --recursive python/dist/src/Objects/stringobject.c python-mod/dist/src/Objects/stringobject.c
*** python/dist/src/Objects/stringobject.c	Mon May  8 16:08:05 2000
--- python-mod/dist/src/Objects/stringobject.c	Mon May 22 10:33:26 2000
***************
*** 1293,1308 ****
  	int inlen, tablen, dellen = 0;
  	PyObject *result;
  	int trans_table[256];
! 	PyObject *tableobj, *delobj = NULL;
  
! 	if (!PyArg_ParseTuple(args, "O|O:translate",
! 			      &tableobj, &delobj))
  		return NULL;
  
  	if (PyString_Check(tableobj)) {
  		table1 = PyString_AS_STRING(tableobj);
! 		tablen = PyString_GET_SIZE(tableobj);
! 	}
  	else if (PyUnicode_Check(tableobj)) {
  		/* Unicode .translate() does not support the deletechars 
  		   parameter; instead a mapping to None will cause characters
--- 1293,1309 ----
  	int inlen, tablen, dellen = 0;
  	PyObject *result;
  	int trans_table[256];
!         char new_table[256];
! 	PyObject *tableobj, *delobj = NULL, *delchars = NULL;
  
! 	if (!PyArg_ParseTuple(args, "O|OO:translate",
! 			      &tableobj, &delobj, &delchars))
  		return NULL;
  
  	if (PyString_Check(tableobj)) {
  		table1 = PyString_AS_STRING(tableobj);
!                 tablen = PyString_GET_SIZE(tableobj);
!         }
  	else if (PyUnicode_Check(tableobj)) {
  		/* Unicode .translate() does not support the deletechars 
  		   parameter; instead a mapping to None will cause characters
***************
*** 1329,1344 ****
  		}
  		else if (PyObject_AsCharBuffer(delobj, &del_table, &dellen))
  			return NULL;
- 
- 		if (tablen != 256) {
- 			PyErr_SetString(PyExc_ValueError,
- 			  "translation table must be 256 characters long");
- 			return NULL;
- 		}
  	}
  	else {
  		del_table = NULL;
  		dellen = 0;
  	}
  
  	table = table1;
--- 1330,1374 ----
  		}
  		else if (PyObject_AsCharBuffer(delobj, &del_table, &dellen))
  			return NULL;
  	}
  	else {
  		del_table = NULL;
  		dellen = 0;
+                 if (tablen != 256) {
+                 	PyErr_SetString(PyExc_ValueError,
+                         "translation table must be a 256 char string.");
+                         return NULL;
+                 }
+ 	}
+ 
+        	if ((delobj != NULL) && ((tablen != 256) || (dellen == 256))) {
+         	if (tablen != dellen) {
+         		PyErr_SetString(PyExc_ValueError,
+                         "arguments fromchar and tochar must match in size");
+                         return NULL;
+                 }
+                 for (i = 0; i < 256; i++)
+                 	new_table[i] = (unsigned char)i;
+         	for (i = 0; i < tablen; i++)
+ 			new_table[table1[i]] = del_table[i];
+                 table1 = new_table;
+ 		if (delchars != NULL) {
+ 			if (PyString_Check(delchars)) {
+         			del_table = PyString_AS_STRING(delchars);
+         			dellen = PyString_GET_SIZE(delchars);
+ 			}
+ 			else if (PyUnicode_Check(delchars)) {
+ 				PyErr_SetString(PyExc_TypeError,
+ 				"deletions are implemented differently for unicode");
+         			return NULL;
+ 			}
+                         else if (PyObject_AsCharBuffer(delchars, &del_table, &dellen))
+                                 return NULL;
+                 }
+                 else {
+                         del_table = NULL;
+                         dellen = 0;
+                 }
  	}
  
  	table = table1;

--------------CBBB6C52367D6C601C2E514C--