[Patches] [ python-Patches-943953 ] Add maketrans to string object

SourceForge.net noreply at sourceforge.net
Sat Jul 17 08:04:46 CEST 2004


Patches item #943953, was opened at 2004-04-29 04:43
Message generated for change (Comment added) made by perky
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=943953&group_id=5470

Category: Core (C code)
Group: None
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Submitted By: Gyro Funch (siva1311)
Assigned to: Nobody/Anonymous (nobody)
Summary: Add maketrans to string object

Initial Comment:
Added maketrans method to the string object. 
This functionality is currently only in the string module.

string module -> str object
string.maketrans(from,to) -> from.maketrans(to)

Attached is the diff for stringobject.c and string_test.py

I am not a proficient C coder, but things look okay to
me and the tests pass.

----------------------------------------------------------------------

>Comment By: Hye-Shik Chang (perky)
Date: 2004-07-17 15:04

Message:
Logged In: YES 
user_id=55188

unicode object have got .decode() method.
It seems the originator may feel happy to write his own codec package 
now. :)

----------------------------------------------------------------------

Comment By: Denis S. Otkidach (ods)
Date: 2004-05-15 00:07

Message:
Logged In: YES 
user_id=63454

> About your note about the Unicode .decode() method:
> I completely agree. The codec was never designed to
> be Unicode vs. the rest of the world. It was designed
> as general purpose encoding and decoding system.

If so, we need to relax mentioned restriction and allow 
Codec instance as argument of encode/decode methods. 
Without this codecs never become general purpose 
encoding and decoding system.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2004-05-14 23:45

Message:
Logged In: YES 
user_id=38388

>>It is easy enough to write a charset based
>> codec that fullfills the same need.
>
> This approach won't work for both str and unicode due to 
> over-restricted implementation of codecs: 
> http://groups.google.com/groups?th=a68a7b5a2e1f294 . 
> Moreover, AFAIK this requires regestering encoding 
> (making it global), but this is often a bad idea.

Indeed. I've always argued for putting codecs into
packages for this reason.

About your note about the Unicode .decode() method:
I completely agree. The codec was never designed to
be Unicode vs. the rest of the world. It was designed
as general purpose encoding and decoding system.
However, a few python-dev'ers seem to have misunderstood
this intention and still believe that codecs are only about
Unicode.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2004-05-14 23:40

Message:
Logged In: YES 
user_id=38388

Probably... patches are welcome :-)

Writing these codecs is really easy. Just have a look
at e.g. rot13.py ... you basically copy the template
to a new module say rot14, edit the mapping dictionary,
save it and then use the module name in the string .encode()
method:

'abc'.encode('rot14')

The codec registry will first look in the encodings package
for the codec and then continue the search on the 
PYTHONPATH, so you may have to provide the complete
package name if you place the codec into a package, e.g.

'abc'.encode('my.new.app.rot14')

That's it.

----------------------------------------------------------------------

Comment By: Denis S. Otkidach (ods)
Date: 2004-05-14 23:40

Message:
Logged In: YES 
user_id=63454

> I'm -1 on adding such a method or trying to tweak 
.translate():
> We should not make use of .translate() more 
wide-spread
> than it already is.

Then why it's not deprecated yet? Now translate has 
many disadvantages: it's difficult ro use for str (need 
maketrans), the interface of it differ for str and unicode. I 
suggested to add unified interface and avoid maketrans 
use. I'm not the first who doesn't like maketrans: 
http://mail.python.org/pipermail/patches/2000-May/000781.html

> It is easy enough to write a charset based
> codec that fullfills the same need.

This approach won't work for both str and unicode due to 
over-restricted implementation of codecs: 
http://groups.google.com/groups?th=a68a7b5a2e1f294 . 
Moreover, AFAIK this requires regestering encoding 
(making it global), but this is often a bad idea.

----------------------------------------------------------------------

Comment By: Gyro Funch (siva1311)
Date: 2004-05-14 23:30

Message:
Logged In: YES 
user_id=679947

Okay. I wasn't even aware of the encodings package (blush).

Should the docs be updated to reflect the fact that users
should consider using codecs instead of translate/maketrans?
Perhaps an example of how maketrans/translate is subsumed by
codecs would be helpful.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2004-05-14 23:13

Message:
Logged In: YES 
user_id=38388

I'm -1 on adding such a method or trying to tweak .translate():
We should not make use of .translate() more wide-spread
than it already is. 

It is easy enough to write a charset based
codec that fullfills the same need. See the codecs in the
encodings
package on how to use this codec for purpose very similar to
those
of .translate(). 

The advantage of this approach is not only to make
the resulting translation easily available to the whole
application; it
also works for both Unicode and plain strings by virtue of
the charset
codec.

----------------------------------------------------------------------

Comment By: Denis S. Otkidach (ods)
Date: 2004-05-14 23:05

Message:
Logged In: YES 
user_id=63454

No, my suggestion doesn't interfer with current interface, 
so it can be added without breaking compatibility. 
Something like the following.
For str:
if isinstance(table, str): ...old behavior...
else: t_from, t_to = table; ...
For unicode:
if isinstance(tables, dict): ...old behavior...
else: t_from, t_to = table; ...

----------------------------------------------------------------------

Comment By: Gyro Funch (siva1311)
Date: 2004-05-14 22:51

Message:
Logged In: YES 
user_id=679947

Although I think your suggestion would have merit if this
method were new, in the current situation this change would
break all code currently using the 'translate' method. I
can't imagine that this would be acceptable.

My suggested change would be backward compatible and would
be consistent with bringing methods out of the string module
and into str object methods. Since 'translate' has already
been made a str method, why not make 'maketrans' a str
method too?

----------------------------------------------------------------------

Comment By: Denis S. Otkidach (ods)
Date: 2004-05-14 21:59

Message:
Logged In: YES 
user_id=63454

I think maketrans is a low-level function and should be 
hidden from user. Traditional tr interface is much more 
tentative and it can be the same both for str and unicode: 
s.translate((from, to) [, delete]).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=943953&group_id=5470


More information about the Patches mailing list