Mailman 3 Re: string.find() again (was Re: timsort for jython) - Python-Dev

Re: string.find() again (was Re: timsort for jython)

Inyeol Lee

9 Aug 2002 9 Aug '02

6:08 a.m.

Guido> I think we've argued about '' in 'abc' long enough. Tim has failed to Guido> convince me, so '' in 'abc' returns True. Barry has checked it all Guido> in. I'm Inyeol Lee, happy python user. I checked other string methods and re functions. 1. most of them assume null character between normal characters and at the start/end of string; 'abc'.count('') -> 4 'abc'.endswith('') -> 1 'abc'.find('') -> 0 'abc'.index('') -> 0 'abc'.rfind('') -> 3 'abc'.rindex('') -> 3 'abc'.startswith('') -> 1 re.search('', 'abc').span() -> (0, 0) re.match('', 'abc').span() -> (0, 0) re.findall('', 'abc') -> ['', '', '', ''] re.sub('', '_', 'abc') -> '_a_b_c_' re.subn('', '_', 'abc') -> ('_a_b_c_', 4) 2. some of them generate exception; '' in 'abc' 'abc'.replace('', '_') 'abc'.split('') 3. one of them ignores empty match; re.split('', 'abc') -> ['abc'] (couldn't test re.finditer but it seems to be the same as re.findall.) Since '' in 'abc' now returns True, How about changing 'abc'.replace('') to generate '_a_b_c_', too? It is consistent with re.sub()/subn() and the cost for change is similar to '' in 'abc' case. Inyeol Lee

Show replies by date

Guido van Rossum

9 Aug 9 Aug

2:13 p.m.

New subject: string.find() again (was Re: timsort for jython)

...

Since '' in 'abc' now returns True, How about changing 'abc'.replace('') to generate '_a_b_c_', too? It is consistent with re.sub()/subn() and the cost for change is similar to '' in 'abc' case.

Do you have a use case? Or are you just striving for consistency? It would be more consistent but I'm not sure what the point is. I can think of situations where '' in 'abc' would be needed, but not so for 'abc'.replace('', '_'). --Guido van Rossum (home page: http://www.python.org/~guido/)

Andrew Koenig

2:52 p.m.

New subject: string.find() again (was Re: timsort for jython)

Guido> Do you have a use case? Or are you just striving for consistency? It Guido> would be more consistent but I'm not sure what the point is. I can Guido> think of situations where '' in 'abc' would be needed, but not so for Guido> 'abc'.replace('', '_'). It's the first way that comes to mind of s p r e a d i n g o u t the characters in a string for use in, say, the title of a report. -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark

Duncan Booth

3:26 p.m.

New subject: string.find() again (was Re: timsort for jython)

On 09 Aug 2002, Andrew Koenig <ark@research.att.com> wrote:

...

...
Do you have a use case? Or are you just striving for consistency? It would be more consistent but I'm not sure what the point is. I can think of situations where '' in 'abc' would be needed, but not so for 'abc'.replace('', '_').

It's the first way that comes to mind of s p r e a d i n g o u t the characters in a string for use in, say, the title of a report.

The first way that comes to my mind is:

...

...
...
' '.join("spreading out") 's p r e a d i n g o u t'

-- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?

Guido van Rossum

3:39 p.m.

New subject: string.find() again (was Re: timsort for jython)

If someone really wants 'abc'.replace('', '-') to return '-a-b-c-', please submit patches for both 8-bit and Unicode strings to SourceForge and assign to me. I looked into this and it's non-trivial: the implementation used for 8-bit strings goes into an infinite loop when the pattern is empty, and the Unicode implementation tacks '----' onto the end. Please supply doc and unittest patches too. At least re does the right thing already:

...

...
...
import re re.sub('', '-', 'abc') '-a-b-c-'

--Guido van Rossum (home page: http://www.python.org/~guido/)

Inyeol Lee

8:51 p.m.

New subject: string.find() again (was Re: timsort for jython)

To underline strings for viewers like less.

...

...
...
underlined = normal.replace('', '_\b')

This also can be done with re.sub(), but I think it is natural to use string methods to handle non-RE strings. This cannot be done with '_\b'.join(), since it doesn't prepend '_\b'. - Inyeol Lee On Fri, Aug 09, 2002 at 10:13:12AM -0400, Guido van Rossum wrote:

...

...
Since '' in 'abc' now returns True, How about changing 'abc'.replace('') to generate '_a_b_c_', too? It is consistent with re.sub()/subn() and the cost for change is similar to '' in 'abc' case.

Do you have a use case? Or are you just striving for consistency? It would be more consistent but I'm not sure what the point is. I can think of situations where '' in 'abc' would be needed, but not so for 'abc'.replace('', '_').

--Guido van Rossum (home page: http://www.python.org/~guido/)

Ka-Ping Yee

9:31 p.m.

New subject: string.find() again (was Re: timsort for jython)

On Fri, 9 Aug 2002, Inyeol Lee wrote:

...

To underline strings for viewers like less.

...
...
...
underlined = normal.replace('', '_\b')

That doesn't quite work, since it puts an extra underbar at the end. But it can be done fairly easily without using replace(): underlined = ''.join(['_\b' + c for c in normal]) -- ?!ng

Andrew Koenig

9:47 p.m.

New subject: string.find() again (was Re: timsort for jython)

Ping> On Fri, 9 Aug 2002, Inyeol Lee wrote:

...

...
To underline strings for viewers like less.

...

...
...
...
...
underlined = normal.replace('', '_\b')

Ping> That doesn't quite work, since it puts an extra underbar at the end. Ping> But it can be done fairly easily without using replace(): Ping> underlined = ''.join(['_\b' + c for c in normal]) With a sufficiently rich family of functions, you can avoid any one of them if you want to do so badly enough. Even so, that doesn't make proposed uses of that function illegitimate. -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark

Inyeol Lee

10:20 p.m.

New subject: string.find() again (was Re: timsort for jython)

On Fri, Aug 09, 2002 at 02:31:30PM -0700, Ka-Ping Yee wrote:

...

On Fri, 9 Aug 2002, Inyeol Lee wrote:

...
To underline strings for viewers like less.

...
...
...
underlined = normal.replace('', '_\b')

That doesn't quite work, since it puts an extra underbar at the end.

underlined = normal.replace('', '_\b', len(normal)) Hmm... my position is getting weaker... When I first posted this, I just thought about consistency, not about use cases. This underline samples are created in a hurry :-) -- Inyeol Lee

8164

Age (days ago)

8164

Last active (days ago)

List overview

Download

8 comments

5 participants

participants (5)

Andrew Koenig
Duncan Booth
Guido van Rossum
Inyeol Lee
Ka-Ping Yee

Re: string.find() again (was Re: timsort for jython)

Inyeol Lee

Guido van Rossum

Andrew Koenig

Duncan Booth

Guido van Rossum

Inyeol Lee

Ka-Ping Yee

Andrew Koenig

Inyeol Lee

tags

participants (5)