Re: string.find() again (was Re: timsort for jython)
Guido> I think we've argued about '' in 'abc' long enough. Tim has failed to Guido> convince me, so '' in 'abc' returns True. Barry has checked it all Guido> in. I'm Inyeol Lee, happy python user. I checked other string methods and re functions. 1. most of them assume null character between normal characters and at the start/end of string; 'abc'.count('') -> 4 'abc'.endswith('') -> 1 'abc'.find('') -> 0 'abc'.index('') -> 0 'abc'.rfind('') -> 3 'abc'.rindex('') -> 3 'abc'.startswith('') -> 1 re.search('', 'abc').span() -> (0, 0) re.match('', 'abc').span() -> (0, 0) re.findall('', 'abc') -> ['', '', '', ''] re.sub('', '_', 'abc') -> '_a_b_c_' re.subn('', '_', 'abc') -> ('_a_b_c_', 4) 2. some of them generate exception; '' in 'abc' 'abc'.replace('', '_') 'abc'.split('') 3. one of them ignores empty match; re.split('', 'abc') -> ['abc'] (couldn't test re.finditer but it seems to be the same as re.findall.) Since '' in 'abc' now returns True, How about changing 'abc'.replace('') to generate '_a_b_c_', too? It is consistent with re.sub()/subn() and the cost for change is similar to '' in 'abc' case. Inyeol Lee
Since '' in 'abc' now returns True, How about changing 'abc'.replace('') to generate '_a_b_c_', too? It is consistent with re.sub()/subn() and the cost for change is similar to '' in 'abc' case.
Do you have a use case? Or are you just striving for consistency? It would be more consistent but I'm not sure what the point is. I can think of situations where '' in 'abc' would be needed, but not so for 'abc'.replace('', '_'). --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido> Do you have a use case? Or are you just striving for consistency? It Guido> would be more consistent but I'm not sure what the point is. I can Guido> think of situations where '' in 'abc' would be needed, but not so for Guido> 'abc'.replace('', '_'). It's the first way that comes to mind of s p r e a d i n g o u t the characters in a string for use in, say, the title of a report. -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark
On 09 Aug 2002, Andrew Koenig <ark@research.att.com> wrote:
Do you have a use case? Or are you just striving for consistency? It would be more consistent but I'm not sure what the point is. I can think of situations where '' in 'abc' would be needed, but not so for 'abc'.replace('', '_').
It's the first way that comes to mind of s p r e a d i n g o u t the characters in a string for use in, say, the title of a report.
The first way that comes to my mind is:
' '.join("spreading out") 's p r e a d i n g o u t'
-- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
If someone really wants 'abc'.replace('', '-') to return '-a-b-c-', please submit patches for both 8-bit and Unicode strings to SourceForge and assign to me. I looked into this and it's non-trivial: the implementation used for 8-bit strings goes into an infinite loop when the pattern is empty, and the Unicode implementation tacks '----' onto the end. Please supply doc and unittest patches too. At least re does the right thing already:
import re re.sub('', '-', 'abc') '-a-b-c-'
--Guido van Rossum (home page: http://www.python.org/~guido/)
To underline strings for viewers like less.
underlined = normal.replace('', '_\b')
This also can be done with re.sub(), but I think it is natural to use string methods to handle non-RE strings. This cannot be done with '_\b'.join(), since it doesn't prepend '_\b'. - Inyeol Lee On Fri, Aug 09, 2002 at 10:13:12AM -0400, Guido van Rossum wrote:
Since '' in 'abc' now returns True, How about changing 'abc'.replace('') to generate '_a_b_c_', too? It is consistent with re.sub()/subn() and the cost for change is similar to '' in 'abc' case.
Do you have a use case? Or are you just striving for consistency? It would be more consistent but I'm not sure what the point is. I can think of situations where '' in 'abc' would be needed, but not so for 'abc'.replace('', '_').
--Guido van Rossum (home page: http://www.python.org/~guido/)
On Fri, 9 Aug 2002, Inyeol Lee wrote:
To underline strings for viewers like less.
underlined = normal.replace('', '_\b')
That doesn't quite work, since it puts an extra underbar at the end. But it can be done fairly easily without using replace(): underlined = ''.join(['_\b' + c for c in normal]) -- ?!ng
Ping> On Fri, 9 Aug 2002, Inyeol Lee wrote:
To underline strings for viewers like less.
underlined = normal.replace('', '_\b')
Ping> That doesn't quite work, since it puts an extra underbar at the end. Ping> But it can be done fairly easily without using replace(): Ping> underlined = ''.join(['_\b' + c for c in normal]) With a sufficiently rich family of functions, you can avoid any one of them if you want to do so badly enough. Even so, that doesn't make proposed uses of that function illegitimate. -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark
On Fri, Aug 09, 2002 at 02:31:30PM -0700, Ka-Ping Yee wrote:
On Fri, 9 Aug 2002, Inyeol Lee wrote:
To underline strings for viewers like less.
underlined = normal.replace('', '_\b')
That doesn't quite work, since it puts an extra underbar at the end.
underlined = normal.replace('', '_\b', len(normal)) Hmm... my position is getting weaker... When I first posted this, I just thought about consistency, not about use cases. This underline samples are created in a hurry :-) -- Inyeol Lee
participants (5)
-
Andrew Koenig
-
Duncan Booth
-
Guido van Rossum
-
Inyeol Lee
-
Ka-Ping Yee