Regular Expression
Laura Creighton
lac at openend.se
Thu Jun 4 10:21:15 EDT 2015
In a message of Thu, 04 Jun 2015 06:36:29 -0700, Palpandi writes:
>Hi All,
>
>This is the case. To split "string2" from "string1_string2" I am using
>re.split('_', "string1_string2", 1)
And you shouldn't be. The 3rd argument, 1 says stop after one match.
>It is working fine for string "string1_string2" and output as "string2". But actually the problem is that if a sting is "__string1_string2" and the output is "_string1_string2". It is wrong.
>
>How to fix this issue?
Depends on what you want.
Approach #1 - just use the string method, forget re, because you do not
need it.
>>>> "__string1_string2".split("_")
['', '', 'string1', 'string2']
>>>> "_string1_string2__".split("_")
['', 'string1', 'string2', '', '']
Approach #2 -- use re but with a fixed string (probably a bad idea,
you should be using approach 1 instead if you have a fixed string)
>>>> re.split('_', "__string1_string2")
['', '', 'string1', 'string2']
>>>> re.split('_', "__string1_string2__")
['', '', 'string1', 'string2', '', '']
Approach #3 - there is a real pattern here I want to use, the example
I posted to the list is a lot simpler than what I really want to do.
Ok, in this case we will match 'any number of underscores' for an
example.
>>>> p = re.compile('_*')
>>>> p.split("__string1_string2")
['', 'string1', 'string2']
>>>> p.split("__string1__string2__")
['', 'string1', 'string2', '']
Laura
More information about the Python-list
mailing list