[Tutor] how to match regular expression from right to left

Kent Johnson kent37 at tds.net
Sun Sep 16 16:50:44 CEST 2007


王瘢雹超 wrote:
> The number of iterms - (us::.*?) - varies.
> 
> When I use re.findall with (us::*?), only several 'us::' are extracted.

I don't understand what is going wrong now. Please show the code, the
data, and tell us what you get and what you want to get.

Here is an example:

Without a group you get the whole match:

In [3]: import re
In [4]: line = """38166 us::Video_Cat::Other; us::Video_Cat::Today Show;
us::VC_Supplier::bc; 1002::ms://bc.wd.net/a275/video/tdy_is.asf;
1003::ms://bc.wd.net/a275/video/tdy_is_.fl;"""
In [5]: re.findall('us::.*?;', line)
Out[5]: ['us::Video_Cat::Other;', 'us::Video_Cat::Today Show;',
'us::VC_Supplier::bc;']

With a group you get just the group:

In [6]: re.findall('(us::.*?);', line)
Out[6]: ['us::Video_Cat::Other', 'us::Video_Cat::Today Show',
'us::VC_Supplier::bc']

Kent

> 
> Daniel
> 
> On 9/16/07, * Kent Johnson* <kent37 at tds.net <mailto:kent37 at tds.net>> wrote:
> 
>     王瘢雹超 wrote:
>      > yes, but I mean if I have the line like this:
>      >
>      > line = """38166 us::Video_Cat::Other; us::Video_Cat::Today Show;
>      > us::VC_Supplier::bc; 1002::ms://bc.wd.net/a275/video/tdy_is.asf;
>      > 1003::ms://bc.wd.net/a275/video/tdy_is_.fl;"""
>      >
>      > I want to get the part "us::MSNVideo_Cat::Other;
>     us::MSNVideo_Cat::Today
>      > Show; us::VC_Supplier::Msnbc;"
>      >
>      > but re.compile(r"(us::.*) .*(1002|1003).*$") will get the
>      > "1002::ms://bc.wd.net/a275/video/tdy_is.asf;" included in an lazy
>     mode.
> 
>     Of course, you have asked for all the text up to the end of the string.
> 
>     Not sure what you mean by lazy mode...
> 
>     If there will always be three items you could just repeat the relevant
>     sections of the re, something like
> 
>     r'(us::.*?); (us::.*?); (us::.*?);'
> 
>     or even
> 
>     r'(us::Video_Cat::.*?); (us::Video_Cat::.*?); (us::VC_Supplier::.*?);'
> 
>     If the number of items varies then use re.findall() with (us::.*?);
> 
>     The non-greedy match is not strictly needed in the first case but it is
>     in the second.
> 
>     Kent
> 
> 



More information about the Tutor mailing list