What is the best way to delete strings in a string list that that match certain pattern?

Dave Angel davea at ieee.org
Fri Nov 6 18:57:28 EST 2009



Peng Yu wrote:
> On Fri, Nov 6, 2009 at 10:42 AM, Robert P. J. Day <rpjday at crashcourse.ca> wrote:
>   
>> On Fri, 6 Nov 2009, Peng Yu wrote:
>>
>>     
>>> On Fri, Nov 6, 2009 at 3:05 AM, Diez B. Roggisch <deets at nospam.web.de> wrote:
>>>       
>>>> Peng Yu schrieb:
>>>>         
>>>>> Suppose I have a list of strings, A. I want to compute the list (call
>>>>> it B) of strings that are elements of A but doesn't match a regex. I
>>>>> could use a for loop to do so. In a functional language, there is way
>>>>> to do so without using the for loop.
>>>>>           
>>>> Nonsense. For processing over each element, you have to loop over them,
>>>> either with or without growing a call-stack at the same time.
>>>>
>>>> FP languages can optimize away the stack-frame-growth (tail recursion) - but
>>>> this isn't reducing complexity in any way.
>>>>
>>>> So use a loop, either directly, or using a list-comprehension.
>>>>         
>>> What is a list-comprehension?
>>>
>>> I tried the following code. The list 'l' will be ['a','b','c'] rather
>>> than ['b','c'], which is what I want. It seems 'remove' will disrupt
>>> the iterator, right? I am wondering how to make the code correct.
>>>
>>> l ='a', 'a', 'b', 'c']
>>> for x in l:
>>>   if x ='a':
>>>     l.remove(x)
>>>
>>> print l
>>>       
>>  list comprehension seems to be what you want:
>>
>>  l =i for i in l if i != 'a']
>>     
>
> My problem comes from the context of using os.walk(). Please see the
> description of the following webpage. Somehow I have to modify the
> list inplace. I have already tried 'dirs =i for i in l if dirs !'a']'. But it seems that it doesn't "prune the search". So I need the
> inplace modification of list.
>
> http://docs.python.org/library/os.html
>
> When topdown is True, the caller can modify the dirnames list in-place
> (perhaps using del or slice assignment), and walk() will only recurse
> into the subdirectories whose names remain in dirnames; this can be
> used to prune the search, impose a specific order of visiting, or even
> to inform walk() about directories the caller creates or renames
> before it resumes walk() again. Modifying dirnames when topdown is
> False is ineffective, because in bottom-up mode the directories in
> dirnames are generated before dirpath itself is generated.
>
>   
The context is quite important in this case.  The os.walk() iterator 
gives you a tuple of three values, and one of them is a list.  You do 
indeed want to modify that list, but you usually don't want to do it 
"in-place."   I'll show you the in-place version first, then show you 
the slice approach.

If all you wanted to do was to remove one or two specific items from the 
list, then the remove method would be good.  So in your example, you 
don' t need a loop.  Just say:
    if 'a' in dirs:
         dirs.remove('a')

But if you have an expression you want to match each dir against, the 
list comprehension is the best answer.  And the trick to stuffing that 
new list into the original list object is to use slicing on the left 
side.  The [:] notation is a default slice that means the whole list.

    dirs[:] = [ item for item in dirs if     bool_expression_on_item ]


HTH
DaveA



More information about the Python-list mailing list