[Tutor] regex help for a noob
Thomas A. Anderson
thomas.anderson at little-beak.com
Mon Feb 15 18:26:55 EST 2021
Thanks David and Alan!
Now all working as desired.
On 15.02.21 23:48, David Rock wrote:
> * Thomas A. Anderson via Tutor <tutor at python.org> [2021-02-15 21:39]:
>> import re
>>
>> def getlist():
>> """ creates a list from file """ list = []
>> dataload = open("/Users/drexl/Lyntin/sample.txt", "r")
>> regExp = '\".*?\"' for line in dataload.readlines():
>> x = re.findall(regExp, line)
>> if x:
>> list.append(x)
>>
>> print list
>>
>>
>> getlist()
>>
>> I get the desired result, more or less, slightly more on the less side =(
>>
>> I am getting this as a list output:
>> [['"n"'], ['"n"'], ['"e"'], ['"w"'], ['"n"']]
>>
>> where I would like a more basic list:
>> list = ['n', 'n', 'e', 'w', 'n']
> It sounds like you need to use a group in you regex:
> instead of: '\".*?\"'
> use: '\"(.*?)\"'
>
> Basically, if you put () around the part you want, it gets "grouped" and can be referenced later by index.
> re.findall will use groups if they are set:
>
>
> re.findall(pattern, string, flags=0)
> Return all non-overlapping matches of
> pattern in string, as a list of strings. The string is scanned left-to-right,
> and matches are returned in the order found. If one or more groups are present
> in the pattern, return a list of groups; this will be a list of tuples if the
> pattern has more than one group. Empty matches are included in the result.
>
> You may still end up with a list of lists, but I think that will get you closer to what you want.
>
More information about the Tutor
mailing list