[Tutor] regex help for a noob
Alan Gauld
alan.gauld at yahoo.co.uk
Mon Feb 15 17:47:55 EST 2021
There are several things to comment on here...
On 15/02/2021 20:39, Thomas A. Anderson via Tutor wrote:
> The single characters I am looking for are nestled within a ("_"), i.e.
> parenthesis and double quote.
>
> I have tried the following code:
>
>
> import re
>
> def getlist():
> """ creates a list from file """
list = []
> dataload = open("/Users/drexl/Lyntin/sample.txt", "r")
Best Python practice says use a with statement for this:
with open("/Users/drexl/Lyntin/sample.txt", "r") as dataload:
That will ensue it gets closed again, even if you hit an exception.
> regExp = '\".*?\"'
This regex does not correspond to your specification. Where are the ()?
I'd expect something like:
regExp = "\(\"(.)\"|) # match any single char between (" and ")...
You want to extract the bit inside the quotes so that's
what the group (ie the (.) bit) will do.
> for line in dataload.readlines():
You don't need the readlines() its better to use the file
object as an iterator:
for line in dataload:
However I'm not sure you eben need to scan line by line, you
could just read() the whole file and do it as a single search
with findall()... But there may be data complications that
preclude that...
> x = re.findall(regExp, line)
> if x:
> list.append(x)
findall() returns a list of found items. You are appending the whole
list to your list. You probably want to add the lists together:
list += x
Also its very bad practice to use a type name for a variable. You
have hidden the list() function so you can't now convert strings,
say, to lists:
Ls = list("abc") -> error because list is now an actual list.
> I have tried various other regex expressions, but they only give me worse or the same results.
> So, I don't think it is regex related? But somewhere else, I am missing something?
You are mostly missing the fact that appending a list to a
list puts the whole list into the containing list
a = [1]
b = [2]
c = []
c.append(a) -> [[1]]
c.append(b) -> [[1],[2]]
But there's quite a few other things to tidy up too.
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos
More information about the Tutor
mailing list