[Tutor] Testing a string to see if it contains a substring (Steve and Mark)
dw
bw_dw at fastmail.fm
Thu Jan 22 17:15:25 CET 2015
Thanks so much Steve and Mark!
You've given me a lot to chew on. :-D
I'll pursue!
More Python FUN!!
================================================================
Based on your description, I think the best way to do this is:
# remove blank lines
line_array = [line for line in line_array if line != '\n']
Possibly this is even nicer:
# get rid of unnecessary leading and trailing whitespace on each line
# and then remove blanks
line_array = [line.strip() for line in line_array]
line_array = [line for line in line_array if line]
This is an alternative, but perhaps a little cryptic for those not
familiar with functional programming styles:
line_array = filter(None, map(str.strip, line_array))
No regexes required!
However, it isn't clear from your example whether non-blank lines
*always* include a date. Suppose you have to filter date lines from
non-date lines?
Start with a regex and a tiny helper function, which we can use lambda
to embed directly in the call to filter:
DATE = r'\d{2}/\d{2}/\d{4}'
line_array = filter(lambda line: re.search(DATE, line), line_array)
In Python version 3, you may need to wrap that in a call to list:
line_array = list(filter(lambda line: re.search(DATE, line),
line_array))
but that isn't needed in Python 2.
If that's a bit cryptic, here it is again as a list comp:
DATE = r'\d{2}/\d{2}/\d{4}'
line_array = [line for line in line_array if re.search(DATE, line)]
Let's get rid of the whitespace at the same time!
line_array = [line.strip() for line in line_array if
re.search(DATE, line)]
And if that's still too cryptic ("what's a list comp?") here it is again
expanded out in full:
temp = []
for line in line_array:
if re.search(DATE, line):
temp.append(line.strip())
line_array = temp
How does this work? It works because the two main re functions,
re.match and re.search, return None when then regex isn't found, and a
MatchObject when it is found. None has the property that it is
considered "false" in a boolean context, while MatchObjects are always
consider "true".
We don't care *where* the date is found in the string, only whether or
not it is found, so there is no need to check the starting position.
--
Steven
=============================
I'd use
https://docs.python.org/3/library/datetime.html#datetime.datetime.strptime
to test the first ten characters of the string. I'll leave that and
handling IndexError or ValueError to you :)
--
Bw_dw at fastmail.net
More information about the Tutor
mailing list