[Tutor] How do I make pattern to find only '.html' file using Python Regular Expression?

Ben Finney ben+python at benfinney.id.au
Thu Apr 2 01:47:21 CEST 2015


Abdullah Al Imran <abdalimran at live.com> writes:

> How to do it using Python Regular Expression?

Don't assume which tool you must use; instead, ask how best the problem
can be solved.

In the case of parsing HTML, regular expressions are a poor fit
<URL:http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454>
because they're not doing the job of parsing.

Use a parser which better understands HTML, like Beautiful Soup
<URL:https://pypi.python.org/pypi/BeautifulSoup>.

-- 
 \      “An expert is a man who has made all the mistakes which can be |
  `\                         made in a very narrow field.” —Niels Bohr |
_o__)                                                                  |
Ben Finney



More information about the Tutor mailing list