How to grab a number from inside a .html file using regex
nikos.the.gr33k at gmail.com
Sat Aug 7 21:37:54 CEST 2010
On 7 Αύγ, 22:17, MRAB <pyt... at mrabarnett.plus.com> wrote:
> Νίκος wrote:
> > On 7 Αύγ, 21:24, MRAB <pyt... at mrabarnett.plus.com> wrote:
> >> Use group capture:
> >> found = re.match(r'<!-- (\d+) -->', firstline).group(1)
> >> print(page_id)
> > Worked like a charm! Thanks a lot!
> > So match method here not only searched for the string representation
> > of the number but also convert it to integer as well?
> > r stand for retrieve the string here?
> > and group?
> > Wehn a regex searched a .txt file when is retrieving something for it
> > always retrieve it as string right? or can get it as a number as well?
> The 'r' prefix makes it a 'raw string literal'. That means that the
> string literal won't treat backslashes as special. Before raw string
> literals were added to the Python language I would have needed to write:
> '<!-- (\\d+) -->'
> (Actually, that's not strictly true in this case, because \d doesn't
> have a special meaning Python strings, but it's a good idea to use raw
> string literals habitually when writing regexes in order to reduce the
> chance of forgetting them when they _are_ necessary. Well, that's what I
> think, anyway. :-))
Couln't agree more!
As the saying goes, better safe than sorry! :-)
More information about the Python-list