How to grab a number from inside a .html file using regex
MRAB
python at mrabarnett.plus.com
Sat Aug 7 15:17:59 EDT 2010
Νίκος wrote:
> On 7 Αύγ, 21:24, MRAB <pyt... at mrabarnett.plus.com> wrote:
>
>> Use group capture:
>>
>> found = re.match(r'<!-- (\d+) -->', firstline).group(1)
>> print(page_id)
>
> Worked like a charm! Thanks a lot!
>
> So match method here not only searched for the string representation
> of the number but also convert it to integer as well?
>
> r stand for retrieve the string here?
>
> and group?
>
> Wehn a regex searched a .txt file when is retrieving something for it
> always retrieve it as string right? or can get it as a number as well?
The 'r' prefix makes it a 'raw string literal'. That means that the
string literal won't treat backslashes as special. Before raw string
literals were added to the Python language I would have needed to write:
'<!-- (\\d+) -->'
instead.
(Actually, that's not strictly true in this case, because \d doesn't
have a special meaning Python strings, but it's a good idea to use raw
string literals habitually when writing regexes in order to reduce the
chance of forgetting them when they _are_ necessary. Well, that's what I
think, anyway. :-))
More information about the Python-list
mailing list