How to grab a number from inside a .html file using regex
MRAB
python at mrabarnett.plus.com
Sat Aug 7 14:24:52 EDT 2010
Νίκος wrote:
> Hello guys! Need your precious help again!
>
> In every html file i have in the very first line a page_id fro counetr
> countign purpsoes like in a format of a comment like this:
>
> <!-- 1 -->
> <!-- 2 -->
> <!-- 3 -->
>
> and so on. every html file has its one page_id
>
> How can i grab that string representaion of a number from inside
> the .html file using regex and convert it to an integer value?
>
> # ==============================
> # open current html template and get the page ID number
> # ==============================
>
> f = open( '/home/webville/public_html/' + page )
>
> #read first line of the file
> firstline = f.readline()
>
> page_id = re.match( '<!-- \d -->', firstline )
> print ( page_id )
Use group capture:
found = re.match(r'<!-- (\d+) -->', firstline).group(1)
print(page_id)
More information about the Python-list
mailing list