Getting a value that follows string.find()
Dave Angel
davea at davea.name
Tue Aug 13 21:31:58 EDT 2013
englishkevin110 at gmail.com wrote:
> I know the title doesn't make much sense, but I didnt know how to explain my problem.
>
> Anywho, I've opened a page's source in URLLIB
> starturlsource = starturlopen.read()
> string.find(starturlsource, '<a href="/profile.php?id=')
> And I used string.find to find a specific area in the page's source.
> I want to store what comes after ?id= in a variable.
> Can someone help me with this?
Python 3.3.0 (default, Mar 7 2013, 00:24:38)
[GCC 4.6.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import string
>>> help(string.find)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'find'
There is no find function in the string module [1]. But assuming
starturlsource is a str, you could do:
pattern = '<a href="/profile.php?id='
index = starturlsource.find( pattern )
index will then be -1 if there's no match, or have a non-negative value
if a match is found.
In the latter case, you can extract the next 17 characters with
newstr = starturlsource[index+len(pattern):index+len(pattern)+17]
You are of course making several assumptions about the web page, which
are perfectly reasonable since it's a page under your control. Or is
it?
[1] Assuming Python 3.3 since you omitted stating the version you're
using. But even in Python 2.7, using the string.find function is
deprecated in favor of the str method.
--
DaveA
More information about the Python-list
mailing list