get a field
MRAB
python at mrabarnett.plus.com
Mon Feb 15 13:56:10 EST 2010
Tim Chase wrote:
> Holden wrote:
>> mierdatutis mi wrote:
>>> I have this:
>>>
>>> pe="http://www.rtve.es/mediateca/videos/20100211/saber-comer---patatas-castellanas-costillas-11-02-10/691046.shtml"
>>>
>>>
>>> I would like to extract this: 691046.shtml
>>>
>>> But is dynamically. Not always have the same lenght the string.
>>
>>>>> s = "http://server/path/to/file/file.shtml"
>>>>> s.rfind("/") # finds rightmost "/"
>> 26
>>>>> s[s.rfind("/")+1:] # substring starting after "/"
>> 'file.shtml'
>
> If I didn't use os.path.basename(s) then I'd write this as
> "s.rsplit('/', 1)[-1]"
>
> >>> "http://server/path/to/file/file.shtml".rsplit('/', 1)[-1]
> 'file.shtml'
> >>> "".rsplit('/', 1)[-1]
> ''
> >>> "file.html".rsplit('/', 1)[-1]
> 'file.html'
>
> I don't know how much of a difference it makes, but I always appreciate
> seeing how various people solve the same problem. I tend to lean away
> from the find()/index() methods on strings because I have to stop and
> think which one raises the exception and which one returns -1 (and that
> it's -1 instead of 0) usually dropping to a python shell and doing
>
> >>> help("".find)
> >>> help("".index)
>
> to refresh my memory.
>
> FWIW, Steve's solution and the os.path.basename() both produce the same
> results with all 3 input values, so it's more a matter of personal style
> preference. The basename() version has slightly different results if
> you have Windows paths with backslashes:
>
> s = r'c:\path\to\file.txt'
>
> but since you (OP) know that they should be URLs with forward-slashes,
> it should be a non-issue.
>
The MacOS separator was (is?) ":". Since MacOS X it has been based on
Unix, so I'm not sure what Python sees nowadays, "/" or still ":".
Another platform, RISC OS, uses ".". For that reason I wouldn't use
os.path.basename().
An alternative to .rsplit() is .rpartition().
More information about the Python-list
mailing list