# get a field

MRAB python at mrabarnett.plus.com
Mon Feb 15 19:56:10 CET 2010

Tim Chase wrote:
>  Holden wrote:
>> mierdatutis mi wrote:
>>> I have this:
>>>
>>> pe="http://www.rtve.es/mediateca/videos/20100211/saber-comer---patatas-castellanas-costillas-11-02-10/691046.shtml"
>>>
>>>
>>> I would like to extract this: 691046.shtml
>>>
>>> But is dynamically. Not always have the same lenght the string.
>>
>>>>> s = "http://server/path/to/file/file.shtml"
>>>>> s.rfind("/")         # finds rightmost "/"
>> 26
>>>>> s[s.rfind("/")+1:]   # substring starting after "/"
>> 'file.shtml'
>
> If I didn't use os.path.basename(s) then I'd write this as
> "s.rsplit('/', 1)[-1]"
>
>   >>> "http://server/path/to/file/file.shtml".rsplit('/', 1)[-1]
>   'file.shtml'
>   >>> "".rsplit('/', 1)[-1]
>   ''
>   >>> "file.html".rsplit('/', 1)[-1]
>   'file.html'
>
> I don't know how much of a difference it makes, but I always appreciate
> seeing how various people solve the same problem.  I tend to lean away
> from the find()/index() methods on strings because I have to stop and
> think which one raises the exception and which one returns -1 (and that
> it's -1 instead of 0) usually dropping to a python shell and doing
>
>   >>> help("".find)
>   >>> help("".index)
>
> to refresh my memory.
>
> FWIW, Steve's solution and the os.path.basename() both produce the same
> results with all 3 input values, so it's more a matter of personal style
> preference.  The basename() version has slightly different results if
> you have Windows paths with backslashes:
>
>   s = r'c:\path\to\file.txt'
>
> but since you (OP) know that they should be URLs with forward-slashes,
> it should be a non-issue.
>
The MacOS separator was (is?) ":". Since MacOS X it has been based on
Unix, so I'm not sure what Python sees nowadays, "/" or still ":".
Another platform, RISC OS, uses ".". For that reason I wouldn't use
os.path.basename().

An alternative to .rsplit() is .rpartition().