get a field

Mon Feb 15 13:56:10 EST 2010

Tim Chase wrote:
>  Holden wrote:
>> mierdatutis mi wrote:
>>> I have this:
>>>
>>> pe="http://www.rtve.es/mediateca/videos/20100211/saber-comer---patatas-castellanas-costillas-11-02-10/691046.shtml" 
>>>
>>>
>>> I would like to extract this: 691046.shtml
>>>
>>> But is dynamically. Not always have the same lenght the string.
>>
>>>>> s = "http://server/path/to/file/file.shtml"
>>>>> s.rfind("/")         # finds rightmost "/"
>> 26
>>>>> s[s.rfind("/")+1:]   # substring starting after "/"
>> 'file.shtml'
> 
> If I didn't use os.path.basename(s) then I'd write this as 
> "s.rsplit('/', 1)[-1]"
> 
>   >>> "http://server/path/to/file/file.shtml".rsplit('/', 1)[-1]
>   'file.shtml'
>   >>> "".rsplit('/', 1)[-1]
>   ''
>   >>> "file.html".rsplit('/', 1)[-1]
>   'file.html'
> 
> I don't know how much of a difference it makes, but I always appreciate 
> seeing how various people solve the same problem.  I tend to lean away 
> from the find()/index() methods on strings because I have to stop and 
> think which one raises the exception and which one returns -1 (and that 
> it's -1 instead of 0) usually dropping to a python shell and doing
> 
>   >>> help("".find)
>   >>> help("".index)
> 
> to refresh my memory.
> 
> FWIW, Steve's solution and the os.path.basename() both produce the same 
> results with all 3 input values, so it's more a matter of personal style 
> preference.  The basename() version has slightly different results if 
> you have Windows paths with backslashes:
> 
>   s = r'c:\path\to\file.txt'
> 
> but since you (OP) know that they should be URLs with forward-slashes, 
> it should be a non-issue.
> 
The MacOS separator was (is?) ":". Since MacOS X it has been based on
Unix, so I'm not sure what Python sees nowadays, "/" or still ":".
Another platform, RISC OS, uses ".". For that reason I wouldn't use
os.path.basename().

An alternative to .rsplit() is .rpartition().