[Python-ideas] Small enhancement to os.path.splitext

M.-A. Lemburg mal at egenix.com
Tue Apr 20 17:44:03 CEST 2010


Tarek Ziadé wrote:
> Hello
> 
> Currently, os.path.splitext will split a string giving you the piece
> behind the last dot:
> 
>>>> os.path.splitext('file.tar.gz')
> ('file.tar', '.gz')
> 
> In some cases, what we really want is the two last parts when
> splitting on the dots (like in my example).
> 
> What about providing an extra argument to be able to grab more than one dot ?
> 
>>>> os.path.splitext('file.tar.gz', numext=2)
> ('file', '.tar.gz')
> 
> If numext > numbers of dots, it will just split after the first dot:
> 
>>>> os.path.splitext('file.tar', numext=2)
> ('file', '.tar')
> 
> 
> What do you think ?

I'm not sure whether that would really solve anything.

The general problem with extensions is that they can span multiple
"dotted" parts in a filename, but whether they do or not depends
on the extensions.

E.g. you can have 'file.tar', 'file.tar.gz', 'file.tgz', 'file.tar.gz.uu',
'file.tag.gz.asc', 'file.tar.gz.gpg', etc.

OTOH, it's possible to have files using extra dotted
parts to signal certain properties to the user, which don't
really mean anything in terms of encoding, file format or
compression, e.g. 'file.i686.linux.64bit.bin'

Most systems I know that have to deal with file extensions,
come with a list of possible extensions and then register
a handler or property with each.

They typically use the 'longest match wins' strategy and then
use the match extension as file extension.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 20 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/



More information about the Python-ideas mailing list