[Python-ideas] str.split with padding

And Clover and-dev at doxdesk.com
Sat Mar 14 04:46:03 CET 2009


Lie Ryan wrote:

> Can you find a better use case?

Well here are some random uses from projects that a search on splitpad 
(one of the names I used for it) is turning up:

     command, parameters= splitpad(line, ' ', 1) # get SMTP command
     y, m, d= splitpad(t, '-', 2) # split date, month and day optional
     headers, body= splitpad(request, '\n\n', 1) # there might be no body
     table, column= rsplitpad(colname, '.', 1) # extract SQL 
[table.]column name
     id, cat, name, price= splitpad(line, ',', 3) # should be four 
columns, but editor might have lost trailing commas
     user, pwd= splitpad(base64.decodestring(authtoken), ':', 1) # will 
always contain ':' unless malformed
     pars= dict(splitpad(p, '=', 1) for p in input.split(';')) # no 
'=value' part is allowable
     server, version= splitpad(environ.get('SERVER_SOFTWARE', ''), '/', 
1) # might not have a version

And so on. (Obviously these have an internetty bias, where “be liberal 
in what you accept” is desirable.)

> For splitting email address, I think I would want to know if the address turned
> out to be invalid (e.g. it does  not contain exactly 1 @s)

Maybe, maybe not. In this case I wanted to accept the case of a bare 
username, with or without ‘@’, as a local user. An empty string instead 
of an exception for a missing part is something I find very common; it 
kind of fits with Python's “string processing does what you usually 
want” behaviour (as compared to other languages that are still tediously 
throwing exceptions when you try to slice outside the string length range).

For example with an HTTP command (eg. “GET / HTTP/1.0”):

     method, path, version= splitpad(command, ' ', 2)

‘version’ might be missing, on ancient HTTP/0.9 clients. ‘path’ could be 
missing, on malformed requests. In either of those cases I don't want an 
exception, and I don't particularly want to burden my split code with 
extra checking; I'll probably have to do further checking on ‘path’ 
anyway so setting it to an empty string is the best I can do here.

The alternative I use if I can't be bothered to define splitpad() again 
is something like:

     parts= command.split(' ', 2)
     method= parts[0]
     path= parts[1] if len(parts)>=2 else ''
     ....

which is pretty ugly.

-- 
And Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/




More information about the Python-ideas mailing list