[Python-ideas] str.split with padding
And Clover
and-dev at doxdesk.com
Sat Mar 14 04:46:03 CET 2009
Lie Ryan wrote:
> Can you find a better use case?
Well here are some random uses from projects that a search on splitpad
(one of the names I used for it) is turning up:
command, parameters= splitpad(line, ' ', 1) # get SMTP command
y, m, d= splitpad(t, '-', 2) # split date, month and day optional
headers, body= splitpad(request, '\n\n', 1) # there might be no body
table, column= rsplitpad(colname, '.', 1) # extract SQL
[table.]column name
id, cat, name, price= splitpad(line, ',', 3) # should be four
columns, but editor might have lost trailing commas
user, pwd= splitpad(base64.decodestring(authtoken), ':', 1) # will
always contain ':' unless malformed
pars= dict(splitpad(p, '=', 1) for p in input.split(';')) # no
'=value' part is allowable
server, version= splitpad(environ.get('SERVER_SOFTWARE', ''), '/',
1) # might not have a version
And so on. (Obviously these have an internetty bias, where “be liberal
in what you accept” is desirable.)
> For splitting email address, I think I would want to know if the address turned
> out to be invalid (e.g. it does not contain exactly 1 @s)
Maybe, maybe not. In this case I wanted to accept the case of a bare
username, with or without ‘@’, as a local user. An empty string instead
of an exception for a missing part is something I find very common; it
kind of fits with Python's “string processing does what you usually
want” behaviour (as compared to other languages that are still tediously
throwing exceptions when you try to slice outside the string length range).
For example with an HTTP command (eg. “GET / HTTP/1.0”):
method, path, version= splitpad(command, ' ', 2)
‘version’ might be missing, on ancient HTTP/0.9 clients. ‘path’ could be
missing, on malformed requests. In either of those cases I don't want an
exception, and I don't particularly want to burden my split code with
extra checking; I'll probably have to do further checking on ‘path’
anyway so setting it to an empty string is the best I can do here.
The alternative I use if I can't be bothered to define splitpad() again
is something like:
parts= command.split(' ', 2)
method= parts[0]
path= parts[1] if len(parts)>=2 else ''
....
which is pretty ugly.
--
And Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/
More information about the Python-ideas
mailing list