[Python-ideas] str.split with padding

Lie Ryan lie.1296 at gmail.com
Sat Mar 14 06:04:18 CET 2009


And Clover wrote:
> Lie Ryan wrote:
> 
>> Can you find a better use case?
> 
> Well here are some random uses from projects that a search on splitpad 
> (one of the names I used for it) is turning up:
> 
>     command, parameters= splitpad(line, ' ', 1) # get SMTP command
>     y, m, d= splitpad(t, '-', 2) # split date, month and day optional
>     headers, body= splitpad(request, '\n\n', 1) # there might be no body
>     table, column= rsplitpad(colname, '.', 1) # extract SQL 
> [table.]column name
>     id, cat, name, price= splitpad(line, ',', 3) # should be four 
> columns, but editor might have lost trailing commas
>     user, pwd= splitpad(base64.decodestring(authtoken), ':', 1) # will 
> always contain ':' unless malformed
>     pars= dict(splitpad(p, '=', 1) for p in input.split(';')) # no 
> '=value' part is allowable
>     server, version= splitpad(environ.get('SERVER_SOFTWARE', ''), '/', 
> 1) # might not have a version
> 
> And so on. (Obviously these have an internetty bias, where “be liberal 
> in what you accept” is desirable.)
> 
>> For splitting email address, I think I would want to know if the 
>> address turned
>> out to be invalid (e.g. it does  not contain exactly 1 @s)
> 
> Maybe, maybe not. In this case I wanted to accept the case of a bare 
> username, with or without ‘@’, as a local user. An empty string instead 
> of an exception for a missing part is something I find very common; it 
> kind of fits with Python's “string processing does what you usually 
> want” behaviour (as compared to other languages that are still tediously 
> throwing exceptions when you try to slice outside the string length range).
> 
> For example with an HTTP command (eg. “GET / HTTP/1.0”):
> 
>     method, path, version= splitpad(command, ' ', 2)
> 
> ‘version’ might be missing, on ancient HTTP/0.9 clients. ‘path’ could be 
> missing, on malformed requests. In either of those cases I don't want an 
> exception, and I don't particularly want to burden my split code with 
> extra checking; I'll probably have to do further checking on ‘path’ 
> anyway so setting it to an empty string is the best I can do here.
> 
> The alternative I use if I can't be bothered to define splitpad() again 
> is something like:
> 
>     parts= command.split(' ', 2)
>     method= parts[0]
>     path= parts[1] if len(parts)>=2 else ''
>     ....
> 
> which is pretty ugly.
> 

I am honestly quite surprised: 
http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html




More information about the Python-ideas mailing list