[Python-Dev] New string method - splitquoted
Heiko Wundram
me+python-dev at modelnine.org
Thu May 18 09:00:00 CEST 2006
Am Donnerstag 18 Mai 2006 06:06 schrieb Dave Cinege:
> This is useful, but possibly better put into practice as a separate
> method??
I personally don't think it's particularily useful, at least not in the
special case that your patch tries to address.
1) Generally, you won't only have one character that does quoting, but
several. Think of the Python syntax, where you have ", ', """ and ''', which
all behave slightly differently. The logic for " and ' is simple enough to
implement (basically that's what your patch does, and I'm sure it's easy
enough to extend it to accept a range of characters as splitters), but if you
have more complicated quoting operators (such as """), are you sure it's
sensible to implement the logic in split()?
2) What should the result of "this is a \"test string".split(None,-1,'"') be?
An exception (ParseError)? Silently ignoring the missing delimiter, and
returning ['this','is','a','test string']? Ignoring the delimiter altogether,
returning ['this','is','a','"test','string']? I don't think there's one case
to satisfy all here...
3) What about escapes of the delimiter? Your current patch doesn't address
them at all (AFAICT) at the moment, but what should the escaping character
be? Should "escape processing" take place, i.E. what should the result
of "this is a \\\"delimiter \\test".split(None,-1,'"') be?
Don't get me wrong, I personally find this functionality very, very
interesting (I'm +0.5 on adding it in some way or another), especially as a
part of the standard library (not necessarily as an extension to .split()).
But there's quite a lot of semantic stuff to get right before you can
implement it properly; see the complexity of the csv module, where you have
to define pretty much all of this in the dialect you use to parse the csv
file...
Why not write up a PEP?
--- Heiko.
More information about the Python-Dev
mailing list