"in" operator for strings
Magnus Lie Hetland
mlh at idi.ntnu.no
Sat Feb 3 04:21:20 CET 2001
"Alex Martelli" <aleaxit at yahoo.com> wrote in message
news:95ep620rbt at news1.newsguy.com...
> "Magnus Lie Hetland" <mlh at idi.ntnu.no> wrote in message
> news:95ebcv$bn$1 at tyfon.itea.ntnu.no...
> > only allow contiguous subsequences, or any subsequence?
> > If we ever get built-in sets contiguity would be meningless,
> > as would it be if we get "in" tests for dictionaries.)
> Dictionaries aren't sequences, nor will sets be, so the
> issue doesn't apply to either.
Sure it does. It will then become the question of whether to
allow the same operator do check membership and subset-ness.
> Implementing __getitem__, to enumerate all
> subsequences (contiguous and non) is also
> of some interest -- a typical combinatorics
> but, that's lazy -- surely some closed form
> exists (but there is no space in the margins
> of this post for me to write it down...).
> finding a closed-form expression
> of "N-th contiguous subsequence" sounds much more fun,
> and it might in fact have some use (e.g., "get a random
> contiguous subsequence with all contiguous subsequences
> having equal probability" -- if you can enumerate such
> subsequences in closed form then that's a solved problem
> I suspect a modified Bayer-Moore
> could be very fast and
> effective for contiguous-subsequence checks -- sublinear
> just like "real" Bayer-Moore!-)
I'm sure :-)
What's the built-in string-matching algorithm in Python?
(Not very important to me, but...)
> with bisection (binary search) over the
> 'places' sequences, O(M log N) appears like an easy
> upperbound, and maybe it can be cut down further.
Well... If you could store an array of size N for each
letter you could easily do it in O(1) time... But then
you'd need O(M*N) time to initialize the stuff :-)
If you add a requirement that the matched letters must
be relatively close to each other (within a constant
distance limit) you can get a speedup. (A reasonable
requirement in some cases.) Actually, I know of some
harwdare that does this very quickly.
> > I think it's quite OK to use
> > from string import split
> In a small module, maybe. Remember the status of local
> import statements is in doubt... they may be disallowed
> in Python 2.1 (hopefully they won't be, but, who knows),
> so it's not clear that you could do this inside a function.
> "doubtful" seems a fair assessment to me -- there ARE
> doubts, depending on context; it's not necessarily "quite
> OK" -- it may or may not be.
OK... I guess that's one way of interpreting the statement...
A rather mild one, IMO :-)
To me the module doesn't even have to be small. Just think
about all the other function and class names you define in
a module... Why should "split" specifically have to have
a module prefix? If I choose to import it that may be as
thought through as defining a function myself. Only if I
have another function called "split" would I feel the
need to add its module name. I do the same thing when
importing classes in Java, for instance. When I instantiate
a JFrame I find it more convenient to write
mainFrame = new JFrame();
mainFrame = new javax.swing.JFrame();
(OK... So I know you dislike the word "convenient". Well...
I find it prettier too ;-)
> Why not just use string.split and play it safe...?\
> Or, just as safe, the string-method...?
If in some project the other method seems unsafe, I certainly
> > I mean -- I'm not against string methods. They're nice. I
> > just get a bit woozy seeing them called on literals. It
> I think we'll all get used to that in time.
I guess I'll have to. But I don't think they are all
That means 'join the elements of somesequence by " "'
On the other hand,
means 'extend  by somesequence'
Why not write somesequence.extend() or
somesequence.join(" ") (not both, of course) so that they would
be consistent with each other? Oh, well. I'm sure it's just
that I don't see the Greatness of it all.
> Why would mutators be any more worthy of syntax-sugar
> support than accessors? You've lost me here...
OK. Accessors are fine, but we hardly need those in Python
The " ".join([1,2,3]) doesn't seem like an accessor to me.
It seems like a plain function with a weird syntax.
But, hey, since this is the way it *is*, I'm the one with
a problem here :-)
Magnus Lie Hetland (magnus at hetland dot org)
"Reality is what refuses to disappear when you stop
believing in it" -- Philip K. Dick
More information about the Python-list