[Tutor] Finding the shortest word in a list of words

Marc Tompkins marc.tompkins at gmail.com
Tue Jan 20 21:14:36 CET 2009

On Tue, Jan 20, 2009 at 11:23 AM, Lie Ryan <lie.1296 at gmail.com> wrote:

> what I meant as wrong is that it is possible that the code would be used
> for a string that doesn't represent human language, but arbitrary array
> of bytes. Also, it is a potential security issue.

This is something I need to know, then - sys.maxint is a potential security
issue?  How?   Should it be avoided?  (Guess I'd better get Googling...)

> > You could just simply use the len of the first word.
> >
> > True dat.  Requires an extra step or two, though - initializing with
> > some impossibly huge number is quick.
> len() is fast, and it also removes the need to import sys, which actually
> removes an extra step or two
Unintended consequence - initializing minLen with the length of the first
word results in trying to append to a list that doesn't exist yet - so must
create minWord and maxWord ahead of time.  (Could use try/except... no.)
Also - it occurred to me that the input might be sentences, and sentences
contain punctuation... I already took the liberty of adding "is" to the end
of the OP's signature quotation; now I add a period:

> corpus = "No victim has ever been more repressed and alienated than the
> truth is."

Now "is." has length 3, not 2; probably not what we had in mind.
So, new version:

> def MinMax(corpus=""):
>     import string
>     corpus = "".join( [x for x in corpus if x not in string.punctuation] )
>     words = corpus.split()
>     minLen = len(words[0])
>     maxLen = 0
>     minWord, maxWord = [],[]
>     for word in words:
>         curLen = len(word)
>         if curLen == minLen:
>             minWord.append(word)
>         if curLen == maxLen:
>             maxWord.append(word)
>         if curLen > maxLen:
>             maxWord = [word]
>             maxLen = curLen
>         if curLen < minLen:
>             minWord = [word]
>             minLen = curLen
>     return minLen, minWord, maxLen, maxWord

Is there a good/efficient way to do this without importing string?
Obviously best to move the import outside the function to minimize
redundancy, but any way to avoid it at all?

which, at the time of writing, was my impression on the OP's request.

Quote: "I need to find the shortest / longest word(s) in a sequence of

I'm sure the OP has moved on by now... time I did likewise.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20090120/d18c49e3/attachment-0001.htm>

More information about the Tutor mailing list