[Tutor] How to list/process files with identical character strings

Mark Lawrence breamoreboy at yahoo.co.uk
Wed Jun 25 00:27:45 CEST 2014


On 24/06/2014 22:01, mark murphy wrote:
> Hi Danny, Marc, Peter and Alex,
>
> Thanks for the responses!  Very much appreciated.
>
> I will take these pointers and see what I can pull together.
>
> Thanks again to all of you for taking the time to help!
>
> Cheers,
> Mark
>
>
> On Tue, Jun 24, 2014 at 4:39 PM, Danny Yoo <dyoo at hashcollision.org
> <mailto:dyoo at hashcollision.org>> wrote:
>
>     The sorting approach sounds reasonable.  We might even couple it with
>     itertools.groupby() to get the consecutive grouping done for us.
>
>     https://docs.python.org/2/library/itertools.html#itertools.groupby
>
>
>     For example, the following demonstrates that there's a lot that the
>     library will do for us that should apply directly to Mark's problem:
>
>     #########################################
>     import itertools
>     import random
>
>     def firstTwoLetters(s): return s[:2]
>
>     grouped = itertools.groupby(
>          sorted(open('/usr/share/dict/words')),
>          key=firstTwoLetters)
>
>     for k, g in grouped:
>          print k, list(g)[:5]
>     #########################################

In order to really overwhelm you see more_itertools.pairwise here 
http://pythonhosted.org//more-itertools/api.html as I've found it useful 
on several occasions.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com




More information about the Tutor mailing list