[Tutor] How to list/process files with identical character strings
Mark Lawrence
breamoreboy at yahoo.co.uk
Wed Jun 25 00:27:45 CEST 2014
On 24/06/2014 22:01, mark murphy wrote:
> Hi Danny, Marc, Peter and Alex,
>
> Thanks for the responses! Very much appreciated.
>
> I will take these pointers and see what I can pull together.
>
> Thanks again to all of you for taking the time to help!
>
> Cheers,
> Mark
>
>
> On Tue, Jun 24, 2014 at 4:39 PM, Danny Yoo <dyoo at hashcollision.org
> <mailto:dyoo at hashcollision.org>> wrote:
>
> The sorting approach sounds reasonable. We might even couple it with
> itertools.groupby() to get the consecutive grouping done for us.
>
> https://docs.python.org/2/library/itertools.html#itertools.groupby
>
>
> For example, the following demonstrates that there's a lot that the
> library will do for us that should apply directly to Mark's problem:
>
> #########################################
> import itertools
> import random
>
> def firstTwoLetters(s): return s[:2]
>
> grouped = itertools.groupby(
> sorted(open('/usr/share/dict/words')),
> key=firstTwoLetters)
>
> for k, g in grouped:
> print k, list(g)[:5]
> #########################################
In order to really overwhelm you see more_itertools.pairwise here
http://pythonhosted.org//more-itertools/api.html as I've found it useful
on several occasions.
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com
More information about the Tutor
mailing list