[Tutor] Concatenating multiple lines into one

Hugo Arts hugo.yoshi at gmail.com
Fri Feb 10 18:00:32 CET 2012


On Fri, Feb 10, 2012 at 5:38 PM, Spyros Charonis <s.charonis at gmail.com> wrote:
> Dear python community,
>
> I have a file where I store sequences that each have a header. The structure
> of the file is as such:
>
>>sp|(some code) =>1st header
> ATTTTGGCGG
> MNKPLOI
> .....
> .....
>
>>sp|(some code) => 2nd header
> AAAAAA
> GGGG ...
> .........
>
> ......
>
> I am looking to implement a logical structure that would allow me to group
> each of the sequences (spread on multiple lines) into a single string. So
> instead of having the letters spread on multiple lines I would be able to
> have 'ATTTTGGCGGMNKP....' as a single string that could be indexed.
>
> This snipped is good for isolating the sequences (=stripping headers and
> skipping blank lines) but how could I concatenate each sequence in order to
> get one string per sequence?
>
>>>> for line in align_file:
> ...     if line.startswith('>sp'):
> ...             continue
> ...     elif not line.strip():
> ...             continue
> ...     else:
> ...             print line
>
> (... is just OS X terminal notation, nothing programmatic)
>
> Many thanks in advance.
>
> S.
>

python has a simple method to do that, str.join. Let me demonstrate it:

>>> a = ['a', 'b', 'c', 'd', 'e']
>>> ''.join(a)
'abcde'
>>> ' '.join(a) # with a space
'a b c d e'
>>> ' hello '.join(a) # go crazy if you want
'a hello b hello c hello d hello e'

so, it takes a list as an argument and joins the elements together in
a string. the string that you call join on is used as a separator
between the arguments. Pretty simple.

HTH,
Hugo


More information about the Tutor mailing list