how to split this kind of text into sections
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Fri Apr 25 11:18:33 EDT 2014
On Fri, 25 Apr 2014 21:07:53 +0800, oyster wrote:
> I have a long text, which should be splitted into some sections, where
> all sections have a pattern like following with different KEY. And the
> /n/r can not be used to split
>
> I don't know whether this can be done easily, for example by using RE
> module
[... snip example ...]
> I hope I have state myself clear.
Clear as mud.
I'm afraid I have no idea what you mean. Can you explain the decision
that you make to decide whether a line is included, or excluded, or part
of a section?
> [demo text starts]
> a line we do not need
How do we decide whether the line is ignored? Is it the literal text "a
line we do not need"?
for line in lines:
if line == "a line we do not need\n":
# ignore this line
continue
> I am section axax
> I am section bbb, we can find that the first 2 lines of this section all
> startswith 'I am section'
Again, is this the *literal* text that you expect?
> .....(and here goes many other text)... let's continue to
> let's continue, yeah
> .....(and here goes many other text)...
> I am using python
> I am using perl
> .....(and here goes many other text)...
> [demo text ends]
>
> the above text should be splitted as a LIST with 3 items, and I also
> need to know the KEY for LIST is ['I am section', 'let's continue', 'I
> am using']:
How do you decide that they are the keys?
> lst=[
> '''I am section axax
> I am section bbb, we can find that the first 2 lines of this section all
> startswith 'I am section'
> .....(and here goes many other text)...''',
>
> '''let's continue to
> let's continue, yeah
> .....(and here goes many other text)...''',
>
>
> '''I am using python
> I am using perl
> .....(and here goes many other text)...'''
> ]
Perhaps it would be better if you show a more realistic example.
--
Steven D'Aprano
http://import-that.dreamwidth.org/
More information about the Python-list
mailing list