Need help in extracting lines from word using python
Dave Angel
davea at davea.name
Tue Mar 19 10:54:24 EDT 2013
On 03/19/2013 10:20 AM, razinzamada at gmail.com wrote:
> I'm currently trying to extract some data between 2 lines of an input file
Your subject line says "from word". I'm only guessing that you might
mean Microsoft Word, a proprietary program that does not, by default,
save text files. The following code and description assumes a text
file, so there's a contradiction.
> using Python. the infile is set up such that there is a line -START- where I need the next 10 lines of code if and only if the -END- condition occurs before the next -START-. The -START- line occurs many times before the -END-. Heres a general example of what I mean:
>
In other words, you want to scan for -END-, then go backwards to -START-
and use the first ten of the lines between? Try coding it that way, and
perhaps it'll be easier.
You also need to consider (and specify behavior for) the possibility
that start and end are less than 10 lines apart.
> blah
> blah
> -START-
> 10 lines I DONT need
> blah
> -START-
> 10 lines I need
> blah
> blah
> -END-
> blah
> blah
> -START-
> 10 lines I dont need
> blah
> -START-
>
> .... and so on and so forth
>
> so far I have only been able to get the -START- + 10 lines for every iteration, but am at a total loss when it comes to specifying the condition to only write if the -END- condition comes before another -START- condition. I'm a bit of a newb, so any help will be greatly appreciated.
>
>
> heres the code I have for printing the -START- + 10 lines:
>
> in = open('input.log')
> out = open('output.txt', 'a')
>
> lines = in.readlines()
> for i, line in enumerate(lines):
> if (line.find('START')) > -1:
> out.write(line)
> out.write(lines[i + 1])
> out.write(lines[i + 2])
> out.write(lines[i + 3])
> out.write(lines[i + 4])
> out.write(lines[i + 5])
> out.write(lines[i + 6])
> out.write(lines[i + 7])
> out.write(lines[i + 8])
> out.write(lines[i + 9])
> out.write(lines[i + 10])
or just out.write(lines[i:i+11) to write out all 11 of them.
>
--
DaveA
More information about the Python-list
mailing list