beginner's questions - manipulating text files

Ben Finney bignose+hates-spam at benfinney.id.au
Wed Jul 2 12:57:00 CEST 2008


Cédric Lucantis <omer at no-log.org> writes:

> Le Wednesday 02 July 2008 01:16:30 Ben Keshet, vous avez écrit :
> > I am trying to write a script that will search for the second and
> > third appearance of the symbol '@' in a file, will read a random
> > line between them, and write the line into a new file.
> 
> If the file you're reading is not too big, you can use
> file.readlines() which read all the files and returns its content as
> a list of lines.

Better is to iterate over the file object, getting a line each time.
This works regardless of the size of the file, because it doesn't
attempt to read the entirety of a large file in at once.

> text.find('@') will return the position of the first occurence of
> '@', or a negative value if not found.

If one is interested only in *whether* text is contained within a
string (and is uninterested in its position), the 'in' operator
returns a boolean value.

Untried code:

    import random

    separator = "@"
    interesting_lines = []

    input_file = open("foo.txt")
    seen_separator_count = 0
    for line in input_file:
        if separator in line:
            # Count this separator, but don't keep the line.
            seen_separator_count += 1
            continue
        if seen_separator_count == 2:
            # We have seen exactly two lines with separators,
            # so we're interested in the current line.
            interesting_lines.append(line)
        if seen_separator_count >= 3:
            # After seeing three lines with separators, stop reading.
            break

    chosen_line = random.choice(interesting_lines)
    output_file = open("bar.txt", 'w')
    output_file.write(chosen_line)

-- 
 \         “Smoking cures weight problems. Eventually.” —Steven Wright |
  `\                                                                   |
_o__)                                                                  |
Ben Finney



More information about the Python-list mailing list