[Tutor] Find (list) strings in large textfile

Sylwester Graczyk sylwester.graczyk at gmail.com
Sat Feb 11 15:30:53 EST 2017


>
> It is probably better to store your key file in memory
> then loop over the large data file and check the
> line against each key. Better to check 2000 data
> keys in memory for one loop of the data file.
> That way you only read the key file and data file
> once each - 502,000 reads instead of a billion.
>
I replace one loop (in file), with searching in a *list*, and it's much 
faster :)

    my_list = open("list_file.txt")
    file_list = [i[:-1] for i in my_list.readlines()]

    file_large = open("large_file.txt")
    save_file = open("output.txt", "w")

    for row in file_large:
         split_row = row.split()
    if split_row[0] in file_list:
             save_file.write(row)

    file_large.close()
    file_list.close()


thx for all


More information about the Tutor mailing list