Python(2.5) reads an input file FASTER than pure C(Mingw)

hdante hdante at gmail.com
Sat Apr 26 14:48:04 EDT 2008


On Apr 26, 12:10 pm, n00m <n... at narod.ru> wrote:
> Both codes below read the same huge(~35MB) text file.
> In the file > 1000000 lines, the length of each line < 99 chars.
>
> Stable result:
> Python runs ~0.65s
> C : ~0.70s
>
> Any thoughts?
>
> import time
> t=time.time()
> f=open('D:\\some.txt','r')
> z=f.readlines()
> f.close()
> print len(z)
> print time.time()-t
> m=input()
> print z[m]
>
> #include <cstdio>
> #include <cstdlib>
> #include <iostream>
> #include <ctime>
>
> using namespace std;
> char vs[1002000][99];
> FILE *fp=fopen("D:\\some.txt","r");
>
> int main() {
>     int i=0;
>     while (true) {
>         if (!fgets(vs[i],999,fp)) break;
>         ++i;
>     }
>     fclose(fp);
>     cout << i << endl;
>     cout << clock()/CLOCKS_PER_SEC << endl;
>
>     int m;
>     cin >> m;
>     cout << vs[m];
>     system("pause");
> return 0;
>
> }
>
>

 First try again with pure C code and compile with a C compiler, not
with C++ code and C++ compiler.
 Then, tweak the code to use more buffering, to make it more similar
to readline code, like this (not tested):

#include <stdio.h>
#include <time.h>

char vs[1002000][100];
char buffer[65536];

int main(void) {
    FILE *fp;
    int i, m;
    clock_t begin, end;
    double t;

    begin = clock();
    fp = fopen("cvspython.txt", "r");
    i = 0;
    setvbuf(fp, buffer, _IOFBF, sizeof(buffer));
    while(1) {
        if(!fgets(vs[i], 100, fp)) break;
        ++i;
    }
    fclose(fp);
    printf("%d\n", i);
    end = clock();
    t = (double)(end - begin)/CLOCKS_PER_SEC;
    printf("%g\n", t);

    scanf("%d", &m);
    printf("%s\n", vs[m]);
    getchar();
    return 0;
}

 Finally, repeat your statement again, if necessary.



More information about the Python-list mailing list