Python(2.5) reads an input file FASTER than pure C(Mingw)

hdante hdante at gmail.com
Mon Apr 28 03:17:51 CEST 2008


On Apr 27, 4:54 pm, n00m <n... at narod.ru> wrote:
> Another PC, another OS (Linux) and another compiler C++ (g++ 4.0.0-8)
>
> Compare 2 my latest submissions:http://www.spoj.pl/status/SBANK,zzz/
>
> times: 1.32s and 0.60s
>
> Submitted codes:
>
> import sys
> z=sys.stdin.readlines()
> print z[5]
>
> #include <cstdio>
> #include <cstdlib>
> #include <vector>
> #include <string>
>
> using namespace std;
>
> vector<string> vs;
>
> int main() {
>     while (true) {
>         char line[50];
>         if (!fgets(line,50,stdin)) break;
>         vs.push_back(line);
>     }
> return 0;
>
> }
>
> If it proves nothing then white is black and good is evil

 It seems that the "push_back" line takes most of the time of the
code. Remove it and execution will drop to 0.25s.

 Python readline uses fread instead of fgets:
 http://svn.python.org/view/python/tags/r251/Objects/fileobject.c?rev=54864&view=markup
 (see the file_readlines function)

 If you write a code that does an fread loop, execution will drop to
0.01s.

 This C code takes 0.25s. Almost all time is spent with string
manipulation.

#include <stdio.h>
#include <string.h>

#define B 8192

char vs[100000][40];
char buffer[B];

int main(void) {
	int count;
	char *begin, *end;
	int i;
	i = 0;
	while (1) {
		count = fread(buffer, 1, B, stdin);
		if (count == 0) break;
		begin = buffer;
		while(1) {
			end = (char *)memchr(begin, '\n', buffer+B-begin);
			if (end == NULL) {
				memmove(buffer, begin, buffer+B-begin);
				break;
			}
			memmove(vs[i], begin, end-begin);
			i = (i+1)%100000;
			begin = end + 1;
		}
	}
	return 0;
}

 The difference, 0.60s-0.25s = 0.35s is probably mostly python's
memory management (which seems to be much more efficient than
std::vector default).

 Very interesting post. :-) I had no idea about how much optimized the
builtin library was.





More information about the Python-list mailing list