[New-bugs-announce] [issue1142] code sample showing errors reading large files with py 2.5

christen report at bugs.python.org
Mon Sep 10 17:52:42 CEST 2007

New submission from christen:

Error in reading >4Go files under windows

try this:

import sys
import time
print (time.strftime('%Y-%m-%d %H:%M:%S'))
start = time.time()
for i in xrange(85014961):
    if i%5000000==0 and i>0:
        print (i,time.time()-start)
    fichout.write(str(i)+' '*59+'\n')
print ('total lines written ',i)
print (i,time.time()-start)
print ('*'*50)
start3 = time.time()
for i,li in enumerate(fichin):
    if i%5000000==0 and i>0:
        print (i,time.time()-start3)
print ('total lines read ',i)

it generates a >4Go file,not all lines are read !!
('total lines written ', 85014960)
('total lines read ', 85014950)
10 lines are missing

if you replace by
fichout.write(str(i)+' '*59+'\n')

file is now under 4Go, is properly read
Used both a 32 and 64 Windows XP machines

seems to work with Linux and BSD (did not tried this example but had no
pb with my home made big files)
Pb : many examples of >4Go files for the human genome and other
biological applications. Almost sure that people are doing mistakes,
because it took me a while before discovering that...
Note : does not happen with py 3k :-)

components: Windows
messages: 55785
nosy: Richard.Christen at unice.fr
severity: urgent
status: open
title: code sample showing errors reading large files with py 2.5
type: behavior
versions: Python 2.5

Tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list