Break large file down into multiple files

Chris cwitts at gmail.com
Fri Feb 13 06:19:52 EST 2009


On Feb 13, 10:02 am, redbaron <ivanov.ma... at gmail.com> wrote:
> > New to python.... I have a large file that I need to break up into
> > multiple smaller files. I need to break the large file into sections
> > where there are 65535 lines and then write those sections to seperate
> > files.
>
> If your lines are variable-length, then look at itertools recipes.
>
> from itertools import izip_longest
>
> def grouper(n, iterable, fillvalue=None):
>     "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
>     args = [iter(iterable)] * n
>     return izip_longest(fillvalue=fillvalue, *args)
>
> with open("/file","r") as f:
>     for lines in grouper(65535,f,""):
>         data_to_write = '\n'.join(lines).rstrip("\n")
>         ...
>         <write data where you need it here>
>         ...

I really would not recommend joining a large about of lines, that will
take some times.

fIn = open(input_filename, 'rb')
chunk_size = 65535

for i,line in enumerate(fIn):
    if not i:   # First Line in the File, create a file to start
writing to
        filenum = '%04d'%(i%chunk_size)+1
        fOut = open('%s.txt'%filenum, 'wb')
    if i and not i % chunk_size:   # Once at the chunk_size close the
old file object and create a new one
        fOut.close()
        filenum = '%04d'%(i%chunk_size)+1
        fOut = open('%s.txt'%filenum, 'wb')
    if not i % 1000:
        fOut.flush()
    fOut.write(line)

fOut.close()
fIn.close()



More information about the Python-list mailing list