Break large file down into multiple files
Chris
cwitts at gmail.com
Fri Feb 13 06:36:11 EST 2009
On Feb 13, 1:19 pm, Chris <cwi... at gmail.com> wrote:
> On Feb 13, 10:02 am, redbaron <ivanov.ma... at gmail.com> wrote:
>
>
>
> > > New to python.... I have a large file that I need to break up into
> > > multiple smaller files. I need to break the large file into sections
> > > where there are 65535 lines and then write those sections to seperate
> > > files.
>
> > If your lines are variable-length, then look at itertools recipes.
>
> > from itertools import izip_longest
>
> > def grouper(n, iterable, fillvalue=None):
> > "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
> > args = [iter(iterable)] * n
> > return izip_longest(fillvalue=fillvalue, *args)
>
> > with open("/file","r") as f:
> > for lines in grouper(65535,f,""):
> > data_to_write = '\n'.join(lines).rstrip("\n")
> > ...
> > <write data where you need it here>
> > ...
>
> I really would not recommend joining a large about of lines, that will
> take some times.
>
> fIn = open(input_filename, 'rb')
> chunk_size = 65535
>
> for i,line in enumerate(fIn):
> if not i: # First Line in the File, create a file to start
> writing to
> filenum = '%04d'%(i%chunk_size)+1
> fOut = open('%s.txt'%filenum, 'wb')
> if i and not i % chunk_size: # Once at the chunk_size close the
> old file object and create a new one
> fOut.close()
> filenum = '%04d'%(i%chunk_size)+1
> fOut = open('%s.txt'%filenum, 'wb')
> if not i % 1000:
> fOut.flush()
> fOut.write(line)
>
> fOut.close()
> fIn.close()
Whoops, day-dreaming mistake. Use "filenum = '%04d'%(i/chunk_size)+1"
and not i%chunk_size.
More information about the Python-list
mailing list