Splitting Text files
William Park
opengeometry at NOSPAM.yahoo.ca
Tue Jul 2 16:51:07 EDT 2002
Ken Seergobin <kseergobin at sympatico.ca> wrote:
> "William Park" <opengeometry at NOSPAM.yahoo.ca> wrote in message
> news:aft0r5$go61b$3 at ID-99293.news.dfncis.de...
>
>> Perhaps, you should remove 'X-No-Archive'. Most people won't give
>> answers, let alone reply, to such posts.
>
> Personally, I'm not thrilled by every scrap of information being
> recorded. However, if the no-archive option makes getting information
> easier the content of the original post will be repeated in this message.
> (That said, I do understand why posts like these should be archived.)
>
> Original Post:
>
> I've looked around, but have been unable to locate a good example of how
> to split a text file. Specifically, I have datafiles with an
> identification line marked with the name of a BMP file followed by many
> lines of data. This repeats a number of times for each datafile. Within
> the data lines I'm only interested in extracting the those with a
> specific keyword. Ultimately, I'd like to have a datafile for each BMP
> listed in the original file.
>
> Suggestions would be appreciated. I really couldn't make sense of the
> regular expression notes I found.
>
> Thanks,
> Ken
I'm guessing that your data file looks something like
file1.bmp
...<data lines>...
...
file2.bmp
...<data lines>...
...
file3.bmp
...
1. In shell, you'd do like
csplit file '/\w*\.bmp/' '{*}' --> xx00, xx01, ...
mv xx00 file1.bmp
mv xx01 file2.bmp
...
2. However, since you're only interested in those data lines with certain
keywords, simply do
egrep -e '^file[0-9]\.bmp$' -e 'your_search_pattern' file
or
for x in xx[0-9][0-9]; do
egrep 'your_search_pattern' $i
done
Translating these to Python is left as exercise for readers. ;-)
--
William Park, Open Geometry Consulting, <opengeometry at yahoo.ca>
8-CPU Cluster, Hosting, NAS, Linux, LaTeX, python, vim, mutt, tin
More information about the Python-list
mailing list