splitting by double newline
Peter Otten
__peter__ at web.de
Mon Feb 7 13:20:38 EST 2011
Nikola Skoric wrote:
> Hello everybody,
>
> I'd like to split a file by double newlines, but portably. Now,
> splitting by one or more newlines is relatively easy:
>
> self.tables = re.split("[\r\n]+", bulk)
>
> But, how can I split on double newlines? I tried several approaches,
> but none worked...
If you open the file in universal newline mode with
with open(filename, "U") as f:
bulk = f.read()
your data will only contain "\n". You can then split with
blocks = bulk.split("\n\n") # exactly one empty line
or
blocks = re.compile(r"\n{2,}").split(bulk) # one or more empty lines
One last variant that doesn't read in the whole file and accepts lines with
only whitespace as empty:
with open(filename, "U") as f:
blocks = ("".join(group) for empty, group in itertools.groupby(f,
key=str.isspace) if not empty)
More information about the Python-list
mailing list