<div><div><div>I have some some (~50) text files that have about 250,000 rows each. I am reading them in using the following which gets me what I want. But it is not fast. Is there something I am missing that should help. This is mostly an question to help me learn more about python. It takes about 4 min right now.</div>
<div><br></div><div>def read_data_file(filename):</div><div> reader = csv.reader(open(filename, "U"),delimiter='\t')</div><div> read = list(reader)</div><div> data_rows = takewhile(lambda trow: '[MASKS]' not in trow, [x for x in read])</div>
<div> data = [x for x in data_rows][1:]</div><div> </div><div> mask_rows = takewhile(lambda trow: '[OUTLIERS]' not in trow, list(dropwhile(lambda drow: '[MASKS]' not in drow, read)))</div><div> mask = [row for row in mask_rows if row][3:]</div>
<div> </div><div> outlier_rows = dropwhile(lambda drows: '[OUTLIERS]' not in drows, read)</div><div> outlier = [row for row in outlier_rows if row][3:]</div><div><br></div><div><br></div><div name="mailplane_signature">
<table><tbody><tr><td width="80">
<img src="http://www.gravatar.com/avatar/226e40fdc55d4597a46279296a616384.png">
</td><td width="10"></td><td width="127" align="center">
<div style="padding-right: 5px; padding-left: 5px;
font-size: 11px; padding-bottom: 5px; color: #666666;
padding-top: 5px">
<p><strong>Vincent Davis<br>
720-301-3003
</strong><br>
<a href="mailto:vincent@vincentdavis.net">vincent@vincentdavis.net</a> </p>
<div style="font-size: 10px">
<a href="http://vincentdavis.net">my blog</a> |
<a href="http://www.linkedin.com/in/vincentdavis">LinkedIn</a></div></div></td></tr><tr></tr></tbody></table></div><br></div></div>