[Tutor] Problem When Iterating Over Large Test Files
Steven D'Aprano
steve at pearwood.info
Thu Jul 19 01:54:26 CEST 2012
On Wed, Jul 18, 2012 at 04:33:20PM -0700, Ryan Waples wrote:
> I'm seeing some unexpected output when I use a script (included at
> end) to iterate over large text files. I am unsure of the source of
> the unexpected output and any help would be much appreciated.
It may help if you can simplify your script to the smallest amount of
code which demonstrates the problem. See here for more details:
http://sscce.org/
More suggestions follow below.
> In my output I am seeing lines that don't occur in the original file,
> and that don't match any lines in the original file.
How do you know? What are you doing to test that they don't match the
original?
I'm not suggesting that you are wrong, I'm just trying to see what steps
you have already taken.
> The incidences
> of badly formatted lines don't seem to match up with any patterns in
> the data file, and occur across multiple different data files.
Do they occur at random, or is this repeatable?
That is, if you get this mysterious output for files A, B, H and Q
(say), do you *always* get them for A, B, H and Q?
> I've included 20 consecutive lines of input and output. Each of these
> 5 'records' should have been selected and printed to the output file.
Earlier, you stated that each record should be four lines. But your
sample data starts with a record of three lines.
More to follow later (time permitting).
--
Steven
More information about the Tutor
mailing list