[Tutor] Logfile multiplexing

Kent Johnson kent37 at tds.net
Wed Nov 11 13:51:00 CET 2009


On Wed, Nov 11, 2009 at 4:46 AM, Stephen Nelson-Smith
<sanelson at gmail.com> wrote:
> Hi Kent,
>
>> See the Python Cookbook recipes I referenced earlier.
>> http://code.activestate.com/recipes/491285/
>> http://code.activestate.com/recipes/535160/
>>
>> Note they won't fix up the jumbled ordering of your files but I don't
>> think they will break from it either...
>
> That's exactly the problem.  I do need the end product to be in order.

You could read many items from each log into your priority queue. If
you can confidently say that, for example, the 100th entry in the log
always occurs after the first, then you could initialize the queue
with 100 items from each log. Or if you are sure that the jitter is
never more than one second, each time you read a log you could read
until the time is two seconds after the initial time. Either of these
could probably be done as a modification of the heapq merge sort
recipe.

If you can't confidently make any claims about the locality of the
jitter, then you probably have no choice but to sort the logs first
(or sort the result when you are done, if you are filtering a lot of
items that might be faster).

Kent


More information about the Tutor mailing list