
It looks like your "load average" is computing something very different than the traditional Unix "load average". If I'm reading right, yours is a measure of what percentage of the time the loop spent sleeping waiting for I/O, taken over the last 60 ticks of a 1 second timer (so generally slightly longer than 60 seconds). The traditional Unix load average is an exponentially weighted moving average of the length of the run queue.
The implementation proposed wants to expose the load of the loop. Having a direct metric that comes from the loop instead of using an external metric such as CPU, load average u others. Yes, the load average uses a decay function based on the length of the run queue for those processes that are using or waiting for a CPU, this gives us extra information about how overloaded is our system. If you compare it with the CPU load. In the case presented, the load of the loop is something equivalent with the load of the CPU and it does not have the ability to inform you about how much overloaded is your loop once reached the 100%.
Is one of those definitions better for your goal of detecting when to shed load? I don't know. But calling them the same thing is pretty confusing :-). The Unix version also has the nice property that it can actually go above 1; yours doesn't distinguish between a service whose load is at exactly 100% of capacity and barely keeping up, versus one that's at 200% of capacity and melting down. But for load shedding maybe you always want your tripwire to be below that anyway.
Well, I partially disagree with this. The load definition has its equivalent in computing with other metrics that have a close range, such as the CPU one. I've never had the intention to align the load of the loop with the load average, I've just used the concept as an example of the metric that might be used to check how loaded is your system.
More broadly we might ask what's the best possible metric for this purpose – how do we judge? A nice thing about the JavaScript library you mention is that scheduling delay is a real thing that directly impacts the quality of service – it's more of an "end to end" measure in a sense. Of course, if you really want an end to end measure you can do things like instrument your actual logic, see how fast you're replying to HTTP requests or whatever, which is even more valid but creates complications because some requests are supposed to take longer than others, etc. I don't know which design goals are important for real operations.
Here the key for me, something where I should have based my rationale. How good is the way presented to measure a load of your asynchronous system compared with the toobusy one? what can we achieve with this metric? I will work on that as the base of my rationale for the change proposed. Then, once if the rationale is accepted the implementation is peanuts :) -- --pau