[Tfug] Unusually high loadavg in new kernel?`

Thu Dec 20 17:58:49 MST 2007

On Thursday 20 December 2007 17:32, Bowie J. Poag wrote:
> We spotted something kind of unusual at work today -- One of our boxes
> (a CentOS/RHEL box thats kept very up to date) has begun showing that it
> has a loadaverage of about 50. This, mind you, is with nothing running.
> Nothing gated, nothing thrashing. Literally, nothing. Totally idle.
>
> Has anybody seen this sort of phenomenon before? It's not hurting
> anything, but it still strikes me as very curious -- I'm scratching my
> chin thinking it might be due to the recent change in the process
> scheduler in the Linux kernel.
>
> Anyone have any ideas?

I have seen 2 different things that can cause this:
1) I have seen what appears to be a bug in the handling of CPU speed/power 
management with ACPI/cpuspeed. Not sure if it is in cpuspeed itself, or some 
timing agent that needs to be updated about changes in jiffies. It manifests 
itself as an occasional jump in CPU load and momentary hang with a load 
average that jumps up to around 50. It looks as though something goes into a 
spawn or fork loop and freaks out for a second when a CPU speed update event 
occurs. But it is so short lived I have never been able to catch it 
explicitly see what causes it.

2) Multiple pending processes in io-wait states. If you have lots of processes 
all trying to do work, but waiting on disk (something like poorly written 
scripts using tempfiles for IO), or even a disk or IO resource that has 
died/disappeared unexpectedly, this can happen. All the processes are 
run-able, and counted in the load average, but barely/not getting any 
throughput, so not actually doing much.

Adrian