Had an issue yesterday where a customer has an SQL instance on a cluster where the Operations Manager datawarehouse was hosted. Right after starting the SQL services all processors in the machine quickly rose to 100% CPU usage. It was impossible to even stop one of the databases in the instance in order to check which one was the culprit. So I talked to a SQL DBA friend of mine and we came to the following:
Started by checking what processes are running. In SQL go to Management – Activity Monitor and right click to select the View Processes list.
While checking the processes list I found two process-IDs that were blocking. Also looked like they were waiting for eachother in some way and they were both Suspended. That is strange… 100% CPU (on 8 cores) and only suspended processes (and sleeping ones). However one of the two processes listed a Wait Type of CXPACKET.
The CXPACKET indicates that it could be related to parallelism. Looks like it is waiting for the split query (across the cores) to come back with the results (I am not that deep into the nitty gritty of this). Why it would continue to use the amoun of CPU that it did I do not know. Other programs were still able to run (a bit slowed down of course), so it only seemed to affect everything running within that sqlsrv.exe for the instance.
In any case I went into the properties of the Server (in this case the cluster instance) in the SQL manager and went to Advanced and found the Maximum degree of parallelism there. By default it is set to 0. That value allows queries to be executed across all processors. So I set it to 1 in order to force each query to execute against one core at a time. This instantly solved the issue. Unfortunately I will not be able to investigate further into this, but I learned about parallelism in the proces 🙂 We are guessing a stored procedure caused it.
Some info about parallelism options in SQL 2005 can be found here.