These are my notes on troubleshooting with wait stats:
It's not my favorite approach to correcting issues, as this is more reactive than proactive. But sometimes its necessary to be able to dig through the wait stats and discern why it is the query is taking forever.
Things to keep in mind:
When waiting SQL Server will collect what you are waiting on
PAGEIOIOLATCH_xx – Waiting to ftech pages from storage into memory
SOS_SCHEDULER_YIELD – Task has 'voluntarily' yielded to another task so it can complete. Basically the query is waiting for SQL Server to say 'Go ahead'
PAGELATCH_xx – Query might be waiting for the public toilet to have enough bandwidth for it. Protecting data in memory from inconsistency.
ASYNC_NETWORK_IO – Processing of the query is done.. This usually means that SQL Server is trying to send the results of the query to the client and the network limitations or client app limitations are restricting it.
LCK_xx – If updating data these waits will be created in order to maintain data integrity – isolating transactions
LATCH_xx – Protecting memory from other transactions.
WRITELOG – Flushing the transaction to storage as it commits
ASYNC_IO_COMPLETION – All committed transactions from the dirty pages in memory are asynchronously committed to the database
CX_PACKET – When a query goes parallel, some of the threads might take longer than others. This means its waiting for everyone to finish
- When SQL Server needs to update data, it will need to pull the clustered index and ALL nonclustered indexes into data. Even triggers and others items will be brought in.
- NOT ALL WAITS ARE BAD – They need to be seen as excessive or impactful
- Running
- Suspended
- Waiting for something
- "Yielding" so others can run
- Runnable
- Ready to run, but waiting to do so
- Needs someone else to yield
- Sleeping
- No work to do
- Waiting for instruction from the app
When waiting SQL Server will collect what you are waiting on
PAGEIOIOLATCH_xx – Waiting to ftech pages from storage into memory
SOS_SCHEDULER_YIELD – Task has 'voluntarily' yielded to another task so it can complete. Basically the query is waiting for SQL Server to say 'Go ahead'
PAGELATCH_xx – Query might be waiting for the public toilet to have enough bandwidth for it. Protecting data in memory from inconsistency.
ASYNC_NETWORK_IO – Processing of the query is done.. This usually means that SQL Server is trying to send the results of the query to the client and the network limitations or client app limitations are restricting it.
LCK_xx – If updating data these waits will be created in order to maintain data integrity – isolating transactions
LATCH_xx – Protecting memory from other transactions.
WRITELOG – Flushing the transaction to storage as it commits
ASYNC_IO_COMPLETION – All committed transactions from the dirty pages in memory are asynchronously committed to the database
CX_PACKET – When a query goes parallel, some of the threads might take longer than others. This means its waiting for everyone to finish
Examples of problems:
Example
|
Issues
Caused
|
Type of
waits you might see
|
| Implicit Conversions | Big Index Scans (more IO and CPU) | PAGEIOLATCH_xx, SOS_SCEDULER_YIELD |
| Missing Critical Indexes | Big Index Scans (more IO and CPU) | PAGEIOLATCH_xx, SOS_SCEDULER_YIELD |
| Cursors (the devil) | Lots of tiny writes – ants mobbing the tlog | WRITELOG |
| Duplicate Indexes | Clog up your memory and cause extra IO | PAGEIOLATCH_xx, ASYNC_IO_COMPLETION |
| "After" triggers (Satan) causing cascading deletes and updates on foreign keys | Make transactions take longer, Secretly uses escalated isolation level to cascade deletes to child tables | LCK_xx (which can lead to a large backlog of queries which often lead to CPU and Storage Waits |
| App Running giant reports | More data sent to app server than it can handle | ASYNC_NETWORK_IO |
Refer to waits and stats by Paul Randal for an excellent method of collecting these (Create a table with a Date column (default getdate()) and Insert to exec on a schedule (every five minutes or hour whatever you prefer.)
Or you can also use the handy dandy sp_whoisactive and a table with a loop around it:
Brent Ozar: Who is Active to a Table
I actually do this.