I will talk today about Performance of SQL queries. The performance of queries is a task which always has multiple causes and seen from different angles gives different solutions. Personally, I have sort-of a rule when confronting with bad queries: before coming up with o solution to improve execution of a specific query, I look at every possible angle of the problem and make sure all is covered according to performance best practices.
First, when a performance problem occurs? Basically there are few hints:
- Application users make complaints about slow running processes.
- Developers report bad performance for some queries and need DBA analyze before going in Production with the code.
- Any of following resources is often triggered with high usage: CPU, Disk, Memory
- Frequent blockages due to long-running or high I/O queries .
Next I will go through most important steps to do when handling a bad query
Checking the Execution Plan of a query is the first thing to look at because it will give you a lot of information about how the statements are executed and provide estimates of the resources requirements for the statements as well at their order. It will outline the query plan chosen for a specific query. Do it in two ways:
- From SQL Server Management Studio: at runtime click on “Include Actual Execution Plan” or without executing the query click on “Display Extimated Execution Plan”
- Issue SET SHOWPLAN_ALL ON at runtime. When SET SHOWPLAN_ALL is ON, SQL Server returns execution information for each statement without executing it. This will detail the exact order of each statement.
Next, look inside Query Plan for statements which have Query Cost relative to batch higher than the rest of statements.
Pay attention to following events:
- Index / Table Scan – Shows the need for an Index update or Index Creation. Check that every join or filtering (=, >, <) operation has an index on the column(s) used to filter data. For better performance include the columns from the select statement in the Index structure as included columns. It will avoid Primary Key Lookups. Sometimes correct Index is being used but still Index scan is performed: It can happen if the amount of rows returned is bigger than the half of total rows from one table. SQL will choose to make a scan than an Index seek.
- Index Create hints – Double Check the Index hint syntax and make sure you don’t have already an index that can be updated with additional columns.
- Use of Convert_Implicit function in Index Scan operations - in compare, equality or inequality operations use same type of datatype. Implicit conversions are those conversions that occur without specifying either the CAST or CONVERT function. SQL will use this function to transform the value to another datatype before performing the actual operation which can take a lot of unnecessary I/O. I saw many Stored Procedures with declared variables with varchar datatype which were being assigned to nvarchar table columns. Same with int / bigint. This is not needed and is a frequent programming error that has to be fixed.
- Index Update – A Clustered Index Update when an Index key is modified is normal but Non-Clustered Index updates should be avoided since every Index update puts an extra load on SQL. Having too much Indexes leads to many Index updates. Check the current Indexes from the specific table and decide whether you really all Indexes. Remember that Indexes are good for reading data, and bad for data modifications.
- Nested Loops and Key Lookups – shows the need for covering Indexes.
- Sorts – Try to use Indexes instead which can avoid the sort operation.
This is an effortless double check measure for a query after the execution plan is analyzed. Create an .sql file with the Query code and load it to Tunning Advisor. SQL will create a report if it considers that there are missing indexes or statistics that can improve the performance. Tunning Advisor will report hints along with their syntax to create Indexes and show an improvement percent if the modifications suggested are done.
Use Profiler to see if your Query has many compilations or re-compilations that can delay the query execution. I wrote a special post about this particular step that can be read at this link.
This applies to Stored Procedures with input parameters only. Parameter sniffing is the process whereby SQL Server creates an optimal plan for a stored procedure by using the calling parameters that are passed the first time a stored procedure is executed. When a Stored Procedure is executed for the very first time, SQL Query Optimizer will compile the query and create an optimal Execution Plan that suites best the passed parameters. In order to save re-compilation time, The Execution Plan will be used at subsequent executions of the procedure if this is still is the cache.
In theory, there is no worry if each subsequent procedure call will be done using the exact number of parameters and values. The difference is when each call of the procedure is not the same as the compiled one in terms of the number of parameters passed as well as of the number of rows returned.
So, if you have a Stored Procedure that runs slowly, first thing to do is to check whether the execution plan is optimal for the result set returned.
Tune the procedure’s body script in SSMS (SQL Server Management Studio)
Execute the Procedure body script inside SSMS with a set of predefined variable values – same ones encountered at the slow execution of the stored procedure with parameters. At runtime press on “Include Actual Execution Plan” so that you can see the plan generated and have a compare start point. If the duration is considerably smaller than previous executions, than you should spend some time in analyzing this behavior.
Tune the long running procedure execution with Profiler.
Inside a Profiler trace add these events:
Filter by Database Name to find easier the procedure call and click on Run. After you have caught some results check the Execution plan at Showplan All event for the SPID displaying your procedure call. Compare this Execution Plan to the one from the SSMS.
Avoid Parameter Sniffing
To avoid parameter sniffing the stored procedure can be written in a different way with identical output. Instead of using the calling parameters that are passed at execution – replace them with local variables which will receive the value of the calling parameters passed. This little correction will lead the Query Optimizer to keep track of the statistics and create an optimal Query Plan based on a average of the real values passed as parameters.
A second option would be to use WITH RECOMPILE in Procedure body. SQL will not cache the query plan for this procedure, forcing it to be compiled each time it is executed.
Or you could execute the procedure with this hint:
EXEC Procedure 'value' WITH RECOMPILE
The Compilation will be done only this time for this procedure only. The new execution plan created will apply only this time to the called procedure.
Check this example of handling Parameter Sniffing on AdventureWorks database:
Procedure containing parameter sniffing:
CREATE PROCEDURE [DBO].[PROCEDURE_PARAM_SNIFFING] (@FIRST int) AS select name, [ProductSubcategoryID] from AdventureWorks.production.product where [ProductSubcategoryID] = @FIRST GO
Execute from 2 different sessions:
EXECUTE [PROCEDURE_PARAM_SNIFFING] 2 EXECUTE [PROCEDURE_PARAM_SNIFFING] 7
The call of [PROCEDURE_PARAM_SNIFFING] with different parameters will have the same Execution plan and scan a range of 43 rows even if second execution returns only one row.
Procedure re-written to avoid parameter sniffing:
CREATE PROCEDURE [PROCEDURE_WITHOUT_PARAM_SNIFF] (@FIRST int) AS DECLARE @LOC_FIRST int = @FIRST select name, [ProductSubcategoryID] from AdventureWorks.production.product where [ProductSubcategoryID] = @LOC_FIRST go
Execute from 2 different sessions:
EXECUTE [PROCEDURE_WITHOUT_PARAM_SNIFF] 2 EXECUTE [PROCEDURE_WITHOUT_PARAM_SNIFF] 7
The call of [PROCEDURE_WITHOUT_PARAM_SNIFF] with different parameters will still have the same Execution plan, but the scan range will represent an average (7.97) on all values grouped by ProductSubcategoryID.The estimated number of rows is not related to first compilation of this procedure, rather it’s is calculated according to tables statistics.
This applies to multi processor servers. Quoting msdn: SQL Server provides parallel queries to optimize query execution and index operations for computers that have more than one microprocessor (CPU). Because SQL Server can perform a query or index operation in parallel by using several operating system threads, the operation can be completed quickly and efficiently. During query optimization, SQL Server looks for queries or index operations that might benefit from parallel execution.
The SQL Server configuration option that manages the number of CPU’s for sqlserver process is called: max degree of parallelism and is set at server level. With this option enabled (sp_configure), SQL Engine will use multiple threads to manage the so-cold parallel-query execution plan. It can have big performance improving when speaking about complex – transactions which need to be processed all at once in multiple threads – taking advantage of the powerful CPU in place. But sometimes, when running a simple query, which does not needs to be divided into threads takes longer with the degree of parallelism enabled than without it. This is all about testing and figuring out which option is best.
The Degree of parallelism can be overwritten in singular query executions if MAXDOP hint is used. MAXDOP can be used to reduce the number of CPU’s used in multi-processing. I found a nice explanation of Pinal Dave about MAXDOP – that can help on starting with the analyze: SQL Server Maxdop settings.
To conclude this post, top 5 steps in analyzing a query’s bad performance before heading to fast solutions are:
1. Analyze Execution Query Plan
2. Analyze the query with Tunning Advisor
3. Use Profiler to track query compilations/re-compilations frequency.
4. Avoid Parameter Sniffing
5. Check Query Parallelism