Category Archives: Performance & Tunning

RangeS-S – Key-Range Locks in SERIALIZABLE isolation level (part 1)

Overview
 

Locking mechanism in SQL is by far one of the most complex mechanism and hard to explain. Blockages and Locks are a DBA’s daily problems when confronting with concurrent processes which access same records of the same table. In order to have a high – performance system, a very important aspect to consider is isolation of concurrent operations. In high-activity systems like web-sites, ERP’s or online shops there is a huge amount of concurrent transactions that select data from same tables, modify data and insert new data. In order to isolate some processes, the isolation level of transactions must be changed. The Transaction Isolation Level controls the default transaction locking behavior for all SQL statements issued by a connection.

READ COMMITTED is the default level of isolation which is also the most recommended for most processes. But this level is not always enough in order to have an ACID transaction because the data can be changed before the end of the transaction, resulting in nonrepeatable reads or phantom data. In order to isolate completely a Transaction which has a BEGIN TRANSACTION in front – you can SET TRANSACTION ISOLATION LEVEL to SERIALIZABLE inside a stored procedure. This does not affect other transactions and their isolation level.
Also, If you issue SET TRANSACTION ISOLATION LEVEL in a stored procedure or trigger, when the object returns control, the isolation level is reset to the level in effect when the object was invoked. For example, if you set REPEATABLE READ in a batch, and the batch then calls a stored procedure that sets the isolation level to SERIALIZABLE, the isolation level setting reverts to REPEATABLE READ when the stored procedure returns control to the batch.

SERIALIZABLE isolation level is equivalent HOLDLOCK hint and has the same behavior when dealing with locks.

When SERIALIZABLE isolation level is set – a range lock is placed on the data set, preventing other users from updating or inserting rows into the data set until the transaction is complete. This is the most restrictive of the four isolation levels. And also… it is the less explained on WWW. This is mainly the reason I want to write about the Locking types inside SERIALIZABLE transaction.

Key-Range Lock Modes
 

In serializable isolation level Key – Range locks are used. Serializability implies that any query executed inside a transaction must obtain the same set of data from the beginning until the end of the transaction. The set of data locked inside the serializable transaction is based on a range of key values of the index that covers the rows extracted. It is assured that the values from the range of values locked will not be modified or new values will not be inserted for the same range of values by other concurrent transactions. Any attempt of a second transaction to insert, update, or delete any row within the same range as the first one requires a modification to the index. Thus, the second transaction is blocked until the first transaction completes because key-range locks cover the index entries.

There are 4 types of locks:

  1. RangeS-S – Shared range, shared resource lock; serializable range scan.
  2. RangeS-U – Shared range, update resource lock; serializable update scan.
  3. RangeX-X – Exclusive range, exclusive resource lock; used when updating a key in a range.
  4. RangeI-N – Insert range, null resource lock; used to test ranges before inserting a new key into an index.

There are couple of basic things to know about key-range locking which are not mentioned in MSDN documentation:

  1. In equality operations (“=”, “IN”) – Key-Range locks are not held on key values that are found inside an Unique Index even if the transaction is issued in Serializable isolation level.

    Example:
    If the ix_range index created on range_lock table would be Unique, the following statement would not require Key-Range locks.

    Shared locks will be enough.

    select rname from range_locks where rname = 'anna' 
    
  2. In equality operations (“=”, “IN”) – Key-Range locks are held on “next” key values that are not found inside an Unique or Non-Unique Index.

    Example:
    Whether ix_range index created on range_lock table is Unique or Non-Unique, the following statement will hold Key-Range locks on the “next” value of the key value from the equality operation that was not found inside the index.

    In this example “annabella” field does not exist so, Key-Range locks will be put on next value from the list: “antony”.

    select rname from range_locks where rname = 'annabella' 
    
  3. In in-equality operations (“>”, “<", BETWEEN, LIKE, "<>“) – Key-Range locks are held on all key values (found) from the range specified and the “next” value. This is true for both Unique or Non-Unique Indexes.

    Example:
    Whether ix_range index created on range_lock table is Unique or Non-Unique, the following statement will hold Key – Range locks on all key values in the range when using “BETWEEN” and on the “next” value of the key value that is outside the range.

    In below example RangeS-S locks will be acquired on all values between “antony” and “barry”.
    “annabella” is missing – so the starting value of the range will be considered the next value in alphabetical order after “annabella”. Also an additional RangeS-S lock will be put on the value after “barry”: “benedict”.

    select rname from range_locks where rname between 'annabella' and 'barry'
    

RangeS-S Locks
 

I will explain through examples each type of locking. In this part I will go through Key – Range Shared – Shared locks. In order to not repeat on following parts I will write only in this part the prerequisites for my tests.
I need to create a table and insert few rows into it. The table will have a Primary Clustered Key and an Index Non-Clustered on one of the three columns.

Here you go:

create table range_lock 
(rid int NOT NULL IDENTITY (1, 1),
rname nvarchar (20),
sname nvarchar (20))

ALTER TABLE range_lock ADD PRIMARY KEY (rid);

CREATE NONCLUSTERED INDEX [ix_rname] ON [dbo].[range_lock] 
(
[rname] ASC
)

Populate the table with names. The rows will be displayed in alphabetical order due to the ix_rname index.

INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('anna')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('antony')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('angel')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('ARLEN')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('BARRY')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('BENEDICT')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('BILL')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('BRYCE')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('CAROL')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('CEDRIC')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('CLINT')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('DARELL')
INSERT INTO [recovery_test_2].[dbo].[range_lock]
           ([rname]) VALUES ('DAVID')

A simple select in the range_lock table will show the inserted values ordered by rname column ascending.

Rows ordered in ascending order by the rname column.


 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Start Serializable Transaction

Open a new SQL session and issue below query without commiting the transaction.
Until the end of this blog this session will be known to readers as session 55.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE 
go
Begin Transaction
go
select rname from range_lock where rname = 'anna'

Analyze the Query Execution Plan

RangeS-S locks - Execution Plan

RangeS-S locks - Execution Plan


 
 
 
 
 
 
 
 
You will see an Index Seek on ix_rname index because the ix_rname contains the rname column needed in the SELECT. Index Seek only touches only the rows that qualify.

IMPORTANT!!!

If I would issue a SELECT on all table columns, then the Query Optimizer will perform an Index Scan on the Primary Key in order to retrieve rid and sname values position from the Clustered Index. This is not optimal and must be avoided. So to avoid this, the rid and sname columns should be added as included columns inside ix_rname index.

Check Held Locks

Execute:

sp_lock 55

Here is what you would get:

RangeS-S for equality operation - Non-Unique Index

It is obvious that RangeS-S locks are acquired on IndId = 2 (ix_rname) for two key values.

Remember what I wrote earlier about equality operations and Unique Indexes – that Range locks are not held? Well, here we deal with a Non-unique Index so, it is quite normal that RangeS-S locks are held. The question that pops in, is why two RangeS-S locks are acquired when the statement only retrieves data about “anna”?
That is because Range locks are acquired on the key value from the “=” operator and on the next value. This is to ensure that no row can be inserted updated or deleted between the requested key and the one after that. It means that “anna” and “antony” rows would become locked if some other transactions tried to modify the fields. Let’s see that.

If you look above in the range_lock table content you will see that after “anna” comes “antony”. Further I will try to delete the “antony” row inside another transaction from the default isolation level and see what happens.

Create Blockage

Open a second SQL session (here will be known as session 58) and execute:

begin transaction
go
delete from range_lock where rname ='antony'

The second transaction will wait for the first one to be committed.
 
Execute sp_who2 active to make sure session 58 is blocked by session 55.

58 blocked by 55


 
 
 
 

Check Locked resources on both transactions

To figure out why the delete operation on “antony” row is blocked by the first select on “anna” row check this out:

Execute:

sp_lock 55 
go
sp_lock 58
go

RangeS-S and CNVT locks


 

I see one CNVT type of lock for the second transaction on same resource the first transaction has put RangeS-S lock. It is clear that we are speaking about same row: “antony”.

Microsoft explains CNVT locks as:

CNVT – Lock is being converted to another lock. A lock being converted to another lock is held in one mode but is waiting to acquire a stronger lock mode (for example, update to exclusive). When diagnosing blocking issues, a CNVT can be considered similar to a WAIT.

Execute Rollback on both sessions (55 and 58) to preserve the initial image of the table.
This is all about RangeS-S locks and further in part 2 and 3, I will detail more complex locks such as RangeS-U and RangeX-X.

Top 5 Steps To Determine Bad Performance of Queries

I will talk today about Performance of SQL queries. The performance of queries is a task which always has multiple causes and seen from different angles gives different solutions. Personally, I have sort-of a rule when confronting with bad queries: before coming up with o solution to improve execution of a specific query, I look at every possible angle of the problem and make sure all is covered according to performance best practices.

First, when a performance problem occurs? Basically there are few hints:

  • Application users make complaints about slow running processes.
  • Developers report bad performance for some queries and need DBA analyze before going in Production with the code.
  • Any of following resources is often triggered with high usage: CPU, Disk, Memory
  • Frequent blockages due to long-running or high I/O queries .

Next I will go through most important steps to do when handling a bad query

1. Analyze Execution Query Plan

Checking the Execution Plan of a query is the first thing to look at because it will give you a lot of information about how the statements are executed and provide estimates of the resources requirements for the statements as well at their order. It will outline the query plan chosen for a specific query. Do it in two ways:

  • From SQL Server Management Studio: at runtime click on “Include Actual Execution Plan”  or without executing the query click on “Display Extimated Execution Plan”
  • Issue SET SHOWPLAN_ALL ON at runtime. When SET SHOWPLAN_ALL is ON, SQL Server returns execution information for each statement without executing it. This will detail the exact order of each statement.

Next, look inside Query Plan for statements which have Query Cost relative to batch higher than the rest of statements.

Pay attention to following events:

  • Index / Table Scan – Shows the need for an Index update or Index Creation. Check that every join or filtering (=, >, <) operation has an index on the column(s) used to filter data. For better performance include the columns from the select statement in the Index structure as included columns. It will avoid Primary Key Lookups. Sometimes correct Index is being used but still Index scan is performed: It can happen if the amount of rows returned is bigger than the half of total rows from one table. SQL will choose to make a scan than an Index seek.
  • Index Create hints – Double Check the Index hint syntax and make sure you don’t have already an index that can be updated with additional columns.
  • Use of Convert_Implicit function in Index Scan operations – in compare, equality or inequality operations use same type of datatype. Implicit conversions are those conversions that occur without specifying either the CAST or CONVERT function. SQL will use this function to transform the value to another datatype before performing the actual operation which can take a lot of unnecessary I/O. I saw many Stored Procedures with declared variables with varchar datatype which were being assigned to nvarchar table columns. Same with int / bigint. This is not needed and is a frequent programming error that has to be fixed.
  • Index Update – A Clustered Index Update when an Index key is modified is normal but Non-Clustered Index updates should be avoided since every Index update puts an extra load on SQL. Having too much Indexes leads to many Index updates. Check the current Indexes from the specific table and decide whether you really all Indexes. Remember that Indexes are good for reading data, and bad for data  modifications.
  • Nested Loops and Key Lookups – shows the need for covering Indexes.
  • Sorts – Try to use Indexes instead which can avoid the sort operation.

2. Analyze the query with Tunning Advisor

This is an effortless double check measure for a query after the execution plan is analyzed. Create an .sql file with the Query code and load it to Tunning Advisor. SQL will create a report if it considers that there are missing indexes or statistics that can improve the performance. Tunning Advisor will report hints along with their syntax to create Indexes and show an improvement percent if the modifications suggested are done.

3. Use Profiler to track query compilations/re-compilations frequency.

Use Profiler to see if your Query has many compilations or re-compilations that can delay the query execution. I wrote a special post about this particular step that can be read at this link.

4. Avoid Parameter Sniffing

This applies to Stored Procedures with input parameters only. Parameter sniffing is the process whereby  SQL Server creates an optimal plan for a stored procedure by using the calling parameters that are passed the first time a stored procedure is executed. When a Stored Procedure is executed for the very first time, SQL Query Optimizer will compile the query and create an optimal Execution Plan that suites best the passed parameters. In order to save re-compilation time, The Execution Plan will be used at subsequent executions of  the procedure if this is still is the cache.

In theory, there is no worry if each subsequent procedure call will be done using the exact number of parameters and values. The difference is when each call of the procedure is not the same as the compiled one in terms of the number of parameters passed as well as of the number of rows returned.

So, if you have a Stored Procedure that runs slowly, first thing to do is to check whether the execution plan is optimal for the result set returned.

  • Tune the procedure’s body script in SSMS (SQL Server Management Studio)

Execute the Procedure body script inside SSMS with a set of predefined variable values – same ones encountered at the slow execution of the stored procedure with parameters. At runtime press on “Include Actual Execution Plan” so that you can see the plan generated and have a compare start point. If the duration is considerably smaller than previous executions,  than you should spend some time in analyzing this behavior.

  • Tune the long running procedure execution with Profiler.

Inside a Profiler trace add these events:

Profiler Events to display Execution Plan

 

 

 

 
 

Filter by Database Name to find easier the procedure call and click on Run. After you have caught some results check the Execution plan at Showplan All event for the SPID displaying your procedure call. Compare this Execution Plan to the one from the SSMS.

  • Avoid Parameter Sniffing

To avoid parameter sniffing the stored procedure can be written in a different way with identical output. Instead of using the calling parameters that are passed at execution – replace them with local variables which will receive the value of the calling parameters passed. This little correction will lead the Query Optimizer to keep track of the statistics and create an optimal Query Plan based on a average of the real values passed as parameters.

A second option would be to use WITH RECOMPILE in Procedure body. SQL will not cache the query plan for this procedure, forcing it to be compiled each time it is executed.

Or you could execute the procedure with this hint:

EXEC Procedure 'value' WITH RECOMPILE

The Compilation will be done only this time for this procedure only. The new execution plan created will apply only this time to the called procedure.

  • Check this example of handling Parameter Sniffing on AdventureWorks database:

Procedure containing parameter sniffing:

CREATE PROCEDURE [DBO].[PROCEDURE_PARAM_SNIFFING] (@FIRST int)
AS
select name, [ProductSubcategoryID] from AdventureWorks.production.product where [ProductSubcategoryID] = @FIRST
GO

Execute from 2 different sessions:

EXECUTE [PROCEDURE_PARAM_SNIFFING] 2
EXECUTE [PROCEDURE_PARAM_SNIFFING] 7

The call of [PROCEDURE_PARAM_SNIFFING] with different parameters will have the same Execution plan and scan a range of 43 rows even if second execution returns only one row.

 

Execution Plan – Parameter Sniffing

 

 

 

 

 

 

 

 

 

 

 

 

 

Procedure re-written to avoid parameter sniffing:

CREATE PROCEDURE [PROCEDURE_WITHOUT_PARAM_SNIFF] (@FIRST int)
AS
DECLARE @LOC_FIRST int = @FIRST
select name, [ProductSubcategoryID] from AdventureWorks.production.product where [ProductSubcategoryID] = @LOC_FIRST
go

Execute from 2 different sessions:

EXECUTE [PROCEDURE_WITHOUT_PARAM_SNIFF] 2
EXECUTE [PROCEDURE_WITHOUT_PARAM_SNIFF] 7

The call of [PROCEDURE_WITHOUT_PARAM_SNIFF] with different parameters will still have the same Execution plan, but the scan range will represent an average (7.97) on all values grouped by ProductSubcategoryID.The estimated number of rows is not related to first compilation of this procedure, rather it’s is calculated according to tables statistics.

 

Execution Plan – Avoid Parameter Sniffing

 

 

 

 

 

 

5. Check Query Parallelism

This applies to multi processor servers. Quoting msdn: SQL Server provides parallel queries to optimize query execution and index operations for computers that have more than one microprocessor (CPU). Because SQL Server can perform a query or index operation in parallel by using several operating system threads, the operation can be completed quickly and efficiently. During query optimization, SQL Server looks for queries or index operations that might benefit from parallel execution.

The SQL Server configuration option that manages the number of CPU’s for sqlserver process is called: max degree of parallelism and is set at server level. With this option enabled (sp_configure), SQL Engine will use multiple threads to manage the so-cold parallel-query execution plan. It can have big performance improving when speaking about complex – transactions which need to be processed all at once in multiple threads – taking advantage of the powerful CPU in place. But sometimes, when running a simple query, which does not needs to be divided into threads takes longer with the degree of parallelism enabled than without it. This is all about testing and figuring out which option is best.

The Degree of parallelism can be overwritten in singular query executions if MAXDOP hint is used. MAXDOP can be used to reduce the number of CPU’s used in multi-processing. I found a nice explanation of Pinal Dave about MAXDOP – that can help on starting with the analyze: SQL Server Maxdop settings.

To conclude this post, top 5 steps in analyzing a query’s bad performance before heading to fast solutions are:

1. Analyze Execution Query Plan
2. Analyze the query with Tunning Advisor
3. Use Profiler to track query compilations/re-compilations frequency.
4. Avoid Parameter Sniffing
5. Check Query Parallelism

How to find out which Queries have non-reusable Query Plans?

Following a post on Stack Exchange asking how to check if a Query Plan of an SQL statement was compiled/re-compiled or cached for the first time, I was driven to write a more detailed explanation on this.

This query will update 1000 rows.

Update t1 set c1 = 10

If I execute it for the first time, SQL needs to create first a Query Plan for it, Compile it and store it in the SQL plan cache for the first time. This will be used for next executions. Plan re-use is desirable for OLTP workloads because re-creating the same plan (for similar or identical transactions) is a waste of CPU resources. What if, the Plan is not cached long enough and each time we execute the Update SQL creates the same Query Plan and compiles it over and over again? This is a performance problem that must be analysed. How we find these queries with many Query Plans compilations?

Profiler is my best friend, along with Performance Monitor when analyzing Poor Queries.

To find out how many compilations SQL Server is doing, you can monitor the SQLServer: SQL Statistics: SQL Compilations/Sec counter. This counter measures how many compilations are performed by SQL Server per second. To have a better undarstanding monitor also SQLServer: SQL Statistics: Batch requests/Sec and SQLServer: SQL ReCompilations/Sec counters.

Performance Monitor – SQL Compilations counters

A general acceptance is that if SQL Compilations/Sec counter measures over 100 compilations per second, then you may be experiencing  unnecessary compilation overhead.

Some of the reasons when Query Compilation/Re-compilation occurs:

– Changes in Object Schema

– Index rebuild, drop, create

– Update Statistics

– Stored Procedure alter – can’t use the old Query Plan.

As well Batch requests/Sec should be compared with the SQL Compilations/Sec. Say I have 500 Batches per sec and my Compilation rate is at 200/sec. It is close to 50% of all batches. So it is good to have a compilation rate as low as possible.

So, the rate only shows that the system is high on Compilations but how to know which queries are not cached?

Profiler can be used to find this out. I will explain here how to extract data from Profiler testing the Performance of a known SQL Statement but you will probably wish to find what query is getting compiled over and over again. The results can be overwhelming and hard to filter. I suggest you store your results into a table to a different database (I use a different SQL instance as well) than the one you are monitoring. It will allow you to select all the Events for a single SPID. The SPID will be easy to extract from Profiler.

Below are described the steps to catch the Poor Queries for which the Query Plan is Compiled more often that it needs to.

STEP 1:

To check out the IO and Execution time enable statistics output:

set statistics time on
set statistics io on

STEP 2:

Start a Profiler new trace and include following events:

  • Performance
    • Performance Statistics – Indicates that a compiled plan was cached for the first time, re-compiled or evicted from plan cache.
  • Stored Procedures
    • RPC:Completed
    • SP:CacheHit – Indicates when a procedure is found in cache.
    • SP:CacheInsert – Indicates when an item is inserted into procedure cache.
    • SP:CacheMiss – Indicates when a Stored Procedure is not found in the procedure cache.
    • SP:CacheRemove – Indicates when an item is removed from procedure cache.
    • SP:Recompile – Indicates that the Stored Procedure is recompiling.
  • TSQL
    • SQL:BacthCompleted
    • SQL:BatchStarting

Check to Show All Columns and select each column for Performance Statistics event. Click Run.

MY EXAMPLE:

To achieve this post goal I removed all elements from plan cache for testing purposes (!!! never do this on Production !!!)

dbcc freeproccache
go
dbcc dropcleanbuffers
go

After Starting Profiler –> I will run below query:

set statistics time on
set statistics io on
Update t1 set c1 = 10

STEP 3:

Let it run for some time (10 min) and then identify the SPID of the rows where Performance Statistics, SP:CacheMiss and SP:CacheInsert are displayed. Filter those SPID’s that contain all of three Events.

Now you identified basically those Processes SPID for which the Query Plan was not found in the cache, compiled and inserted in the cache.

MY EXAMPLE:

Execution Output:

SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 90 ms.
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.
Table 't1'. Scan count 1, logical reads 17738, physical reads 3, read-ahead reads 1000, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 234 ms,  elapsed time = 2454 ms.

(1000 row(s) affected)

Profiler Output:

Profiler – Stored Procedure not in cache

First indication that the query is not cached is given by first occurence of Performance Statistics event class, that has a duration of 86 ms used to compile the statement within an adhoc query (EventSubClass: 2 – Batch:Plan). To see all types of EventSubClass descriptions, look here at msdn documentation. Afterwards, SQL checks whether the statement exists in procedure cache – SP:CacheMiss shows that it is not there. The statement is inserted into the procedure into procedure cache : SP:CacheInsert. The actual duration of the statement is 2544 ms. In SQL statistics we can see a difference between the time reported for compilation 90 ms (IO statistics) against 86 ms (Profiler)

STEP 4:

I will execute again the Update after it was cached to see the difference.

Profiler – Stored Procedure found in cache

I no longer have Performance Statistics event class occurrence – so the compiled plan was found in the cache. Moreover I see a new event SP:CacheHit which confirms that the Query Optimizer found the Statement in procedure cache. It is obvious also the difference in Duration: 1223 ms. versus 2544 ms. Same reads and writes which is normal. CPU usage: 171 versus 234.

So the SQL physical resources are less used when query plan does not have to be compiled again. It is an important fact when designing web applications as well as writing SQL code. The DBA should be able to report excessive SQL Compilations and work with the DEV team to try to cache as much as possible.

Some good documents on this subject can be found also here:

http://www.sql-server-performance.com/articles/per/optimizing_sp_recompiles_p1.aspx

http://www.mssqltips.com/tip.asp?tip=1661

http://technet.microsoft.com/en-us/library/cc293623.aspx

http://www.sql-server-performance.com/tips/sql_server_performance_monitor_coutners_p1.aspx

Handling SQL Deadlocks – One Resource Conflict

From a DBA point of view, Deadlocks occurred in Production environment are probably the trickiest and time consuming problems to fix. Every time I came across this, it took me a while to understand where it is coming from, why, and how to avoid a particular Deadlock. This blog will go through some types of deadlocks and how to deal with them.

Here I will analyze a Deadlock caused by an After Insert trigger that acts like a constraint.

Usually people see the Deadlock error after the Deadlock already happened – thrown by client applications or DB Monitoring tools (ex: SQL Diagnostic Manager – Idera).

1. Provoke Deadlock:

a) Create a table with a Primary Key and a Trigger on it that check that same Security Number is not inserted for a new client.

CREATE TABLE [dbo].[Clients]
([clientid] [int] NOT NULL,[name] [nvarchar](50) NULL,[SecurityNR] [int] NULL,
 CONSTRAINT [PK_Clients] PRIMARY KEY CLUSTERED 
([clientid] ASC)) ON [PRIMARY]
GO
CREATE TRIGGER [dbo].[SecurityNR] ON [dbo].[Clients]
AFTER INSERT
AS
DECLARE @SecNR Int
SET @SecNR  = (select SecurityNR from Inserted)
IF EXISTS (SELECT count (1)
           FROM Clients c
           where c.SecurityNR = @SecNR having count (1) >1)
BEGIN
RAISERROR ('This client''s SecurityNR is assigned to another Client', 16, 1);
ROLLBACK TRANSACTION;
RETURN
END;

Let’s assume that there are many concurrent transactions running on the Server that insert 1..to..many new clients inside one transaction.

b) Start a Transaction and Insert one client and do not commit it.

BEGIN TRANSACTION
INSERT INTO [test2].[dbo].[clients]       
 ([clientid],[name],[SecurityNR]) 
VALUES (3,'Chris',12345680)
GO

c) Start a second Transaction and Insert one client and do not commit it.

BEGIN TRANSACTION
INSERT INTO [test2].[dbo].[clients]       
 ([clientid],[name],[SecurityNR]) 
VALUES (4,'Lya',12345682)
GO

At this point the second Transaction will be blocked by the first one.

d) Return to first Transaction and insert another client and commit transaction

BEGIN TRANSACTION
INSERT INTO [test2].[dbo].[clients]       
 ([clientid],[name],[SecurityNR]) 
VALUES (5,'Mary',12345681)
GO

Now, the Deadlock will occur.

Transaction (Process ID 56) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

2. Start Debugging:

Look at the error you caught. If you have a DB monitoring tool which captures deadlocks (I have worked with Idera and I recommend it), than you will probably have already some informations about the deadlocked process.

Alert displayed in IDERA:

A deadlock occurred on server [SERVER NAME]. 

The deadlock victim was spid 58 with application name '[APPLICATION NAME]' 
by user '[USER NAME]' on host '[MACHINE NAME]'.
Last command issued:
Proc [Database Id = 13 Object Id = 1157579162]

Whether you have or not a monitoring tool, next step is to activate the Deadlock traceflag and analyse its output from SQL Server Log after you encounter another deadlock:

-- SQL 2005 - 2008: 
DBCC TRACEON (1222, -1)

-- SQL 2000 :
DBCC TRACEON (1204, -1)

3. Collect Deadlock Trace Output

Here is the important info that must be extracted from the Log file.

A: deadlock-list

  1. deadlock victim=process1816da088

B: process-list

    1. process id = process1816da088 waitresource = KEY: 7:72057594040483840 (03000d8f0ecc) lockMode = S spid = 53 clientapp = Microsoft SQL Server Management Studio – Query loginname = irina-VAIOirina

 

    1. frame procname = test2.dbo.SecurityNR

 

IF EXISTS (SELECT count (1) FROM Clients c where c.SecurityNR = @SecNR
having count (1) >1)
    1. frame procname = adhoc

 

INSERT INTO [test2].[dbo].[clients]([clientid],[name],[SecurityNR])
values(4,'Lya',12345682)
    1. process id = process1816da508 waitresource = KEY: 7:72057594040483840 (0400b4b7d951) lockMode = S spid = 54 clientapp = Microsoft SQL Server Management Studio – Query loginname = irina-VAIOirina

 

    1. frame procname = test2.dbo.SecurityNR

 

IF EXISTS (SELECT count (1) FROM Clients c where c.SecurityNR = @SecNR
having count (1) >1)
    1. frame procname = adhoc

 

INSERT INTO [test2].[dbo].[clients]([clientid],[name],[SecurityNR])
values((5,'Mary',12345681))

C: resource-list

  1. keylock dbid = 7 objectname = test2.dbo.Clients indexname = PK_Clients mode = X
  2. owner id = process1816da508 mode = X
  3. waiter id= process1816da088 mode = S requestType = wait
  4. keylock dbid = 7objectname = test2.dbo.Clients indexname = PK_Clients mode = X
  5. owner id = process1816da088 mode = X
  6. waiter id = process1816da508 mode = S requestType = wait

3. Analyze Deadlock Trace Output

At A: deadlock-list I see the deadlocked process number – process1816da088.
Lower at B: process-list, I can see the SQL statements of the 2 processes caught in the deadlock:

    • process1816da088 – spid 53 – executed:

 

INSERT INTO [test2].[dbo].[clients]([clientid],[name],[SecurityNR]) values(4,'Lya',12345682)
IF EXISTS (SELECT count (1) FROM Clients c where c.SecurityNR = @SecNR having count (1) >1)
    • process1816da508 – spid 54 – executed:

 

INSERT INTO [test2].[dbo].[clients]([clientid],[name],[SecurityNR]) values(5,'Mary',12345681)
IF EXISTS (SELECT count (1) FROM Clients c where c.SecurityNR = @SecNR having count (1) >1)

Lower at C:resource-list:

process1816da508  has X (exclusive) lock on PK_Clients
process1816da088 waits for S (Shared) lock on PK_Clients
From these lines I can get the name of the first resource demanded by both processes: primary key PK_Clients from Clients table.

The second resource demanded by both processes seems to be also the primary key PK_Clients of Clients table. It can be extracted from the other 2 types of locks held:

process1816da088 has X (exclusive) lock on PK_Clients
process1816da508 waits for S (Shared) lock on PK_Clients

How can this be possible? In all documentation the deadlock is described as a conflict between at least 2 different resources.

Now, I have the same resource on which each one of the concurrent processes has X locks and claims S lock. The single logical explanation refers to the fact that SQL would put Row Locks (default Locking level) if the Clustered Index didn’t exist. In Row Lock mechanism transactions are not blocked because each one is modifying a different row from a table. The presence of PK makes a little different but not completely.

After each Insert SQL will have to make a Clustered Index Insert with a new value inserted on ClientID column. So it will request Exclusive Key Lock on  PK_Clients pointing a different row from the Clustered Index. This allows other concurrent transactions to update the Index. So, first conclusion is that the Deadlock problem happens inside the trigger.

After each Insert, the trigger is fired – the select from trigger will do an Index Scan on PK_Clients, in this way claiming Shared locks. How do I know all this? Looking at the execution Plan for Trigger code – I see a Clustered Index Scan to extract the SecurityNR value.

Shared locks read only commited data. It means that the second Transaction will wait after the first row has been commited from the First Transaction. But the first Transaction has one more Insert coming which will have the exact behavior – only now it will claim Shared Lock on PK_Clients that is already locked by second Transaction – inserted row from second Transaction.

This is how the Locking occured to my example:

Deadlock schema

 

4. Fix the Problem

There are more ways to fix this problem:

Avoid Index Scan on PK_Clients when trigger is fired –> Create Index on SecurityNR column which will avoid the PK_Clients index. Instead the trigger will perform an Index Seek on new NonClustered Index.

Putting No_Lock hints at the Select from trigger will fix the deadlock as well, but this is very risky because you can have dirty data reads and corrupt data.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE – option set per session/transaction. This is the highest Isolation Level and will prevent concurrent Inserts, causing a lot of blockages. More on this here.

SQL 2005/2008 – Configure Memory (part 2)

How can we be sure that SQL is low on memory? Or what to set up on max Memory value?

Here is a short guide on how to analyze memory needs for an Instance of SQL Server:
 
Use Perfmon Counters to understand SQL Memory needs
 
Considering that you did not set up yet the maximum SQL Server Memory limit, start Perfmon and include the below 3 counters when the system is most highly used. A stress test would be good enough in order to catch the highest/lowest values.

  1. Memory: Available MBytes – reports the available physical memory. This gives a good indication of whether you are running low on physical memory. If the machine is running low on physical memory, paging will either be happening or will soon begin to happen. This is useful data for diagnosing OOM issues. If this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM and don’t need to worry.
  2. Process(sqlservr): Private Bytes – indicates the amount of memory being used that can’t be shared with other processes. You should monitor this if you want to see how much memory your process uses. This counter also gives a good indication of how your application impacts the entire system—using a lot of private bytes has a big impact on the machine since the memory cannot be shared with other processes
  3. SQLServer: Buffer ManagerPage Life expectancy – normal value: < = 300 sec. According to Microsoft, 300 seconds is the minimum target for page life expectancy. If the buffer pool flushes your pages in less than 300 seconds, you probably have a memory problem.

a) SQL is short on memory
 
First indication that SQL/Machine is low on memory will be that SQL process will assign as much Memory as it finds free and available on the system – until the system will run out of memory. So check the First counter from the list. It should be over 10% of Total RAM. If it’s not, look at the third counter to see whether SQL is really hungry for memory or just very greedy. If Page Life Expectancy is above 300 sec, but the SQL still requires more memory – than it is the time to limit the SQL Memory so that at least 1 GB is left for other OS processes. If, at contrary Available MB is very low and Page Life Expectancy counter is below 300 sec, than you actually are forcing the system to cope with less memory than it needs. Add few GB to RAM and check again.
 
b) SQL has sufficient memory
 
The second case is the easiest meaning that SQL didn’t grab all of available memory and the system still has enough memory for its own processes. I usually take the value from Private Bytes and add few GB (1-3) to set maximum Memory limit on SQL engine setting.

In short, The Page Life Expectancy is a very good indication that SQL has less memory that it would be comfortable with and I suggest collecting this value periodically along with other counters mentioned.

Basically when PLE goes under 300 sec it means, the activity on the Disk will grow (more Paging will occur) in order to get the pages which should still be in buffer pool. At this point CPU will also grow in order to manage the High Disk Activity. So, don’t take fast conclusions if you see high disk activity or CPU increasing. Memory, Disk and CPU are very tightly related.

In addition you could add these counters to check CPU and Disk activity:

  • Processor(_Total)%: Processor Time – determines the percentage of time the processor is busy by measuring the percentage of time the thread of the Idle process is running and then subtracting that from 100 percent. If you begin to see values of 70 percent or more for the Processor% Processor Time counter, investigate your processor’s activity further, as follows:
    1. Examine SystemProcessor Queue Length.
    2. Identify the processes that are running when Processor% Processor Time and SystemProcessor Queue Length values are highest.
  • PhysicalDisk(_Total): Current Disk Queue Length – used to determine the number of I/O requests queued for service. Notice that this might overstate the true length of the queue, because the counter includes both queued and in-service requests. If the value of Current  Disk Queue Length exceeds twice the number of spindles, then you are likely developing a bottleneck. With a volume set, a queue that is never shorter than the number of active physical disks indicates that you are developing a bottleneck.  Normally, for a disk with two spindles the value should be max 2.