7 Common Design Errors – that can be avoided

Being a DBA usually means fixing a lot of design issues when confronting with performance problems. If the design is done with good planning ahead, then the DBA doesn’t have much to do at the design part. Developers, in the absence of a Database Architect are preoccupied mostly about functional scripting rather that functional + optimal scripting. It is not developers fault, it is because a database design should be performed together with a Database Specialist who can figure out the possible issues at the design phase. A lot of firms don’t invest in this position but rely on developer’s professional experience. I came up with a list of possible and frequent Developer errors that should be avoided in order to prevent SQL thinking twice or more before doing a simple operation.

  1. When filtering the results by the WHERE clause, in equality or unequality operators (=, Like, <>), pay attention when assigning to Non-Unicode types of columns (varchar, char, text) values of Unicode type (nvarchar, nchar, ntext). Doing so, SQL will add a CONVERT_IMPLICIT operation at the execution plan which is a pointless thing to do. Example:
    DECLARE @table AS TABLE
     (varchar_column varchar(50) PRIMARY KEY)
     INSERT INTO @table VALUES ('Maria')
     
    DECLARE @varchar_param AS varchar(50) = 'Maria'
     
    SELECT *
    FROM @table
    WHERE varchar_column = @varchar_param
    
    DECLARE @Nvarchar_param AS Nvarchar(50) = N'Maria'  
    
    SELECT *
    FROM @table
    WHERE varchar_column = @Nvarchar_param
    
    

  2. Based on the same reason as the upper one (CONVERT_IMPLICIT), always use in JOIN operations same types of columns, not different ones which SQL can auto convert without an explicit CAST or CONVERT in your code. Here are all implicit and explicit conversions done by SQL: Data Type Conversion
  3. Avoid using In WHERE clauses and JOIN conditions – functions. If a function is called for column that has an index on it, than the index might not be used as it should be. Or if you do, analyze the execution plan so that optimal index usage is made. The below query is one example of a bad usage of functions inside WHERE clause:

  4. NOLOCK hint is overused. Please be aware that NOLOCK causes dirty reads. This means that at the beginning of the transaction you can retrieve one set of data which by the end of the transaction might get modified but you don’t care about it and go along with it. In the end, the transaction conditions might or might not apply to some data that shouldn’t be touched after all. An alternative to NOLOCK hint is to enable Snapshot Isolation in SQL Server.
  5. Some Stored Procedures are created with parameters that have the datatype different than the assigned value. Same is valid for Variables declaration. Keep the datatype identical for each type of equality operation or assignment.
  6. You can choose some datatypes in favor of other considering: storage and data inserted. Sometimes there is no point in using INT if you will have values up to 255, same as using INT when you expect a lot of data – more that 2 billion. A Column Alter later on is not a valid solution. This means you should use TINYINT datatype in first case or BIGINT in second.
    Also, if you will not handle strings with special characters or chinese/japanese words, than you should use any of the non-unicode datatypes: char, varchar, text rather than the unicode ones because of the doubled size in storage. For example same string stored as NVARCHAR format will be twice as big the VARCHAR format, only because SQL will use one extra byte to encode each character. It causes more disk writes and therefore more disk reads, in the end less performance.

  7. Usage of Common Table Expression: ;WITH (CTE). It is very simple to write and easy to understand but, there is something to think about before using it everywhere.
    First of all, all recursive table definitions must be evaluated for each following operation, each time. The data is not evaluated once for re-use later on and is used only inside the scope of the query. The SQL treats the whole expression more like a view and is held in memory; it is not treated like a temporary table. The advantage with the #temp table is that, the data gets evaluated once and then the results are re-used in the join.
    Second, you can not create Indexes on recursive table definitions used in following Joins or filter operations. In conclusion, the CTE expression is advisable to be used when you have a small data set to use in joins but when you want to join millions of rows, than CTE is not a good option. You should choose temporary tables. It will always run faster.

  1. 1 & 2) Not all implicit/explicit conversions are bad. First example will generate an implicit conversion of `Maria` to NVARCHAR data type which isn’t a problem from performance point of view. More, I think SQL Server will choose the right data type for `Maria` -> N`Maria` (NVARCHAR). But WHERE column_type_varchar = N’Maria’ can be a problem if the column’s collation is a SQL collation.
    3) There are two exceptions: CONVERT(DATE, DateTimeColumn) and computed columns, cases which can generate execution plan w. Index Seek if there are indexes.
    6) The Romanian language has also some chars that need special attention: DECLARE @x VARCHAR(100) = ‘AĂÂIÎSȘTȚ’; SELECT @x;DECLARE @y NVARCHAR(100) = N’AĂÂIÎSȘTȚ’; SELECT @y;
    7) Recursive CTE aren’t a bad idea but the performance depends (of course) on indexing and how the join is written.

  2. very good point, and thanks for your feedback concerning implicit convert between nvarchar and varchar. you’re right, the issue is when comparing varchar column datatypes to nvarchar values, and I have repaired the mistake. At 6, I think I covered the idea that if you do have strings that contain special characters you should use nvarchar datatype (“if you will not handle strings with special characters or chinese/japanese words…”). At 7, I have seen a CTE running for joins between millions of rows and when I switched to temp tables, there has been a boost in performance from 30 minutes to 20 seconds. Just saying that it should be used carefully.

  3. Great post for designers who looking for increase conversion rate by focusing visitor satisfaction 😛

  4. Dear Irina,

    referring to top 1 of your blog post – what SQL Server version are you using. The query optimizer has changed significantly in many aspectes.

    The described INDEX SCAN will be avoided by a type convertion [COMPUTE SCALAR] by internal “GetRangeThroughConvert” operation.

    I checked your example with 2005, 2008R2 and 2012.
    The described behaviour was only in 2005.

    Best to you and your great blog, Uwe

    Uwe Ricken
    MCM: SQL Server 2008

  5. Hello, I am pretty sure the example from point one is exactly as described in Execution Plan provided below in pictures on 2008 R2 and 2012. I double checked it now 🙂 can you explain what to you mean “INDEX SCAN will be avoided by a type convertion [COMPUTE SCALAR] by internal “GetRangeThroughConvert” operation.” ? Thanks for reading the blog!

  6. Hi Irina,

    If you use

    SET STATISTICS PROFILE ON;
    GO

    before your script you get the execution steps as output.
    For the second example (with “invalid” type) you get the following operation as StmtText:

    |–Compute Scalar(DEFINE:(([Expr1005],[Expr1006],[Expr1004])=GetRangeThroughConvert([@Nvarchar_param],[@Nvarchar_param],(62))))

    Paul White has written a pretty fine article about that – new – function:

    http://sqlblog.com/blogs/paul_white/archive/2012/01/18/dynamic-seeks-and-hidden-implicit-conversions.aspx

    Best, Uwe
    MCM: SQL Server 2008

  7. Your comment made me curious and I checked this script on more servers having different versions. The strange thing is that indeed, I get the Index seek with compute scalar operation added, but at the same time I get different execution plans on same server versions – one is default instance another is named instance – same physical machine. Here is the little video demonstration. With or without set profile on which logically should not change the execution plan. I think the execution plan can be different on same version type servers and SQL might think differently depending on nr of CPU, memory assigned… So, I stick to the idea assigning nvarchar values to varchar columns is not a good idea 🙂 Thank you so much for your interest!

Leave a Comment


NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>