Tag Archives: choosing right datatype

7 Common Design Errors – that can be avoided

Being a DBA usually means fixing a lot of design issues when confronting with performance problems. If the design is done with good planning ahead, then the DBA doesn’t have much to do at the design part. Developers, in the absence of a Database Architect are preoccupied mostly about functional scripting rather that functional + optimal scripting. It is not developers fault, it is because a database design should be performed together with a Database Specialist who can figure out the possible issues at the design phase. A lot of firms don’t invest in this position but rely on developer’s professional experience. I came up with a list of possible and frequent Developer errors that should be avoided in order to prevent SQL thinking twice or more before doing a simple operation.

  1. When filtering the results by the WHERE clause, in equality or unequality operators (=, Like, <>), pay attention when assigning to Non-Unicode types of columns (varchar, char, text) values of Unicode type (nvarchar, nchar, ntext). Doing so, SQL will add a CONVERT_IMPLICIT operation at the execution plan which is a pointless thing to do. Example:
     (varchar_column varchar(50) PRIMARY KEY)
     INSERT INTO @table VALUES ('Maria')
    DECLARE @varchar_param AS varchar(50) = 'Maria'
    SELECT *
    FROM @table
    WHERE varchar_column = @varchar_param
    DECLARE @Nvarchar_param AS Nvarchar(50) = N'Maria'  
    SELECT *
    FROM @table
    WHERE varchar_column = @Nvarchar_param

  2. Based on the same reason as the upper one (CONVERT_IMPLICIT), always use in JOIN operations same types of columns, not different ones which SQL can auto convert without an explicit CAST or CONVERT in your code. Here are all implicit and explicit conversions done by SQL: Data Type Conversion
  3. Avoid using In WHERE clauses and JOIN conditions – functions. If a function is called for column that has an index on it, than the index might not be used as it should be. Or if you do, analyze the execution plan so that optimal index usage is made. The below query is one example of a bad usage of functions inside WHERE clause:

  4. NOLOCK hint is overused. Please be aware that NOLOCK causes dirty reads. This means that at the beginning of the transaction you can retrieve one set of data which by the end of the transaction might get modified but you don’t care about it and go along with it. In the end, the transaction conditions might or might not apply to some data that shouldn’t be touched after all. An alternative to NOLOCK hint is to enable Snapshot Isolation in SQL Server.
  5. Some Stored Procedures are created with parameters that have the datatype different than the assigned value. Same is valid for Variables declaration. Keep the datatype identical for each type of equality operation or assignment.
  6. You can choose some datatypes in favor of other considering: storage and data inserted. Sometimes there is no point in using INT if you will have values up to 255, same as using INT when you expect a lot of data – more that 2 billion. A Column Alter later on is not a valid solution. This means you should use TINYINT datatype in first case or BIGINT in second.
    Also, if you will not handle strings with special characters or chinese/japanese words, than you should use any of the non-unicode datatypes: char, varchar, text rather than the unicode ones because of the doubled size in storage. For example same string stored as NVARCHAR format will be twice as big the VARCHAR format, only because SQL will use one extra byte to encode each character. It causes more disk writes and therefore more disk reads, in the end less performance.

  7. Usage of Common Table Expression: ;WITH (CTE). It is very simple to write and easy to understand but, there is something to think about before using it everywhere.
    First of all, all recursive table definitions must be evaluated for each following operation, each time. The data is not evaluated once for re-use later on and is used only inside the scope of the query. The SQL treats the whole expression more like a view and is held in memory; it is not treated like a temporary table. The advantage with the #temp table is that, the data gets evaluated once and then the results are re-used in the join.
    Second, you can not create Indexes on recursive table definitions used in following Joins or filter operations. In conclusion, the CTE expression is advisable to be used when you have a small data set to use in joins but when you want to join millions of rows, than CTE is not a good option. You should choose temporary tables. It will always run faster.