Migrate MySQL instance from one machine to another using Replication (5.6)

MySQL is an opensource Database Management System and offers a very flexible platform from management point of view. It has a lot of options when the discussion comes to how to backup and restore a MySQL instance. This can be done in several ways, depending what you are after:

  • Consistency before speed – use mysqldump
  • Speed before consistency – use LVM backups or Percona toolkit.

I used mysqldump and Percona toolkit and I saw the advantages to each one of these methods.

mysqldump: can guarantee the consistency of the backup and the data behind it. It will take a lot of time because it practically creates the scripts for each Database Object (structure and data).

Percona: integrated solution for backup/recovery that makes a folder backup and can include corruption inside database objects because it is not checking the consistency of the objects. This method is faster and can be a good option if the database has been checked for corruption AND FIXED.

A mysqld (instance) can have one or more databases. It has a config file which can contain the configuration options for more MySQL instances. In this post I will explain how to move a machine with one MySQL instance, using the above mentioned methods.

In my case, I had a server with a MySQL instance (5.6 version) on it and I had to move the servers’s databases to another machine. This was a DEV server and there where 2 reasons why I chose to use MySQL Replication.

I will use hostname1(current) and hostname2 (new)

  1. I didn’t want to lock the DEV team from doing changes on db side, while the dump is being made and restored
  2. The specification was to keep the same MySQL server name on hostname2 that was used in connection strings and for my case it easier to down interface from hostname1 and put it up on hostname2 with little disruption of activity.

1. mysqldump method

1.1. prepare the MySQL instance on hostname2 and config file to match the hostname1 MySQL instance.

  • install the right version of MySQL 5.6.
  • change the config file to match the source MySQL my.cnf file. validate that on the new server you have the directories for data, innodb files, logging and startup variables if specified (ex: socket and pid).
  • create MySQL user if not exists
  • initialize MySQL database Data Directory through mysql_install_db

    It is important to make sure that the database directories and files are owned by the mysql login account so that the server has read and write access to them when you run it later. If you are running the command as root, specify
    the –user that will be used to run mysqld. The –datadir option is mandatory. mysql_install_db creates the mysql directory which will contain MySQL database files and should NOT exist.

    /usr/local/mysql-VERSION/scripts/mysql_install_db --user=mysql --datadir=/path-to/datadir/ 
  • check datadir permissions. change owner&group if necessary.
    ls -la /path-to/datadir/
    chown -R mysql:mysql /path-to/datadir/
  • start mysql server
  • /etc/init.d/mysql.server start

1.2. mysqldump with master-data argument on hostname1

when master-data is used with value=2 it causes the dump output to include a CHANGE MASTER TO statement that indicates the binary log coordinates (file name and position) of the dumped server. These are the master server coordinates from which the slave should start replicating after you load the dump file into the slave.

mysqldump --host=hostname1 --port=3306 --all-databases --add-drop-database --add-drop-table --master-data=2 --single-transaction > /drive_location_for_dump/backupfile.sql

1.3. copy and restore the dump on a new installed MySQL server from hostname2.
use rsync or scp – whatever you prefer.

Run from hostname1:

scp /drive_location_for_dump/backupfile.sql  youruser@hostname2:/drive_location_for_dump/

Run on hostname2:

mysql -h hostname2 -P3306 < /drive_location_for_dump/backupfile.sql

1.4. link the hostname2 MySQL to the hostname1 for a couple of hours/days in master – slave setup

Run from shell on hostname2:

head -n2000 /drive_location_for_dump/backupfile.sql | grep "CHANGE MASTER TO"

copy the line and execute in mysql:

CHANGE MASTER TO MASTER_HOST='hostname1', master_port = 3306, MASTER_LOG_FILE='binlog file' MASTER_LOG_POS=position
start slave;
show master status \G

1.5. make the flip between master and slave and promote slave as future master.

This step can be done by several methods and it depends on your network configuration.

2. Percona toolkit method

2.1. prepare the MySQL instance on hostname2 and config file to match the hostname1 MySQL instance.

Repeat the 1.1. steps from above method, EXCEPT the part with initializing MySQL database. Because you are doing a folder to folder copy MySQL database will be copied from hostname1 to hostname2.

2.2. Download and install the right version of Percona xtrabackup tool on hostname1 and hostname2
they have release notes for each version and you can check if your MySQL version is supported by one version or another.


Next, you can find all the steps to setup a slave to an existing master, explained here (https://www.percona.com/doc/percona-xtrabackup/2.1/howtos/setting_up_replication.html)

sysprocesses table – a handy tool for monitoring when you are not there

There are several SQL native queries to collect information about the server while confronting a bottleneck and I mean:

  • sp_who2 active – check active processes and if there are blockages
  • dbcc inputbuffer (spid) – see what the process behind a specific spid number is doing. This command is similar to using the ::fn_get_sql function that converts the sql_hadle into humanly readable text.
  • querying sys.dm_exec_requests and the other related DMVs

But all of these are practically transforming in a better form the output from master.dbo.sysprocesses table.
This table has a lot of useful information, but some of it may not be used at its real value by all of you. I want to show how you can use most of it and really understand its contents. I will not speak about the obvious columns which are displayed by the sp_who2 procedure call, such as: SPID, Status, Login, BlkBy, DBName, Command, CPUTime, DiskIO, LastBatch, ProgramName. There are all very nice and give you fast overview over your busy system if executed when the problem is noticed. I personally use it each time somebody is complaining about a heavy workload. But what if the workload happens when you are not at work? or not in front of a computer and receiving alerts? or maybe, you found that sometimes between 1 AM and 4 AM during the off hours, your system gets very busy, and you don’t have any extra expensive tool like (Diagnostic manager, Spotlight, Redgate etc) installed that will show you next day what really happened when your system was overused? So how do you explain to your IT director what happened and what can be done to overcome the future problems?

Here I will show you what you can do. It is free and I will be using only SQL tools, meaning the output from sysprocesses.
So, basically when you execute sp_who2 you will not see the underlining query that is behind a specific SPID. I bet you know that, this information is held inside sysprocesses under the same SPID number but in binary form. something like: 0x01000100099BA83AA0E4BEAD3100000000000000. Also you probably know that you can retrieve the sql text using the above mentioned function ::fn_get_sql. There are 2 other columns that can give you the exact statement if the statement that is executing and causing problems is inside a Stored Procedure. Then, you’ll know what part of your stored procedure needs more attention. These are stmt_start and stmt_end. They mark the starting and ending offset of the current SQL statement for the specified sql_handle.

What I am trying to say is that you could make a job inside the SQL Agent that will start at a specified hour and execute a stored procedure containing a Time based Cursor (from 1 AM to 4 AM) with a waiting delay of 3-4 seconds, that will insert the output from sysprocesses into a user table, available for you to query and see the history of all queries. Be careful though, when scripting the algorithm of the insert, because some SPID numbers can execute at a given time one query and after completion the SPID will be taken by another thread, other database. In order to not have a bunch of duplicate values and make the history table very hard to query, my advice is to insert once all the SPID numbers found in sysprocesses which don’t exist in your table, then that start the cursor based on existing SPID numbers from your table and insert only changed values for same SPID in: Status, loginame, hostname, cmd, dbid, and blocked from sysprocesses.

Example: MUST BE FILLED with right values.

CREATE TABLE [dbo].[your_table](
	[date] datetime,
	[spid] [smallint] NOT NULL,
	[kpid] [smallint] NOT NULL,
	[blocked] [smallint] NOT NULL,
	[waittype] [binary](2) NOT NULL,
	[waittime] [bigint] NOT NULL,
	[lastwaittype] [nchar](32) NOT NULL,
	[waitresource] [nchar](256) NOT NULL,
	[dbid] [smallint] NOT NULL,
	[uid] [smallint] NULL,
	[cpu] [int] NOT NULL,
	[physical_io] [bigint] NOT NULL,
	[memusage] [int] NOT NULL,
	[login_time] [datetime] NOT NULL,
	[last_batch] [datetime] NOT NULL,
	[ecid] [smallint] NOT NULL,
	[open_tran] [smallint] NOT NULL,
	[status] [nchar](30) NOT NULL,
	[sid] [binary](86) NOT NULL,
	[hostname] [nchar](128) NOT NULL,
	[program_name] [nchar](128) NOT NULL,
	[hostprocess] [nchar](10) NOT NULL,
	[cmd] [nchar](16) NOT NULL,
	[nt_domain] [nchar](128) NOT NULL,
	[nt_username] [nchar](128) NOT NULL,
	[net_address] [nchar](12) NOT NULL,
	[net_library] [nchar](12) NOT NULL,
	[loginame] [nchar](128) NOT NULL,
	[context_info] [binary](128) NOT NULL,
	[sql_handle] [binary](20) NOT NULL,
	[stmt_start] [int] NOT NULL,
	[stmt_end] [int] NOT NULL,
	[request_id] [int] NOT NULL

declare @datefrom as datetime 
declare @dateto as datetime
set @datefrom = getdate () 
set @dateto = dateadd(second, 120, @datefrom) - the loop will run for 2 minutes.

WHILE @datefrom < @dateto 


waitfor delay '00:00:03'

Insert into [yourtable] (date, column_list) select getdate(), column_list from sysprocesses
where SPID not in (select SPID from your_table) -- you can add filters for background processeses. 

DECLARE spid_cursor for 
select SPID from sysprocesses where SPID in (select SPID from your_table) 

[...] - open, fetch, while clauses 

Insert into your_table () 
    from sysprocesses a 
    where spid = @spid and 
    (a.status not in (select status from your_table b where b.spid=@SPID) or 
    [...] - the rest of the columns that must be compared 

[...] fetch next , close deallocate clauses 

set @datefrom = GETDATE ()

But there is one catch of course. sql_handle contains the MD5 hash of the entire batch text and is stored in the SQL Manager Cache (SQLMGR). So you could make an easy cursor that will insert the output from “select * from sysprocesses” but the actual SQL query might or might not be in Cache when you arrive at work and analyze them.
What you need to do is to convert the sql_handle into readable SQL text commands as it is collected. One way to do it is to create 2 user functions (one for the statement, one for the Stored Procedure name). These will serve as formula for 2 new computed columns inside your table.

create FUNCTION [dbo].[text_handle] (@handle binary(20), @start int, @end int)
RETURNS nvarchar (max) 
declare @text nvarchar(max) 
set @text = (SELECT  substring(text, (@start + 2)/2, 
CASE @end WHEN -1 THEN (datalength(text)) ELSE (@end -@start + 2)/2 END) 
FROM ::fn_get_sql(@handle))
return @text

create FUNCTION [dbo].[proc_handle] (@handle binary(20), @db int) 
RETURNS nvarchar (max) 
declare @proc nvarchar(max) 
set @proc = (SELECT 'DB:' + convert (nvarchar(4),@db) + ', SP: ' +  convert (nvarchar(max),text) FROM ::fn_get_sql(@handle))
return @proc

ALTER TABLE your_table
    ADD sql_text AS dbo.text_handle (sql_handle,stmt_start,stmt_end)

ALTER TABLE your_table	
    ADD sql_proc AS dbo.proc_handle (sql_handle,dbid)

That should be enough for you to start narrowing the problem and mauybe you will want to collect some perfmon counters as well, because a bottleneck usually doesn’t come from a single source. There might be disk pressure, cpu or memory leak that makes the SQL processes run poorly.

Remove members from Replica Set

I had to deal with it, since the storage used by our three-members Replica Set needed to be redesigned and to achieve this I needed to remove practicaly 2 members. The replica set had to stay online only with one Primary node. The guys from 10Gen explain in this link, a step-by-step method to remove ONE member from three. When you are left with 2 members in a Replica Set, it is not the best option, but also not the worst. You can attach a witness which can decide between the two of them which one will become Primary, when one node fails. So the idea is that, if a node fails (shuts down, powers off, etc.. ) the other node which is online will have the SECONDARY status when you issue rs.status() and the writes will be prohibited. You will only be able to query the mongod databases.

And this is why I started this post. The mongodb creators suggest 2 methods to remove one member from a Replica Set. I will describe below step-by-step a workaround method to remove 2 Secondary members from a three members Replica Set. Finally you will have one member Replica Set.

  • Issue db.isMaster() on any node to know which is primary and which are secondary members and to get their host names. You will see something like this:

    RS:PRIMARY> db.isMaster()
            "setName" : "RS",
            "ismaster" : true,
            "secondary" : false,
            "hosts" : [
            "primary" : "mongodb-01:27017" 
  • Login to first Secondary member you want to remove. I will do it first for mongodb-02:27017.
  • Connect to the mongod instance of mongodb-02 and issue db.shutdownServer()
  • Check that it’s down. Issue a Linux bash command: ps -Aef | grep mongod, and you should not see any mongod processes started.
  • Go to Primary node and issue rs.conf() and you will see that the server is still included in Replica Set configuration.

     "_id" : "RSProd",
            "version" : 4,
            "members" : [
                            "_id" : 0,
                            "host" : "mongodb-01:27017",                        
                            "_id" : 1,
                            "host" : "mongodb-02:27017"
                            "_id" : 2,
                            "host" : "mongodb-03:27017"
  • To remove it issue rs.remove(“mongodb-02:27017”)
    After that, when you issue again rs.conf() you will get:

     "_id" : "RSProd",
            "version" : 4,
            "members" : [
                            "_id" : 0,
                            "host" : "mongodb-01:27017",                        
                            "_id" : 1,
                            "host" : "mongodb-03:27017"

    In order to remove also the mongodb-03:27017 node from the remaining 2 server Replica Set, the above method will not work since, if one server from two is down, the mongodb setting is configured to not allow any writes to the remaining online server. So you need to do it with the all servers online.

  • Issue on Primary:
  • cfg = rs.conf()
    cfg.members.splice(1,1)  // first 1 in the brackets represents the id of the node from rs.conf(). 
     "_id" : "RSProd",
            "version" : 5,
            "members" : [
                            "_id" : 0,
                            "host" : "mongodb-01:27017",                        
  • Now you can go and stop the mongod service from mongodb-03:27017. Issue the bash command:
  • sudo service mongod stop

Collect Backup & Database Size over time & forecast

There are frequent cases when you have to create a graph with Backup size growth over time. For example, to estimate their growth in order to plan buying more space. If you don’t have any collection scripts put in place to gather sizes of databases or backups it is very simple to query msdb system database.

In msdb is stored each backup command along with database size, backup size, compressed backup size, file location, and many other usefull information.

The only case you cannot rely completely on msdb information is when you migrated user databases from a different server and chose to do it to a newly installed SQL server with new system databases. In this case, you will not find any information about backups, older than migration date in msdb. You will have to recover the information from older msdb backups if you archived them of course.

  • Here is one script that you can use to get the sum of backup sizes for all databases that are being backed up on your server for the last year.

    select convert (date,backup_start_date) Backup_Date,
    sum(compressed_backup_size/1073741824) as Sum_Compressed_Backup_Size_GB, 
    sum(backup_size/1073741824) as Sum_Backup_Size_GB,COUNT(database_name)
    from msdb.dbo.backupset 			
    where	[type] in ('D','I','L') 
    and backup_start_date > GETDATE ()-365 
    group by convert (date,backup_start_date)
    order by convert (date,backup_start_date)
  • This one got me the information I was looking for, namely: sum of all databases full backups + their diff backups from last full backup done. If you have for example the Full backup scheduled on Saturday then you will want to know the differential done after this one, in order to get the exact size needed per database to be restored if the need comes. Adapt the @dw_[…] variable to your needs.
  • &nbsp

    declare @dw_sunday date = (select max(CONVERT (date, backup_start_date)) from msdb..backupset where DATEPART (dw,backup_start_date) =1)
    declare @dw_monday date = (select max(CONVERT (date, backup_start_date)) from msdb..backupset where DATEPART (dw,backup_start_date) =2)
    declare @dw_tuesday date = (select max(CONVERT (date, backup_start_date)) from msdb..backupset where DATEPART (dw,backup_start_date) =3)
    declare @dw_wednesday date = (select max(CONVERT (date, backup_start_date)) from msdb..backupset where DATEPART (dw,backup_start_date) =4)
    declare @dw_thursday date = (select max(CONVERT (date, backup_start_date)) from msdb..backupset where DATEPART (dw,backup_start_date) =5)
    declare @dw_friday date = (select max(CONVERT (date, backup_start_date)) from msdb..backupset where DATEPART (dw,backup_start_date) =6)
    declare @dw_saturday date = (select max(CONVERT (date, backup_start_date)) from msdb..backupset where DATEPART (dw,backup_start_date) =7)
    ;with a as 
       CONVERT (date,backup_start_date) as data, 
        sum(compressed_backup_size/1048576) as [Backup_total(MB) Full]
        FROM msdb..backupset 
        where	[type] in ('D') and 
    convert (date,backup_start_date) = @dw_saturday
    group by  database_name,  CONVERT (date,backup_start_date)
    ) ,
    b as 
        convert (date,backup_start_date) as data, 
        sum(compressed_backup_size/1048576) as [Backup_total(MB) Diff]
        FROM msdb..backupset b
        inner join a on a.database_name = b.database_name and convert (date,backup_start_date) > CONVERT(date,a.data) 
        where	[type] in ('I') 
    group by  b.database_name,  CONVERT (date,backup_start_date))
    select a.database_name,
    		a.data as full_data, 
    		b.data as diff_data, 
    		a.[Backup_total(MB) Full], 
    		b.[Backup_total(MB) Diff]	
     from a, b
     a.database_name = b.database_name and
     b.data > a.data 
     order by 1,2,3
  • If you have Full backups done once per week (like I do) and Differential + Log Backups daily then, you will want to know the backup size per week, in order to be able to do a forecast for a number of weeks in advance.

    SELECT distinct
        datepart(wk,backup_start_date) AS weekNumber,
        sum(compressed_backup_size/1073741824) as [Backup_total(GB)],
        COUNT(database_name) as Files_Total
        FROM msdb.dbo.backupset 
    backup_start_date > GETDATE ()-365 and backup_start_date < '2014-01-01 00:00:00.000'
    group by  datepart(wk,backup_start_date)
    order by  datepart(wk,backup_start_date)
  • And this script will show you all backups (here are selected Full ones but you can change the letter from type column in Where clause)
    order by date descending, including information about database file names and their sizes. This will allow you to estimate the growth of Database files, not only their backups.

    select distinct b2.database_name, b2.backup_start_date, b2.type,
    		b2.backup_size/1048576 as backupsize_GB, m2.physical_device_name, b.file_size/1048576 as File_size_MB, b.logical_name 
    		as Logical_FileName
    from				msdb.dbo.backupfile b, 
    				msdb.dbo.backupset b2, 				
    				msdb.dbo.backupmediafamily m2
    where	b.backup_set_id=b2.backup_set_id and 		
    		b2.media_set_id = m2.media_set_id and
    		--b2.database_name = 'master' AND
    		b2.type = 'D' 
    order by 2 desc

    7 Common Design Errors – that can be avoided

    Being a DBA usually means fixing a lot of design issues when confronting with performance problems. If the design is done with good planning ahead, then the DBA doesn’t have much to do at the design part. Developers, in the absence of a Database Architect are preoccupied mostly about functional scripting rather that functional + optimal scripting. It is not developers fault, it is because a database design should be performed together with a Database Specialist who can figure out the possible issues at the design phase. A lot of firms don’t invest in this position but rely on developer’s professional experience. I came up with a list of possible and frequent Developer errors that should be avoided in order to prevent SQL thinking twice or more before doing a simple operation.

    1. When filtering the results by the WHERE clause, in equality or unequality operators (=, Like, <>), pay attention when assigning to Non-Unicode types of columns (varchar, char, text) values of Unicode type (nvarchar, nchar, ntext). Doing so, SQL will add a CONVERT_IMPLICIT operation at the execution plan which is a pointless thing to do. Example:
      DECLARE @table AS TABLE
       (varchar_column varchar(50) PRIMARY KEY)
       INSERT INTO @table VALUES ('Maria')
      DECLARE @varchar_param AS varchar(50) = 'Maria'
      SELECT *
      FROM @table
      WHERE varchar_column = @varchar_param
      DECLARE @Nvarchar_param AS Nvarchar(50) = N'Maria'  
      SELECT *
      FROM @table
      WHERE varchar_column = @Nvarchar_param

    2. Based on the same reason as the upper one (CONVERT_IMPLICIT), always use in JOIN operations same types of columns, not different ones which SQL can auto convert without an explicit CAST or CONVERT in your code. Here are all implicit and explicit conversions done by SQL: Data Type Conversion
    3. Avoid using In WHERE clauses and JOIN conditions – functions. If a function is called for column that has an index on it, than the index might not be used as it should be. Or if you do, analyze the execution plan so that optimal index usage is made. The below query is one example of a bad usage of functions inside WHERE clause:

    4. NOLOCK hint is overused. Please be aware that NOLOCK causes dirty reads. This means that at the beginning of the transaction you can retrieve one set of data which by the end of the transaction might get modified but you don’t care about it and go along with it. In the end, the transaction conditions might or might not apply to some data that shouldn’t be touched after all. An alternative to NOLOCK hint is to enable Snapshot Isolation in SQL Server.
    5. Some Stored Procedures are created with parameters that have the datatype different than the assigned value. Same is valid for Variables declaration. Keep the datatype identical for each type of equality operation or assignment.
    6. You can choose some datatypes in favor of other considering: storage and data inserted. Sometimes there is no point in using INT if you will have values up to 255, same as using INT when you expect a lot of data – more that 2 billion. A Column Alter later on is not a valid solution. This means you should use TINYINT datatype in first case or BIGINT in second.
      Also, if you will not handle strings with special characters or chinese/japanese words, than you should use any of the non-unicode datatypes: char, varchar, text rather than the unicode ones because of the doubled size in storage. For example same string stored as NVARCHAR format will be twice as big the VARCHAR format, only because SQL will use one extra byte to encode each character. It causes more disk writes and therefore more disk reads, in the end less performance.

    7. Usage of Common Table Expression: ;WITH (CTE). It is very simple to write and easy to understand but, there is something to think about before using it everywhere.
      First of all, all recursive table definitions must be evaluated for each following operation, each time. The data is not evaluated once for re-use later on and is used only inside the scope of the query. The SQL treats the whole expression more like a view and is held in memory; it is not treated like a temporary table. The advantage with the #temp table is that, the data gets evaluated once and then the results are re-used in the join.
      Second, you can not create Indexes on recursive table definitions used in following Joins or filter operations. In conclusion, the CTE expression is advisable to be used when you have a small data set to use in joins but when you want to join millions of rows, than CTE is not a good option. You should choose temporary tables. It will always run faster.

    In the heart of Sardinia – a summer paradise

    Besides SQL there are vacations… nice time spent with family and friends. I want to blog about one of ours and give you a hint where to go next time you think about traveling. This year, we again visited Sardinia. The third year in Sardinia calls for a post πŸ™‚

    OK… so we have a good motivation to go there as my mother/father-in-law moved there some years ago, but… it is really a good destination to peak for a wonderful vacation. We tried them all: as a couple – just us first time, two years after that with some close friends, and two another years after that with some other close friends but with kids attached to us this time πŸ™‚

    All of the cases have been just fabulous vacations, with new things to do there and visit. I guess it is known that the sea in Sardinia is very clean having all sorts of colours from blue to turquoise, and has different sort of beaches to peak from: sandy from very smooth to not so smooth (the sand that does not stick to your skin when you are wet), rocky, with mountains behind them, red sand and so on…

    Of course it is difficult to cover all of them and still keep the vacation air of relaxation. Even us after 3 vacations spent there, still have other parts of Sardinia to go to. We explored a big part from east side, some of the center side, little of the south and the west only from the car this year traveling from airport to our destination.

    All the times we stayed in Orosei – a small village, 90 km from Olbia (the island port). I would not recommend this village for a tourist as it is not so full of attractions being a small village, but for us it was the good destination having the relatives there as well. The first and the second time we came with our car from Romania to Sardinia, early booked tickets to the fairy from Livorno to Olbia (first time) and from (Piombino to Olbia). These are quite expensive. You can book them using www.aferry.com.

    The third time we flied to Alghero airport directly from Romania but we had a car borrowed to us. What I really wanted to say is that all the times we had a car at our end. It is very important as the distances are not big but if you really want to see the island a car is what you need. There are comapnies that let you rent cars, even in Orosei there was one πŸ™‚

    Enough talking, I will rather start uploading pictures and describe the locations from them.

    Orosei (part 1)

    A small village with a very nice beach with sand that does not stick to your skin (see down right picture) and you could get to Osala beach (see the middle and up left pictures) quite easy as it is very near. Because there are rocks inside the water, you can see different species of small and medium fish.

    Road to Osini

    It was one good morning when we agreed to drive two guys to their family in Osini. Google said it was 120 km done in 2 hours. Very nice. Except….
    It didn’t consider the excessive steep, narrow curves and the fact that some roads where vanished due to current landslides, and that we had to search for alternative paths to get there, path that didn’t exist on our GPS. It was a adventure… quite dangerous one. We did 4-5 hours towards Osini and 4 hours back to Orosei. Still we don’t regret anything from this exciting journey. From a certain point we just kept climbing and for one hour we didn’t descend at all, so when we saw the mark to “monte Olinie 1372 m” we had a good laugh. We saw road signs that were shot (later we found out it is for fun) Sardinians pigs, cows sitting on probably the highest hill for a cow to be… just wandering there, a crystal water and a rocky-sandy beach, and the most interesting story about a city shattered by flood and another one a little bit higher reconstructed as its clone. You could see them both from the road as magnets fixed to the mountain.

    Orosei (part 2)

    Second time when we came we tried to explore the very extended beach in front of Stagno Petrosu lake (Pond Petrosu in english) very close to Orosei. We almost stuck in the sand with the car due to the fact that we didn’t estimate the depth of it very good and wanted to get as closer to the beach as possible. Two man saw the deep problem we were in and helped drag the car out of it :))

    We wanted to see the sun rise from sea in Orosei, but it came behind a mountain, due to the golf shape but anyway… we had our share of pictures:

    Admired the Sardinian flora & fauna:

    Did some snorkeling again.


    AND… did the most breathtaking boat trip through the Orosei golf to explore cala Luna, cala Mariolu, cala Sisine, grotte del Fico, grotte del Bue Marino and its beach. It is really, without any exaggeration the most beautiful, like a fairy tail experience I have ever had on sea. Just watch:



    La Caletta and Biderosa

    One local man said we should go to La Caletta and Biderosa beaches. So we did but unfortunately I don’t have pictures from these places. I recommend though much more Biderosa for whole day stay, as there are 5 beaches to peak from, trees to hide from sun, and benches for picnic, and water… need I say more?

    In conclusion, wonderful island with wonderful places!
    Definitely a good vacation!

    Automated MongoDB backup

    I recently had to setup an automated job to backup some mongo databases. Here it is how you do it.

    1. Create users for each db that needs to be backed up with full rights in that database. Pay attention to NOT create the user in admin database and allow full rights for all other user db’s. Even if you will be able to connect with that user and query the collections for other databases, you will not be able to use mongodump.

    2. Write in shell file with .sh extension following script:

    # Dump Mongo Dbs database every night using mongodump
    # Author: by yrushka.com
    ## Binary path ##
    echo $BAK
    /usr/bin/mongodump  --port 27017 --username user1 -password user1pass --db db1 --out $BAK/`date +"%Y_%m_%d__"`db1
    /usr/bin/mongodump  --port 27017 --username user2 -password user2pass --db db2 --out $BAK/`date +"%Y_%m_%d__"`db2
    /usr/bin/mongodump  --port 27017 --username user3 -password user3pass --db db3 --out $BAK/`date +"%Y_%m_%d__"`db3

    3. Create a task in CRON that will execute the above script to a specific schedule.

    And that’s it.

    Analyze Indexes – with or without them?

    SQL Indexes – what a great subject to discuss on! I just love them. It is incredible to find out each time, what a proper index can do to your database…

    As far as I can think of there are 3 issues related to Indexes:

    1. Fragmentation
    2. Query Optimization
    3. Storage problems – I already blogged about it and if you follow the link you will see the whole story.

    I. Fragmentation

    Inside each DBA’s maintenance tasks, should be included a step which will collect fragmentation on all Production most critical databases.
    You can schedule the Index part once per month, or as frequent as you consider necessary.

    The SQL query is rather simple:

    1. Create a table with structure:

      use tempdb
      CREATE TABLE #indexes (
         ObjectName CHAR (255),
         ObjectId INT,
         IndexName CHAR (255),
         IndexId INT,
         Lvl INT,
         CountPages INT,
         CountRows INT,
         MinRecSize INT,
         MaxRecSize INT,
         AvgRecSize INT,
         ForRecCount INT,
         Extents INT,
         ExtentSwitches INT,
         AvgFreeBytes INT,
         AvgPageDensity INT,
         ScanDensity DECIMAL,
         BestCount INT,
         ActualCount INT,
         LogicalFrag DECIMAL,
         ExtentFrag DECIMAL)
    2. Open a cursor that contains 2 parameters: schema name and table name selected from INFORMATION_SCHEMA.TABLES and execute for each:

       INSERT INTO #indexes
    3. The execution output will be stored in a table with same structure as the output’s columns, used for future analysis.
      I only focus on Indexes that have a fragmentation higher than 5% and have more than 10000 pages. In theory, there is no point in rebuilding an index that occupies less than 24 pages. In practice, it’s generally recommended not to do anything for indexes with less than 10000 pages.
      Next step:

      delete from #indexes where LogicalFrag <= 5 or CountPages <= 10000

    The list resulted can be split in two action strategies:

    • Reorganize Indexes that have a fragmentation rate > 5% and < 30%
    • Rebuild Indexes that have a fragmentation rate > 30%

    Check this also for more info on fill factor importance and Alter Index syntaxes:

    Managing SQL Server Database Fragmentation
    How to check Fragmentation on SQL Server 2005

    II. Query Optimization

    I have been asked so many times, questions like: "Why this procedure is running so slowly?" or "why so many blockages?", "I have a new script and it takes ages until it is completed!" and I can go on. Almost every time, in a bigger or a smaller part - a lack or surplus of one/many indexes is involved.

    How to know if you have unused indexes

    Indexes in surplus - causes SQL to do as many index updates as many indexes are there on a table that include the same column which is modified in a single statement.
    I saw many cases with tables having tens of indexes. Some of them were doubled and some of them were created just following SQL hints from SQL Query Optimizer. There is more damage than good, when you follow literary SQL's hints. First of all, multiple updates on those many indexes which slow down the query, secondarily index disk usage which can get substantial.

    To understand which indexes are really unused and should be dropped at next maintenance window, SQL Server must not be restarted for some time (1 month - 6 months), considering that it will have exhausted all possible recurrent user queries, recurrent stored procedures, application transactions that run on it. This is because SQL will hold the statistics until next reboot. In order to have valuable statistics the SQL should have run for an enough period of time.

    1. Look at existing Indexes in your database:

      The script will avoid Primary Keys, Unique keys and already Disabled indexes.

      use your_database
      select i.name index_name, o.name [object_name], i.type_desc, o.type_desc , i.is_disabled, i.is_unique
      from sys.indexes i inner join sys.objects o on 
      i.[object_id] = o.[object_id] where o.type_desc = 'USER_TABLE' and i.name is not null 
      and i.is_primary_key = 0 and i.is_disabled = 0 and i.is_unique = 0
      order by o.name
    2. Check how many seeks/scans are done on above extracted indexes. You can filter the query by specifying one or more object names and index names. This select will display also each table's row count.

      Example 1: you see a number of 1867777 scans on a 2000000 rows table and close to 0 seeks. Big problem. It means the index is not well designed. It is used but not as it should be.
      Example 2: you see a number close to 0 both for seeks and scans columns. Why? The index has never been used since SQL was restarted.

      In conclusion, seeks column should filled with numbers. If both seeks and scans have numbers, the indexes should be left there. user lookups are used mostly by primary keys or clustered indexes, so it is unlikely you will something there.

      use your_database 
      IF (object_id('tempdb..#temp') IS NOT Null)
         DROP TABLE #temp   
      CREATE TABLE #temp (
      table_name sysname ,
      row_count INT,
      reserved_size VARCHAR(50),
      data_size VARCHAR(50),
      index_size VARCHAR(50),
      unused_size VARCHAR(50))
      INSERT #temp
      EXEC sp_msforeachtable 'sp_spaceused ''?'''
               I.[NAME] AS [INDEX NAME],
               USER_UPDATES, t1.row_count
                 ON I.[OBJECT_ID] = S.[OBJECT_ID]
                    AND I.INDEX_ID = S.INDEX_ID
                    inner join #temp t1 on S.object_id = object_id (t1.table_name)
      WHERE    OBJECTPROPERTY(S.[OBJECT_ID],'IsUserTable') = 1  
      and is_disabled = 0 and is_unique = 0 and i.name is not null -- and user_seeks =0
      and  OBJECT_NAME(s.[OBJECT_ID]) in ('object1') 
      and I.[name] IN ('index1', 'index2','index3')
      order by  OBJECT_NAME(s.[OBJECT_ID]) ,i.name
    3. Check leaf updates, inserts and deletes on indexes identified in previous step with zero scans/seeks.
      This is the ultimate proof that an index which is not used in optimization, is used to be updated at insert/update/delete operations --> bad performance.

      use your_database 
             I.[NAME] AS [INDEX NAME],
               ON I.[OBJECT_ID] = A.[OBJECT_ID]
                  AND I.INDEX_ID = A.INDEX_ID
      and OBJECT_NAME(A.[OBJECT_ID]) in ('object1') 
      order by  I.[NAME]
    4. At last, identify Indexes that don't do any harm, but overcrowd the Indexes list and use space, these are indexes completely useless and should be dropped.

      DECLARE @dbid INT
      SELECT @dbid = DB_ID(DB_NAME())
      I.INDEX_ID, i.object_id
      AND DATABASE_ID = @dbid) and I.NAME is not null 
      and is_unique =0 and is_disabled = 0

    How to update system tables SQL 2008 & 2012 (FIX: “Ad hoc updates to system catalogs are not allowed”)

    I came across an old version update that was enabled for all SQL 2005 + versions, meaning: update on system tables. In older versions, there was the possibility to update system tables, thus allowing you to perform some administrative tasks/fixes without having to shut down the server.
    I agree that updating system tables are not meant to be done by an usual user, and therefore hacking SQL should be done only by Database Administrators and by those who have enough knowledge to not ruin everything and put the production on hold πŸ™‚ yes… that D smile is on my face now.
    So, let’s assume I have restored the master database on a new machine and I don’t have enough space on C drive or I don’t want to use it, and want to put the databases on a SAN. Drive C was used by all user databases from the old server. In order to attach the databases rather that perform a restore on each one of them from backups (reasons: time consuming, backups are not up to date, backups don’t exist), I have to change the physical path for all user databases inside master database to point to an existing location with enough space, so that at the next SQL startup the databases will all come online as well. In this way, you can even change the databases storage by putting data files on a separate drive from the log files.

    Before I go on, I am strongly reminding that this method is not documented and should be used very carefully, and only by those who feel confident in their knowledge. Backup to master database is a must before every hack on it.

    These said… In past SQL 2000 version, the update should have been done on 2 system tables contained inside master database:

    • sysaltfiles – contains the physical location of each data / log file for each database and
    • sysdatabases – contains the physical location for each data file from each database.

    Today in SQL 2005 – 2012 versions, these 2 tables have been kept for backward compatibility as system views. SQL recommends that you use the current SQL Server system views instead. To find the equivalent system view or views, see Mapping SQL Server 2000 System Tables to SQL Server 2005 System Views.

    To spare you from research, sysaltfiles has been replaced by sys.master_files and sysdatabases by sys.databases.

    A simple select would show that all of these are views.

    use master
    select name, xtype from sysobjects
    where name in ('sysaltfiles','sysdatabases','master_files','databases')
    order by name 


    I want to update the system table which is on top of above system views. How do I know which one?
    In order to gain access to all system tables/views you need to open a DAC connection from SSMS. So go on and open a DAC connection for future queries. Don’t use the “Connect” option to make the DAC connection from SSMS but “NEW QUERY”.

    I will limit myself and run the select only on sysaltfiles and sysdatabases to look at their execution plan.
    I assume that master_files and databases have the same referenced tables.

    select name, filename from sysdatabases 

    For this query SQL will actually query sys.sysdbreg and sys.sysbrickfiles. In sys.sysdbreg SQL stores the database names and some additional information, but only sys.sysbrickfiles table is used as source for filenames location.

    select name, filename from sysaltfiles

    For this query SQL will actually query sys.sysbrickfiles.

    The following query will confirm that both are system tables. xtype = S stands for a system table.

    select name,xtype from sysobjects where name in ('sysbrickfiles','sysdbreg')

    In conclusion, we need to update only a single table: sys.sysbrickfiles.

    Le’t go and do that.

    1. Close all opened connections and Stop the SQL Server.

    3. Start SQL in single server mode
      From an elevated cmd (Run as administrator), access the right folder where sqlservr.exe executable is located and execute below commands. Normally you will find it at β€œC:\Program Files\Microsoft SQL Server\MSSQL[InstanceName]\MSSQL\Binn”. Run below commands:

      D:\>CD /D "C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Binn"
      C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Binn>sqlservr.exe -m

      You will see a lot of information in the shell, allow it to start the SQL and when it’s done, let the shell opened.

    5. Start another elevated cmd or open a DAC connection from SSMS to perform the update on your system table.

      • Connect to SQL from cmd:
      • Make sure you are connected to right server and then enable the update on system tables.

        select @@servername
        use master
        sp_configure 'allow updates', 1
        reconfigure with override

        A small note on this: If you had done it with SQL started in normal (multi) user mode, the above configuration change would have worked fine without any warning, but the update step on system table would have failed with the error message:
        “Ad hoc updates to system catalogs are not allowed”

      • Identify logical file names and their physical name locations for the database in question: – my Database Name is “data1”

        select lname, pname from sys.sysbrickfiles s1 
        inner join sys.sysdbreg s2 on s1.[dbid] = s2.[id] 
        where s2.name = 'data1' 
      • Perform the update.

        update sys.sysbrickfiles set pname = 'D:\MSSQL\data1.mdf' where lname ='data1';
        update sys.sysbrickfiles set pname = 'D:\MSSQL\data1_log.ldf' where lname ='data1_log';
        select lname, pname from sys.sysbrickfiles s1 
        inner join sys.sysdbreg s2 on s1.[dbid] = s2.[id] 
        where s2.name = 'data1' 
    7. Close the first cmd shell – It will stop SQL Server.

    9. Copy database data file and log file to the new “D:\MSSQL” location.

    11. Allow SQL Service Account full rights on “D:\MSSQL\” location – Right click on folder –> Properties –> Security –> Edit –> Add –> enter the SQL Service Account, in my case it was Network Service –> press OK –> check Full Control and press OK

    13. Restart SQL Server (Normal mode)
      The database should come online and its files will point to the new location.

    Key SQL performance counters – identify bottlenecks

    I was planning for a long time to write this post but time is never enough and timing is always not the right one πŸ™‚
    This time I managed to gather all the info and present a simple to follow, brief and explanatory list of most important performance counters used in SQL monitoring. I came up with it, by studying a lot of other similar documents from important SQL professionals and adding personal insight on each of them.

    Let’s begin by saying that these counters are helpful when are collected throughout the whole day with a refresh interval set at 1 minute. It can be done easily if you go to: Control Panel\Administrative Tools\Performance Monitor\Data Collector Sets\User defined. Right click and open: New–> Data Collector Set –> check on Create Manually –> check on Performance counters –> start adding performance counters.

    Analyze Disk
    1. Logical Disk

    • Disk Transfers/sec
    • - slow, needs attention : < (less than) than 80 I/O’s per second when 
      "Physical Disk\Avg. Disk sec/Transfer" is > (greater than) 25ms. 
      This may indicate too many virtual LUNs using the same physical disks on a SAN. 

    2. Physical Disk

    • Avg. Disk sec/Read – average time, in seconds, of a read of data to the disk
    • – very good	 	         < 10 ms 
      – okay               		 10-20 ms			
      – slow, needs attention   	 20-50 ms			 
      – Serious I/O bottleneck  	 > 50 ms 
    • Avg. Disk sec/Write – average time, in seconds, of a write of data to the disk
    • – very good	 	         < 10 ms 
      – okay               		 10-20 ms			
      – slow, needs attention   	 20-50 ms			 
      – Serious I/O bottleneck  	 > 50 ms 
    • Avg. Disk sec/Transfer – Disk latency
    • To determine the impact of excessive paging on disk activity, 
      multiply the values of the "Physical Disk\Avg. Disk sec/Transfer"
      and "Memory\Pages/sec counters.  If the product of these counters 	
      exceeds 0.1, paging is taking more than 10 percent of disk access time, 
      which indicates that you need more RAM.  If this occurs over a long 
      period, you probably need more memory. 
      However, "Page Life Expectancy" counter must be checked as well. 
    • Disk Bytes/sec – total number of bytes sent to the disk (write) and retrieved from the disk (read) over a period of one second.
      If the Perfmon capture interval is set for anything greater than one second, the average of the values captured is presented.
    • Disk Read Bytes/sec
    • Disk Write Bytes/sec
    • Disk Reads/sec
    • Disk Writes/sec

    The above 5 are useful in analyzing any ups or downs for a certain period. It will help you know the nr of reads/writes in normal behavior and if a problem occurs you can link several values and see if the difference is notable.

    Analyze Memory
    1. Memory

    • Available MBytes – amount of physical memory available to processes running on the computer, in Megabytes
      - low– < than 10% available
      - very low – less than 5% available
    • Page file usage %
    • - good - < 70 % 
    • Pages input/sec - the rate at which pages are read from disk to resolve hard page faults. Hard page faults occur when a process refers to a page in virtual memory that is not in its working set or elsewhere in physical memory, and must be retrieved from disk.
    • - good - < 10
      - bad  - > 15 

    • Pages/sec - rate at which pages are read/written to disk
    • - slow on a slow disk subsystem  - 100 - 600 
      - slow on a powerful disk system - 600 - 2500 
      - very slow                      - 2500 - 5000
      - extremely slow                 - > 5000 

    2. SQL Server: Memory Manager

    • Memory Grants pending - Number of waits on processes
    • should stay close to 0 
    • SQL Cache Memory (KB) - how much physical RAM is devoted to SQL Server’s data cache
    • It should be = (Total amount of RAM in the server) - (RAM used by Windows Server) 
      - (RAM used by SQL Server) - (and any utilities you have running on the server). 
      If much lower, than it may be possible that you aren't allowing SQL Server to dynamically
      allocate RAM, and instead have accidentally specified that SQL Server uses less RAM 
      than it should have for optimal performance. This is not a critical counter for memory
      leak but it's worth looking at it. 

    3. SQL Server : Buffer Manager

    • Buffer cache hit ratio - how oftem SQL Server can get data from buffer rather than disk
    • - good - > 90 % OLAP systems 
      - good - > 95% OLTP systems
    • Free list stalls/sec - the frequency of requests for database buffer pages are suspended because there's no buffer available
    • - good - < 2 
    • Free pages - total nr of 8k data pages on all free lists
    • - good -  > 640 
    • Lazy writes / sec - nr of times per sec that lazy writer moves dirty pages from buffer to disk to free buffer space
    • - good - < 20 
    • Page lookups/sec - nr of requests to find a page in the buffer. This counter should be analyzed in relation with "SQL statistics\ Batch Requests/sec" counter.

    • (Page Lookups/sec) / (Batch Requests/sec) - reveals if query plan is efficient or inefficient
    • -good - < 100 
    • Page reads/sec - reveals index, memory problems
    • - good - < 90 
    • Page writes/sec
    • - good - < 90
    • Page Life Expectancy - time (seconds) that a data page set is kept into buffer memory.
    • - good - > 300 

      Hint: Less than that on a regular basis is a clear indication that your SQL Server needs more RAM allocated to function properly.

    Analyze CPU
    1. System

    • Processor Queue Length
    • - good         - < 4 per cpu 
      - also good    - 4-6  
      - fair         - 7 - 12 
      - bad          - > 12 is 
    • Context Switches/sec - happens when a higher priority thread preempts a lower priority thread that is currently running or when a high priority thread blocks
      - high        - > 5000 
      - very high   - > 15000 

    2. Processor

    • Processor time % (total & per process)
    • - Fair - <80% 
    • Privileged time % (total & per process)
    • - Fair - < 75% (higher bottleneck)
    • Interrupt Time % - indicates the percentage of time the processor spends receiving and servicing hardware interrupts
    • - bad         - > 30 % (hardware or driver problem) 
      - very bad    - > 50%

    Analyze Query Performance
    1. SQL Server : Access Methods IO Intensive Operations

    • Forwarded records /sec - Occurs when a column's data gets expanded more (for varchar types) than the initially defined space. It is happening only for tables without clustered indexes.
    • - good - < 10 per 100 batch requests/sec

      Hint: Evaluate creating clustered indexes for heap tables.

    • (Index searches /sec) / (Full scans/sec) - Monitors the ratio between Index searches (good) over Full Scans (bad) on base tables or indexes
    • - bad - > 1000 and CPU high 

      Hint: Queries with missing indexes or too many rows requested will have a large number of logical reads and an increased CPU time.
      For OLTP applications, optimize for more index searches and less scans (preferably, 1 full scan for every 1000 index searches). Index and table scans are expensive I/O operations.

    • Pages splits / sec - Occurs when a 8k page fills and must be split to a new 8k page
    • - fair - < 20 per 100 batch requests/sec 

      Hint: If higher, than you have a problem that can be solved by clustered index, good fill factor.
      An increased fill factor helps to reduce page splits because there is more room in data pages before it fills up and a page split has to occur.

    2. SQL Statistics

    • Batch requests/sec - shows you how busy is your server.
    • - busy server  - > 1000 
      - very busy    - > 3000 

      Hint: A typical 100 Mbs network card is only able to handle about 3000 batch requests per second. If you have a server that is this busy, you may need
      to have two or more network cards, or go to a 1 Gbs network card.

    • SQL Compilations/Sec - The number of times per second that SQL Server compilations have occurred.
    • - good - as low as possible.
      - bad  -  > 100 
    • SQL ReCompilations/Sec - The number of times per second that SQL Server recompilations have occurred.
    • - good - as low as possible.
      - bad  -  > 100 

      Hint: Evaluate the number of (re)compilations linked to the number of Batch requests/sec, and check if the resulting ratio is less than 30% or higher. If higher than you should start checking which Stored procedures, or ad-hoc queries are getting compiled / recompiled over and over again and causing CPU usage.
      I wrote about this in detail at How to find out which Queries have non-reusable Query Plans? and Top 5 Steps To Determine Bad Performance of Queries

    So basically these counters should give you an overall view if your system is low on CPU, Memory or Disk resources or if you should re-design some your queries, tables, indexes. The perfmon counters are a very good indication of how tuned is your system. However, these should be seen together with SQL statistics that can be gathered from the multitude of dynamic views available starting SQL 2005 + editions. In a next post I will write about those important SQL counters, how I prefer to call them. If you start collecting windows counters and SQL counters you should have no problem in identifying a certain bottleneck happened in a timeframe you where not at office or skipped the critical period and couldn't analyze on time everything.

    Have a great time until then!