Friday, May 18, 2012

SQL Server : Find out Nth highest salary of employee using RANK() or DENSE_RANK() functions in SQL Server


This article is going to hash out some interesting question which is also one of the favorite questions asked in SQL Server interviews.  Hope, you guys got it. J
Yeap, you are right that is “Find out Nth  highest salary of employee”.
In the interviews, we used to get many answers for this questions like using CURSORS, Sub Query etc.,  which is involved lot of SQL statements with low performance.
This can be achieved very easily by using RANK()or DENSE_RANK()functions which were first introduced in SQL Server 2005.  It may be old function but it’s worth of sharing. J
Let us create sample table with few records for this demo.
CREATE TABLE [DBO].[Employee]
(
      [EmpCode]   INT   IDENTITY    (1,1),
      [EmpName]   VARCHAR(100)      NOT NULL,
      [Salary]    NUMERIC(10,2)     NOT NULL,
      [DeptName]  VARCHAR(100)      NOT NULL
)
GO

INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Priya',60000,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Chaitu',55000,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Praveen',35000,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Sathish',57000,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Ramana',62000,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Kiran',75000,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Krishna',78534,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Sravani',23000,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Mahesh',23500,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Raman',45000,'Accounts');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Raghu',35250,'Accounts');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Noha',27000,'Sales');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Sushma',29500,'Sales');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Sekhar',30000,'Sales');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Ravi',30000,'Sales');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Harini',35000,'Sales');
Hope, successfully the above statements were executed and the records are inserted into [Employee]table.
Let us execute the below query to check the inserted records:
SELECT
      [EmpCode],[EmpName],[Salary]
FROM
      [DBO].[Employee] WITH (NOLOCK)
GO

The output would be as shown below :
Now, the question is to display 4th highest salary of employee. Before that let us execute and find manually what is the 4th highest salary in the table using the following query. 
SELECT
      [EmpCode],[EmpName],[Salary]
FROM
      [DBO].[Employee] WITH (NOLOCK)
ORDER BY
      [Salary] DESC
GO

The output would be as shown below :
Now, let us find out the 4th highest salary of employee using RANK()function:
As we have seen from the above output, the fourth highest salary is “60000” of employee “Priya”, let us execute the below SQL statement and confirm whether it  gives the same output.

SELECT
      [EmpCode],[EmpName],[Salary]
FROM(     SELECT
      [EmpCode],[EmpName],[Salary]
    ,RANK() OVER (ORDER BY [Salary] DESC) AS [Highest]
FROM
      [DBO].[Employee] WITH (NOLOCK)
) AS High
WHERE
      [Highest] = 4
GO

Yeap, this is what we expected right. J

Let me add few more records for other scenarios where we may not use RANK()function, for this example let us insert few more records.
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Yash',60000,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Bhaskar',60000,'Accounts');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Swetha',75000,'IT');
INSERT INTO [DBO].[Employee] ([EmpName],[Salary],[DeptName]) VALUES ('Shailaja',45000,'Sales');
Just execute the below query to check the inserted records:

SELECT
      [EmpCode],[EmpName],[Salary]
FROM
      [DBO].[Employee] WITH (NOLOCK)
GO
Now, again the question is to display 4th highest salary of employee. Before that let us execute and find manually what is the 4th highest salary in the table using the following query.

As we know the 4th highest salary is “60000”and there are three employees having the same salary. Let us execute the above query and check the result now.

SELECT
      [EmpCode],[EmpName],[Salary]
FROM(     SELECT
      [EmpCode],[EmpName],[Salary]
    ,RANK() OVER (ORDER BY [Salary] DESC) AS [Highest]
FROM
      [DBO].[Employee] WITH (NOLOCK)
) AS High
WHERE
      [Highest] = 4
GO

The output would not be as we expected. L
Let us execute the same query with including [Highest] column in the SELECT statement.
SELECT
      [EmpCode],[EmpName],[Salary],[Highest]
FROM
(    
SELECT
      [EmpCode],[EmpName],[Salary]
    ,RANK() OVER (ORDER BY [Salary] DESC) AS [Highest]
FROM
      [DBO].[Employee] WITH (NOLOCK)
) AS High
GO
From the above query output, we can understand that “60000” salary is went to 5th position, It’s because If two or more rows tie for a rank, each tied rows receives the same rank.
Also, you can see, it’s skipped the 3rd position because there are 2 guys in 2nd position.
I know what you are thinking, you do not want to skip any position even in tied situation. Am I right?  J
Yeap, For this kind of situation we can go for DENSE_RANK()function.
Let us execute the same query with replacing DENSE_RANK() function instead of  RANK().
SELECT
      [EmpCode],[EmpName],[Salary],[Highest]
FROM
(    
SELECT
      [EmpCode],[EmpName],[Salary]
    , DENSE_RANK() OVER (ORDER BY [Salary] DESC) AS [Highest]
FROM
      [DBO].[Employee] WITH (NOLOCK)
) AS High
GO
From the above query output, you can understand even the values are tied, it is giving rank without any gaps.
Now, let us execute the below statement to find again the 4th highest though we know it is  “60000”.  
SELECT
      [EmpCode],[EmpName],[Salary]
FROM

(     SELECT
      [EmpCode],[EmpName],[Salary]
    , DENSE_RANK() OVER (ORDER BY [Salary] DESC) AS [Highest]
FROM
      [DBO].[Employee] WITH (NOLOCK)
) AS High
WHERE
      [Highest] = 4
GO
Yeap, we got the same result as we expected. J
The same query can be modified slightly , so that we can identify the Nth  highest salary of employee.
The below query will give us the 2nd highest salary of employee. If you want, you can change the value and execute. J
DECLARE @iHightest      INT
SET @iHightest = 2
SELECT
      [EmpCode],[EmpName],[Salary]
FROM
(    
SELECT
      [EmpCode],[EmpName],[Salary]
    , DENSE_RANK() OVER (ORDER BY [Salary] DESC) AS [Highest]
FROM
      [DBO].[Employee] WITH (NOLOCK)
) AS High
WHERE
      [Highest] = @iHightest
GO
The output would be :
Sometime, we may need to find out  Nth  highest salary of each department. Here, we go. J
DECLARE @iHightest      INT
SET @iHightest = 2
SELECT
      [EmpCode],[EmpName],[Salary],[DeptName]
FROM
(    
SELECT
      [EmpCode],[EmpName],[Salary],[DeptName]
    , DENSE_RANK() OVER (PARTITION BY [DeptName] ORDER BY [Salary] DESC) AS [Highest]
FROM
      [DBO].[Employee] WITH (NOLOCK)
) AS High
WHERE
      [Highest] = @iHightest
ORDER BY
[DeptName]
GO
The output would be :
From the above result, we could get the 2nd highest salary for each department. Have you noticed that in the SQL statement I have used PARTITION BY clause
Which divides the result set produced by the FROM clause into partitions to which the DENSE_RANK()function is applied. 
Hope, you have enjoyed reading this article and share your comments. J
Note :
  1. RANK()or DENSE_RANK()will work in SQL Server 2005 and above versions.
  2. For more info refer http://msdn.microsoft.com/en-us/library/ms173825

Tuesday, May 8, 2012

SSIS : Step by step process of creating a deployment Manifest file

It may be a simple process to create a deployment manifest file in SSIS for experienced guys, but may be difficult for those who are new to SSIS. So, this article will walk you through the steps to create a manifest file, in other words, deployment utility in SSIS.

Before we start discussion, let me tell you what is manifest file in SSIS. :)

Once we complete the development of the package, the next step would be deploying the package in different machine / server. The deployment utility which contains the files you need to deploy like packages, configuration files, Readme file (which might be placed in Miscellaneous folder) etc., These files are included automatically in the deployment utility when we built the package.

Now, Its time to create a manifest file in SSIS. :)

For this example, I have already created one sample package and the package folder will contains the following files before you build a solution or package.

When you build a package as shown below :

After you built the package, you could see one extra folder is created in the name of “bin” inside the package folder which will contain the deployment utility once we create it.

By default, the output path would be “bin”, but this folder name can be changed by selecting the project and right click properties as below :

Just open the bin folder and check, you could see only ImportCSV.dtsx file.

As I said before, its very easy process to create a deployment manifest file which I am going to show now :

Select your project and right click -> Go to Properties -> Deployment Utility and make CreateDeploymentUtility as "True" which is by default “False”.

The deployment utility path can be changed in DeploymentOutputPath as when required. :)

Click “OK” button.

Again, Let us go and build / rebuild solution or package.

Once you built, just go to “bin” folder and check, you could see a new folder as “Deployment”.

Open the folder and could see the manifest file has been created in the name of “ImportCSV.SSISDeploymentManifest” which will be used to send for deployment in other machines / servers.

Let us test, how it works? Double click the manifest file and follow the steps as below :

Press “Next” button.

You can deploy the package either in “File System” or “SQL Server” deployment. For this demo, let us go with default option i.e “File System” deployment. Press "Next" button.

Change the folder location, if you want, otherwise press “Next” button.

Press “Next” button.

Here, you can see the installation details and click “Finish” button to complete the deployment.
In this example, I have not included any configuration or miscellaneous files, so you could see only “ImportCSV.dtsx” file in the deployed folder.

Keep reading this page for XML configuration and adding miscellaneous files in the SSIS package.

Hope, you have enjoyed by reading this article and your comments on this article would be highly appreciated.:)

Note :

  1. Please keep in mind that you cannot deploy a single package, if you have multiple packages in the project. In this case all the packages will be deployed.
  2. For further info, please refer Books Online.