Q: What is a self-join?
A: A self-join is a standard SQL Server join that joins a table to itself. You can perform a self-join when all the data you require is in a single table, and the data you need to extract relates to other data in the same table. You join a table to itself using a field or combination of fields that have matching data in different records. To perform a self-join, you must use a table name for each instance of the table, so that each instance has a separate name. Use a self-join when you want to compare values in a column to other values in the same column. The data type of the columns on which the join is made must be the same or you’ll need to cast them to the same type.
For example, in employee information records, each record might have a field for an employee’s ID number and also a field for the supervisor’s ID number. To retrieve a list of employees and their managers you need to join the employee information table to itself.
Q: Is a self-join an inner join or an outer join?
A: A self-join can be either an inner join or an outer join. Let’s look at some examples that show how using either an inner join or an outer join affects the self-join.
First, run the following code to create a table called SaleOnDate with two columns, DateOfMonth and Sold- Qty, and populate it with sample data:
USE TestData GO CREATE TABLE SaleOnDate (DateOfMonth INT, SoldQty INT) GO INSERT INTO SaleOnDate (DateOfMonth, SoldQty) SELECT 1, 100 UNION ALL SELECT 2, 150 UNION ALL SELECT 3, 200 UNION ALL SELECT 4, 225 UNION ALL SELECT 6, 250 UNION ALL SELECT 7, 280 UNION ALL SELECT 8, 300 GO
The SaleOnDate table contains information about the day of the month and the quantity of items sold on that day. Figure 1 shows the contents of this table, which you can see by running the following query:
- SELECT * FROM SaleOnDate
Next, run the following query to find the difference between a named day and the next day:
SELECT sd1.DateOfMonth, (sd2.SoldQty - sd1.SoldQty) DifferenceInSale FROM SaleOnDate sd1 INNER JOIN SaleOnDate sd2 ON sd1. DateOfMonth + 1= sd2.DateOfMonth GO
This query demonstrates the use of an inner join when joining the SaleOnDate table to itself. As you can see in Figure 2, the result set doesn’t contain information about days 4 and 8, because days 5 and 9 don’t exist in the database. This is because the conditions specified in the join clause haven’t been met.
Next, let’s look at the use of a left outer join, also called a left join, to join the SaleOnDate table to itself. The following code demonstrates the use of a left join when joining the SaleOnDate table to itself:
SELECT sd1.DateOfMonth, ISNULL(CAST((sd2.SoldQty - sd1. SoldQty) AS VARCHAR(10)), ‘No Data’) DifferenceInSale FROM SaleOnDate sd1 LEFT JOIN SaleOnDate sd2 ON sd1. DateOfMonth + 1= sd2.DateOfMonth GO
The result set, shown in Figure 3, now contains information about days 4 and 8. Because days 5 and 9 don’t exist in the database, the result set contains NULL for days 4 and 8. NULL values are displayed as No Data because the ISNULL function was used in the SELECT clause.