T-SQL Travel Puzzle

I got this nice puzzle from Chip Smithson. The data involved contains travel information. Use the following code to create a table called Travel and populate it with sample data:

SET NOCOUNT ON;

USE tempdb;

 

IF OBJECT_ID('dbo.Travel', 'U') IS NOT NULL

  DROP TABLE dbo.Travel;

 

CREATE TABLE dbo.Travel

(

  travelerid    INT          NOT NULL,

  destination   VARCHAR(25NOT NULL,

  travelstart   DATETIME2(0) NOT NULL,

  milestraveled INT          NOT NULL,

  PRIMARY KEY(travelerid, travelstart)

);

 

INSERT INTO dbo.Travel(travelerid, destination, travelstart, milestraveled) VALUES

  (2, 'England', '20110212 10:00:00',    0),

  (2, 'England', '20110212 12:00:00',  200),

  (2, 'England', '20110212 14:00:00',  300),

  (2, 'England', '20110213 08:00:00',  500),

  (2, 'Germany', '20110215 10:00:00',  700),

  (2, 'Germany', '20110216 03:00:00',  900),

  (2, 'Germany', '20110216 08:00:00', 1100),

  (2, 'Germany', '20110218 06:00:00', 1300),

  (2, 'England', '20110219 12:00:00', 1700),

  (2, 'England', '20110219 14:00:00', 1900),

  (2, 'England', '20110221 10:00:00', 2300),

  (7, 'USA',     '20110314 10:00:00',    0),

  (7, 'USA',     '20110314 12:00:00',  100),

  (7, 'USA',     '20110314 14:00:00',  200),

  (7, 'Canada''20110315 08:00:00',  300),

  (7, 'Canada''20110317 10:00:00',  900),

  (7, 'Canada''20110318 03:00:00', 1200),

  (7, 'USA',     '20110318 08:00:00', 1300),

  (7, 'USA',     '20110320 06:00:00', 1500),

  (7, 'USA',     '20110321 12:00:00', 1600),

  (7, 'Canada''20110321 14:00:00', 1800),

  (7, 'Canada''20110322 10:00:00', 2500);

 

Each row in the table Travel represents one leg of a traveler’s itinerary, with information about the traveler ID, destination, travel start date and time, and total miles travelled so far. The primary key of the table is defined on the combination of attributes travelerid and travelstart. As you can see, each row contains only the flight destination, not the source. The information about the origin of the flight appears in the chronologically previous flight for the same traveler. Observe that the table contains a row with a milestraveled value of 0 for each traveler representing the start of the itinerary.

The challenge is to calculate for each traveler the total number of miles travelled domestically in each country. International flights are not interesting for the purposes of this challenge. The desired result for the given sample data are (in no specific order):

travelerid  destination               totalmiles

----------- ------------------------- -----------

7           Canada                    1600

2           England                   1100

2           Germany                   600

7           USA                       500

 

To test the validity of your solution use the above sample data. To test its performance, you need a larger table. Use the following code to populate the table Travel with a larger set:

-- helper table returning a sequence of integers of a requested size

IF OBJECT_ID('dbo.GetNums', 'IF') IS NOT NULL

  DROP FUNCTION dbo.GetNums;

GO

CREATE FUNCTION dbo.GetNums(@n AS BIGINT) RETURNS TABLE

AS

RETURN

  WITH

    L0   AS(SELECT 1 AS c UNION ALL SELECT 1),

    L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),

    L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),

    L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),

    L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),

    L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),

    Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)

  SELECT TOP (@n) n FROM Nums ORDER BY n;

GO

 

-- populate table with large sample data for performance testing

TRUNCATE TABLE dbo.Travel;

 

INSERT INTO dbo.Travel WITH(TABLOCK)

    (travelerid, destination, travelstart, milestraveled)

  SELECT

    NTILE(2) OVER(ORDER BY n) AS travelerid,

    'Destination ' + CAST((NTILE(8) OVER(ORDER BY n) - 1) % 2 + 1 AS VARCHAR(10)) AS destination,

    DATEADD(second, (n-1) % 500000, '20110101') AS travelstart,

    (n-1) % 500000 AS milestraveled

  FROM dbo.GetNums(1000000);

 

Please send your solutions to itzik@solidq.com with the subject T-SQL Challenge. Feel free to also post them here as comments if you like.

I’ll provide the solutions in my next post.

Good luck!

BG

Discuss this Blog Entry 4

on Aug 17, 2010
;WITH cteTravel(TravelerID, Destination, TotalMiles)
AS (
SELECT TravelerID,
Destination,
MAX(MilesTraveled) - MIN(MilesTraveled) AS TotalMiles
FROM (
SELECT TravelerID,
Destination,
MilesTraveled,
ROW_NUMBER() OVER (PARTITION BY TravelerID ORDER BY MilesTraveled) AS Yak,
ROW_NUMBER() OVER (PARTITION BY TravelerID, Destination ORDER BY MilesTraveled) AS Yik
FROM dbo.Travel
) AS x
GROUP BY TravelerID,
Destination,
Yak - Yik
)
SELECT TravelerID,
Destination,
SUM(TotalMiles) AS TotalMiles
FROM cteTravel
GROUP BY TravelerID,
Destination























on Aug 17, 2010
My bad. Just realized clustered index is over travelerid and travelstart (not milestraveled).

;WITH cteTravel(TravelerID, Destination, TotalMiles)
AS (
SELECT TravelerID,
Destination,
MAX(MilesTraveled) - MIN(MilesTraveled) AS TotalMiles
FROM (
SELECT TravelerID,
Destination,
MilesTraveled,
ROW_NUMBER() OVER (PARTITION BY TravelerID ORDER BY TravelStart) AS Yak,
ROW_NUMBER() OVER (PARTITION BY TravelerID, Destination ORDER BY TravelStart) AS Yik
FROM dbo.Travel
) AS x
GROUP BY TravelerID,
Destination,
Yak - Yik
)
SELECT TravelerID,
Destination,
SUM(TotalMiles) AS TotalMiles
FROM cteTravel
GROUP BY TravelerID,
Destination























on Aug 18, 2010
Hi Itzik,

this is a nice case of "unpacking running totals", so a solution with a single table scan can use previous operator, but it not exists...
my solution use an inline function to emulate it but this don't work good as the wonderful Pesomannen's solution.

create function fi_NextTravel(@travelerid int, @travelstart datetime2(0))
returns table as return

select top 1 *
from dbo.Travel
where travelerid=@travelerid and travelstart>@travelstart
order by travelstart
go
with r as(
select *, Ml=0 from dbo.Travel where milestraveled=0

union all

select n.*, case n.destination when r.destination then n.milestraveled-r.milestraveled else 0 end
from r
cross apply dbo.fi_NextTravel(r.travelerid, r.travelstart) n
)
select travelerid, destination, milestraveld=sum(Ml) from r
group by travelerid, destination
option(maxrecursion 0)
go
























on Aug 17, 2010
;WITH CTE_Ord AS (
SELECT ROW_NUMBER() OVER(PARTITION BY t.travelerid ORDER BY t.travelstart) AS Ord
,t.travelerid
,t.destination
,t.travelstart
,t.milestraveled
FROM dbo.Travel AS T
)
SELECT COFrom.travelerid,COFrom.destination,SUM(COTo.milestraveled - COFrom.milestraveled) milestraveled
FROM CTE_Ord AS COFrom
INNER JOIN CTE_Ord AS COTo
ON COFrom.travelerid = COTo.travelerid
AND COFrom.destination = COTo.destination
AND COFrom.Ord = COTo.Ord - 1
GROUP BY COFrom.travelerid,COFrom.destination













Please or Register to post comments.

What's Puzzled By T-SQL Blog?

T-SQL tips and logical puzzles from Itzik Ben-Gan.

Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×