In "Intervals and Counts, Part 1," I started a series about temporal intervals and computing all sorts of counts related to them. For sample data I provided a table called Sessions holding information about sessions that connect to applications. Use the code in Listing 1 to create the Sessions table and fill it with a small set of sample data to check the correctness of your solutions. Use the code in Listing 2 to create a helper function called GetNums, and use it to fill the Sessions table with a large set of sample data for performance testing purposes.
The first task I discussed last month was one presented to me by Adam Machanic, involving computing counts of overlapping discrete intervals. I presented two solutions. The first solution used window aggregate and offset functions and took 37 seconds to complete on my system against the large set of sample data. The second solution used window ranking and offset functions and took 22 seconds to complete. This month I discuss how you can further improve the second solution, as well as how to handle a variation of the task that involves packing of abutting intervals with the same count.
Improving Solution for Counts During Discrete Intervals
Before we explore ways to improve the second solution I presented last month, let's first revisit what that solution and its plan looked like. Listing 3 provides the solution code, and Figure 1 shows its plan (using SQL Sentry Plan Explorer).
The cool thing about this plan is that the optimizer didn't require any explicit sort operations. Throughout the plan, the optimizer relied exclusively on the ordered scans of the indexes idx_start and idx_end to compute all window functions, including the ones based on the unified set of events. However, one thing missing from this plan is parallelism. Especially with large sets, parallelism could be important to reduce run times.
I learned a trick from Adam that can improve parallelism treatment when your querying logic needs to apply work for each group of rows independently. In our case, the work needs to be done for each group of rows that are associated with the same application independently. I described this technique in "SQL Server 2005's APPLY, Part 2."
The technique involves querying the table holding the distinct groups, and using the CROSS APPLY operator to apply the original query logic per group. With enough data, SQL Server will usually use a parallel plan and distribute the work to the different threads. Each group's data is processed by one thread, and multiple groups can be processed by multiple threads in parallel. This strategy tends to distribute the work nicely between the threads, and often better than parallel plans—when such are chosen by default—to process window functions. But in our case the optimizer didn't even choose a parallel plan for the original query in Listing 3.
To use the APPLY strategy, you create an inline table function that operates on only one application's intervals. This means that the function will accept the application as an input parameter (@app). The query returned by the function is fundamentally based on the logic of the query in Listing 3, but you need to apply a few changes:
- Remove the app column from all SELECT clauses.
- Remove PARTITION BY app from all window functions.
- Add the filter WHERE app = @app to the queries in the CTE C1; those are the queries that generate the chronological sequence of events.
Use the code in Listing 4 to create the IntervalCounts function, which implements such an inline table function.
Next, to apply the function to each application from the Apps table, use the following query:
SELECT A.app, IC.*
FROM dbo.Apps AS A
CROSS APPLY dbo.IntervalCounts(A.app) AS IC;
Figure 2 shows the plan for this query. You can see the efficient handling of parallelism in the plan. This query completed in about 10 seconds on my system—less than half the run time of the original query!
Counts During Discrete Intervals with Packing
The task that's the focus of this section is a variation of the previous task. Like before, you need to produce the counts of overlapping discrete intervals, but you also need to pack abutting discrete intervals that have the same count. Figure 3 shows a graphical depiction of the task.
Notice the intervals for app3. The original task required you to produce four discrete intervals. The new task requires you to pack the three abutting intervals with the same count, resulting in two discrete intervals: one starting at 8:00 and ending at 8:30 with a count of 2, and another starting at 8:30 and ending at 10:00 with a count of 1. Figure 4 shows the desired result for the small set of sample data (sorted by app and starttime for clarity, although there's no requirement from the solution to return the result sorted).
app starttime endtime cnt
----- -------------------- -------------------- ----
app1 2013-02-12 08:30:00 2013-02-12 08:45:00 2
app1 2013-02-12 08:45:00 2013-02-12 09:00:00 1
app1 2013-02-12 09:00:00 2013-02-12 09:15:00 2
app1 2013-02-12 09:15:00 2013-02-12 09:30:00 4
app1 2013-02-12 09:30:00 2013-02-12 10:30:00 2
app1 2013-02-12 10:30:00 2013-02-12 10:45:00 1
app1 2013-02-12 10:45:00 2013-02-12 11:00:00 2
app1 2013-02-12 11:00:00 2013-02-12 11:30:00 3
app1 2013-02-12 11:30:00 2013-02-12 12:30:00 2
app1 2013-02-12 12:30:00 2013-02-12 14:30:00 1
app2 2013-02-12 08:30:00 2013-02-12 08:45:00 1
app2 2013-02-12 09:00:00 2013-02-12 09:30:00 1
app2 2013-02-12 11:45:00 2013-02-12 12:00:00 1
app2 2013-02-12 12:30:00 2013-02-12 12:45:00 1
app2 2013-02-12 12:45:00 2013-02-12 13:00:00 2
app2 2013-02-12 13:00:00 2013-02-12 13:30:00 3
app2 2013-02-12 13:30:00 2013-02-12 14:00:00 2
app2 2013-02-12 14:00:00 2013-02-12 15:30:00 1
app2 2013-02-12 15:30:00 2013-02-12 16:30:00 2
app2 2013-02-12 16:30:00 2013-02-12 17:00:00 1
app3 2013-02-12 08:00:00 2013-02-12 08:30:00 2
app3 2013-02-12 08:30:00 2013-02-12 10:00:00 1
Listing 5 shows the complete solution, including packing logic. I'll explain it in steps. The code in the CTEs C1 and C2 is similar to the code used in the solution without the packing. That's the part that generates the chronological sequence of events, as well as a start ordinal near start events (s), an end ordinal near end events (e), and a combined ordinal near all events (se). Next, the following query is used to define the CTE C3:
s - (se - s), -- count of active intervals after start event
(se - e) - e) -- count of active intervals after end event
) AS cnt,
LEAD(ts) OVER(PARTITION BY app ORDER BY ts, type, keycol) AS nextts
The query computes the count of active intervals after start events (s - (se - s)) and the count after end events ((se - e) - e). Then the COALESCE function returns whichever of the two that isn't NULL. The result of this computation (named cnt) is simply the count of active intervals after the current event.
The query also uses the LEAD function to return the timestamp (ts) of the next event, naming the resulting column nextts. Figure 5 shows the output of this query for application app3.
app ts type keycol s e se cnt nextts
----- ----- ----- ------- ----- ----- --- ---- -------
app3 08:00 1 61 1 NULL 1 1 08:00
app3 08:00 1 62 2 NULL 2 2 08:30
app3 08:30 -1 61 NULL 1 3 1 09:00
app3 09:00 -1 62 NULL 2 4 0 09:00
app3 09:00 1 63 3 NULL 5 1 09:30
app3 09:30 -1 63 NULL 3 6 0 09:30
app3 09:30 1 64 4 NULL 7 1 10:00
app3 10:00 -1 64 NULL 4 8 0 NULL
Next, the following query is used to define the CTE C4:
ts AS starttime,
nextts AS endtime,
CASE WHEN cnt = LAG(cnt) OVER(PARTITION BY app ORDER BY ts)
THEN NULL ELSE 1 END isstart
WHERE ts <> nextts
AND nextts IS NOT NULL
The query filters only the events where the current ts is different than nextts because they're the events representing interesting discrete intervals. The ordering used in the window functions ensures that if multiple start or end events happen at the same point in time, the one appearing last before the timestamp changes will have a count of active intervals after all have been applied. This count reflects the count of overlapping sessions during that discrete interval.
The query also filters only events where nextts isn't NULL. The very last end event will have a NULL in the nextts column, and that's not an interesting event for us to return.
The query uses a CASE expression with the LAG function to check whether the count of the current interval is the same as the count of the previous abutting interval. If they're the same, the computation returns 1; otherwise it returns NULL. The resulting column is named isstart. When the current interval starts a new packed group—even when the group is made of only the self-interval—the isstart flag is 1. When the interval continues a packed group, the isstart flag is NULL. Figure 6 shows the output of this query for app3.
app starttime endtime cnt isstart
----- ---------- -------- ---- -----------
app3 08:00 08:30 2 1
app3 08:30 09:00 1 1
app3 09:00 09:30 1 NULL
app3 09:30 10:00 1 NULL
The next step is to compute an identifier for each packed interval. This is done by the query that defines the CTE C5:
COUNT(isstart) OVER(PARTITION BY app ORDER BY starttime
ROWS UNBOUNDED PRECEDING) AS grp
The packed interval identifier is simply computed as the running count of the isstart flag until the current interval. Figure 7 shows the output of this query, where you can see that two packed intervals were identified, marked as grp 1 and grp 2.
app starttime endtime cnt isstart grp
----- ---------- -------- ---- -------- ----
app3 08:00 08:30 2 1 1
app3 08:30 09:00 1 1 2
app3 09:00 09:30 1 NULL 2
app3 09:30 10:00 1 NULL 2
Finally, the outer query against C5 removes all intervals with a count of 0 and groups the remaining rows by app and the packed interval identifier, returning the start and end times for each packed interval. Note that it's very important not to remove the intervals with the count of 0 before the isstart flag is computed. Doing so might cause intervals that don't abut, but have the same count, to be packed together. After the isstart flag is computed, it's safe to remove those noninteresting intervals. The output of this query is the desired output shown previously in Figure 4.
Figure 8 shows the plan for the query in Listing 5, which took 88 seconds to complete on my system.
To improve the performance of this solution, you can use the APPLY technique, as you did to improve the performance of the solution for the previous task. Use the code in Listing 6 to create the IntervalCounts inline function, implementing the same logic as in the query in Listing 5, but for a single input application (@app). Then use the following code to apply the function to each application from the Apps table:
SELECT A.app, IC.*
FROM dbo.Apps AS A
CROSS APPLY dbo.IntervalCounts(A.app) AS IC;
When I ran this code on my system with eight logical CPUs, unfortunately I didn't get a parallel plan. I got the serial plan shown in Figure 9, which took 54 seconds to complete.
Interestingly, this solution is still faster than the one in Listing 5 even though parallelism wasn't used. But it should run much faster if you can find a way to get a parallel plan. It would be nice if T-SQL supported a MINDOP hint, but it doesn't. However, here's a trick that does the job. Add an artificial cross join with a table expression that returns only one row; this won't change the result, but it makes the optimizer think that there are, for example, 100 rows. Here's how you do it:
DECLARE @n AS BIGINT = 1;
SELECT A.app, IC.*
FROM dbo.Apps AS A
CROSS APPLY dbo.IntervalCounts(A.app) AS IC
CROSS JOIN (SELECT TOP (@n) * FROM dbo.Apps) AS B
OPTION (OPTIMIZE FOR (@n = 100));
This is enough to cause the optimizer to choose a parallel plan, as Figure 10 shows. This time, the code completed in 24 seconds!
Still More To Come
There are many intriguing challenges related to intervals and counts. This article covered two main topics. One was an improved solution using the APPLY operator for computing counts during discrete intervals. Another was computing counts during discrete intervals including packing logic. Next month, I'll continue this series by covering additional tasks involving intervals and counts.