On October 29 I provided a challenge to write a T-SQL function that

implements a primality test for an input value (of a BIGINT datatype).

You can find the challenge here.

Thanks to all those who sent solutions. I know that many of you gave it

a try but didnâ€™t send a solution since you couldnâ€™t come up with a faster

solution than the one posted by Steve Kass. ;-)

Iâ€™ll only provide a brief overview of primality tests here since the subject

is covered thoroughly in Wikipedia

(http://en.wikipedia.org/wiki/Primality_test). Iâ€™ll focus on the T-SQL

solutions assuming you are familiar with the concepts.

Primality tests is a well researched subject in computer science, but

as far as I can tell, all existing efficient algorithms are suited for

iterative/recursive implementations (e.g., the solution posted by Steve

Kass based on the Miller-Rabin test for strong pseudoprimality). I

havenâ€™t found any efficient algorithm that is suited for set-based

implementations, or at least, I couldnâ€™t find or think of any set-based

adaptations of existing efficient algorithms. One of the reasons for

providing this T-SQL challenge was that in my research, I couldnâ€™t find

any attempts (at least successful ones) to address the problem with

set-based logicâ€”such that a SQL based solution would be very fast.

You never know, maybe if enough attention will be given to trying to

solve the problem with set-based logic, some brilliant mind will come

up with a very fast solution based on a fresh set-based approach. Iâ€™m

not losing hopeâ€¦ Iâ€™ll leave the puzzle open, so if youâ€™re on to

something, Iâ€™d love to hear about it.

Primality tests fall under three categories: NaÃ¯ve Methods, Probabilistic

Techniques and Fast Deterministic Techniques.

**NaÃ¯ve Techniques **

NaÃ¯ve algorithms are the slowest. The slowest of the naÃ¯ve methods is:

Given an integer input n:

If n < 2, it is not a prime. If any integer m between 2 and n-1 divides n,

n is a composite, otherwise n is a prime.

Of course, you can do much better with a naÃ¯ve method. The first

improvement can be achieved by lowering the upper boundary value of

m. Let n be a composite expressed as the product ab, either a or b

must be smaller than or equal to the square root of n. So itâ€™s sufficient

to set the upper boundary of m to floor(sqrt(n)). Â All of the solutions I

got to the puzzle besides Steveâ€™s implemented such a naÃ¯ve method,

but all of them used iterative logic. Unlike other existing methods, naÃ¯ve

methods are in fact suited for set-based implementations. Though still

slow compared to probabilistic and fast deterministic techniques, naÃ¯ve

methods can be implemented with much faster set-based solutions

than iterative ones.

Iâ€™ll create all objects in my examples in a database called primesdb:

set nocount on;

use master;

go

if db_id('primesdb') is null create database primesdb;

go

use primesdb;

go

First, you need a fast way to produce a range of integers (BIGINT in

our case). The following code creates a function called fn_nums that

returns a set of integers in the range @min, @max:

-- function returns a nums table in the range @min - @max

if object_id('dbo.fn_nums') is not null

Â drop function dbo.fn_nums;

go

create function dbo.fn_nums(@min as bigint, @max as bigint) returns table

as

Â

return

Â with

Â Â Â l0 as(select 0 as c union all select 0),

Â Â Â l1 as(select 0 as c from l0 as a, l0 as b),

Â Â Â l2 as(select 0 as c from l1 as a, l1 as b),

Â Â Â l3 as(select 0 as c from l2 as a, l2 as b),

Â Â Â l4 as(select 0 as c from l3 as a, l3 as b),

Â Â Â l5 as(select 0 as c from l4 as a, l4 as b),

Â Â Â l6 as(select 0 as c from l5 as a, l5 as b),

Â Â Â nums as(select row_number() over(order by c) as n from l6)

Â select @min + n - 1 as n from nums where n <= @max - @min + 1;

go

You can now implement the first version of the fn_isprime function by

checking whether any integer up to the square root of the input @n

divides @n:

-- function fn_isprime, version 1

if object_id('dbo.fn_isprime') is not null

Â drop function dbo.fn_isprime;

go

create function dbo.fn_isprime(@n as bigint) returns bit

Â with returns null on null input

as

begin

Â

Â declare @explicitprime as bigint;

Â set @explicitprime = 23;

Â

Â if @n < @explicitprime

Â Â Â if @n in (2, 3, 5, 7, 11, 13, 17, 19) return 1 else return 0;

Â

Â if @n%2 = 0 or @n%3 = 0 or @n%5 = 0 or @n%7 = 0

Â or @n%11 = 0 or @n%13 = 0 or @n%17 = 0 or @n%19 = 0 return 0;

Â

Â if @n < cast(square(@explicitprime) as bigint) return 1;

Â

Â declare

Â Â Â @sqrt as bigint,

Â Â Â @from as bigint,

Â Â Â @toÂ Â as bigint;

Â

Â set @sqrt = cast(sqrt(@n) as bigint);

Â

Â set @from = @explicitprime;

Â set @toÂ Â = @sqrt;

Â

Â return

Â Â Â case when exists

Â Â Â Â Â (select * from dbo.fn_nums(@from, @to) where @n%n = 0)

Â Â Â then 0 else 1 end

Â

end;

go

Â

The function first checks whether the input is an obvious prime or

nonprime. You achieve this by explicitly handling primes below some

prime number you choose (call it @explicitprime) instead of using the

fn_nums function. For example, say you set @explicitprime to 23. If

@n < 23, the function checks explicitly if @n is one of the known

primes below 23 or not. If @n >= 23, the function checks explicitly if

any of the primes below 23 divides @n. If the previous tests did not

detect the answer and @n < 529 (square(23)), @n is a prime. Only if

the previous tests for obvious primes or nonprimes did not detect the

answer, the function queries the fn_nums table, with the range 23, floor

(sqrt(@n)), and checks if any integer in the range divides @n.

You can further optimize the solution by checking only divisors that

have potential to be primes. For example, the preliminary tests already

checked whether 2 divides @n, so you can query about half the rows

from fn_nums, and produce only odd divisors with the expression

n*2+1: