On October 29 I provided a challenge to write a T-SQL function that

implements a primality test for an input value (of a BIGINT datatype).

You can find the challenge here.

Thanks to all those who sent solutions. I know that many of you gave it

a try but didn’t send a solution since you couldn’t come up with a faster

solution than the one posted by Steve Kass. ;-)

I’ll only provide a brief overview of primality tests here since the subject

is covered thoroughly in Wikipedia

(http://en.wikipedia.org/wiki/Primality_test). I’ll focus on the T-SQL

solutions assuming you are familiar with the concepts.

Primality tests is a well researched subject in computer science, but

as far as I can tell, all existing efficient algorithms are suited for

iterative/recursive implementations (e.g., the solution posted by Steve

Kass based on the Miller-Rabin test for strong pseudoprimality). I

haven’t found any efficient algorithm that is suited for set-based

implementations, or at least, I couldn’t find or think of any set-based

adaptations of existing efficient algorithms. One of the reasons for

providing this T-SQL challenge was that in my research, I couldn’t find

any attempts (at least successful ones) to address the problem with

set-based logic—such that a SQL based solution would be very fast.

You never know, maybe if enough attention will be given to trying to

solve the problem with set-based logic, some brilliant mind will come

up with a very fast solution based on a fresh set-based approach. I’m

not losing hope… I’ll leave the puzzle open, so if you’re on to

something, I’d love to hear about it.

Primality tests fall under three categories: Naïve Methods, Probabilistic

Techniques and Fast Deterministic Techniques.

**Naïve Techniques **

Naïve algorithms are the slowest. The slowest of the naïve methods is:

Given an integer input n:

If n < 2, it is not a prime. If any integer m between 2 and n-1 divides n,

n is a composite, otherwise n is a prime.

Of course, you can do much better with a naïve method. The first

improvement can be achieved by lowering the upper boundary value of

m. Let n be a composite expressed as the product ab, either a or b

must be smaller than or equal to the square root of n. So it’s sufficient

to set the upper boundary of m to floor(sqrt(n)). All of the solutions I

got to the puzzle besides Steve’s implemented such a naïve method,

but all of them used iterative logic. Unlike other existing methods, naïve

methods are in fact suited for set-based implementations. Though still

slow compared to probabilistic and fast deterministic techniques, naïve

methods can be implemented with much faster set-based solutions

than iterative ones.

I’ll create all objects in my examples in a database called primesdb:

set nocount on;

use master;

go

if db_id('primesdb') is null create database primesdb;

go

use primesdb;

go

First, you need a fast way to produce a range of integers (BIGINT in

our case). The following code creates a function called fn_nums that

returns a set of integers in the range @min, @max:

-- function returns a nums table in the range @min - @max

if object_id('dbo.fn_nums') is not null

drop function dbo.fn_nums;

go

create function dbo.fn_nums(@min as bigint, @max as bigint) returns table

as

return

with

l0 as(select 0 as c union all select 0),

l1 as(select 0 as c from l0 as a, l0 as b),

l2 as(select 0 as c from l1 as a, l1 as b),

l3 as(select 0 as c from l2 as a, l2 as b),

l4 as(select 0 as c from l3 as a, l3 as b),

l5 as(select 0 as c from l4 as a, l4 as b),

l6 as(select 0 as c from l5 as a, l5 as b),

nums as(select row_number() over(order by c) as n from l6)

select @min + n - 1 as n from nums where n <= @max - @min + 1;

go

You can now implement the first version of the fn_isprime function by

checking whether any integer up to the square root of the input @n

divides @n:

-- function fn_isprime, version 1

if object_id('dbo.fn_isprime') is not null

drop function dbo.fn_isprime;

go

create function dbo.fn_isprime(@n as bigint) returns bit

with returns null on null input

as

begin

declare @explicitprime as bigint;

set @explicitprime = 23;

if @n < @explicitprime

if @n in (2, 3, 5, 7, 11, 13, 17, 19) return 1 else return 0;

if @n%2 = 0 or @n%3 = 0 or @n%5 = 0 or @n%7 = 0

or @n%11 = 0 or @n%13 = 0 or @n%17 = 0 or @n%19 = 0 return 0;

if @n < cast(square(@explicitprime) as bigint) return 1;

declare

@sqrt as bigint,

@from as bigint,

@to as bigint;

set @sqrt = cast(sqrt(@n) as bigint);

set @from = @explicitprime;

set @to = @sqrt;

return

case when exists

(select * from dbo.fn_nums(@from, @to) where @n%n = 0)

then 0 else 1 end

end;

go

The function first checks whether the input is an obvious prime or

nonprime. You achieve this by explicitly handling primes below some

prime number you choose (call it @explicitprime) instead of using the

fn_nums function. For example, say you set @explicitprime to 23. If

@n < 23, the function checks explicitly if @n is one of the known

primes below 23 or not. If @n >= 23, the function checks explicitly if

any of the primes below 23 divides @n. If the previous tests did not

detect the answer and @n < 529 (square(23)), @n is a prime. Only if

the previous tests for obvious primes or nonprimes did not detect the

answer, the function queries the fn_nums table, with the range 23, floor

(sqrt(@n)), and checks if any integer in the range divides @n.

You can further optimize the solution by checking only divisors that

have potential to be primes. For example, the preliminary tests already

checked whether 2 divides @n, so you can query about half the rows

from fn_nums, and produce only odd divisors with the expression

n*2+1: