A frequent request that pops up in public SQL Server newsgroups is how to perform base conversions. That is, you store values as character strings containing digits in a given base and want to convert them to a target base. Typically, you need to store values in a nondecimal base when you have an application that works with nondecimal values (e.g., serial numbers represented in base 36, binary bitmaps stored in base 2). A decimal base uses the digits 0 through 9. A base n that's smaller than 10 uses the digits 0 through n − 1. A base that's greater than 10 uses the digits 0 through 9 plus alphabetical characters starting with A. For example, base 16 uses the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, and the characters A through F, where A represents the decimal value 10, B represents 11, and so on. However, SQL Server doesn't support working with values expressed in a nondecimal base. Thus, you have a problem if you store such values and need to perform arithmetic manipulations, such as calculating the result of 1101 + 1010 expressed in base 2.
This month, we'll explore how to convert a value in any given base to a decimal value so that you can apply any arithmetic manipulation. Next month, I'll show you how to convert the resulting decimal value back to a given base. As is typical for T-SQL query problems, writing a base converter requires a lot of logical manipulation. Even if you don't really need a base converter, writing one in T-SQL is good practice for polishing your T-SQL techniques. Consider the task as a puzzle or an exercise in logic—and try to come up with your own solution before looking at mine. You can also check the solution to last month's logical puzzle in "The Logical Puzzle," and try to solve this month's logic problem.
Converting to Decimals
Given a character string representing a value in a certain base, your task is to convert the input to a decimal value of the bigint data type. Run the script that Listing 1 shows to create table T1 and populate it with three values, each in a different base. The id column is an IDENTITY column that functions as the primary key, val is a character string that contains the value, and base is the value's base. You need to write T-SQL code that converts all input values from T1 in the given bases to decimal (base 10) values. Table 1 shows the desired output. Try to come up with a set-based solution, which will probably be more efficient than an iterative solution.
To describe my solution, I'll first explain the logic behind converting a value in a given base to a decimal value. The decimal value of an input v of a base b that contains n digits is SUM(1st digit × b0 + 2nd digit × b1 + 3rd digit × b2 + ... + nth digit × b(n-1)), where the first digit is the right-most digit, the second is the second digit from the right, and so on. For example, the hexadecimal (base 16) value 1F is 15 × 160 + 1 × 161 = 15 + 16 = 31 decimal. The first hex digit F (read from right to left) is equal to 15 decimal, and the second hex digit 1 is equal to 1 decimal.
The first step in the calculation is to break each input value into its individual digits from right to left. You can easily achieve this by joining T1 with an auxiliary table of numbers, which you create and populate by running the code in Listing 2. The join condition you use is
n <= LEN(val)
You specify the following expression in the SELECT list to extract the individual digits:
LEN(val) - n + 1, 1)
For example, the value 1F will generate two result rows: one with n equal to 1 and the digit F, and another with n equal to 2 and the digit 1.
The second step is a bit tricky. Now that you've calculated the digit position (n) and extracted the digit in the nth position, calculate the decimal value of the position. You can perform this calculation by determining the character position of the extracted digit within the string '0123456789ABCDE FGHIJKLMNOPQRSTU
VWXYZ' − 1. For example, the digit F appears as the 16th character in the above string; 16 − 1 gives you its decimal value, 15. In T-SQL, the expression to perform this calculation is:
(val, LEN(val) - n + 1, 1),
'0123456789ABCDEFGHIJKLM NOPQRSTUVWXYZ') - 1
(You'll actually type the character string as a single line.) Run the query in Listing 3 to apply the previous two steps, which break each source value into its individual digits and their respective positions and calculate the decimal value of each digit. Table 2 shows the results of this query.
The third step in solving the problem is to multiply decdigit by base raised to the power of the digit's position minus 1: decdigit × base(pos-1). The fourth and final step is to group the rows by id (a group for each source value) and calculate the sum of the results of the third step's calculations.
To perform the last two steps, create a derived table from the query that Listing 3 shows. In the outer query, group the data by id, val, and base, and for each group return the result of the following T-SQL expression:
(base AS bigint), pos-1))
The base is stored as a regular int data type. The reason for converting the base to a bigint data type is that the POWER() function returns the same data type as the first argument's data type. If you want to support decimal values up to the highest possible integer value (maximum bigint value), the first argument for the POWER() function must be a bigint data type. Listing 4 shows the complete solution that generates the desired results, which Table 1 contains.
Encapsulating the Logic
If you want to encapsulate the conversion logic, you can create a user-defined function (UDF) that accepts the source value and base as inputs and returns the decimal value. There's nothing really special about the function besides the fact that you query only the Nums table. Instead of having multiple inputs in a table like T1, you can use only one input value and base. You specify the solution query in the function's RETURN clause.
Run the code in Listing 5 to create the dbo.fn_basetodec() function. To test the function, invoke it and use the value 1F and the base 16 as inputs:
You'll get the decimal value 31 as a result.
Now that you have a method for converting a value in any given base (up to base 36) to decimal, you can perform arithmetic operations between inputs in any given bases. For example, to calculate the result of adding two base 2 values, 1101 + 1010, simply run
+ dbo.fn_basetodec('1010', 2)
This code produces the decimal result 23. If you want to convert the resulting decimal value to base 2, you need to write a program that converts the value from base 10 to base 2, or better yet, to any desired target base. Next month, I'll show you how to write such a converter.