Wednesday 17 December 2014

SQL Scalar Value Function Inefficiencies

Are scalar value functions (SVF) really the best place to centralise the logic in your DB?

Although I’m against it many developers place business logic regardless of complexity into the database through the use of inline case statements where this could be a simple calculation to calculate a shipping cost for example. As the database grows and the number of SQL queries increases where they have a dependency on this calculation it becomes good practice to centralise the code for reuse rather than duplicating the logic in numerous queries.

One of the options for centralising this logic is through the use of scalar value functions where these calls are placed inline in the SQL statement, either in the SELECT or as a predicate in the WHERE clause. Using these inside a query actually impedes the performance as the SVF call is executed in a different context to the main query, thus causing an additional cost to the analyser.

If we take a simple Products data table with 100k records as an example (this simply generates some random numbers for the properties of a product):

set nocount on

create table dbo.temp_Products
(
 [ProductID] int,
 [Type] tinyint,
 [Price] float,
 [Weight] float 
)

declare @rowcnt int = 0
while (@rowcnt <= 100000)
begin
 insert into dbo.temp_Products
 select @rowcnt, 1+rand()*4, 1+rand()*100, 1+rand()*10

 set @rowcnt = @rowcnt + 1
end

And a scalar value function which does a simple logical calculation for calculating a shipping cost:

create function dbo.usvf_CalculateShipping
(
    @PricePerKG float = 1,
    @Type tinyint,
    @WeightInKG float = 1
)
returns float
as
begin

    return  /*get the appropriate factor to apply*/
            case
                when @Type = 1 then 0.1
                when @Type = 2 then 0.2
                when @Type = 3 then 0.35
                when @Type = 4 then 0.43
            end * @PricePerKG * @WeightInKG

end
go

When we call the SVF inside a query the resulting performance is impaired relative to calling the logic inline. To compare the differences between the two take the following two SQL statements:

select ProductID, case
                when [Type] = 1 then 0.1
                when [Type] = 2 then 0.2
                when [Type] = 3 then 0.35
                when [Type] = 4 then 0.43
            end *Price*[Weight] from temp_Products

select ProductID, dbo.usvf_CalculateShipping(Price, [Type], [Weight]) from temp_Products

The first query which runs the logic inline produces a result in a quickier time in comparison to the second query. The results are as follows:

(100001 row(s) affected)
Table 'temp_Products'. Scan count 1, logical reads 390, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 78 ms,  elapsed time = 212 ms.

(100001 row(s) affected)
Table 'temp_Products'. Scan count 1, logical reads 390, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 842 ms,  elapsed time = 938 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

To get around this the solution would be to use an inline table-valued function (TVF) where the logic resides, this logic would then be injected into the plan of the query calling this function, affectively placing the logic inline.

create function dbo.utvf_CalculateShipping
(
    @PricePerKG float = 1,
    @Type tinyint,
    @WeightInKG float = 1
)
returns table as return 
select case
         when @Type = 1 then 0.1
         when @Type = 2 then 0.2
         when @Type = 3 then 0.35
         when @Type = 4 then 0.43
       end * @PricePerKG * @WeightInKG as Shipping
go

If I run the following two queries you’ll find the performance of both are almost identical:

select ProductID, case
                when [Type] = 1 then 0.1
                when [Type] = 2 then 0.2
                when [Type] = 3 then 0.35
                when [Type] = 4 then 0.43
            end *Price*[Weight] from temp_Products
select ProductID, tt.Shipping from temp_Products t cross apply dbo.utvf_CalculateShipping(Price, Type, Weight) tt



(100001 row(s) affected)
Table 'temp_Products'. Scan count 1, logical reads 390, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 109 ms,  elapsed time = 310 ms.

(100001 row(s) affected)
Table 'temp_Products'. Scan count 1, logical reads 390, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 109 ms,  elapsed time = 285 ms.

No comments:

Post a Comment