Click on the banner to learn about and purchase my database training on Azure

SQL Server - Performance Comparison between Scalar Function and CLR Scalar Function

Views: 1.792 views
This post is the 5 part of 10 in the series. Performance tuning
Reading Time: 13 minutes

Hello people,
All right with you ??

In this post, which will be the blog's 200 number, I would like to talk about two subjects that I particularly like about SQL Server, which is the creation of .NET (CLR) routines within the SQL Server database and performance tuning.

Coincidental or not, my post number 100 was the Introduction to SQL Common Language Runtime (CLR) in SQL Server, which was my first blog post about the CLR.

My intention in this post, is to demonstrate the performance gain that can be obtained by using Scalar functions of the CLR instead of the Scalar function written in T-SQL (UDF - User Defined Function). In the tests carried out, the vast majority of the T-SQL functions that were migrated to CLR functions have a great performance gain simply due to the optimizations of the Microsoft .NET Framework compared to the SQL Server engine.

In many situations, we see that indexes are created correctly, statistics updated, and without having to rewrite the query, you can achieve a great performance gain by simply changing the programming language of scalar functions.

Will CLR function always be better?

Some may have some questions about the performance of scalar functions of the CLR, which I will clarify:

- Will the scalar functions of the CLR always perform better than functions?
Certainly not! I've done a lot of tests comparing different functions, for different purposes and from my experience, the scalar functions written in C # on CLR do have a better performance, but I've seen some cases where even optimizing the code to the maximum, the function T-SQL performs better.

- What if we compare the performance of native functions with the CLR functions?
Functions, in general, always degrade performance if a database query. However, as much as the CLR scalar functions are quite performative, the native SQL Server functions (eg CONVERT, CAST, DATEADD, etc.) generally perform better when compared.

Comparing performance in practice

After making this introduction on the subject, it's time to really demonstrate what was said here. And nothing better than performing performance tests in practice to convince you that the CLR can in many cases provide a great performance gain in your SQL Server queries.

To generate the mass of test data, I used this script:

The source code for the fncRand () function, as well as the explanation of why I use this function instead of RAND (), you can find in the post. SQL Server - Msg 443 Invalid use of a side-effecting operator 'rand' within a function.

fncFirstDayMes

View content
T-SQL Source Code:

CLR Source Code:

Output:

SELECT Performance Comparison

WHERE Performance Comparison

fncLastDayMes

View content
T-SQL Source Code:

CLR Source Code:

Output:

SELECT Performance Comparison

WHERE Performance Comparison

fncSplit

View content
T-SQL Source Code:

CLR Source Code:

Output:

SELECT Performance Comparison
Notice that, after waiting 5 mins, I ended up canceling and starting the tests again, but with only 10.000 records for the T-SQL function, compared to the CLR function running on 100.000 records. But even though the CLR function is running 10x more, see what happened in the results…

WHERE Performance Comparison

fncBase64_Encode

View content
T-SQL Source Code:

CLR Source Code:

Output:

SELECT Performance Comparison

fncBase64_Decode

View content
T-SQL Source Code:

CLR Source Code:

Output:

SELECT Performance Comparison

fncValida_CPF

View content
T-SQL Source Code:

CLR Source Code:

Output:

SELECT Performance Comparison

fncRetrieveNumbers

View content
T-SQL Source Code:

CLR Source Code:

Output:

SELECT Performance Comparison

fncMes

View content
T-SQL Source Code:

CLR Source Code:

Output:

SELECT Performance Comparison

fncConverte_In_Hours

View content
T-SQL Source Code:

CLR Source Code:

Output:

SELECT Performance Comparison

fncFormat_Document

View content
T-SQL Source Code:

CLR Source Code:

Output:

SELECT Performance Comparison

Conclusion

After 10 performance tests between Scalar functions T-SQL x CLR, it can be concluded that in all the cases presented (I chose 10 from 90 functions that I have in 2 languages, randomly), the CLR function had a huge performance gain. , both CPU and runtime. In the company where I work as a DBA, I tested all 90 functions and in 87 cases, the function written in C # was executed at least half the time of the T-SQL function.

However, as I said above, this does not mean that in 100% of cases this will occur. Like everything related to performance tuning, there is no perfect tip that applies to 100% of situations. Before changing a T-SQL function to SQL, do several tests to ensure that there was really a significant performance gain when creating the function in CLR.

In the first 2 examples, where the functions are very simple, I could also compare with native SQL Server functions, without using UDF functions. In this case, it was observed that without using UDF function and applying native functions directly in SELECT and / or WHERE, it ends up being faster than CLR. However, the functions are often quite complex, which makes it difficult and even impossible to use them without creating a UDF function.

Functions in SQL Server are a great way to encapsulate and reuse code, but at the same time can end up compromising the performance of your queries. In this case, the solution that acts as a compromise between native SQL Server function and T-SQL scalar function would be the CLR scalar function, which outperforms the T-SQL function (in the vast majority of cases) and is close to of the native function and at the same time, enables the reuse and encapsulation of code.

Through this post, I hope I have shown you a real way to optimize SQL queries without having to change 1 line of code, without having to change anything in your instance. In many cases, indexes are being used as expected, statistics are up to date, no warning in the execution plan, but when using a T-SQL function, query performance is poor.

In this case, using CLR functions can be a big improvement in your application / routine and gain precious seconds / CPU cycles in everyday life. As a result, it was possible to reduce several and several hours of CPU / processing daily in the instances of my work by simply replacing T-SQL functions with CLR functions and this can be a big differentiator in DBA life.

I hope you enjoyed this post.
Any questions or criticism, leave here in the comments.

Hug!