SQL Server - Breaking strings into sub-strings using a separator (Split string)

Views: 12.343
This post is the 1 part of 5 in the series. string Split
Reading Time: 3 minutes

Hello guys!
Good night!

Today I will present a function created by my friend Murilo Mielke, which allows you to break a string delimited by some (or some) characters in substrings. For the web developer, that's what makes the explode function of PHP or Java Split, Javascript, C #, etc ..

Basically, you have a string like the example below:

nome;nascimento;email
Nome 1;1994-05-29;[email protected]
Nome 2;1981-07-10;[email protected]
Nome 3;2001-02-27;[email protected]

Imagine that you want to retrieve only the name and email from the records above. Dividing each line using the character “;” as a separator, we have 3 sub-strings. This is exactly what the function below does:

Examples of use:

Using the CLR

Another alternative to solve this problem is to use CLR, a feature that allows you to create .NET (C # or VB.NET) code written within SQL Server, where it generally performs much better than T-SQL code. If you don't know what SQLCLR is, learn more by accessing the post. Introduction to SQL Common Language Runtime (CLR) in SQL Server.

View C # Function (CLR) Source Code

Performance test

While the two functions bring us the same result, the way they are executed is totally different. The T-SQL function is simpler to implement (just an F5), and the CLR function takes more work to be created in the database, as it will be necessary to create an assembly in Visual Studio to use it (if you don't have any). If you already have one created, then it's easy) in addition to the fact that it is written in C #, a language widely known by developers, but not so much by DBA's (All of this was already mentioned in the SQL CLR introduction post).

Given these facts, you may be asking yourself, “So why use the CLR function? What is the advantage? ”. And the answer is this: PERFORMANCE.

Query used for testing:

Result:
SQL Server - Split function CLR TSQL performance

As you can see, the test table has 81.753 lines, where each line has 4 words separated by the character "|", and I applied the function 4 times, to retrieve each of those words, from each line.

The result presented a very convincing argument for using the CLR function: While the T-SQL function took 213,2 seconds to do this processing, the CLR function (which does the same thing) needed only 3,6 seconds. A performance difference of almost 60 times more if using the CLR.
Amazing!

sql server split explode split string strings table valued function

sql server split explode split string strings table valued function

That simple!
Until the next post.