SQL Server - Using STRING_SPLIT to transform strings from a row into columns

Views: 1.002
This post is the 5 part of 5 in the series. string Split
Reading Time: 6 minutes

Speak guys !!
In this article I would like to bring a scenario that I am asked from time to time about how to solve, which is to transform strings from one row to columns, widely used for accounting accounts or cost center, for example, which was exactly the situation I was asked for. help for about 10 days and I decided to take some time to write this article.

Table example:

And then the person needed to break the numbers of the cost centers (separated by “.”) In different columns, in order to be able to set up a hierarchy of the cost centers and perform some analysis / report on this data.

Solution # 1 - STRING_SPLIT

If you are using the 2016+ version of SQL Server, you can use the function STRING_SPLIT to achieve the desired objective:

Result:

One of the users' biggest complaints about the STRING_SPLIT function is the lack of a numeric and incremental field returning the index of each line that was broken. In the example above, I had to create this line using a ranking function (row_number), but be aware that this does NOT guarantee the correct ordering.

To convert these various rows into columns, we can use aggregation functions with CASE's, assuming that the maximum level would be 10 levels:

Result:

And if you want to use a more dynamic solution, where the number of levels in the table is defined by the number of levels of the actual data, you can also use the code below:

Result:

Solution # 2 - fncSplitTexto

If you are using a pre-2016 version of SQL Server, the function STRING_SPLIT it cannot be used. However, you can very well use the fncSplitTexto function, which I made available in the article How to wrap a string in a substring table using a delimiter in SQL Server.

The advantage of using this function, in addition to backward compatibility, is that it already has the indexer by default.

Once created, its use is almost identical to that of STRING_SPLIT:

Result:

And to read the data dynamically, according to the existing columns in the data, you can use the code below:

Result:

The performance of both functions

If you’ve studied Performance Tuning a bit or read my article SQL Server - Performance Comparison between Scalar Function and CLR Scalar Function, I'm sure you already know that using STRING_SPLIT is much faster than using a UDF (User Defined Function), even though TVF (Table-valued Function).

Even so, I will show the performance difference of the native function (I will test with and without ROW_NUMBER) with a UDF function, in a data volume of approximately 20 thousand records:

As you could see, the performance difference is huge between a UDF and a native function, and the more data in the table, the bigger the difference. For this reason, I recommend always using the native function STRING_SPLIT, when possible.

Well guys, I hope you enjoyed this article.
A big hug and see you next time!