- SQL Server - Breaking strings into sub-strings using a separator (Split string)
- SQL Server - Table-valued function to break a string into lines up to N characters long
- How to wrap a string in a substring table using a delimiter in SQL Server
- SQL Server - Charindex: A Different Function to Break Split Strings
- SQL Server - Using STRING_SPLIT to transform strings from a row into columns
Speak guys !!
In this article I would like to bring a scenario that I am asked from time to time about how to solve, which is to transform strings from one row to columns, widely used for accounting accounts or cost center, for example, which was exactly the situation I was asked for. help for about 10 days and I decided to take some time to write this article.
And then the person needed to break the numbers of the cost centers (separated by “.”) In different columns, in order to be able to set up a hierarchy of the cost centers and perform some analysis / report on this data.
Solution # 1 - STRING_SPLIT
If you are using the 2016+ version of SQL Server, you can use the function STRING_SPLIT to achieve the desired objective:
1 2 3 4 5 6 7 8 9 10 11 |
DECLARE @Tabela TABLE ( Conta VARCHAR(100) ) INSERT INTO @Tabela VALUES ('1'), ('01.04'), ('01.06'), ('01.04.14'), ('01.04.18'), ('1.1.3'), ('1.5.6'), ('1.1.2'), ('1.4.3.1.4'), ('1.6.4.32') SELECT *, ROW_NUMBER() OVER(PARTITION BY A.Conta ORDER BY A.Conta) AS Nivel FROM @Tabela A CROSS APPLY STRING_SPLIT(A.Conta, '.') |
To convert these various rows into columns, we can use aggregation functions with CASE's, assuming that the maximum level would be 10 levels:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
DECLARE @Tabela TABLE ( Conta VARCHAR(100) ) INSERT INTO @Tabela VALUES ('1'), ('01.04'), ('01.06'), ('01.04.14'), ('01.04.18'), ('1.1.3'), ('1.5.6'), ('1.1.2'), ('1.4.3.1.4'), ('1.6.4.32') IF (OBJECT_ID('tempdb..#Tabela') IS NOT NULL) DROP TABLE #Tabela ;WITH dados AS ( SELECT A.Conta, [value] AS Palavra, ROW_NUMBER() OVER(PARTITION BY A.Conta ORDER BY A.Conta) AS Nivel FROM @Tabela A CROSS APPLY STRING_SPLIT(A.Conta, '.') ) SELECT A.Conta, MAX(CASE WHEN A.Nivel = 1 THEN CONVERT(INT, A.Palavra) END) AS Parte1, MAX(CASE WHEN A.Nivel = 2 THEN CONVERT(INT, A.Palavra) END) AS Parte2, MAX(CASE WHEN A.Nivel = 3 THEN CONVERT(INT, A.Palavra) END) AS Parte3, MAX(CASE WHEN A.Nivel = 4 THEN CONVERT(INT, A.Palavra) END) AS Parte4, MAX(CASE WHEN A.Nivel = 5 THEN CONVERT(INT, A.Palavra) END) AS Parte5, MAX(CASE WHEN A.Nivel = 6 THEN CONVERT(INT, A.Palavra) END) AS Parte6, MAX(CASE WHEN A.Nivel = 7 THEN CONVERT(INT, A.Palavra) END) AS Parte7, MAX(CASE WHEN A.Nivel = 8 THEN CONVERT(INT, A.Palavra) END) AS Parte8, MAX(CASE WHEN A.Nivel = 9 THEN CONVERT(INT, A.Palavra) END) AS Parte9, MAX(CASE WHEN A.Nivel = 10 THEN CONVERT(INT, A.Palavra) END) AS Parte10 FROM dados A GROUP BY A.Conta |
And if you want to use a more dynamic solution, where the number of levels in the table is defined by the number of levels of the actual data, you can also use the code below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
------------------------------------------------------ -- GERA OS DADOS ------------------------------------------------------ DECLARE @Tabela TABLE ( Conta VARCHAR(100) ) INSERT INTO @Tabela VALUES ('1'), ('01.04'), ('01.06'), ('01.04.14'), ('01.04.18'), ('1.1.3'), ('1.5.6'), ('1.1.2'), ('1.4.3.1.4'), ('1.6.4.32') ------------------------------------------------------ -- QUEBRA AS CONTAS EM PARTES ------------------------------------------------------ IF (OBJECT_ID('tempdb..#Partes') IS NOT NULL) DROP TABLE #Partes SELECT A.Conta, CONVERT(INT, [value]) AS Parte, ROW_NUMBER() OVER(PARTITION BY A.Conta ORDER BY A.Conta) AS Nivel INTO #Partes FROM @Tabela A CROSS APPLY STRING_SPLIT(A.Conta, '.') ------------------------------------------------------ -- GERA OS SCRIPTS PRA CRIAÇÃO E ATUALIZAÇÃO DINÂMICA ------------------------------------------------------ DECLARE @Contador INT = 1, @Total INT = (SELECT MAX(Nivel) FROM #Partes), @QueryCreate VARCHAR(MAX) = 'CREATE TABLE ##Tabela ( Conta VARCHAR(100)', @QueryUpdate VARCHAR(MAX) = '' WHILE(@Contador <= @Total) BEGIN SET @QueryCreate += ', Nivel' + CONVERT(VARCHAR(10), @Contador) + ' INT' SET @QueryUpdate += 'UPDATE A SET A.Nivel' + CONVERT(VARCHAR(10), @Contador) + ' = B.Parte FROM ##Tabela A JOIN #Partes B ON A.Conta = B.Conta WHERE B.Nivel = ' + CONVERT(VARCHAR(10), @Contador) + '; ' SET @Contador += 1 END SET @QueryCreate += ' )' ------------------------------------------------------ -- CRIA A TABELA DINAMICAMENTE ------------------------------------------------------ IF (OBJECT_ID('tempdb..##Tabela') IS NOT NULL) DROP TABLE ##Tabela EXEC(@QueryCreate) INSERT INTO ##Tabela ( Conta ) SELECT DISTINCT Conta FROM #Partes ------------------------------------------------------ -- ATUALIZA OS DADOS DINAMICAMENTE ------------------------------------------------------ EXEC(@QueryUpdate) ------------------------------------------------------ -- RESULTADO FINAL ------------------------------------------------------ SELECT * FROM ##Tabela |
Solution # 2 - fncSplitTexto
If you are using a pre-2016 version of SQL Server, the function STRING_SPLIT it cannot be used. However, you can very well use the fncSplitTexto function, which I made available in the article How to wrap a string in a substring table using a delimiter in SQL Server.
The advantage of using this function, in addition to backward compatibility, is that it already has the indexer by default.
Once created, its use is almost identical to that of STRING_SPLIT:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
DECLARE @Tabela TABLE ( Conta VARCHAR(100) ) INSERT INTO @Tabela VALUES ('1'), ('01.04'), ('01.06'), ('01.04.14'), ('01.04.18'), ('1.1.3'), ('1.5.6'), ('1.1.2'), ('1.4.3.1.4'), ('1.6.4.32') IF (OBJECT_ID('tempdb..#Tabela') IS NOT NULL) DROP TABLE #Tabela SELECT Conta, MAX(CASE WHEN B.Id = 1 THEN CONVERT(INT, B.Palavra) END) AS Parte1, MAX(CASE WHEN B.Id = 2 THEN CONVERT(INT, B.Palavra) END) AS Parte2, MAX(CASE WHEN B.Id = 3 THEN CONVERT(INT, B.Palavra) END) AS Parte3, MAX(CASE WHEN B.Id = 4 THEN CONVERT(INT, B.Palavra) END) AS Parte4, MAX(CASE WHEN B.Id = 5 THEN CONVERT(INT, B.Palavra) END) AS Parte5, MAX(CASE WHEN B.Id = 6 THEN CONVERT(INT, B.Palavra) END) AS Parte6, MAX(CASE WHEN B.Id = 7 THEN CONVERT(INT, B.Palavra) END) AS Parte7, MAX(CASE WHEN B.Id = 8 THEN CONVERT(INT, B.Palavra) END) AS Parte8, MAX(CASE WHEN B.Id = 9 THEN CONVERT(INT, B.Palavra) END) AS Parte9, MAX(CASE WHEN B.Id = 10 THEN CONVERT(INT, B.Palavra) END) AS Parte10 FROM @Tabela CROSS APPLY dbo.fncSplitTexto(Conta, '.') AS B GROUP BY Conta |
And to read the data dynamically, according to the existing columns in the data, you can use the code below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
------------------------------------------------------ -- GERA OS DADOS ------------------------------------------------------ DECLARE @Tabela TABLE ( Conta VARCHAR(100) ) INSERT INTO @Tabela VALUES ('1'), ('01.04'), ('01.06'), ('01.04.14'), ('01.04.18'), ('1.1.3'), ('1.5.6'), ('1.1.2'), ('1.4.3.1.4'), ('1.6.4.32') ------------------------------------------------------ -- QUEBRA AS CONTAS EM PARTES ------------------------------------------------------ IF (OBJECT_ID('tempdb..#Partes') IS NOT NULL) DROP TABLE #Partes SELECT Conta, B.Id, CONVERT(INT, B.Palavra) AS Parte INTO #Partes FROM @Tabela CROSS APPLY dbo.fncSplitTexto(Conta, '.') AS B ------------------------------------------------------ -- GERA OS SCRIPTS PRA CRIAÇÃO E ATUALIZAÇÃO DINÂMICA ------------------------------------------------------ DECLARE @Contador INT = 1, @Total INT = (SELECT MAX(Id) FROM #Partes), @QueryCreate VARCHAR(MAX) = 'CREATE TABLE ##Tabela ( Conta VARCHAR(100)', @QueryUpdate VARCHAR(MAX) = '' WHILE(@Contador <= @Total) BEGIN SET @QueryCreate += ', Parte' + CONVERT(VARCHAR(10), @Contador) + ' INT' SET @QueryUpdate += 'UPDATE A SET A.Parte' + CONVERT(VARCHAR(10), @Contador) + ' = B.Parte FROM ##Tabela A JOIN #Partes B ON A.Conta = B.Conta WHERE B.Id = ' + CONVERT(VARCHAR(10), @Contador) + '; ' SET @Contador += 1 END SET @QueryCreate += ' )' ------------------------------------------------------ -- CRIA A TABELA DINAMICAMENTE ------------------------------------------------------ IF (OBJECT_ID('tempdb..##Tabela') IS NOT NULL) DROP TABLE ##Tabela EXEC(@QueryCreate) INSERT INTO ##Tabela ( Conta ) SELECT DISTINCT Conta FROM #Partes ------------------------------------------------------ -- ATUALIZA OS DADOS DINAMICAMENTE ------------------------------------------------------ EXEC(@QueryUpdate) ------------------------------------------------------ -- RESULTADO FINAL ------------------------------------------------------ SELECT * FROM ##Tabela |
The performance of both functions
If you’ve studied Performance Tuning a bit or read my article SQL Server - Performance Comparison between Scalar Function and CLR Scalar Function, I'm sure you already know that using STRING_SPLIT is much faster than using a UDF (User Defined Function), even though TVF (Table-valued Function).
Even so, I will show the performance difference of the native function (I will test with and without ROW_NUMBER) with a UDF function, in a data volume of approximately 20 thousand records:
As you could see, the performance difference is huge between a UDF and a native function, and the more data in the table, the bigger the difference. For this reason, I recommend always using the native function STRING_SPLIT, when possible.
Well guys, I hope you enjoyed this article.
A big hug and see you next time!
show this widely used in groups and sub groups items an open chain of levels.