Reading Time: 8 minutesHello people,
Good afternoon!
In this post I will talk about a well-requested feature for Microsoft, and that it has not natively created natively in SQL Server, which is data concatenation using collations, already present in other DBMSs such as MySQL (GROUP_CONCAT), Oracle (XMLAGG) and PostgreeSQL ( STRING_AGG or ARRAY_TO_STRING (ARRAY_AGG ())).
A lot of people think this feature is the CONCAT () function, introduced in SQL Server 2012, but it only allows the concatenation of multiple columns of one row in one column and not the concatenation of multiple rows in one column.
Concatenating columns into grouped strings consists of transforming rows into a concatenated string so that you can group the values by some field, and the other column is concatenated, as shown below:

To create our test table, use the commands below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | IF (OBJECT_ID('dbo.Teste_Group_Concat') IS NOT NULL) DROP TABLE dbo.Teste_Group_Concat CREATE TABLE dbo.Teste_Group_Concat( Categoria VARCHAR(100), Descricao VARCHAR(100) ) INSERT INTO dbo.Teste_Group_Concat( Categoria, Descricao ) VALUES ('Brinquedo', 'Bola'), ('Brinquedo', 'Carrinho'), ('Brinquedo', 'Boneco'), ('Brinquedo', 'Jogo'), ('Cama e Mesa', 'Toalha'), ('Cama e Mesa', 'Edredom'), ('Informatica', 'Teclado'), ('Informatica', 'Mouse'), ('Informatica', 'HD'), ('Informatica', 'CPU'), ('Informatica', 'Memoria'), ('Informatica', 'Placa-Mae'), (NULL, 'TV') |
COALESCE How to concatenate columns in grouped strings using COALESCE
| DECLARE @Nomes VARCHAR(MAX) SELECT @Nomes = COALESCE(@Nomes + ', ', '') + Descricao FROM dbo.Teste_XML SELECT @Nomes |

As we can see above, using the COALESCE function, we always need to store the results in a variable, and for this reason, we cannot get the results grouped by category as we need. If you just want to convert columns to a string, without grouping, this solution will suit you, but not what we need for this post.
STUFF + FOR XML PATH How to concatenate columns in grouped strings using STUFF + FOR XML PATH
A practical and performative way to solve this problem is to use FOR XML PATH and STUFF to perform grouped concatenation, so the result is exactly what we expected:
| SELECT Categoria, STUFF(( SELECT ', ' + B.Descricao FROM dbo.Teste_Group_Concat B WHERE ISNULL(B.Categoria, '') = ISNULL(A.Categoria, '') ORDER BY B.Descricao FOR XML PATH('')), 1, 2, '' ) AS Descricao FROM dbo.Teste_Group_Concat A GROUP BY Categoria ORDER BY Categoria |

UPDATED How to concatenate columns in grouped strings using UPDATE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | DECLARE @Categoria VARCHAR(MAX), @Descricoes VARCHAR(MAX) DECLARE @Tabela TABLE ( Categoria VARCHAR(200), Descricao VARCHAR(200), Descricoes VARCHAR(200) ) INSERT @Tabela ( Categoria, Descricao ) SELECT Categoria, Descricao FROM dbo.Teste_Group_Concat ORDER BY Categoria, Descricao UPDATE @Tabela SET @Descricoes = Descricoes = COALESCE(CASE COALESCE(@Categoria, '') WHEN Categoria THEN @Descricoes + ', ' + Descricao ELSE Descricao END, ''), @Categoria = ISNULL(Categoria, '') SELECT Categoria, Descricoes = MAX(Descricoes) FROM @Tabela GROUP BY Categoria ORDER BY Categoria |

Another solution that allowed us to generate the data as our needs demand.
LOOPING WITH CURSOR How to concatenate columns in grouped strings using LOOPING with CURSOR
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | DECLARE @Tabela TABLE ( Categoria VARCHAR(MAX), Descricoes VARCHAR(MAX) ) INSERT @Tabela ( Categoria, Descricoes ) SELECT Categoria, '' FROM dbo.Teste_Group_Concat GROUP BY Categoria DECLARE @Categoria VARCHAR(MAX), @Descricao VARCHAR(MAX), @Descricoes VARCHAR(MAX) DECLARE c CURSOR LOCAL FAST_FORWARD FOR SELECT Categoria, Descricao FROM dbo.Teste_Group_Concat ORDER BY Categoria, Descricao OPEN c FETCH c INTO @Categoria, @Descricao WHILE @@FETCH_STATUS = 0 BEGIN UPDATE @Tabela SET Descricoes += ', ' + @Descricao WHERE ISNULL(Categoria, '') = ISNULL(@Categoria, '') FETCH c INTO @Categoria, @Descricao END CLOSE c DEALLOCATE c SELECT Categoria, Descricoes = STUFF(Descricoes, 1, 1, '') FROM @Tabela ORDER BY Categoria |

This solution also met what we need, but like all looping, it's not performative. I particularly abhor the creation of cursors (unless you are STILL on SQL Server 2000), as a last resort I prefer to use WHILE, but we have better solutions here in this post.
LOOPING WITH WHILE How to concatenate columns in grouped strings using LOOPING with WHILE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | DECLARE @Tabela_Original TABLE ( Id INT IDENTITY(1, 1), Categoria VARCHAR(MAX), Descricao VARCHAR(MAX) ) INSERT @Tabela_Original ( Categoria, Descricao ) SELECT Categoria, Descricao FROM dbo.Teste_Group_Concat ORDER BY Categoria DECLARE @Tabela TABLE ( Id INT IDENTITY(1, 1), Categoria VARCHAR(MAX), Descricoes VARCHAR(MAX) ) INSERT @Tabela ( Categoria, Descricoes ) SELECT Categoria, '' FROM dbo.Teste_Group_Concat GROUP BY Categoria DECLARE @Categoria VARCHAR(MAX), @Descricao VARCHAR(MAX), @Descricoes VARCHAR(MAX), @Contador INT = 1, @Numero_Linhas INT = (SELECT COUNT(*) FROM @Tabela_Original) WHILE(@Contador <= @Numero_Linhas) BEGIN SELECT @Categoria = ISNULL(Categoria, ''), @Descricao = Descricao FROM @Tabela_Original WHERE Id = @Contador ORDER BY Categoria, Descricao UPDATE @Tabela SET Descricoes += ', ' + @Descricao WHERE ISNULL(Categoria, '') = ISNULL(@Categoria, '') SET @Contador = @Contador + 1 END SELECT Categoria, Descricoes = STUFF(Descricoes, 1, 1, '') FROM @Tabela ORDER BY Categoria |

Solution served, but like all looping, it is not performative.
CTE RECURSIVE How to concatenate columns in grouped strings using Recursive CTE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ;WITH Dados as ( SELECT Categoria, CONVERT(NVARCHAR(MAX), Descricao) AS Descricao, ROW_NUMBER() OVER (PARTITION BY Categoria ORDER BY Descricao) AS Linha FROM dbo.Teste_Group_Concat ), Tabela1 AS ( SELECT Categoria, Descricao, Linha FROM Dados WHERE Linha = 1 ), Resultado AS ( SELECT Categoria, Descricao, Linha FROM Tabela1 WHERE Linha = 1 UNION ALL SELECT Dados.Categoria, Resultado.Descricao + N', ' + Dados.Descricao, Dados.Linha FROM Dados JOIN Resultado ON Resultado.Categoria = Dados.Categoria AND Dados.Linha = Resultado.Linha + 1 ) SELECT Categoria, Descricoes = MAX(Descricao) FROM Resultado GROUP BY Categoria ORDER BY Categoria OPTION (MAXRECURSION 0); |

This solution uses the recursion technique to solve our problem without looping. Although not very easy to understand at first, it is a practical and very interesting solution.
SCALAR FUNCTION (UDF) How to Concatenate Columns in Grouped Strings Using Scalar Function (UDF)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | CREATE FUNCTION dbo.fncConcatCategoria ( @Ds_Categoria VARCHAR(MAX) ) RETURNS VARCHAR(MAX) WITH SCHEMABINDING AS BEGIN DECLARE @Retorno VARCHAR(MAX); SELECT @Retorno = COALESCE(@Retorno + ', ', '') + Descricao FROM dbo.Teste_Group_Concat WHERE ISNULL(Categoria, '') = ISNULL(@Ds_Categoria, '') ORDER BY Descricao RETURN @Retorno END GO SELECT Categoria, dbo.fncConcatCategoria(Categoria) AS Descricao FROM dbo.Teste_Group_Concat GROUP BY Categoria |

Very useful function and once created, becomes very practical to use. The problem with this solution is that it is not very performative and requires you to create a function for each table that you need to perform this solution on, that is, it is not very generic and you would need to create several functions that do pretty much the same thing in your data base.
SQL CLR (C #) How to concatenate columns in grouped strings using SCL CLR (C #)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | using System; using System.Data.SqlTypes; using Microsoft.SqlServer.Server; using System.Text; [Serializable] [SqlUserDefinedAggregate( Format.UserDefined, IsInvariantToNulls = false, IsInvariantToDuplicates = false, IsInvariantToOrder = true, MaxByteSize = -1)] public struct concat : IBinarySerialize, INullable { private StringBuilder _accumulator; private string _delimiter; public bool IsNull { get; private set; } public void Init() { _accumulator = new StringBuilder(); _delimiter = string.Empty; this.IsNull = true; } public void Accumulate([SqlFacet(MaxSize = -1)]SqlString Value, [SqlFacet(MaxSize = -1)]SqlString Delimiter) { if (Value.IsNull) return; if (!Delimiter.IsNull & Delimiter.Value.Length > 0) { _delimiter = Delimiter.Value; if (_accumulator.Length > 0) _accumulator.Append(Delimiter.Value); } _accumulator.Append(Value.Value); IsNull = false; } public void Merge(concat Group) { if (_accumulator.Length > 0 & Group._accumulator.Length > 0) _accumulator.Append(_delimiter); _accumulator.Append(Group._accumulator.ToString()); } public SqlString Terminate() { return new SqlString(_accumulator.ToString()); } void IBinarySerialize.Read(System.IO.BinaryReader r) { _delimiter = r.ReadString(); _accumulator = new StringBuilder(r.ReadString()); if (_accumulator.Length != 0) this.IsNull = false; } void IBinarySerialize.Write(System.IO.BinaryWriter w) { w.Write(_delimiter); w.Write(_accumulator.ToString()); } } |

My favorite solution. Practical, extremely fast, generic and solves our problem. Not sure what CLR is or know, but not sure how to implement it in your database? Learn more by visiting my post Introduction to SQL Common Language Runtime (CLR) in SQL Server.
Performance Considerations
Let's now measure our solutions and find out which ones are the fastest and the slowest.
CLR

STUFF + FOR XML PATH

UPDATED

Recursive CTE

UDF scalar function

While loop

Cursor Loop

Right away we can eliminate the loopings, which were much worse. What happens if we insert more records? Like some 850.000 lines. What will the performance be like?
CLR

STUFF + FOR XML PATH

Since no other method was able to finish processing in less than 1 minute, although I used 850.000 lines with the CLR and FOR XML PATH, I will only use 158.000 lines (27%) for the rest of the methods to try to parse the results. We will see.
UPDATED

Until the time with UPDATE wasn't too bad. 5.7s to 158.000 lines is a reasonable time.
UDF scalar function

Recursive CTE

Well, I tried to wait for the Recursive CTE .. I even went to lunch and left running here, but after 2 hours and 22 minutes I couldn't wait to finish processing the 158 thousand lines (remembering that the CLR processed 850.000 in 3s and FOR XML in 1.5s) ..
Apparently our winner was FOR XML PATH, but if we are going to analyze the results, the CLR delivered the correct result, even with all this nonsense rows:

FOR XML PATH… It ended up getting lost and the results were not in the correct order.

Therefore, the best solution for this kind of situation is SQL CLR!
That's it folks!
Thanks for stopping by and see you in the next post.
sql server clr function stuff for xml path recursive cte convert columns rows grouped concat concatenation string how to convert concatenate columns to string
sql server clr function stuff for xml path recursive cte convert columns rows grouped concat concatenation string how to convert concatenate columns to string
Concatenate many rows into single text string
Concatenate many rows into single text string
Congratulations Dirceu, for sharing this valuable information, I take the opportunity to add a plugin to your topic that I found after your blog, with four functions already ready for the clr, just adding it to the bank.
http://groupconcat.codeplex.com/
André, thanks for the compliment and for sharing this link. I will take a look at it to learn more about.
Thanks for the reply, Dirceu!
I didn't remember SQL Sentry anymore. I used a version a long time ago, but I don't think it was like that. I am already installing here to evaluate again.
Thanks a lot!
Dirceu,
I would like to congratulate you on the article! Didactic and very practical, was calm to understand and compare the various solutions.
I have been particularly using the FOR XML with STUFF solution, which has served me well so far. But the SQL CLR solution was really good! Case to think for next implementations of this.
Just one question: did you use SQL Server or another database? In fact, I was wondering which tool you used in these graphical comparisons, which I thought was really cool.
Thank you!
Laércio
Laercio,
Good afternoon.
Thank you so much for visiting and feedback?
About your question, the database is SQL Server yes, but the tool I use to analyze the execution plan is SQL Sentry Plan Explorer FREE (I will make a post about this tool in the future).
In my opinion, this tool offers a better view of the data, as well as some features to manipulate the information that SQL Server Management Studio does not have, but would return something similar by Management Studio itself.
If you have any questions, do not hesitate to ask.