SQL Server - How to concatenate rows by grouping data by one column (Grouped concatenation)

Views: 26.513
Reading Time: 8 minutes

Hello people,
Good afternoon!

In this post I will talk about a well-requested feature for Microsoft, and that it has not natively created natively in SQL Server, which is data concatenation using collations, already present in other DBMSs such as MySQL (GROUP_CONCAT), Oracle (XMLAGG) and PostgreeSQL ( STRING_AGG or ARRAY_TO_STRING (ARRAY_AGG ())).

A lot of people think this feature is the CONCAT () function, introduced in SQL Server 2012, but it only allows the concatenation of multiple columns of one row in one column and not the concatenation of multiple rows in one column.

Concatenating columns into grouped strings consists of transforming rows into a concatenated string so that you can group the values ​​by some field, and the other column is concatenated, as shown below:

SQL Server - Grouped Concatenation convert rows into string

To create our test table, use the commands below:

COALESCE

How to concatenate columns in grouped strings using COALESCE

SQL Server - Grouped Concatenation convert rows into string - COALESCE

As we can see above, using the COALESCE function, we always need to store the results in a variable, and for this reason, we cannot get the results grouped by category as we need. If you just want to convert columns to a string, without grouping, this solution will suit you, but not what we need for this post.

STUFF + FOR XML PATH

How to concatenate columns in grouped strings using STUFF + FOR XML PATH

A practical and performative way to solve this problem is to use FOR XML PATH and STUFF to perform grouped concatenation, so the result is exactly what we expected:

SQL Server - Grouped Concatenation convert rows into string - STUFF FOR XML PATH

UPDATE

How to concatenate columns in grouped strings using UPDATE

SQL Server - Grouped Concatenation convert rows into string - UPDATE

Another solution that allowed us to generate the data as our needs demand.

LOOPING WITH CURSOR

How to concatenate columns in grouped strings using LOOPING with CURSOR

SQL Server - Grouped Concatenation convert rows into string - CURSOR

This solution also met what we need, but like all looping, it's not performative. I particularly abhor the creation of cursors (unless you are STILL on SQL Server 2000), as a last resort I prefer to use WHILE, but we have better solutions here in this post.

LOOPING WITH WHILE

How to concatenate columns in grouped strings using LOOPING with WHILE

SQL Server - Grouped Concatenation convert rows into string - WHILE

Solution served, but like all looping, it is not performative.

CTE RECURSIVE

How to concatenate columns in grouped strings using Recursive CTE

SQL Server - Grouped Concatenation convert rows into string - RECURSIVE CTE

This solution uses the recursion technique to solve our problem without looping. Although not very easy to understand at first, it is a practical and very interesting solution.

SCALAR FUNCTION (UDF)

How to Concatenate Columns in Grouped Strings Using Scalar Function (UDF)

SQL Server - Grouped Concatenation convert rows into string - SCALAR FUNCTION UDF

Very useful function and once created, becomes very practical to use. The problem with this solution is that it is not very performative and requires you to create a function for each table that you need to perform this solution on, that is, it is not very generic and you would need to create several functions that do pretty much the same thing in your data base.

SQL CLR (C #)

How to concatenate columns in grouped strings using SCL CLR (C #)

SQL Server - Grouped Concatenation convert rows into string - SQL CLR CSHARP

My favorite solution. Practical, extremely fast, generic and solves our problem. Not sure what CLR is or know, but not sure how to implement it in your database? Learn more by visiting my post Introduction to SQL Common Language Runtime (CLR) in SQL Server.

Performance Considerations

Let's now measure our solutions and find out which ones are the fastest and the slowest.

CLR
SQL Server - Grouped Concatenation convert rows into string - Performance - CLR

STUFF + FOR XML PATH
SQL Server - Grouped Concatenation convert rows into string - Performance - STUFF FOR XML PATH

UPDATE
SQL Server - Grouped Concatenation convert rows into string - Performance - UPDATE

Recursive CTE
SQL Server - Grouped Concatenation convert rows into string - Performance - CTE

UDF scalar function
SQL Server - Grouped Concatenation convert rows into string - Performance - UDF Function

While loop
SQL Server - Grouped Concatenation convert rows into string - Performance - WHILE

Cursor Loop
SQL Server - Grouped Concatenation convert rows into string - Performance - CURSOR

Right away we can eliminate the loopings, which were much worse. What happens if we insert more records? Like some 850.000 lines. What will the performance be like?

CLR
SQL Server - Grouped Concatenation convert rows into string - Performance - CLR2

STUFF + FOR XML PATH
SQL Server - Grouped Concatenation convert rows into string - Performance - STUFF FOR XML PATH

Since no other method was able to finish processing in less than 1 minute, although I used 850.000 lines with the CLR and FOR XML PATH, I will only use 158.000 lines (27%) for the rest of the methods to try to parse the results. We will see.

UPDATE
SQL Server - Grouped Concatenation convert rows into string - Performance - UPDATE

Until the time with UPDATE wasn't too bad. 5.7s to 158.000 lines is a reasonable time.

UDF scalar function
SQL Server - Grouped Concatenation convert rows into string - Performance - UDF Function3

Recursive CTE
SQL Server - Grouped Concatenation convert rows into string - Performance - CTE Recursive

Well, I tried to wait for the Recursive CTE .. I even went to lunch and left running here, but after 2 hours and 22 minutes I couldn't wait to finish processing the 158 thousand lines (remembering that the CLR processed 850.000 in 3s and FOR XML in 1.5s) ..

Apparently our winner was FOR XML PATH, but if we are going to analyze the results, the CLR delivered the correct result, even with all this nonsense rows:
SQL Server - Grouped Concatenation convert rows into string - Performance - CLR4

Already FOR XML PATH… It got lost and the results were not in the correct order.
SQL Server - Grouped Concatenation convert rows into string - Performance - STUFF FOR XML PATH4

Therefore, the best solution for this kind of situation is SQL CLR!

That's it folks!
Thanks for stopping by and see you in the next post.

sql server clr function stuff for xml path recursive cte convert columns rows grouped concat concatenation string how to convert concatenate columns to string

sql server clr function stuff for xml path recursive cte convert columns rows grouped concat concatenation string how to convert concatenate columns to string

Concatenate many rows into single text string

Concatenate many rows into single text string