SQL Server - How to Convert an HTML String to Text (Remove HTML Tags) Using CLR (C #)

Views: 1.371
Reading Time: 2 minutes

Hey guys!
All right with you? I hope so, huh!

In this post, I will demonstrate to you how to convert an HTML string to text (Remove HTML tags) using CLR (C #). If you are new to the blog or have never heard of CLR or don't know how to create your first project using this powerful SQL Server tool, which lets you create C # or VB.NET code and run it through the database, give it a try. list in post Introduction to SQL Common Language Runtime (CLR) in SQL Server.

In 2014, I made the post Removing HTML Tags from a String in SQL Server And so, you might be asking, “Dirceu, if you have already made a post about it, why make another one using the CLR?” and the answer to that is very simple: Code much simpler and PERFORMANCE! As I explained in the post SQL Server - Performance Comparison between Scalar Function and CLR Scalar Function, CLR functions generally deliver much better performance than UDF T-SQL functions, up to 200 times faster.

Code used for record creation:

To demonstrate the difference in performance of the two functions, I populated a table with only 6.000 records, all with the same string containing basic HTML code. And the larger the data volume, the greater the performance difference between the CLR and UDF T-SQL functions. The result you can check below:

Comparison of CLR function with faster T-SQL function: 173x

Function Source Code:

To remove HTML tags, I use the HtmlDecode method of the WebUtility class, which belongs to the System.Net library. This method is available only from the .NET Framework 4.0 and therefore can only be used on SQL Server 2012 or later (SQL Server versions 2005 and 2008 use the .NET Framework 3.5)

The Fl_Quebra_Linha parameter is used to replace the <br> tag (and its variants) with a line break in the text. If you enter the value 0 (false) in this parameter, the line breaks will be replaced with an empty string.

That's it folks!
I hope you enjoyed the post and see you next time.

sql server converter convert string text html remove remove html tags for text plain text

sql server converter convert string text html remove remove html tags for text plain text