Click on the banner to learn about and purchase my database training on Azure

SQL Server - How to Convert an HTML String to Text (Remove HTML Tags) Using CLR (C #)

Views: 7.381 views
Reading Time: 2 minutes

Hey guys!
All right with you? I hope so, huh!

In this post, I will demonstrate to you how to convert an HTML string to text (Remove HTML tags) using CLR (C #). If you are new to the blog or have never heard of CLR or don't know how to create your first project using this powerful SQL Server tool, which lets you create C # or VB.NET code and run it through the database, give it a try. list in post Introduction to SQL Common Language Runtime (CLR) in SQL Server.

In 2014, I made the post Removing HTML Tags from a String in SQL Server and therefore, you must be asking yourself: “Dirceu, if you have already made a post about it, why do another one using the CLR?” and the answer to that is very simple: Much simpler code and PERFORMANCE! As I explained in the post SQL Server - Performance Comparison between Scalar Function and CLR Scalar Function, CLR functions generally deliver much better performance than UDF T-SQL functions, up to 200 times faster.

Code used for record creation:

To demonstrate the difference in performance of the two functions, I populated a table with only 6.000 records, all with the same string containing basic HTML code. And the larger the data volume, the greater the performance difference between the CLR and UDF T-SQL functions. The result you can check below:

Comparison of CLR function with faster T-SQL function: 173x

Function Source Code:

To remove HTML tags, I use the HtmlDecode method of the WebUtility class, which belongs to the System.Net library. This method is available only from the .NET Framework 4.0 and therefore can only be used on SQL Server 2012 or later (SQL Server versions 2005 and 2008 use the .NET Framework 3.5)

The parameter Fl_Quebra_Linha serves to replace the tag (and its variants) for a line break in the text. If you enter a value of 0 (false) in this parameter, line breaks will be replaced by an empty string.

That's it folks!
I hope you enjoyed the post and see you next time.

sql server converter convert string text html remove remove html tags for text plain text

sql server converter convert string text html remove remove html tags for text plain text