Click on the banner to learn about and purchase my database training on Azure

SQL Server – How to filter and separate the number and text of a string

Views: 209
Reading Time: 7 minutes

Hey guys!
In this article, I'm going to share with you a series of options to filter and separate the number and text of a string using T-SQL functions and commands in SQL Server, which is a very common need in the daily lives of those who develop SQL queries several or want to validate the data in some table.

Test base creation

For the examples I'm going to demonstrate in this article, I'll use the following script below:

Result:

How to filter the numeric part and text part of a string

A very common need too is to filter the numeric part and the text part of a string. Let's learn how to do this?

Example 1 – How to return rows with only numbers (NOT LIKE)

One of the frequent ways I see people trying to achieve this goal is to use the NOT LIKE operator to ignore lines that have certain characters.

Result:

Well.. The result was achieved, but the query turned out to be too large, difficult to maintain, poor performance and may need changes depending on the collation, especially if you are using a case-sensitive collation.

Example 2 – How to return rows with only numbers (ISNUMERIC)

In order to try to deliver a simpler and more effective solution, let's try to use another way to achieve this result through the function INUMERIC(), which will return 0 if the given parameter has any non-numeric character.

Observation: ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols like the dollar sign ($).
Script used:

Result:

Once again, the desired result was achieved and the query was very simple and objective. But what if I want to return rows that contain only numbers anyway, ignoring even decimals, (+), (-) signs, currency symbols, etc?

Example 3 – How to return rows with only numbers (NOT LIKE and Regular Expression)

Another way to be able to return rows with only numbers is using NOT LIKE and Regular Expressions in SQL Server.

If you didn't know it was possible to use regular expression (Regex or Regexp) in SQL Server or want to dig deeper, check out my article SQL Server - How to use regular expressions (RegExp) in your database.

Example:

Result:

In this case, we return lines that don't have letters, but are still returning symbols...

Let's change our query a little bit and try again, but this time, I want to return the lines where I DON'T HAVE any characters that are NOT in the range 0 to 9 (only numbers):

Result:

Show!! Now I finally have the result I wanted! Only lines that only have numbers.

Example 4 – How to return lines with only letters

In a scenario now opposite to the previous one, I want to do the opposite. I want to return lines with just letters. So the query is as simple as the one in our last example, right?

Example:

Result:

Hey.. Nothing returned! But there was a string "Dirceu Resende" that should return... Ah, but there's the space... Let's try again:

Result:

Ready 🙂

Example 5 – How to return lines that contain special characters

If you want to identify which lines have special characters, it's also pretty easy.

Example:

Result:

It worked! This is very useful to identify possible errors in data integrations and those invisible and hidden characters in the middle of the string (Read more about this in my article SQL Server - How to identify and replace “strange things” (hidden characters) in strings and tables)

How to separate the numeric part and the text part of a string

Unlike the examples above, here what I really want is to return the numeric part of the lines, where it is equal to “12345678909”, even if there are other characters in the middle of the string.

Example:

Result:

I can also do the same thing with the textual part, where I'll return the lines whose textual phrase is “Dirceu Resende”:

Result:

Analyzing all the lines of this table, now also using the 2 functions:

Result:

Another example to help fix the idea:

Liked these functions, right? Below is the source code for both:

Source code of fncRecupera_Letras

Source code of fncRecupera_Numbers

Important: In general, user functions (UDF) can cause big performance problems, especially if used in the WHERE/JOIN clause and in very large tables. To learn more about this, read my post. SQL Server - Using Calculated Columns (or Computed Columns) for Performance Tuning.

Another tip: In general, user functions (UDF) are slow to run on large tables. Another alternative to greatly improve the performance of functions is to use SQLCLR functions, as I demonstrate the gains in the article SQL Server - Performance Comparison between Scalar Function and CLR Scalar Function.

How to improve query performance using these functions

Another nice tip is that you can even create a calculated, indexed column to make searches much faster:

Now I'm going to try to include the columns calculated using the functions:

And the result is an error message:

Msg 4936, Level 16, State 1, Line 39
Computed column 'Numeric_Part' in table 'ExampleData' cannot be persisted because the column is non-deterministic.

As I explained about this error in the article SQL Server - Using Calculated Columns (or Computed Columns) for Performance Tuning, to use function as calculated column, the function must be deterministic. In this case, to make these 2 functions deterministic, let's add the WITH SCHEMABINDING clause in the CREATE FUNCTION command:

Now I will try to include again the columns calculated using the functions, which have now been created as deterministic (WITH SCHEMABINDING):

And the command was executed successfully. Let's look at our table again:

And now, we can index these columns normally:

And now when I want to query this calculated column, it will have excellent performance (of course, depending on how the query is done and how the indexes were created):

And that's it folks!
I hope you enjoyed this post and see you next time!