Click on the banner to learn about and purchase my database training on Azure

SQL Server - The danger of using JOIN between columns of different data types

Views: 689 views
Reading Time: 3 minutes

Hello everybody!
Good night!

Today I came across a critical performance issue in a production environment, where a particular query (which can be executed multiple times per second) was experiencing an already known and old slow-running problem (taking between 21 and 30 seconds per run), but today was the day to settle this once and for all.

For many DBA's (including me) and database modeling enthusiasts, this type of problem should never occur, but unfortunately in practice it is not what happens. Analyzing the query, which by the way was quite large, with several CASES, LEFT JOIN'S, OR's, use of function, etc. I could observe that there were many cases of implicit conversion and when analyzing the real execution plan, it was very clear the reason for the slowness :

SQL Server - JOIN Predicate Columns Different DataTypes 2

Analyzing the readings on disk, we can see that the readings in a specific table are very high, especially considering that the query uses in the WHERE clause an equality term in the main table of the query, informing a code that is primary key and is in the clustered index, that is, it should be an extremely fast and optimized query processing few records.

SQL Server - JOIN Predicate Columns Different DataTypes 3

Fix that there is a warning in select. And by hovering over it, we can see this information:
SQL Server - Differential Datatypes Performance Problems 2

Well, we have the implicit conversion problem going on. This occurs when two columns of different data types are compared and then the bank needs to convert this manually during the query.

Looking at the query JOINS, I could see that the columns involved in the comparison were of different types. This means that all records involved in JOIN must be converted to the same type, and then check whether they should be restricted by clauses in JOIN or not. If the volume of records is too high, this can greatly increase processing time and disk reads (as shown in the example).

A simple ALTER TABLE command on the column that was JOINING the table that had too many readings to match the data types solved the problem:

SQL Server - JOIN Predicate Columns Different DataTypes 1

SQL Server - JOIN Predicate Columns Different DataTypes 4

Another solution would be to create an intermediate table containing a part of the query, and converting the column already to the correct type. With this intermediate table would be JOIN with the rest of the query, and now, with the same data type, there was no longer the implicit conversion problem and the query would be running more optimally.

That's it folks!
Regards and see you next post!