Hello everybody!
All quiet?
In this article, I would like to share with you a problem I had recently, which several columns from various tables in a given database used a different collation from the DB standard, causing joins and WHERE conditions between VARCHAR / CHAR / columns. NVARCHAR with different collations, the bank returns us the following error message:
What is a COLLATION?
Collation is nothing more than the form of character encoding that a database uses to interpret them.
A Collation is a grouping of these characters in a certain order (each Collation has a different order), where the “A” is a different character than the “a”, if the collation is case-sensitive (case sensitive) and the “A” is different from “á”, if the collation is Accent-Sensitive (accent differentiation).
COLLATION has three levels of hierarchy:
- Server
- Data base
- Column
If the database is created without specifying which collation to use, it will be created with the server's collation (operating system language). When a table is created without specifying the collation of text columns (VARCHAR, NVARCHAR, CHAR, etc.), the database collate will be used as the table collation.
In SQL Server, the name of the Collation follows the following naming pattern:
SQL_CollationDesignator_CaseSensitivity_AccentSensitivity_KanatypeSensitive_WidthSensitivity
Collation Example:
SQL_Latin1_General_CP1_CS_AS
Where:
- CollationDesignator: Specifies the basic grouping rules used by Windows grouping, where sorting rules are based on the alphabet or language.
- CaseSensitivity: CI specifies case-insensitive, CS specifies case-insensitive.
- AccentSensitivity: AI specifies that it does not differentiate accents, AS specifies that it differentiates accent.
- KanatypeSensitive: Omitted specifies not distinguishing kana characters, KS specifies distinguishing kana characters.
- WidthSensitivity: Omitted specifies that it does not distinguish width, WS specifies that it distinguishes width.
If a column is using a COLLATION case sensitive (CS), a query like SELECT * FROM Table WHERE Column LIKE '% Oracle%' will return the “Oracle” record, but it will not return the “oracle” record.
The same thing happens with a column using a COLLATION accent sensitive (AS). A query like SELECT * FROM Table WHERE Column LIKE '% JOÃO%' will return the record "João", but will not return the record "Joao".
For a complete list of collations by region and language, visit this link, remembering that the most used in Portuguese (Brazil) is SQL_Latin1_General_CP1_CI_AI (or SQL_Latin1_General_CP1_CS_AS).
To learn more about what Collation is, take a look at the article. SQL Server - Cannot resolve the collation conflict between… in the same as operation..
How to identify the default collation?
To identify the default collation of a database, simply use the query below:
1 2 3 4 5 6 |
USE [dirceuresende] GO SELECT collation_name FROM sys.databases WHERE database_id = DB_ID() |
To identify your instance's default collation, just use the command below:
1 |
SELECT SERVERPROPERTY('Collation') |
Test base generation
To standardize the examples and help with the practical understanding of this article, I will use a simple script to generate sample data (tests):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
------------------------------------------------------------------------ -- Geração dos dados para teste ------------------------------------------------------------------------ CREATE TABLE dbo.Cliente ( CPF VARCHAR(11) COLLATE Latin1_General_CS_AS NOT NULL PRIMARY KEY CLUSTERED, Nome VARCHAR(50) ) INSERT INTO dbo.Cliente ( CPF, Nome ) VALUES ( '12345678909', 'Joãozinho' ) CREATE NONCLUSTERED INDEX SK01 ON dbo.Cliente(CPF) INCLUDE(Nome) CREATE TABLE dbo.Pedido ( Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED, Dt_Pedido DATETIME DEFAULT GETDATE(), Vl_Pedido MONEY, CPF_Cliente VARCHAR(11) COLLATE Latin1_General_CS_AS CONSTRAINT [FK_Cliente] FOREIGN KEY(CPF_Cliente) REFERENCES dbo.Cliente(CPF) NOT NULL ) INSERT INTO dbo.Pedido ( Vl_Pedido, CPF_Cliente ) VALUES ( 9.99, -- Vl_Pedido - money '12345678909' -- Id_Cliente - int ) CREATE NONCLUSTERED INDEX SK02 ON dbo.Pedido(Id, Dt_Pedido, CPF_Cliente) |
How to standardize the collation of all columns?
If you want to standardize the collation of all columns in a database so that they use a specific collation or database standard, I developed a script that will identify all non-standard columns and execute an ALTER TABLE command. to apply the change.
Since there are dependencies (constraints, indexes, and foreign keys), the script will drop the identified dependent objects, apply the changes, and recreate those objects. Because of this, I strongly recommend that you test well before running this script in production. Preferably make a structure backup first.
Script to change the collation of all columns (and dependencies) that have a different collation than database
View source
Script to change the collation of all columns (and dependencies) to the desired collation:
View source
If the script identifies many columns to change and PRINT output is getting cut off because of the size of the generated strings, you can use the fncSplitTexto function, which I made available in the post. How to wrap a string in a substring table using a delimiter in SQL Server and replace the end of the script with this snippet here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
------------------------------------------------------------------------- -- Executa os scripts ------------------------------------------------------------------------- PRINT '---------- Dropando as contraints FK' IF (NULLIF(LTRIM(RTRIM(@CmdDropFK)), '') IS NOT NULL) SELECT * FROM dirceuresende.dbo.fncSplitTexto(@CmdDropFK, ';') IF (@Fl_Debug = 0) EXEC(@CmdDropFK) PRINT '---------- Desativando as Check Contraints' IF (NULLIF(LTRIM(RTRIM(@CmdDisableCK)), '') IS NOT NULL) SELECT * FROM dirceuresende.dbo.fncSplitTexto(@CmdDisableCK, ';') IF (@Fl_Debug = 0) EXEC(@CmdDisableCK) PRINT '---------- Dropando os índices' IF (NULLIF(LTRIM(RTRIM(@CmdDropIndex)), '') IS NOT NULL) SELECT * FROM dirceuresende.dbo.fncSplitTexto(@CmdDropIndex, ';') IF (@Fl_Debug = 0) EXEC(@CmdDropIndex) PRINT '---------- Dropando os índices em constraints' IF (NULLIF(LTRIM(RTRIM(@CmdDropIndexConstraint)), '') IS NOT NULL) SELECT * FROM dirceuresende.dbo.fncSplitTexto(@CmdDropIndexConstraint, ';') IF (@Fl_Debug = 0) EXEC(@CmdDropIndexConstraint) PRINT '---------- Alterando a tabela' IF (NULLIF(LTRIM(RTRIM(@CmdAlterTable)), '') IS NOT NULL) SELECT * FROM dirceuresende.dbo.fncSplitTexto(@CmdAlterTable, ';') IF (@Fl_Debug = 0) EXEC(@CmdAlterTable) PRINT '---------- Reativando as constraints' IF (NULLIF(LTRIM(RTRIM(@CmdEnableCK)), '') IS NOT NULL) SELECT * FROM dirceuresende.dbo.fncSplitTexto(@CmdEnableCK, ';') IF (@Fl_Debug = 0) EXEC(@CmdEnableCK) PRINT '---------- Recriando os índices' IF (NULLIF(LTRIM(RTRIM(@CmdCreateIndex)), '') IS NOT NULL) SELECT * FROM dirceuresende.dbo.fncSplitTexto(@CmdCreateIndex, ';') IF (@Fl_Debug = 0) EXEC(@CmdCreateIndex) PRINT '---------- Recriando os índices em constraints' IF (NULLIF(LTRIM(RTRIM(@CmdCreateIndexConstraint)), '') IS NOT NULL) SELECT * FROM dirceuresende.dbo.fncSplitTexto(@CmdCreateIndexConstraint, ';') IF (@Fl_Debug = 0) EXEC(@CmdCreateIndexConstraint) PRINT '---------- Recriando as FKs' IF (NULLIF(LTRIM(RTRIM(@CmdCreateFK)), '') IS NOT NULL) SELECT * FROM dirceuresende.dbo.fncSplitTexto(@CmdCreateFK, ';') IF (@Fl_Debug = 0) EXEC(@CmdCreateFK) |
I hope you enjoyed this post and see you next time!
PS: Remember to back up and test ENOUGH if you plan to apply this script in Production.
If you have questions or problems, leave it here in the comments.
Error executing the penultimate script:
Msg 7738, Level 16, State 2, Line 1
Cannot enable compression for object 'xyz'. Only SQL Server Enterprise Edition supports compression.
Hello friend, congratulations on your class .. fantastic. Now I ask: and when you cannot change the collate but need to use, for example, queries in two different banks (tables with JOIN), the collates are different .. you can just set the collate, make the query, and leave without changing ???
Dirceu,
If in the section “Run the scripts” I comment on the EXEC lines -> “IF (@Fl_Debug = 0) EXEC (@CmdDropFK)” can I see all the ALTER TABLE, DROP etc commands without being executed? This is because I have a database with many nonstandard columns in the base but I would like to arrange it by groups of related tables, and not all at once. So copy and paste the generated commands only the ones I need.
Good afternoon
Thank you very much for the script, it helped a lot.
Here are some notes to help those in need:
If your database has many foreign keys and you try to run this script for only a few tables, the changes may not be made, since there may be a foreign key related to the executed table, but belonging to a table from which it was not related to script processing.
- Always run the script for the entire database.
If you are running this script on a sqlserver express, the parameters below must be removed from the commands for creating the indexes / constraints:
If it is not removed, the indexes will not be recreated because these parameters are only available for the enterprise version of sqlserver.
- DATA_COMPRESSION = PAGE
- SORT_IN_TEMPDB = OFF
If your database is very large, the varchar (max) variables will not support concatenating so much information and the string will be cut when it reaches 8000 characters… .To solve this question, I created a table with a field called command and another called order, and I changed the script to instead of concatenating the variables, insert command by command in that table according to their order of execution and then after entering all the commands, I created another script to scan command by command of that table and execute it.
Hi Dirceu.
I thank you for your time dedicated to disseminating knowledge.
I performed this procedure but unfortunately it didn't work.
The script even changed the SQL SERVER collation but DATABASE I want to change the tables did not work.
SQL version
Microsoft SQL Server 2012 (SP4) (KB4018073)
Do you have any tips?
Hi Dirceu.
Thanks so much for this post. But I have a problem. When I run your script, it runs right, but does not change the collation of the columns.
In the result, it points out how the script executed successfully.
Have an idea what it could be?
Thanks so much for the help.
Hello Almedir. All right?
Still having this problem? Anything warns me.
Hello Dirceu, good afternoon!
The post is really cool, but I have a doubt, in my case there are more than 43 thousand columns that need to be changed, so when running the script the result of the script exceeds the maximum allowed by the “message” field of the management studio and I end up not seeing all the commands.
Have a solution?
Thanks in advance!
Hey Kelvin, okay?
I edited the post to answer your question ..
If you have questions just say it!
Thank you so much Dirceu!