Hello people,
Alright?
In this post I would like to show how to calculate the difference in years (age) between two dates in SQL Server, showing some solutions and a performance analysis between these options in order to identify the most performative way to perform this calculation.
As you know, to resolve this issue it is not enough to perform a simple DATEDIFF (YEAR, @Dt_Inicial, @Dt_Final), since the DATEDIFF (YEAR…) function will return the years difference without considering whether the dates have already “completed birthdays” .
In a simple explanation, the DATEDIFF (YEAR…) function returns the calculation YEAR (@Dt_Final) - YEAR (@Dt_Inicial).
I will demonstrate why this does not work for age calculation:
As you can see, it is not enough to use the DATEDIFF (YEAR ..) function to calculate age. In the first example, with only 1 day difference between dates, the function returned to me that this difference is 1 year.
In the second example, the function returns me 30 years apart, but the first date is 09 / 04 and the second is 28 / 05, that is, I haven't completed 1 year yet and the correct value would be 29 years.
How to calculate age using T-SQL
Now I will demonstrate two simple ways for age calculation using Transact-SQL
Age Calculation Using Transact-SQL and CASE
To perform this calculation, we calculate the year difference between the dates and decrease 1 if the month and day of the start date is less than the month and day of the end date.
1 2 3 4 5 |
CREATE FUNCTION [dbo].[fncCalcula_Idade] (@Dt_Nascimento DATETIME, @Dt_Hoje DATETIME) RETURNS INT AS BEGIN RETURN DATEDIFF(YEAR, @Dt_Nascimento, @Dt_Hoje) + CASE WHEN (MONTH(@Dt_Nascimento) > MONTH(@Dt_Hoje) OR (MONTH(@Dt_Nascimento) = MONTH(@Dt_Hoje) AND DAY(@Dt_Nascimento) > DAY(@Dt_Hoje))) THEN -1 ELSE 0 END END |
Age calculation using Transact-SQL and 365.25 division
To perform this calculation, we calculated the difference in days between the dates and divided this amount by 365,25. But why this 0,25? The answer is simple .. Leap years. This “technique” is authored by my friend and BI specialist, Lucas Galon.
1 2 3 4 5 |
CREATE FUNCTION [dbo].[fncCalcula_Idade2] (@Dt_Nascimento DATETIME, @Dt_Hoje DATETIME) RETURNS INT AS BEGIN RETURN DATEDIFF(DAY, @Dt_Nascimento, @Dt_Hoje) / 365.25 END |
How to calculate age using CLR
Now I am going to demonstrate two simple ways of calculating age using CLR, which aims to achieve a great performance gain by virtue of optimizations of the C # language and the Microsoft .NET Framework.
Age calculation using CLR and manual calculations
To perform this calculation, we calculate the year difference between the dates and decrease 1 if the month and day of the start date is less than the month and day of the end date.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
using System.Data.SqlTypes; public partial class UserDefinedFunctions { [Microsoft.SqlServer.Server.SqlFunction] public static SqlInt32 fncCalcula_Idade(SqlDateTime Dt_Nascimento, SqlDateTime Dt_Hoje) { if (Dt_Nascimento.IsNull || Dt_Hoje.IsNull) return SqlInt32.Null; var anos = Dt_Hoje.Value.Year - Dt_Nascimento.Value.Year; if (Dt_Hoje.Value.Month < Dt_Nascimento.Value.Month || (Dt_Hoje.Value.Month == Dt_Nascimento.Value.Month && Dt_Hoje.Value.Day < Dt_Nascimento.Value.Day)) anos--; return anos; } } |
Age calculation using CLR and 365.25 division
To perform this calculation, we calculate the difference of days between the dates and divide this value by 365.25, as mentioned above, but now using the CLR.
1 2 3 4 5 6 7 8 9 10 11 |
using System; using System.Data.SqlTypes; public partial class UserDefinedFunctions { [Microsoft.SqlServer.Server.SqlFunction] public static SqlInt32 fncCalcula_Idade2(SqlDateTime Dt_Nascimento, SqlDateTime Dt_Hoje) { return (Dt_Nascimento.IsNull || Dt_Hoje.IsNull) ? SqlInt32.Null : Convert.ToInt32(Math.Floor((Dt_Hoje.Value - Dt_Nascimento.Value).TotalDays / 365.25)); } } |
Results of the functions presented
Performance Tests
Now that I've demonstrated a few different solutions for age calculation in SQL Server, I'll compare the performance of all of them and see which one is the most performative.
As I had commented in the post SQL Server - Performance Comparison between Scalar Function and CLR Scalar Function, CLR functions generally will perform better than T-SQL functions at the same time as inline functions (without creating UDF) generally perform better than CLR functions. Let's see how each solution works out.
For these tests, I will use a base of 131.072 random dates to calculate age against the current date, and the script to generate this base I will make available below:
Test mass generation script
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
IF (OBJECT_ID('dbo.fncRand') IS NOT NULL) DROP FUNCTION dbo.fncRand GO CREATE FUNCTION dbo.fncRand(@Numero BIGINT) RETURNS BIGINT AS BEGIN RETURN (ABS(CHECKSUM(PWDENCRYPT(N''))) / 2147483647.0) * @Numero END GO IF (OBJECT_ID('dbo.Datas') IS NOT NULL) DROP TABLE dbo.Datas CREATE TABLE dbo.Datas ( Data_Inicio DATETIME ) DECLARE @Contador INT = 1, @Total INT = 17 INSERT INTO dbo.Datas ( Data_Inicio ) VALUES ( '1970-01-01' ) WHILE(@Contador <= @Total) BEGIN INSERT INTO dbo.Datas ( Data_Inicio ) SELECT DATEADD(SECOND, dbo.fncRand(1491696000), '1970-01-01') FROM dbo.Datas SET @Contador += 1 END |
Test script
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
-- Funções inline SELECT DATEDIFF(YEAR, Data_Inicio, GETDATE()) + CASE WHEN (MONTH(Data_Inicio) > MONTH(GETDATE()) OR (MONTH(Data_Inicio) = MONTH(GETDATE()) AND DAY(Data_Inicio) > DAY(GETDATE()))) THEN -1 ELSE 0 END FROM dbo.Datas SELECT CAST(DATEDIFF(DAY, Data_Inicio, GETDATE()) / 365.25 AS INT) FROM dbo.Datas -- Funções T-SQL UDF SELECT dbo.fncCalcula_Idade(Data_Inicio, GETDATE()) FROM dbo.Datas SELECT dbo.fncCalcula_Idade2(Data_Inicio, GETDATE()) FROM dbo.Datas -- Funções CLR SELECT CLR.dbo.fncCalcula_Idade(Data_Inicio, GETDATE()) FROM dbo.Datas SELECT CLR.dbo.fncCalcula_Idade2(Data_Inicio, GETDATE()) FROM dbo.Datas |
As expected, inline functions performed better, with CLR performing similarly and T-SQL functions taking much longer than other 2 solutions. In addition, the 365.25 split solution was better than using CASE in all 3 scenarios.
That's it folks!
I hope you have learned to calculate age correctly and with the best performance possible.
A hug and see you next.
It will greatly help in designing abs performance analysis.
Cristiano,
Thanks for the feedback!
If you are studying performance, read the post https://www.dirceuresende.com/blog/sql-server-comparacao-de-performance-entre-scalar-function-udf-e-clr-scalar-function/, which may help you too.
Hug!