- SQL Server - Introduction to Performance Tuning Study
- Understanding Index Functioning in SQL Server
- SQL Server - How to Identify a Slow or “Heavy” Query in Your Database
- SQL Server - Performance Tuning Tips: Implicit Conversion? NEVER!
- SQL Server - Performance Comparison between Scalar Function and CLR Scalar Function
- SQL Server - Performance Tuning Tips: What is the Difference Between Seek Predicate and Predicate?
- SQL Server - Using Calculated Columns (or Computed Columns) for Performance Tuning
- SQL Server - How to identify and collect time consuming query information using Extended Events (XE)
- SQL Server - How to identify all missing indexes (Missing indexes) of a database
- SQL Server and Azure SQL Database: How to Identify Key Lookup occurrences through the plancache
Speak guys!
In this post, I would like to share with you a script to identify all Missing indexes of a database in SQL Server, Managed Instance or Azure SQL Database. Remembering that for the execution of this script, you will need the “View server state” permission on the instance.
I had already shared this script in the articles Understanding Index Functioning in SQL Server e SQL Server - Useful DBA Queries You Always Have to Look for on the Internet, but Google didn't correctly index who searched for this script.
A reflection on indices
One of the daily tasks of a DBA is to identify indexes in the database that could improve the performance of queries that are impacting the environment. This should always be the objective of an index: Improve the performance of queries that are impacting the environment or could have a gain in a critical process.
In most environments, you will not be able to create indexes to cover all queries that come to the database. Indexes have a storage cost (they take up space, often the indexes take up more space than the table itself) and they slow down the performance of write operations, in addition to requiring maintenance processes like reorganize/rebuild to keep fragmentation low .
For these reasons mentioned above, you cannot create indexes without any control. Indices must be very well thought out to try to have the best cost benefit possible. If an index is not being used, it must be deleted.
Another tip I always talk about is: Many times, you will need to create several non-clustered indexes, because your clustered index was poorly planned. Always assess whether the clustered index is optimized for the way queries are made against the database. It's no use creating the clustered index on the autoincrement column if it's not used as a filter in the main queries.
Don't just rely on numbers and big runtime reductions. If an index is serving to reduce the time of a query from 1h to 1 second, but it's a query executed only 1x a day, at dawn, and it's not bringing any big gains, it's not helping. It would have been better to try to optimize a query that used to take 5 seconds and now it takes 0ms but runs 1 million times a day.
Always try to think hard about the value the index will add to business processes before creating it. As I said, you can't create an index for every query that arrives in the database. Focus on what really matters and it's actually being a problem, causing the system to slow down, making a customer wait while the data is loaded, things like that.
What are Missing Indexes?
In order to help DBAs quickly identify situations where an index could make a difference, SQL Server has a set of DMVs dm_db_missing_index_% that bring information about these missing indexes (Missing indexes).
This information is automatically generated by the SQL Server Engine based on the cached execution plans, that is, based on the queries that have already been sent to the database, SQL Server identifies some indexes that could cover these queries to provide better performance. , considering existing indexes, filtered columns and returned columns.
After this analysis, the result is available for consultation using the script that I'm going to share with you, and the DBA can quickly analyze the suggestions and identify what makes sense to create and what doesn't.
Don't go out creating indexes without analyzing them first, even if they are Missing index returned by this script. The best possible index will not always be returned, and often there is already an index very similar to the one suggested, where it is more worthwhile to merge the existing index with the suggested index, changing the current index and only including the new suggested columns. This is much better than keeping 2 indexes almost the same.
And believe me: It is very common to see several and several indexes created on clients, where the DBA didn't want to have the work or to rename the indexes, let alone analyze if there are better ways to create the suggested index.
How to identify all missing indexes (Missing indexes) of a database
With the query below, you will be able to view SQL Server index suggestions based on Missing Index statistics. To generate this data, the set of Missing Index DMV's (dm_db_missing_index_%) will be used.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
SELECT db.[name] AS [DatabaseName], id.[object_id] AS [ObjectID], OBJECT_NAME(id.[object_id], db.[database_id]) AS [ObjectName], gs.avg_total_user_cost * ( gs.avg_user_impact / 100.0 ) * ( gs.user_seeks + gs.user_scans ) AS ImprovementMeasure, gs.[user_seeks] * gs.[avg_total_user_cost] * ( gs.[avg_user_impact] * 0.01 ) AS [IndexAdvantage], 'CREATE INDEX [IX_' + OBJECT_NAME(id.[object_id], db.[database_id]) + '_' + REPLACE(REPLACE(REPLACE(ISNULL(id.[equality_columns], ''), ', ', '_'), '[', ''), ']', '') + CASE WHEN id.[equality_columns] IS NOT NULL AND id.[inequality_columns] IS NOT NULL THEN '_' ELSE '' END + REPLACE(REPLACE(REPLACE(ISNULL(id.[inequality_columns], ''), ', ', '_'), '[', ''), ']', '') + '_' + LEFT(CAST(NEWID() AS [NVARCHAR](64)), 5) + ']' + ' ON ' + id.[statement] + ' (' + ISNULL(id.[equality_columns], '') + CASE WHEN id.[equality_columns] IS NOT NULL AND id.[inequality_columns] IS NOT NULL THEN ',' ELSE '' END + ISNULL(id.[inequality_columns], '') + ')' + ISNULL(' INCLUDE (' + id.[included_columns] + ')', '') AS [ProposedIndex], id.[statement] AS [FullyQualifiedObjectName], id.[equality_columns] AS [EqualityColumns], id.[inequality_columns] AS [InEqualityColumns], id.[included_columns] AS [IncludedColumns], gs.[unique_compiles] AS [UniqueCompiles], gs.[user_seeks] AS [UserSeeks], gs.[user_scans] AS [UserScans], gs.[last_user_seek] AS [LastUserSeekTime], gs.[last_user_scan] AS [LastUserScanTime], gs.[avg_total_user_cost] AS [AvgTotalUserCost], gs.[avg_user_impact] AS [AvgUserImpact], gs.[system_seeks] AS [SystemSeeks], gs.[system_scans] AS [SystemScans], gs.[last_system_seek] AS [LastSystemSeekTime], gs.[last_system_scan] AS [LastSystemScanTime], gs.[avg_total_system_cost] AS [AvgTotalSystemCost], gs.[avg_system_impact] AS [AvgSystemImpact], CAST(CURRENT_TIMESTAMP AS [SMALLDATETIME]) AS [CollectionDate] FROM [sys].[dm_db_missing_index_group_stats] gs WITH ( NOLOCK ) JOIN [sys].[dm_db_missing_index_groups] ig WITH ( NOLOCK ) ON gs.[group_handle] = ig.[index_group_handle] JOIN [sys].[dm_db_missing_index_details] id WITH ( NOLOCK ) ON ig.[index_handle] = id.[index_handle] JOIN [sys].[databases] db WITH ( NOLOCK ) ON db.[database_id] = id.[database_id] WHERE db.[database_id] = DB_ID() --AND gs.avg_total_user_cost * ( gs.avg_user_impact / 100.0 ) * ( gs.user_seeks + gs.user_scans ) > 10 ORDER BY [IndexAdvantage] DESC OPTION ( RECOMPILE ); |
As you saw, in the script above, we used SQL Server DMV's to identify the missing indexes in the base. Another alternative, which even allows us to see the execution plan, is to go directly to the DMV's of the plancache, and extract this data from the XML of the execution plan.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
IF (OBJECT_ID('tempdb..#MissingIndexInfo') IS NOT NULL) DROP TABLE #MissingIndexInfo ;WITH XMLNAMESPACES ( DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan' ) SELECT query_plan, n.value('(@StatementText)[1]', 'VARCHAR(4000)') AS sql_text, n.value('(//MissingIndexGroup/@Impact)[1]', 'FLOAT') AS impact, DB_ID(REPLACE(REPLACE(n.value('(//MissingIndex/@Database)[1]', 'VARCHAR(128)'), '[', ''), ']', '')) AS database_id, OBJECT_ID(n.value('(//MissingIndex/@Database)[1]', 'VARCHAR(128)') + '.' + n.value('(//MissingIndex/@Schema)[1]', 'VARCHAR(128)') + '.' + n.value('(//MissingIndex/@Table)[1]', 'VARCHAR(128)')) AS OBJECT_ID, n.value('(//MissingIndex/@Database)[1]', 'VARCHAR(128)') + '.' + n.value('(//MissingIndex/@Schema)[1]', 'VARCHAR(128)') + '.' + n.value('(//MissingIndex/@Table)[1]', 'VARCHAR(128)') AS statement, ( SELECT DISTINCT c.value('(@Name)[1]', 'VARCHAR(128)') + ', ' FROM n.nodes('//ColumnGroup') AS t(cg) CROSS APPLY cg.nodes('Column') AS r(c) WHERE cg.value('(@Usage)[1]', 'VARCHAR(128)') = 'EQUALITY' FOR XML PATH('') ) AS equality_columns, ( SELECT DISTINCT c.value('(@Name)[1]', 'VARCHAR(128)') + ', ' FROM n.nodes('//ColumnGroup') AS t(cg) CROSS APPLY cg.nodes('Column') AS r(c) WHERE cg.value('(@Usage)[1]', 'VARCHAR(128)') = 'INEQUALITY' FOR XML PATH('') ) AS inequality_columns, ( SELECT DISTINCT c.value('(@Name)[1]', 'VARCHAR(128)') + ', ' FROM n.nodes('//ColumnGroup') AS t(cg) CROSS APPLY cg.nodes('Column') AS r(c) WHERE cg.value('(@Usage)[1]', 'VARCHAR(128)') = 'INCLUDE' FOR XML PATH('') ) AS include_columns INTO #MissingIndexInfo FROM ( SELECT query_plan FROM ( SELECT DISTINCT plan_handle FROM sys.dm_exec_query_stats WITH ( NOLOCK ) ) AS qs OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) tp WHERE tp.query_plan.exist('//MissingIndex') = 1 ) AS tab(query_plan) CROSS APPLY query_plan.nodes('//StmtSimple') AS q(n) WHERE n.exist('QueryPlan/MissingIndexes') = 1; -- Trim trailing comma from lists UPDATE #MissingIndexInfo SET equality_columns = LEFT(equality_columns, LEN(equality_columns) - 1), inequality_columns = LEFT(inequality_columns, LEN(inequality_columns) - 1), include_columns = LEFT(include_columns, LEN(include_columns) - 1); SELECT * FROM #MissingIndexInfo ORDER BY [impact] DESC; |
Now just click on the XML (first column – query_plan) to view the execution plan (and see the Missing index warning)
Right-click and select “Missing Index Details…”
And check the automatically generated script for creating the index:
To better understand performance tuning and understand what is Seek operation, Scan, etc. read the article SQL Server - Introduction to Performance Tuning Study.
Conclusion
Closing this article, I hope you enjoyed this tip. She will definitely be very useful in your daily life as a DBA, especially for beginners.
Remember my tips and don't go around creating every suggested index and always analyze before creating it to see if this index could be better optimized or merged with an existing one.
Also analyze whether this index will really add value to the business or will it just take up disk space and improve a query that will not make a difference, such as a query executed 1x a day, at dawn.
And that's it folks!
A big hug and until the next post!