Click on the banner to learn about and purchase my database training on Azure

SQL Server - Introduction to Performance Tuning Study

Views: 10.688 views
This post is the 1 part of 10 in the series. Performance tuning
Reading Time: 14 minutes

Hello people,

In this post I would like to start talking about a subject I really like, which is Performance Tuning, which I have already lectured on 2 Chapter SQL Server ES Meeting - 10 / 06 / 2017.

This theme is always among the most sought after by database professionals, developers and companies looking for experienced DBA consultancies. Unlike many areas of the database, Performance Tuning always requires an analysis of the problem and several tests before any action. Although there are good practices for this, there is no magic formula or “cake recipe” that will always solve performance problems, regardless of the scenario and environment.

I hope you enjoy this series 🙂

Why is Performance Tuning so important?

  • “In today's world, most people don't have the patience to wait for one site to load for a long time, and they eventually hit another one right away.”
  • “On average, users leave any site if it doesn't load on their mobile phone within three seconds” (Google)
  • One of the biggest reasons for the failure of Windows Vista was that it was an extremely heavy and slow system (
  • “HP's TouchPad tablet was one of the most anticipated launches of 2011 - and soon became the biggest failure of the year. The device, created to compete with Apple's iPad, lasted only seven weeks on the market before HP made the decision to end it, citing weak sales as a reason. The reason? Consumers immediately recognized that the cell phone was too slow ”(
  • “55% of Canadian students admit to being stressed by slow computers in Hourglass Syndrome.” (
  • “66% of Americans are stressed by slow computers and 23% describe themselves as very stressed because of that.” (

In today's scenario, users no longer have the patience to wait for a screen to load or the return of a button click. Users want quick system responses, and for this, the database must enable the systems to be able to query the data in the database without generating system waits.

For this, the DBA needs to be knowledgeable to identify possible slowness, which may be caused by several factors:

  • No table indexes or inefficient indexes for a given query
  • Index or Table Fragmentation
  • Locks
  • Wait events (Ex: PAGEIOLATCH_EX, OLEDB, etc)
  • Poorly written query
  • CPU overloaded
  • High disk read time
  • Information volume too large
  • Tempdb containment
  • Network issues

How is Performance Tuning work?

To perform a performance tuning analysis, I will list the activities that are usually required to achieve your goal:

Understand the problem

This activity of the performance tuning process consists of identifying in a macro way, the origin of the slowness in the environment. If you already know which routine you are going to try to optimize, this process is already complete.

In this activity, you should refer to your instance's routines and data collections to analyze, for example:

  • High time duration routines
  • Routines that have a high CPU time
  • Routines with high I / O volume (reads / writes)
  • Server Log Analysis
  • Analysis of instance wait events
  • Block and deadlock history analysis
  • History analysis of running routines at slow time
  • Block and deadlock history analysis
  • Index and Table Fragmentation Reports
  • Table reports without statistics update
  • Missing index reports

In order to understand and identify the problem, you must have all (or most) of the controls listed above. They will provide you with data and information about what happens in your instance when a performance issue occurs.

Make the diagnosis

Now that you have identified what is causing the slowness, now is the time to find out why it is occurring. In this step, we will take a micro look at the specific routine that is performing poorly in order to unravel it and find out which queries can be optimized.

In this activity, we have several ways to help with the analysis, such as:

  • Execution plan analysis to identify how the query is being done internally (and identify potential anomalies such as implicit conversion)
  • Analysis of indices used to ensure they are covering the heaviest queries (covering index)
  • Use of SET STATISTICS IO and TIME to measure the amount of readings / writes that are performed for each object queried, as well as the response time of each operation.
  • In many cases, the database structure is optimized, but the query is poorly written. Thus, the query may be changed to optimize the use of existing indexes.
  • Analyze index histograms to validate if it is really being effective and making sense

Apply optimization tips and techniques

Once you have identified the queries that are performing poorly, it is time to apply the performance tuning techniques, be it query tuning, index creation / alteration, statistics update, etc.

In this step, you will probably need to test more than one improvement to achieve the best result, depending on how your environment is doing. In many cases, you will need one more change for the query to be optimized.

Tests, Tests, Tests and then, more Tests!

This step, along with the Apply optimization tips and techniques step, are probably the ones that will most need your time. All performance tuning work requires changes to be tested and validated before being applied to the environment.

As I mentioned earlier, there is no “cake recipe” that will always be more performative in all situations (although there are good practices that GENERALLY perform better). I have seen several cases in which a change gave a great performance gain in one environment, and in the other it ended up having no effect. Remember: There are N factors that can influence your bank's performance and all must be considered.

From my experience, I recommend you: NEVER apply a performance improvement without testing ENOUGH before applying. This is especially recommended if you have made changes to the query to achieve better performance. In my day-to-day, I make changes to queries in both the structural and business rules, so the test turns out to be even more important.

Apply optimization

In the final step of the generally quieter Performance Tuning process, you will apply the suggested changes to your environment after an intense battery of testing has already been done.

If you have more than one environment in your company (Ex: DEV, HOM, PRD), it is very interesting to go up the change by environment in order to observe the system behavior after the changes.

An extremely important part of this step is the collection of the results obtained after applying the changes. This part of the Performance Tuning project is vital to gaining confidence in DBA's work and being able to show the advantages gained from day to day by investing time (and hence money) in that work within the company.

Performance Tuning - Indexes

Indexes in the database are one of the most important (perhaps most) structures for query performance. Indexes are disk structures associated with a table or view that expedite row retrieval by sorting data into binary trees.

To make an analogy, it would be like the index or summary of a book, so you can quickly search for what you are searching for. Although indexes are extremely useful and effective, they must be created with great rigor and discretion: Indexes take up disk space, that is, if you create many indexes, they can take up more space than the table itself.

In addition, indexes speed queries, but slow down write operations (INSERT, UPDATE, DELETE) because these operations need to update the table and data for all related indexes. If you have a table that has a lot of writes and few readings (eg log / history table), this table may not be a good candidate for indexing. WATCH OUT!

Some great candidates to be part of an index are frequently accessed computed (calculated) columns, columns that are part of a Foreign Key (FK), table identity column.

Another great candidate for creating an index can be identified after further analysis where you have identified a very heavy query that is often performed in your environment. In this case, you will create a specific index for this query, which will have all the columns used by this query, in a technique known as Covering Index.

If you want to dig deeper into the indexes in SQL Server, I recommend reading the post. Understanding Index Functioning in SQL Server.

Performance Tuning - The Execution Plan

Execution Plan is a graphical tool to help DBA / Developer understand how queries are being performed in the database and interpreted by the query optimizer. This tool is extremely important for anyone considering doing database query optimization work because only by understanding how the query is being done internally is it possible to identify points for improvement.

Example of an execution plan being analyzed

What can we extract from a plan?

  • Are you using index?
  • What index was used?
  • Parallel work?
  • How much data?
  • What is the most costly operation?
  • Which operator was used?

How do we read a plan?

Arrows indicate volume of records processed:

Percentage of cost of each operation

Detailed Information through ToolTips

Index Suggestion

What is the difference between estimated and current plan?

Estimated plan:

  • Useful for development where you cannot execute the query
  • Very useful in scenarios where the original query takes too long to process
  • Does not work with temporary objects
  • Does not identify any warnings (eg Residual I / O)
  • Based on statistics (if outdated, may distort reality)

Current plan:

  • The original query is executed in the database.
  • At the end of operations, the execution plan that was actually used by the query optimizer is shown.

How to view the execution plan of my query?

To be able to see the query execution plan you will execute, just select one of the 2 options marked in the print below:

Interface option to display execution plan

However, they have different behaviors: When selecting the option “Include Actual Execution Plan”, the queries made from that moment on will return the generated plan at the end of the execution.

The option “Display Estimated Execution Plan” must be used when selecting the queries in which you want to view the estimated plan.

Regardless of the form you choose, once you select the option, simply execute your query (Current Plan) or view the plan of selected queries (Estimated Plan) to view the execution plan graphically in your Management Studio (SSMS):

Another way to view the estimated execution plan is through SET commands:


To graphically view the execution plan, just click on the generated XML:

How to view the execution plan of a running query?

To view the execution plan of the running queries, simply execute the query below:

If you want to know all the information about running queries, including CPU usage, tempdb, reads, writes, the query execution plan itself, and more, take a look at the simplified version of sp_whoisactive I made available in the post. SQL Server - Query to return running queries (sp_WhoIsActive without consuming TempDB).

How to view cached execution plans?

Generally when you execute a query, SQL Server will generate an execution plan for it and leave that plan cached in the database, in case that same query is executed again, the query optimizer does not need to parse the query and generate a new plan. Therefore, you can view the plans that are cached in SQL Server.

Remember that every time the instance is restarted the cached plans are discarded. SQL Server also keeps only the most commonly used plans, as it has no way to store the plans for each query already performed on the instance.

To view cached plans, simply use the query below:


How to clear cached execution plans?

As I mentioned above, keeping execution plans cached is good practice to prevent the query optimizer from having to parse queries and generating new execution plans unnecessarily. However, in many situations it is important that you can clear a particular plan from the cache or even all the plans.

To clear all cache from your instance, simply execute the command below:

To clear all cache from a given database, run this command:

To clear the cache of a stored procedure or function, simply execute an alter procedure / alter function command that the plan is recreated.

An interesting option is hint WITH RECOMPILE, which when used to change an object causes a new execution plan to be created with each call of that object.

Another way to force recompilation of an object and generate a new execution plan is by using the sp_recompile internal Stored Procedure:

What are the main operators of the Execution Plan?

In this session, I will list the most common operators that you will see most often during your analysis.

Table Scan
- Operator that consists of reading ALL the data in the table to find the information that must be returned.
- Happens in table without cluster index
- In general, high cost operation

Clustered Index Scan
- Operator that consists of reading ALL the data of the CLUSTERED index to find the information that must be returned.
- In general, it is usually a little faster than Table Scan, as the data is already sorted in the index.
- Index Scan can be a reflection of an expensive Lookup

Clustered Index Seek and NonClustered Index Seek
- Algorithm that is usually extremely efficient to return specific records.
- As the data is already sorted in the index, you can use more efficient algorithms such as QuickSort and ShellSort.

Key Lookup and RID Lookup
When a query is performed on a table, SQL Server Query Optimizer will determine the best method of accessing data according to the statistics collected and choose the one with the lowest cost.

Because the clustered index is the table itself, generating a large amount of data, the lowest cost nonclustered index is often used for the query. This can be a problem because often the query is selecting columns where not all are indexed, so that a non-clustered index is used to search the indexed information (Index Seek NonClustered) and the clustered index is also used to return the remaining information, where the nonclustered index has a pointer to the exact position of the information in the clustered index (or the ROWID if the table has no clustered index).

SQL Server - 2 Key Lookup

This operation is called a Key lookup, for tables with clustered index or RID Lookup (RID = Row ID) for tables that do not have clustered index (called HEAP tables) and by generating 2 read operations for a single query, should be avoided whenever possible.

Key lookup
SQL Server - 2 Key Lookup Execution Plan

RID Lookup
SQL Server - RID Lookup Execution Plan 2

To avoid KeyLookup just use the Covering Index technique, which consists of adding to the NonClustered Index (INCLUDE) the main columns that are used in the table queries. This makes the query optimizer able to get all the information by reading only the chosen index, without having to read the clustered index as well.

However, close attention should be paid to index modeling. It is not recommended to add all table columns to the nonclustered index as it will become so large that it will no longer be effective and the query optimizer may even decide not to use it and prefer the Index Scan operator, which does so. sequential reading of the entire index, impairing query performance.

To avoid RID Lookup, simply create the clustered index on the table and pay attention to any Key Lookup events that may arise.

- Operator who is usually very heavy on queries, especially with large volumes of data
- Line to line processing
- Generally, it can be discarded and ordered in the application (fight DBA x DEV)

Stream Aggregate
- Queries with grouping (GROUP BY, DISTINCT, etc.)
- Operator who is usually very heavy in consultations

Compute Scalar
- Operator used in queries with expressions, mathematical calculations or conversions (CAST, CONVERT)

Nested loops
- Very efficient algorithm
- Ideal for scenarios with few registrations
- For each row of the outer table, sweep them all on the inner table.
- Low CPU and memory consumption
- Table variable (@Tabela) will ALWAYS use Nested Loops, regardless of the number of records, because as there are no statistics, the estimated number of rows is always 1 (unless you force with hint).

How it works internally:

Algorithm used:

Merge Join
- Efficient, but needs orderly data
- Ideal for scenarios with many records
- If the data is not sorted, Merge Join may request sorting through a Sort Merge Join
- If both tables do not have a single index, Merge Join Many to Many occurs - tables without PK, use tempdb, less efficient
- Merge Join and its impact on TEMPDB - Consumption much higher than the Nested Loop, as the matches are made in memory and the volume of data is usually higher as well.
- It generally consumes little CPU and memory. Often found in queries with covering indexes.

How it works internally:

Algorithm used:

Some Troubleshooting Tips

- Lookup: Solve with Covering index or INCLUDE
- Index Scan can be a reflection of an expensive Lookup! Watch!
- Beware of Conversions, especially conversions in the WHERE clause. They can be deadly for your consultation!
- Pay attention to outdated statistics: Estimated number <> current number

- Sniffing Parameter: Occurs when the same SP has several types of behaviors according to the informed parameters, making the plan used is not appropriate. Try creating different procedures for each parameter or adding WITH RECOMPILE or Hint OPTION (RECOMPILE) clause.

Well, that's it, guys!
This post is just an introduction to the “art” of Performance Tuning and I was already in debt for about 3 months, when I gave my talk at the 2nd meeting of SQL Server ES and I did not do this post to complement the presentation for those who were not.

I hope to be coming back soon with new posts on this subject, which I find so interesting.
A hug and see you next!