SQL Server 2008 – Fourteen tips for optimising query performance for applications by Tony Wright

June 23, 2010

1. Aim for the ideal of only one single sql call per page request.

Firstly, I should explain what I mean by page request. When you open a page, the page may need to be populated with content from the database. That call to the database to populate that page is a page request. If that page is a data entry page and required data to be written back to the database, then clicking on the Update button will also perform a page request. So it’s impractical to say one single database request per page, because there are likely to be many, especially on a more sophisticated page.

Now, if you make multiple requests to the database, that tends to require each request to be authenticated, executed, then the results returned. I have seen applications where there are literally hundreds of database calls just to update a single grid. I have seen where every row of the grid forces a call to the database to retrieve the individual row, that row gets updates, then a call is made back to the database to commit that data. That’s a lot of network traffic to achieve what could ultimately be done in a single batch!

If you must touch the database, prefer to use a stored procedure call that takes all the parameters you need, including user information, and returns all the data required to populate the page. This has become less popular in recent times because applications are becoming more modular (web parts) and technologies such as LINQ to SQL don’t easily return multiple resultsets in a single call. Modern databases such as SQL Server 2008 do batch inserts, updates and deletes, but they can’t really do a good job of batching if you are selecting individual items, modifying the data then writing it back to the database row by row.

So, does every one of my applications have a maximum of one database request per page request? Of course not! It’s an ideal. I often don’t have the time to go making every page perform optimally from the start, so my pages usually average around 2 or 3 requests per page request. Practically speaking, I prefer to use LINQ to SQL to make stored procedure calls and only return single resultsets. PLINQ, which will give us the ability to make multiple database requests in parallel, will improve this somewhat. Unfortunately, it is unlikely to fix the scenario where each row of a grid is retrieved, updated and written back individually. It might speed this scenario up overall, however it is still far less efficient than a single call!

There are also scenarios where page content won’t change, or it changes infrequently. These days, you could potentially load an entire table into cache and perform LINQ queries to retrieve the data from there. It is far harder to debug applications where data is stored in cache, but it can be done. By doing this, you can get the total number of queries per page down significantly below one request per page on average. The work is then performed within your web server, which is a cheaper resource.  By doing this, you an reduce the impact on the database, and you’ll potentially be able to scale your site up as sql server itself will only need to service the more important transactions. It is far easier and more cost efficient to scale up by adding extra web servers than it is to have to scale up your database and be able to handle a much higher volume of traffic.

2. Turn on SQL Profiler

This one’s easy, because most good developers already do it. If you turn on the profiler and execute your page, you should be able to see all the traffic caused by a single request. This is important, because you will be able to see a number of things. Firstly, you can look for candidates for queries that can be combined. You may also see situations that shouldn’t need to be done each time you request the page. You may also see situations where the exact same query is executed multiple times.

This case is a special one, because within an ASP.Net page, the event model can sometimes cause double execution of queries. This is usually done to recreate an old version of the page which can then be used to process post backs, and then a second query is executed as a result of the post back to change the state of the page. This can be pretty nasty depending on what you are doing. For example, if you were changing pages in the grid, you wouldn’t want it to query the old page just to process the request for the next page. Page.IsPostback is intended to prevent this kind of scenario, but it doesn’t always work out like this, especially if you have controls that were dynamically added to the page. In this scenario, there would be twice as much work done to retrieve a single page, as the database would potentially need to sort the data twice, run the rownumber routine against every candidate row, an so on. Of course, it has some smarts to reduce the amount of processing, but it still takes more time, and it still retrieves twice the data, which needs to be passed across the network, and rendered to the page possibly twice.

The profiler can also help you determine where to look in your code for problems. On numerous occasions I have identified code that I wasn’t expecting to be executed just by seeing an unexpected query via SQL Profiler.

And finally, the profiler can help you determine which query in a batch is taking the longest.

3. Preprocess expensive queries

One of my customers requires a hierarchy consisting of every company corporate structure tree in Australia. Some of those companies have holdings in around 40 to 50 thousand companies. If these were created on the fly, and a few of their customers requested these corporate structures at the same time, the impact on the server could be significant. What I did here is to stage the data in fixed temporary tables and then populate a table with the hierarchies already calculated and in the correct order. The time taken to retrieve the corporate structure, then, is the same as simply retrieving a user interface page of rows. This processing occured overnight, as it is acceptable for the data to be out of date by one day.

4. Review query plans to ensure optimally performing sql statements.

People have different ways of analysing and improving queries that they need to execute. One I like to analyse is the query plan. Now, the hardcore people look at the text output of the query plan, however I prefer the graphical view. From SQL Server Management Studio, select Query from the menu, then choose Show Actual Execution Plan. The next time you execute the query or stored procedure, it gives a graphical representation of the query execution in one of the tabs adjacent to the results and messages tab. The rule of thumb is to look at the relative expense of each subsection of the query and see if you can improve the performance of the more expensive parts. You work from top to bottom and right to left and aim to replace to icons (which represent underying query choices) with more efficent ones.

5. Identify table scans or index scans

Make sure your tables are using indexes. Table scans and index scans are candidates for adding or improving indexes you have. On very small tables, using a scan often doesn’t really matter. Also, when using an “exists” test, the query processing engine may decide that it’s more efficient just to perform a table scan of a table rather than use an index. This is perfectly acceptable. However, on larger tables with more rows, indexes are necessary and more efficient devices.

Each table has a maximum of one clustered index, and the rest are non-clustered. With the clustered index, the physical data within the table is actually organised in the order of the index. So it is the most efficient if it is used by the query optimiser. With the non-clustered indexes, a separate structure is kept within the database that contains the data ordered in the order of that index. So if you have a lot of these indexes, a significant amount of extra processing may be required to keep these indexes up to date. However, these indexes are usually smaller than processing the original table, and so can significantly improve the performance of a query. So if data entry speed and disk space is not a big issue, lots of indexes is fine and a good choice.

6. Identify key lookups

Within the query plan, identify Key Lookups (these used to be called Bookmarks). If there are any Key Lookups, they can be removed from the query plan by adding columns as include columns to the non-clustered index on the right of the Key Lookup. Include columns are fantastic, because they ultimately mean that when the candidate rows are found in the index, there will be no requirement to go back to the original table and retrieve the included columns because they are already in the index. So in this case, the speed of the query approaches that of a clustered index. Don’t worry, the data is kept the same between the include column and the original table, so you lose nothing here. The only downside is that included columns are added to the index and so index size increases, and data entry may slow down slightly because there are more references to maintain.

7. Check the order of columns in your where clause

Ensure the order of columns in your “where” clause is the same as in the order within your index, otherwise it may not choose your index. I’ve seen plenty of cases where scans are performed instead of seeks simply because the order of the columns in the where clause are not optimal.

8. Ensure the where clause is ordered most restrictive to least restrictive.

This will make sure that the most efficient path is taken when matching data between indexes in your query. By restrictive, I mean that the data is more uniquely selectable. So a column with different data in every row is more restrictive than a column with much of the same data in every row. Also consider the size of the table, so that a table with less data in it may be selected first in a join over a table with more data in it. This can be a bit of a balancing act.

9. Remove CTEs

CTEs are kind of like temporary tables. But they are mostly disastrously inefficient. I had one query taking 20 seconds to execute, which was unacceptable. It had a significant number of CTEs so I changed them all the Temp Tables and executed the query. The query sped up and took around 1 second to execute – which was fast enough for the scenario and so I could move on. There is one valid purpose I can think of for using CTEs and that’s for producing hierarchies. But in general, they can be replaced by temporary tables anyway. That said, however…

10. Remove Temp Tables

The creation of temp tables adds to the overhead required to run your overall query. In some scenarios, I have removed temporary tables and replaced them with fixed tables and had significant performance improvement.

If the temp table is created to enable to merging of data from similar data sources, then prefer a union instead. Unions, in general, are far far cheaper than temp tables.

#Temp tables are created in tempdb. @Temp tables are created in memory first, but the moment there is memory pressure, they spill over into tempdb as well. tempdb requires disk writes and reads, and so will be slower than accessing the original table data.

11. Remove Cursors

These are one of the most expensive statements you can use. There are special cases where they should be used, but it’s better to train yourself to use standard set based statements than cursors. If you want to understand when it’s reasonable to use a cursor, take a look at Greg Low’s article. But in general, avoid them like you would the plague. http://msmvps.com/blogs/greglow/archive/2007/02/04/improving-asp-net-session-state-database-performance-by-reducing-blocking.aspx

12. Reduce the number of joins in your queries

If you significantly reduce the number of joins in your query, you will have a vast improvement in speed of the query. There are a couple of ways to do this. You could stage the data in a denormalised table, or in Enterprise edition you can create a view and put an index on that view. Again, it depends on how immediate your requirement is for having the latest data. It is often acceptable for reporting to build these tables overnight due to the fact that a single day often has no impact on the benefits associated with a particular report.

13. Remove all declared variables

If everything is compilable within your stored proc, then there will be no need for the query engine to perform extra work to determine how the query will look after the declared variables are taken into consideration. So it will perform most optimally. So how do you remove declared variables? Well, for starters, you can pass them as parameters of your stored procedure call. If you find you do need to have declared variables, you can create a second stored procedure that does the main work and pass the declared variables into that. So the passed variables become parameters in the second query.

14. Give this fact sheet to your database developers


It’s a little bit out of date, but it’s the best one I have found so far. It shows the relative expensiveness of different data types, sargability of joins, includes various helpful information, and also describes many of the icons found in the query plans. Someone should tell that guy to update it.

14 Tips for optimising SQL Server 2008 Queries in ASP.Net

June 19, 2010

This article is now here.

Sql Server 2005 – Twelve Tips For Optimizing Query Performance by Tony Wright

May 26, 2008

1. Turn on the execution plan, and statistics

The first thing you need to do is to use the tools that help you determine whether a query done one way is better than another. That’s what we’re trying to do. By comparing the original query to a new query that we come up with is the best way to evaluate the benefits of any changes.

To do this, go into Sql Server Management Studio and select the Query menu. Select the “Include Actual Query Plan.” This turns on the graphical Execution Plan when you execute a query, and that can be found in the bottom pane after the execution.

In the Execution Plan, you can mouse over the components of the plan and it provides tooltip information boxes. The box contains Estimated Subtree Cost, which can be used to help determine whether one query is better than another. Of course, it’s not always right, as some query parts are not included in the execution plan, but it helps. It is also helpful to know the estimated number of rows, which is also found in this tooltip box.

Next, turn on statistics. Type the following statement:


This causes statistics to be output to the Messages tab in the bottom pane. The information you want here is mainly logical reads and physical reads. Logical reads are page reads from memory. Physical reads are page reads from disk. This stat can be a little deceptive as it doesn’t include CPU in the metric, but in general, the less page reads, the less work done, and so the more performant the query will be.

To counteract the above two you should also compare the actual execution times. To do this, eecute the following statement:


This also has issues, as blocking and contention issues affect the output time. You should execute the query a few times to determine how accurate the time shown is.

2. Use Clustered Indexes

Having the clustered index on the primary key is sometimes not the most efficient place for the clustered index to be. A clustered index is the most performant type of index. The whole table is sorted according to the clustered index. If the table is involved in lots of joins based on the primary key, it is probably the right place for it to be, but if you are continually filtering or grouping on other columns in a table, then you should possibly consider changing the primary key index to Non-Clustered, and putting the clustered index on those filtered or grouped columns.

The following statement removes and existing clustered index on the primary key and replaces it with a non-clustered index:

ALTER TABLE MySchema.SalesOrderHeader
ALTER TABLE MySchema.SalesOrderHeader

Then the following statement adds a new clustered index to a table.

ON MySchema.SalesOrderHeader (OrderID)

3. Use Indexed Views

Indexed Views have been around for a while. A view is like a named query, and these days you can add indexes to them. If used correctly, they can cause a massive improvement in execution times, often better than a clustered index with covering columns on the original table. Also, in SQL Server Developer Edition and Enterprise Edition, a view index will also be automatically used if it is the best index even if you don’t actually specify the view in your query!

CREATE VIEW MySchema.SalesByCustomer
SELECT soh.SalesTerritoryID, soh.CustomerID,
SUM(sod.Quantity * sod.UnitPrice)
FROM MySchema.SalesOrderHeader soh
INNER JOIN MySchema.SalesOrderDetail sod
ON (soh.SalesOrderID = sod.SalesOrderID)
GROUP BY soh.SalesOrderTerritory, soh.CustomerID

Note the use of the schema binding attribute. This prevents you from changing underlying tables while this view exists, and is necessary if you want to add an index. Some people avoid indexed views for this reason, as the maintenance becomes more complicated as further dependencies to the view are created. The following statement adds an index:

ON MySchema.SalesByCustomer(
SalesTerritoryID, CustomerID

4. Use Covering Indexes

Covering indexes are a feature that was newly added to SQL 2005. Basically, you can create an index optimised for the query itself based on joins, filters and grouping, and then add additional columns that can be retrieved directly from the index for use in select statements, as follows:

ON MySchema.SalesOrderDetail(OrderId)
INCLUDE (Quantity, UnitPrice)

The above statement causes a non-clustered index to be created on the SalesOrderDetail table. If queries are executed on the OrderId column, the index will be used, and if the only other columns being retrieved are Quantity and UnitPrice, then the query optimiser doesn’t need to retrieve any extra columns from the underlying table. It can just use the index. Because the query optimiser doesn’t need to query the original table, performance is improved.

5. Keep your clustered index small.

One thing you need to consider when determining where to put your clustered index is how big the key for that index will be. The problem here is that the key to the clustered index is also used as the key for every non-clustered index in the table. So if you have a large clustered index on a table with a decent number of rows, the size could blow out significantly. In the case where there is no clustered index on a table, this could be just as bad, because it will use the row pointer, which is 8 bytes per row.

6. Avoid cursors

A bit of a no-brainer. Cursors are less performant because every FETCH statement executed is equivalent to another SELECT statement execution that returns a single row. The optimiser can’t optimise a CURSOR statement, instead optimising the queries within each execution of the cursor loop, which is undesireable. Given that most CURSOR statements can be re-written using set logic, they should generally be avoided. There is a case where it is useful to use cursors, however, and that is when you actually want a row by row execution. In the case where there’s blocking occurring in a large batch of rows that’s exactly what you want.

7. Archive old data

Another no-brainer, so I won’t say much. If you want to improve query performance, give the optimiser less work to do. If you can cut down the number of rows the query has deal with, then performance will improve. I have no problem with people creating audit triggers to move historical data into other tables for this reason. Alternatively, if you don’t need your data after a certain period of time, back up your database and remove the data.

8. Partition your data correctly

These days, you don’t actually have to move old data out of a table to improve query performance. You can partition your table into a number of data segments based on a partition function. The query optimiser can use the partition function to look at rows only on the most appropriate filegroup. To create partitions, you need a partition function and a partition scheme.

CREATE PARTITION FUNCTION myRangePartitionFunction(int)

Once the partition function is created, you can then apply the function to a partition scheme for a table.

AS PARTITION myRangePartitionFunction
TO (filegrp1, filegrp2, filegrp3, filegrp4)

Then it’s just a matter of creating the table to use the partition scheme on the column you decided to partition on:

CREATE TABLE mySchema.myPartitionTable(
col1 int,
col2 nvarchar(100)
ON myRangePartitionScheme(col1)

9. Remove user-defined inline scalar functions

Inline scalar functions are convenient if you want to return a single value, but at the expense of performance. They look somewhat like stored procedures, and they can be used in SQL statements. The problem is that they are not expanded and therefore not optimised into the query plan by the query optimiser. Bad news, because it turns a Seek into a Scan. Queries like this may appear to be performant in the Execution plans and also in the IO statistics, but when you run the query, it can perform really really badly. No seriously, really bad.

Here’s an example of what I’m talking about:

CREATE FUNCTION dbo.fnGetPostalCode(
   @Suburb nvarchar(100),
   @State nvarchar(10)
      SELECT PostalCode
      FROM dbo.PostalCode
      WHERE Suburb = @Suburb
      AND State = @State
   ), -1 );

The following statement will only perform a clustered index scan, not a seek, and on a big table this could seriously affect performance.

SELECT s.SalesPersonID,
AS PostalCode
FROM dbo.SalesPerson

You can have a look at the details by clicking on SQL Server Management Studio’s Query menu, and selecting “Include Actual Execution Plan”

One way to get around this is to simply inline the underlying query from the function, as follows:

SELECT s.SalesPersonID, s.SuburbName, s.State,
         FROM dbo.PostalCode
         WHERE Suburb = s.SuburbName
         AND State = s.State), -1)
      AS PostalCode
FROM dbo.SalesPerson

Inline the SQL statement will perform significantly better.

10. Use APPLY

The apply statement was created for the situation where you put multiple inline nested queries in the one statement. For example, take the following statement:

SELECT soh.SalesOrderID,
Quantity=(SELECT TOP 1 (Quantity)
          FROM Sales.SalesOrderDetails
          WHERE SalesOrderID = soh.SalesOrderID),
UnitPrice=(SELECT TOP 1 (UnitPrice)
           FROM Sales.SalesOrderDetails
           WHERE SalesOrderID = soh.SalesOrderID)
           FROM Sales.SalesOrderHeader soh)

This performs an extra query, retrieving data from another table using the same criterion. This can now be replaced with the following:

SELECT soh.SalesOrderID, soh.OrderDate, a.*
FROM Sales.SalesOrderHeader soh
   SELECT TOP (1) sod.UnitPrice, sod.Quantity
   FROM Sales.SalesOrderDetail sod
   WHERE sod.SalesOrderId = soh.SalesOrderId
   ORDER BY sod.Quantity DESC
) as a

11. Use computed columns

Computed columns are derived from other columns in a table. By creating and indexing a computed column, you can turn what would otherwise be a scan into a seek. For example, if you needed to calculate SalesPrice and you had a Quantity and UnitPrice column, multiplying them in the SQL inline would cause a table scan as it multiplied the two columns together for every single row. Create a computed column called SalesPrice, then index it, and the query optimiser will no longer need to retrieve the UnitPrice and Quantity data and do a calculation – it’s already done.

12. Use the correct transaction isolation level

If there are a lot of rows in your table, multiple concurrent requests to that table could cause contention if the correct transaction isolation level is not set. If requests are repeatedly blocked, it could be time to consider whether to change.

For example, READ UNCOMMITTED is equivalent to dirty reads, or NOLOCK. That is, if a transaction is in the middle of processing and you read a row, the data may not be valid, especially if multiple inserts/updates are occurring that require atomicity. This is the most performant and it ignores locking altogether, but is generally not allowed by good design and is a special case.

With READ_COMMITTED_SNAPSHOT, it specifies that any data read by the transaction will be the transactionally consistent version of the data that existed at the start of the transaction. Internally, it makes a versioned copy of the data and this is placed in tempdb until the transaction has competed. Except when the database is being recovered, snapshot transactions do not request locks when reading data, and therefore do not block other transactions from writing data. Transactions writing data also do not block other transactions reading data.

There are various other types of transaction options, including REPEATABLE_READ and SERIALIZABLE amongst others that you can look at to determine whether they are appropriate for your needs.