Source : http://sqlperformance.com/2013/06/t-sql-queries/missing-index
Kevin Kline (@kekline) and I recently held a query tuning webinar (well, one in a series, actually), and one of the things that came up is the tendency of folks to create any missing index that SQL Server tells them will be a good thing™. They can learn about these missing indexes from the Database Engine Tuning Advisor (DTA), the missing index DMVs, or an execution plan displayed in Management Studio or SQL Sentry Plan Explorer (all of which just relay information from exactly the same place):
The problem with just blindly creating this index is that SQL Server has decided that it is useful for a particular query (or handful of queries), but completely and unilaterally ignores the rest of the workload. As we all know, indexes are not "free" – you pay for indexes both in raw storage as well as maintenance required on DML operations. It makes little sense, in a write-heavy workload, to add an index that helps make a single query slightly more efficient, especially if that query is not run frequently. It can be very important in these cases to understand your overall workload and strike a good balance between making your queries efficient and not paying too much for that in terms of index maintenance.
So an idea I had was to "mash up" information from the missing index DMVs, the index usage stats DMV, and information about query plans, to determine what type of balance currently exists and how the overall benefit of adding the index might fare.
Missing indexes
First, we can take a look at the missing indexes that SQL Server currently suggests:
This shows the table(s) and column(s) that would have been useful in an index, how many compiles/seeks/scans would have been used, and when the last such event happened for each potential index. You can also include columns like
s.avg_total_user_cost
and s.avg_user_impact
if you want to use those figures to prioritize.Plan operations
Next, let's take a look at the operations used in all of the plans we have cached against the objects that have been identified by our missing indexes.
A friend over on dba.SE, Mikael Eriksson, suggested the following two queries which, on a larger system, will perform much better than the XML / UNION query I cobbled together above, so you could experiment with those first. His ending comment was that he "not surprisingly found out that less XML is a good thing for performance. :)" Indeed.
Now in the
#planops
table you have a bunch of values for plan_handle
so that you can go and investigate each of the individual plans in play against the objects that have been identified as lacking some useful index. We're not going to use it for that right now, but you can easily cross-reference this with:
Now you can click on any of the output plans to see what they're currently doing against your objects. Note that some of the plans will be repeated, since a plan can have multiple operators that reference different indexes on the same table.
Index usage stats
Next, let's take a look at index usage stats, so we can see how much actual activity is currently running against our candidate tables (and, particularly, updates).
Don't be alaramed if very few or no plans in the cache show updates for a particular index, even though the index usage stats show that those indexes have been updated. This just means that the update plans aren't currently in cache, which could be for a variety of reasons – for example, it could be a very read-heavy workload and they've been aged out, or they're all single-use and
optimize for ad hoc workloads
is enabled.Putting it all together
The following query will show you, for each suggested missing index, the number of reads an index might have assisted, the number of writes and reads that have currently been captured against the existing indexes, the ratio of those, the number of plans associated with that object, and the total number of use counts for those plans:
If your write:read ratio to these indexes is already > 1 (or > 10!), I think it gives reason for pause before blindly creating an index that could only increase this ratio. The number of
potential_read_ops
shown, however, may offset that as the number becomes larger. If the potential_read_ops
number is very small, you probably want to ignore the recommendation entirely before even bothering to investigate the other metrics – so you could add a WHERE
clause to filter some of those recommendations out.
A couple of notes:
- These are read and write operations, not individually measured reads and writes of 8K pages.
- The ratio and comparisons are largely educational; it could very well be the case that 10,000,000 write operations all affected a single row, while 10 read operations could have had substantially more impact. This is just meant as a rough guideline and assumes that read and write operations are weighted roughly the same.
- You may also use slights variations on some of these queries to find out – outside of the missing indexes SQL Server is recommending – how many of your current indexes are wasteful. There are plenty of ideas about this online, including this post by Paul Randal (@PaulRandal).
I hope that gives some ideas for gaining more insight into your system's behavior before you decide to add an index that some tool told you to create. I could have created this as one massive query, but I think the individual parts will give you some rabbit holes to investigate, if you so wish.