How to determine Index fragmentation and then defragment a database

Index fragmentation is always an issue with big data. I have written some articles before:
How to rebuild all indexes
How to script all indexes
How to examine index fragmentation in SQL Server 2012

The easy way:

SELECT
DB_NAME(DPS.DATABASE_ID) AS [DatabaseName]
, OBJECT_NAME(DPS.OBJECT_ID) AS TableName
, SI.NAME AS IndexName
, DPS.INDEX_TYPE_DESC AS IndexType
, DPS.AVG_FRAGMENTATION_IN_PERCENT AS AvgPageFragmentation
, DPS.PAGE_COUNT AS PageCounts
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL , NULL, NULL) DPS --N'LIMITED') DPS
INNER JOIN sysindexes SI
ON DPS.OBJECT_ID = SI.ID
AND DPS.INDEX_ID = SI.INDID
ORDER BY AVG_FRAGMENTATION_IN_PERCENT DESC

As a rule, all indexes with a fragmentation state over 30% should be defragmented:

— Index neu erstellen
EXEC sp_MSforeachtable @command1=”print ‘Erstelle Index für ?’ ALTER INDEX ALL ON ? REBUILD WITH (FILLFACTOR = 90)”
GO

— Index reorganisieren
EXEC sp_MSforeachtable @command1=”print ‘Reorgansiere Index für ?’ ALTER INDEX ALL ON ? REORGANIZE”
GO

How to examine index fragmentation in SQL Server 2012

In this exercise, execute the following T-SQL code, examine the fragmentation reported, and defragment the indexes that show 100 percent fragmentation.

SELECT SCHEMA_NAME(so.schema_id) AS [SchemaName],
OBJECT_NAME(idx.OBJECT_ID) AS [TableName],
idx.name AS [IndexName],
idxstats.index_type_desc AS [Index_Type_Desc],
CAST(idxstats.avg_fragmentation_in_percent AS decimal(5,2)) AS [Frag_Pct],
idxstats.fragment_count,
idxstats.page_count,
idx.fill_factor
FROM sys.dm_db_index_physical_stats
(DB_ID(), NULL, NULL, NULL, ‘DETAILED’) idxstats
INNER JOIN
sys.indexes idx
ON idx.OBJECT_ID = idxstats.OBJECT_ID
AND idx.index_id = idxstats.index_id

INNER JOIN sys.objects so
ON so.object_id = idx.object_id

WHERE idxstats.avg_fragmentation_in_percent > 20

ORDER BY idxstats.avg_fragmentation_in_percent DESC

Index fragmentation
Index fragmentation

How to create a filtered index

This is the quick-and-dirty method of creating a filtered index:

T-SQL
USE [AdventureWorks2012]
GO
SET ANSI_PADDING ON
GO
CREATE NONCLUSTERED INDEX [NC_Person_Address_City_spi_pc] ON [Person].[Address] ([City] ASC) INCLUDE ([StateProvinceID], [PostalCode])
WHERE [City] = ‘Seattle’ ON [PRIMARY] GO

You can find more about filtered indexes on Introduction to SQL Server Filtered Indexes by Seth Delconte

Find missing indexes for SQL Server 2012

The following example generates a listing of missing indexes that could be beneficial for T-SQL batches that have been submitted to the query optimizer since the last time the SQL Server service was restarted:

T-SQL

SELECT user_seeks * avg_total_user_cost * (avg_user_impact * 0.01) AS [Index_Useful] ,
igs.last_user_seek ,
id.statement AS [Statement] ,
id.equality_columns ,
id.inequality_columns ,
id.included_columns ,
igs.unique_compiles ,
igs.user_seeks ,
igs.avg_total_user_cost ,
igs.avg_user_impact
FROM sys.dm_db_missing_index_group_stats AS igs
INNER JOIN sys.dm_db_missing_index_groups AS ig ON igs.group_handle = ig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS id ON ig.index_handle = id.index_handle
ORDER BY [Index_Useful] DESC;

Query Execution Results
Query Execution Results
Missing Index Details - Full listing showing the desirability status, columns to be used and user seeks
Missing Index Details – Full listing showing the desirability status, columns to be used and user seeks

As you can see, the most required index missing in the database above is one on the status table.
We can create the missing index using the following command:

CREATE INDEX IX_ERPStatus ON tblERPStatus (SystemID, ERPDocumentNumber)

An alternative to the Count(*) function.

Sometimes rapid code development doesn’t always produce the most efficient code.
Take the age old line of code

SELECT COUNT(*) FROM MyTable.

Obviously this will give you the row count for a table, but at what cost?

Doing any SELECT * from a table will ultimately result in a table or clustered index scan.

ExecutionPlan

Turning on STATISTICS IO on reveals 4534 logical reads just to return the row count of 1,133,242.

Table 'tblERPInvoiceData'. Scan count 1, logical reads 4534, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Starting with SQL Server 2005, Microsoft introduced a DMV, sys.dm_db_partition_stats, that provides you with the same information at a fraction of the cost. It requires a little more coding, but once you turn on STATISTICS IO, you will see the performance benefit.

USE [db_name];
SELECT
s.name AS 'SchemaName'
,o.name AS 'TableName'
,SUM(p.row_count) AS 'RowCount'
FROM sys.dm_db_partition_stats p
JOIN sys.objects o ON o.object_id = p.object_id
JOIN sys.schemas s ON o.schema_id = s.schema_id
WHERE p.index_id < 2 AND o.type = 'U'
AND s.name = 'dbo'
AND o.name = 'tblERPInvoiceData'
GROUP BY s.name,o.name
ORDER BY s.name,o.name;
GO

Since we’re querying a DMV, we never touch the base table. We can see here we only need 16 logical reads to return the same row count of 1,133,242, and the tblERPInvoiceData table is nowhere in our execution plan.
Capture

Table 'sysidxstats'. Scan count 1, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 0, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysclsobjs'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

By using the DMV, we have improved the query performance and reduced the total I/O count by nearly 100%. Another added benefit of using the DMV, is we won’t need locks on the base table and therefore will avoid the possibility of blocking other queries hitting that table.