What is fragmentation in Azure SQL server?
Fragmentation in Azure SQL Server, or any SQL Server, refers to the condition where the data pages of a database or index are not physically stored in an efficient, contiguous manner. Fragmentation can occur over time as data is inserted, updated, and deleted from tables and indexes. It can negatively impact database performance and efficiency. There are two primary types of fragmentation in SQL Server:
Index Fragmentation:
- Internal Fragmentation (Page Fragmentation): This occurs when the data pages within an index are not stored efficiently. Internal fragmentation can lead to wasted space within the data file.
- External Fragmentation: This occurs when the data pages of an index are not stored in contiguous order on the disk. It can lead to increased disk I/O operations, slowing down query performance.
Data Fragmentation:
- Data fragmentation refers to fragmentation within the actual data pages of tables. This can occur when data is inserted or deleted, causing gaps in data page storage.
Fragmentation can have a significant impact on database performance because it can increase the number of physical reads and writes required for data retrieval and modification. This results in slower query response times and reduced overall system performance.
To address fragmentation in Azure SQL Server, you can perform index maintenance tasks, such as index rebuilding or reorganizing, which involve physically reordering the data pages within an index to reduce fragmentation. Additionally, you can monitor fragmentation levels and implement a regular maintenance plan to keep fragmentation in check.
Azure SQL Database offers some automation for index maintenance through features like Azure SQL Database Managed Maintenance, which handles certain maintenance tasks for you, reducing the need for manual intervention.
Efficiently managing fragmentation is a critical aspect of database administration to ensure optimal database performance, especially in cloud-based database services like Azure SQL Database.
Why the Index Fragmentation percentage varies?
Index fragmentation percentages can vary in a database due to a combination of factors related to data modification and storage management. Here are some of the reasons why index fragmentation percentages may fluctuate:
- Data Modification:
Inserts, updates, and deletes of rows in tables can lead to index fragmentation. When new data is added or existing data is modified or deleted, it may no longer fit efficiently in the existing data pages, causing fragmentation.
- Page Splits:
Page splits occur when a data page in an index is full, and a new row needs to be inserted. SQL Server will split the full page into two new pages, leading to physical fragmentation. Page splits are a common cause of fragmentation.
- Rebuilding and Reorganizing:
The process of index rebuilding or reorganizing can affect fragmentation levels. Rebuilding an index typically results in a defragmented index, while reorganizing can reduce fragmentation but may not eliminate it completely.
- Concurrency:
High levels of concurrent data modifications can lead to increased fragmentation. Multiple transactions attempting to modify the same index simultaneously can result in fragmentation.
- Data Distribution:
Uneven data distribution in the index key can contribute to fragmentation. If certain values are concentrated in a specific range, it may lead to page splits when new data is inserted.
- Maintenance and Scheduling:
The timing and scheduling of index maintenance tasks can impact fragmentation. Frequent maintenance can help keep fragmentation levels low, while irregular or infrequent maintenance can allow fragmentation to accumulate.
- Index Fill Factor:
The fill factor setting for an index can influence fragmentation. The fill factor determines the percentage of space that should be left empty on each data page. A lower fill factor leaves more space for future data modifications but can increase fragmentation.
- Storage System Performance:
The underlying storage system’s performance, including disk speed and I/O operations, can affect how efficiently SQL Server manages data pages and minimizes fragmentation.
Problem Causes due to increase in fragmentation percentage
An increase in fragmentation percentage in a database can lead to several performance-related problems, affecting query performance and overall database efficiency. Here are the key problems that can occur as fragmentation increases:
- Slower Query Performance: Fragmentation can significantly slow down query execution because SQL Server must perform more I/O operations to retrieve data. As fragmentation increases, the database engine may need to read more data pages, resulting in longer query response times.
- Increased Disk I/O: Fragmentation can lead to a higher number of physical read and write operations on disk. This can strain the storage system, especially in environments with high data modification rates, leading to reduced disk performance and slower data retrieval.
- Reduced Buffer Pool Efficiency: Fragmented indexes and tables may consume more space in the buffer pool, which is a portion of memory used to cache frequently accessed data. As more pages are needed to store fragmented data, there is less room in the buffer pool for other data, potentially causing increased disk I/O.
- Increased Maintenance Overhead: Indexes with high fragmentation require more frequent index maintenance, such as rebuilding or reorganizing. This maintenance can consume additional system resources and increase database management overhead.
- Locking and Blocking Issues: As queries encounter fragmentation-related performance issues, there may be more locking and blocking incidents. Lock contention can result from longer query execution times, leading to concurrency problems and application delays.
- Reduced Backup and Restore Performance: Fragmented data can impact backup and restore operations. Backing up or restoring a fragmented database can take longer due to the increased I/O requirements.
- Space Wastage: Fragmentation results in wasted disk space, as data pages are not used as efficiently as they could be. Over time, this can lead to larger data files and increased storage costs.
- Application Slowdown: Application users may experience slower response times and degraded user experience due to increased query response times caused by fragmentation.
How to find Index Fragmentation using the T-SQL statement
To Find Fragmentation Index for all Objects in database:
SELECT S.name as 'Schema',
T.name as 'Table',
I.name as 'Index',
DDIPS.avg_fragmentation_in_percent,
DDIPS.page_count
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS DDIPS
INNER JOIN sys.tables T on T.object_id = DDIPS.object_id
INNER JOIN sys.schemas S on T.schema_id = S.schema_id
INNER JOIN sys.indexes I ON I.object_id = DDIPS.object_id
AND DDIPS.index_id = I.index_id
WHERE DDIPS.database_id = DB_ID()
and I.name is not null
AND DDIPS.avg_fragmentation_in_percent > 0
ORDER BY DDIPS.avg_fragmentation_in_percent desc
Output:
Here, we can see that the maximum average fragmentation percentage is noticeable as 99%, which must be engaged with an action to reduce the fragmentation with the choices of either REBUILD or REORGANIZE. REBUILD or REORGANIZE is the index maintenance command that can be executed with the ALTER INDEX statement. Users can perform this command using SSMS as well.
To Find Fragmentation Index for a particular Table in database:
SELECT OBJECT_SCHEMA_NAME(ips.object_id) AS schema_name,OBJECT_NAME(ips.object_id) AS object_name,i.name AS index_name,i.type_desc AS index_type,ips.avg_fragmentation_in_percent,ips.avg_page_space_used_in_percent,ips.page_count,ips.alloc_unit_type_desc FROM sys.dm_db_index_physical_stats(6,1213403542, Default, default, 'LIMITED') AS ips INNER JOIN sys.indexes AS i ON ips.object_id = i.object_id AND ips.index_id = i.index_id ;
Output:
SELECT clause:
The query selects specific columns to display in the result set.
OBJECT_SCHEMA_NAME(ips.object_id) AS schema_name:
Retrieves the schema name of the object associated with the index.
OBJECT_NAME(ips.object_id) AS object_name:
Retrieves the name of the object (e.g., table) associated with the index.
i.name AS index_name:
Retrieves the name of the index.
i.type_desc AS index_type:
Retrieves the type of the index (e.g., clustered, non-clustered).
ips.avg_fragmentation_in_percent:
Retrieves the average fragmentation percentage of the index.
ips.avg_page_space_used_in_percent:
Retrieves the average page space used percentage.
ips.page_count:
Retrieves the number of pages used by the index.
ips.alloc_unit_type_desc:
Retrieves the allocation unit type description (e.g., IN_ROW_DATA for data pages).
FROM clause:
The query retrieves this information from the sys.dm_db_index_physical_stats dynamic management function. This function provides physical statistics about index fragmentation and page usage.
INNER JOIN clause:
Joins the result from sys.dm_db_index_physical_stats with the sys.indexes system view using the object ID and index ID.
The query appears to be focused on retrieving information about index fragmentation, space usage, and related details for indexes within a specific database (database ID 6) and a specific object (object ID 200036144). The ‘LIMITED’ option in the sys.dm_db_index_physical_stats function limits the amount of information retrieved to a subset, which can be useful for performance analysis.
How to find Object IDs for above Query:
SELECT
i.object_id,
t.NAME AS TableName,
s.Name AS SchemaName,
p.rows,
SUM(a.total_pages) * 8 AS TotalSpaceKB,
CAST(ROUND(((SUM(a.total_pages) * 8) / 1024.00), 2) AS NUMERIC(36, 2)) AS TotalSpaceMB,
SUM(a.used_pages) * 8 AS UsedSpaceKB,
CAST(ROUND(((SUM(a.used_pages) * 8) / 1024.00), 2) AS NUMERIC(36, 2)) AS UsedSpaceMB,
(SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB,
CAST(ROUND(((SUM(a.total_pages) - SUM(a.used_pages)) * 8) / 1024.00, 2) AS NUMERIC(36, 2)) AS UnusedSpaceMB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
LEFT OUTER JOIN
sys.schemas s ON t.schema_id = s.schema_id
WHERE
t.NAME NOT LIKE 'dt%'
AND t.is_ms_shipped = 0
AND i.OBJECT_ID > 255
GROUP BY
i.object_id,
t.Name, s.Name, p.Rows
ORDER BY
TotalSpaceMB DESC, t.Name
How to consider which table fragmentation index is high and need to rebuild?
Percentage of High Fragmentation Limit | Provide the limit (in percentage) of fragmentation above which an index is termed as a highly fragmented index in the this text box. By default, the value specified here is 30. This means that if 30% or more of a monitored index is found to be fragmented, then such indexes are counted as highly fragmented indexes. |
Percentage of Very High Fragmentation Limit | Provide the limit (in percentage) of fragmentation above which an index is termed as a very highly fragmented index in this text box. By default, the value specified here is 50. This means that if 50% or more of a monitored index is found to be fragmented, then such indexes are counted as very highly fragmented indexes. |
Measurement | Description | Measurement Unit | Interpretation |
---|---|---|---|
Highly fragmented SQL indexes | Indicates the number of highly fragmented indexes. | Number | If 30% – 49% of an index is found to be fragmented, then such an index is counted as a highly fragmented index.Ideally, the value of this measure should be 0. A high value indicates high index fragmentation. High levels of fragmentation can cause disk I/O to mount, queries to run for long periods, and the overall performance of the database server to deteriorate.Use the detailed diagnosis of this measure to identify highly fragmented indexes.Once the affected indexes are isolated, take the necessary steps to correct the fragmentation. Towards this end, SQL provides the following statements:DROP INDEX followed by CREATE INDEXCREATE INDEX WITH DROP_EXISTINGDBCC INDEXDEFRAGDBCC DBREINDEX |
Very highly fragmented SQL indexes | Indicates the number of indexes that are very highly fragmented. | Number | If over 50% of an index is found to be fragmented, then such an index is counted as a highly fragmented index.Ideally, the value of this measure should be 0. A high value indicates high index fragmentation. High levels of fragmentation can cause disk I/O to mount, queries to run for long periods, and the overall performance of the database server to deteriorate.Use the detailed diagnosis of this measure to identify highly fragmented indexes.Once the affected indexes are isolated, take the necessary steps to correct the fragmentation. Towards this end, SQL provides the following statements:DROP INDEX followed by CREATE INDEXCREATE INDEX WITH DROP_EXISTINGDBCC INDEXDEFRAGDBCC DBREINDEX |
Rebuild and Reorganize Index using SQL Server Management Studio (SSMS)
Find and expand the table in Object Explorer >> Open Indexes >> Right-click on the target index >> Rebuild or Reorganize.
As visible in the above image, REBUILD and REORGANIZE are the two available choices to play out the trim operation over the page. Ideally, this operation ought to be done in the off-peak time to avoid its impact on other transactions and users. Microsoft SQL Server Enterprise Edition supports index online and offline features with index REBUILD.
REBUILD INDEX using Query
Step 1: Identify Fragmented Indexes
Before you begin the index rebuilding process, it’s essential to identify which indexes are fragmented and need maintenance. Use the following query to get a list of fragmented indexes:
sqlCopy code
-- List fragmented indexes with a fragmentation level greater than 10%
SELECT
OBJECT_NAME(OBJECT_ID) AS TableName,
name AS IndexName,
index_type_desc AS IndexType,
avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL)
WHERE avg_fragmentation_in_percent > 10;
This query lists indexes with a fragmentation level greater than 10%. You can adjust the threshold based on your requirements.
Step 2: Backup the Database
Before performing any significant maintenance tasks, such as index rebuilding, it’s best practice to back up your database. This ensures data safety and provides a recovery point in case of unexpected issues during maintenance.
Step 3: Rebuild Indexes
Now, let’s rebuild the identified fragmented indexes. You can use the ALTER INDEX statement for this task. For example, to rebuild an index named IX_Products_ProductID on a table named Products, use the following SQL statement:
-- Rebuild the specified index on the table
ALTER INDEX IX_Products_ProductID ON dbo.Products REBUILD;
Replace IX_Products_ProductID with the name of the index you want to rebuild and dbo.Products with the table and schema name.
Step 4: Monitor Progress
Index rebuilding can be a resource-intensive process, and it’s essential to monitor its progress and resource utilization. You can use SQL Server Management Studio or other monitoring tools to keep track of the operation. Be sure to check for completion and resource consumption.
Step 5: Update Statistics
After rebuilding indexes, it’s important to update the statistics for the table. This ensures that the query optimizer has accurate information for query optimization. Use the following SQL command to update statistics:
-- Update statistics for the table
UPDATE STATISTICS dbo.Products;
Replace dbo.Products with the name of your table.
Step 6: Review and Schedule Regular Maintenance
Index maintenance is not a one-time task; it should be part of your regular database maintenance strategy. You can schedule index maintenance jobs to run at off-peak times to minimize the impact on your production environment.
Step 7: Consider Azure SQL Database Managed Maintenance
If you’re using Azure SQL Database, consider enabling the “Managed Maintenance” feature, which automates some of the index maintenance tasks. This feature handles maintenance operations during a system-defined maintenance window to reduce the impact on your workload.
Note:
REBUILD Index can be set online or offline using the below SQL commands:
--Basic Rebuild Command
ALTER INDEX Index_Name ON Table_Name REBUILD
--REBUILD Index with ONLINE OPTION
ALTER INDEX Index_Name ON Table_Name REBUILD WITH(ONLINE=ON) | WITH(ONLINE=ON)
If a user performs the REBUILD INDEX offline, then the object resource (Table) of the index won’t be accessible till the end of REBUILD process completion. It affects numerous other transactions as well, which are associated with this object. Rebuild index operation recreates the index. Therefore, it generates new statistics and appends the log records of the index in the database transaction log file too.
For example, before rebuilding the index, let’s take the current allotment of pages for the index of the abc database, Sales.abc table and IX_abc named index.
SELECT OBJECT_NAME(IX.object_id) as db_name, si.name, extent_page_id, allocated_page_page_id, previous_page_page_id, next_page_page_id
FROM sys.dm_db_database_page_allocations(DB_ID('abc'), OBJECT_ID('Sales.abc'),NULL, NULL, 'DETAILED') IX
INNER JOIN sys.indexes si on IX.object_id = si.object_id AND IX.index_id = si.index_id
WHERE si.name = 'IX_abc'
ORDER BY allocated_page_page_id
Here, 1961 pages exist in the database file for this index, and the first 5 pages are the 861, 862, 1627, 1628, and 1904 in order of the page number. Now, let’s rebuild the index using SSMS.
Index REBUILD operation is completed successfully and take page allocation references for the same index with the help of the same T-SQL query again.
After rebuilding the index, the refreshed page count is 1457, which was 1961 before. If you check the first 5 allocated pages of the same index, it has been changed with the new page references. It presumes that the index is dropped and made once again. We should check the refreshed fragmentation percentage for the same index, and as can be seen below, it’s 0.1% now.
Perform rebuild operation on all indexes of the table or database together
Perform rebuild operation on all indexes of the table or database together; a user can use DBCC DBREINDEX() command.
DBCC DBREINDEX ('DatabaseName', 'TableName');
REORGANIZED INDEX
The REORGANIZE INDEX command reorders the index page by expelling the free or unused space on the page. Ideally, index pages are reordered physically in the data file. REORGANIZE does not drop and create the index but simply restructure the information on the page. REORGANIZE does not have any offline choice, and REORGANIZE does not affect the statistics compared to the REBUILD option. REORGANIZE performs online always.This operation is less resource-intensive than rebuilding the index and is often a good choice for moderately fragmented indexes
Take the fragmentation reading for the database
he index fragmentation percentage is 98.39 before REORGANIZE. The below list in the image is the allocation pages to the index.
ALTER INDEX abc_index ON Sales.Orderabc REORGANIZE
If we run below query before and after reorganize to check pages
SELECT OBJECT_NAME(IX.object_id) as db_name, si.name, extent_page_id, allocated_page_page_id,
previous_page_page_id, next_page_page_id
FROM sys.dm_db_database_page_allocations(DB_ID('abc'), OBJECT_ID('Sales.Orderabc'),NULL, NULL, 'DETAILED') IX
INNER JOIN sys.indexes si on IX.object_id = si.object_id AND IX.index_id = si.index_id
WHERE si.name = 'abc_index'
ORDER BY allocated_page_page_id
before page 459 and the total page count is decreased to 331, which was 459 before.
To perform the REORGANIZE index operation on all indexes of the table or database together, the user can use the DBCC INDEXDEFRAG() command:
DBCC INDEXDEFRAG('DatabaseName', 'TableName');
As seen, there is a substantial difference between the Index REBUILD and REORGANIZE. Here the users have a choice to choose one of the alternatives as per the index fragmentation percentage. We can understand that there are no documented standards; however, the database administrator follows the standard equation as per the requirement of the Index size and type of information.
Usual determination of the use of the equation :
- When the Fragmentation percentage is between 15-30: REORGANIZE
- When the Fragmentation is greater than 30: REBUILD
REBUILD option is more useful with the ONLINE option when the database is not available to take index maintenance in off-peak hours.
Tip : Rebuild All Database Indexes
Manually rebuilding indexes can be time-consuming and error prone. A Microsoft Engineer has provided a stored procedure that builds a list of indexes in the database and rebuilds or reorganizes them as appropriate.
The code can be found here: https://raw.githubusercontent.com/yochananrachamim/AzureSQL/master/AzureSQLMaintenance.txt
After you create the stored procedure, run as follows:
EXEC AzureSQLMaintenance 'index'
We recommend you inspect and change the code as required. Run the procedure when the database is not busy. The procedure is processor and I/O intensive.
Conclusion
In conclusion, fragmentation is a common issue in SQL Server databases that can negatively impact query performance and overall database efficiency. It occurs when data pages of tables and indexes are not stored efficiently, leading to various problems. Here are the key takeaways regarding fragmentation:
- Types of Fragmentation: Fragmentation in SQL Server can be categorized into two main types: index fragmentation and data fragmentation.
- Index Fragmentation: Index fragmentation occurs when the data pages of an index are not stored efficiently. It can be further divided into internal fragmentation (within index pages) and external fragmentation (physical storage on disk).
- Data Fragmentation: Data fragmentation refers to fragmentation within the actual data pages of tables, often caused by data modifications.
- Problems Caused by Fragmentation: An increase in fragmentation percentage can lead to slower query performance, increased disk I/O, reduced buffer pool efficiency, more frequent maintenance overhead, locking and blocking issues, reduced backup and restore performance, and space wastage.
- Managing Fragmentation: To address fragmentation, you should regularly monitor fragmentation levels, prioritize maintenance based on fragmentation severity, schedule regular index maintenance, and adjust the maintenance frequency based on your database’s specific needs.
- Index Maintenance: Index maintenance involves index rebuilding and reorganizing. Rebuilding creates a new index, while reorganizing rearranges the existing index pages to reduce fragmentation. Choose the appropriate method based on the level of fragmentation and database requirements.
- Database Maintenance Plans: A well-structured database maintenance plan that includes regular index maintenance is essential for maintaining optimal database performance.
- Balance Maintenance Overhead: It’s important to strike a balance between proactive index maintenance and the potential overhead it may introduce. Consider your database’s specific needs and usage patterns when deciding on a maintenance schedule.
- Regular Monitoring: Regularly monitoring and maintaining your indexes is crucial for ensuring that your database continues to operate efficiently and that fragmentation is kept in check.