Heaps (Tables without Clustered Indexes)
SQL Server 2014
A heap is a table without a clustered index. One or more nonclustered
indexes can be created on tables stored as a heap. Data is stored in the heap
without specifying an order. Usually data is initially stored in the order in
which is the rows are inserted into the table, but the Database Engine can move
data around in the heap to store the rows efficiently; so the data order cannot
be predicted. To guarantee the order of rows returned from a heap, you must use
the ORDER BY clause. To specify the order for storage
of the rows, create a clustered index on the table, so that the table is not a
heap.|
|
|
There are sometimes good reasons to leave a table as a heap instead of
creating a clustered index, but using heaps effectively is an advanced skill.
Most tables should have a carefully chosen clustered index unless a good
reason exists for leaving the table as a heap. |
When to Use a Heap
If a table is a heap and does not have any nonclustered indexes, then the entire table must be examined (a table scan) to find any row. This can be acceptable when the table is tiny, such as a list of the 12 regional offices of a company.When a table is stored as a heap, individual rows are identified by reference to a row identifier (RID) consisting of the file number, data page number, and slot on the page. The row id is a small and efficient structure. Sometimes data architects use heaps when data is always accessed through nonclustered indexes and the RID is smaller than a clustered index key.
When Not to Use a Heap
Do not use a heap when the data is frequently returned in a sorted order. A clustered index on the sorting column could avoid the sorting operation.Do not use a heap when the data is frequently grouped together. Data must be sorted before it is grouped, and a clustered index on the sorting column could avoid the sorting operation.
Do not use a heap when ranges of data are frequently queried from the table. A clustered index on the range column will avoid sorting the entire heap.
Do not use a heap when there are no nonclustered indexes and the table is large. In a heap, all rows of the heap must be read to find any row.
Managing Heaps
To create a heap, create a table without a clustered index. If a table already has a clustered index, drop the clustered index to return the table to a heap.To remove a heap, create a clustered index on the heap.
To rebuild a heap to reclaim wasted space, create a clustered index on the heap, and then drop that clustered index.
|
|
|
Creating or dropping clustered indexes requires rewriting the entire
table. If the table has nonclustered indexes, all the nonclustered indexes
must all be recreated whenever the clustered index is changed. Therefore,
changing from a heap to a clustered index structure or back can take a lot of
time and require disk space for reordering data in tempdb. |
Clustered and Nonclustered Indexes
SQL Server 2014
An index is an on-disk structure associated with a table or view that speeds
retrieval of rows from the table or view. An index contains keys built from one
or more columns in the table or view. These keys are stored in a structure
(B-tree) that enables SQL Server to find the row or rows associated with the
key values quickly and efficiently. A table or view can contain the following types of indexes:
·
Clustered
o
Clustered indexes sort and store the data rows
in the table or view based on their key values. These are the columns included
in the index definition. There can be only one clustered index per table,
because the data rows themselves can be sorted in only one order.
o
The only time the data rows in a table are stored
in sorted order is when the table contains a clustered index. When a table has
a clustered index, the table is called a clustered table. If a table has no
clustered index, its data rows are stored in an unordered structure called a
heap.
·
Nonclustered
o
Nonclustered indexes have a structure separate
from the data rows. A nonclustered index contains the nonclustered index key
values and each key value entry has a pointer to the data row that contains the
key value.
o
The pointer from an index row in a nonclustered
index to a data row is called a row locator. The structure of the row locator
depends on whether the data pages are stored in a heap or a clustered table.
For a heap, a row locator is a pointer to the row. For a clustered table, the
row locator is the clustered index key.
o
You can add nonkey columns to the leaf level of
the nonclustered index to by-pass existing index key limits, 900 bytes and 16
key columns, and execute fully covered, indexed, queries. For more information,
see Create
Indexes with Included Columns.
Both clustered and nonclustered indexes can be unique. This means no two
rows can have the same value for the index key. Otherwise, the index is not
unique and multiple rows can share the same key value.Indexes are automatically maintained for a table or view whenever the table data is modified.
Indexes and Constraints
Indexes are automatically created when PRIMARY KEY and UNIQUE constraints are defined on table columns. For example, when you create a table and identify a particular column to be the primary key, the Database Engine automatically creates a PRIMARY KEY constraint and index on that column.How Indexes Are Used by the Query Optimizer
Well-designed indexes can reduce disk I/O operations and consume fewer system resources therefore improving query performance. Indexes can be helpful for a variety of queries that contain SELECT, UPDATE, DELETE, or MERGE statements. Consider the query SELECT Title, HireDate FROM HumanResources.Employee WHERE EmployeeID = 250 in the AdventureWorks2012 database. When this query is executed, the query optimizer evaluates each available method for retrieving the data and selects the most efficient method. The method may be a table scan, or may be scanning one or more indexes if they exist.When performing a table scan, the query optimizer reads all the rows in the table, and extracts the rows that meet the criteria of the query. A table scan generates many disk I/O operations and can be resource intensive. However, a table scan could be the most efficient method if, for example, the result set of the query is a high percentage of rows from the table.
When the query optimizer uses an index, it searches the index key columns, finds the storage location of the rows needed by the query and extracts the matching rows from that location. Generally, searching the index is much faster than searching the table because unlike a table, an index frequently contains very few columns per row and the rows are in sorted order.
The query optimizer typically selects the most efficient method when executing queries. However, if no indexes are available, the query optimizer must use a table scan. Your task is to design and create indexes that are best suited to your environment so that the query optimizer has a selection of efficient indexes from which to select. SQL Server provides the Database Engine Tuning Advisor to help with the analysis of your database environment and in the selection of appropriate indexes.