WRONG BELIEFS ABOUT SQL INDEXES

Mohammed Ragab

3 min readMar 9, 2019

1- Databases don’t need indexes

Many of developers developed and deployed database without indexes

For these reasons

- “It’s a small database that won’t get much data.”

- “It’s just a proof of concept and won’t be around for long.”

- “It’s not an important application, so performance isn’t important.”

All of these reasons are easy to break down, even databases that are expected to be small can start growing quickly.

unimportant database and application wouldn’t have been created if there weren’t a need or someone wasn’t interested in expending resources for the features and fitting a database into memory doesn’t mean it will be fast

· Try to run the following query and see the I/O Statistics for Table with No Index

SELECT * INTO MyTable FROM Sales.SalesOrderDetail;GOSET STATISTICS IO ON SET NOCOUNT ONGOSELECT SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber,OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal FROM MyTable WHERE CarrierTrackingNumber = ‘4911–403C-98’;GOSET STATISTICS IO OFFGOTable ‘MyTable’. Scan count 1, logical reads 1496, physical reads 0, read-ahead reads 0, lob logical reads 0,lob physical reads 0, lob read-ahead reads 0.

Try to add an index to “CarrierTrackingNumber” Column and run the query and see the I/O Statistics for Table with index

Table MyTable. Scan count 1, logical reads 15 physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

It could be argued that 1,496 isn’t a lot of input/output (I/O). This might be true given the size of some databases

and the amount of data in today’s world. But the I/O of a query shouldn’t be compared to the performance

of the rest of the world; it needs to be compared to it's potential I/O, the needs of the application, and the platform on which it is deployed.

2- Primary Keys Are Always Clustered

If a primary key is created after the clustered index is built, then the primary key will be created as a nonclustered index.

Try to run the following query and see the information about created indexes

CREATE TABLE dbo.MyTable1 ( RowID int NOT NULL,Column1 nvarchar(128),Column2 nvarchar(128));ALTER TABLE dbo.MyTable1 ADD CONSTRAINT PK_MyTable1 PRIMARY KEY (RowID);GOCREATE TABLE dbo.MyTable2(RowID int NOT NULL,Column1 nvarchar(128),Column2 nvarchar(128));CREATE CLUSTERED INDEX CL_MyTable2 ON dbo.MyTable2 (RowID);ALTER TABLE dbo.MyTable2 ADD CONSTRAINT PK_MyTable2 PRIMARY KEY (RowID);GOSELECT OBJECT_NAME(object_id) AS table_name ,name ,index_id ,type ,type_desc,is_unique ,is_primary_keyFROM sys.indexes WHERE object_id IN (OBJECT_ID(‘dbo.MyTable1’),OBJECT_ID(‘dbo.MyTable2’));· First, I created table1 with Primary key in first then I created· The table2 with clustered index in first then I created the primary keytable_name name index_id type type_desc is_unique is_primary_keyMyTable1 PK_MyTable1 1 1 CLUSTERED 1 1MyTable2 CL_MyTable2 1 1 CLUSTERED 0 0MyTable2 PK_MyTable2 2 2 NONCLUSTERED 1 1Table2 have clustered index and the primary key created with NONCLUSTERED

3- Any Column Can Be Filtered in Multicolumn Indexes

An index does not need to use all the columns in a table. It does, however, need to start with the

leftmost column in an index and use the columns from left to right, in order. This is why the order of the columns in an index is so important.

To demonstrate this wrong belief, try the following query create the table

Mytable4 based on Sales.SalesOrderHeader with a primary key on SalesOrderID

In “AdventureWorks2016” database and create indexes for each the following columns (OrderDate, DueDate, ShipDate)

IF OBJECT_ID(‘dbo.MyTable4’) IS NOT NULLDROP TABLE dbo.MyTable4GOSELECT SalesOrderID, OrderDate, DueDate, ShipDate INTO dbo.MyTable4 FROM Sales.SalesOrderHeader;GOALTER TABLE dbo.MyTable4 ADD CONSTRAINT PK_MyTable4 PRIMARY KEY CLUSTERED (SalesOrderID);GOCREATE NONCLUSTERED INDEX IX_MyTable4 ON dbo.MyTable4 (OrderDate, DueDate, ShipDate);GO

Then try to select query filtered by the leftmost (OrderDate)

SELECT OrderDate FROM dbo.MyTable4 WHERE OrderDate = ‘2011–07–17 00:00:00.000’

You found by filtering on the leftmost column, the query uses a seek operation

Then try to select query filtered by the rightmost (ShipDate)

SELECT OrderDate FROM dbo.MyTable4 WHERE ShipDate = ‘2011–07–17 00:00:00.000’

by filtering on the rightmost column, uses a scan operation on IX_MyTable4. Instead of being able to go

directly to the records that match the OrderDate, the query needs to check all records to determine which match

the filter. While the index is used, it isn’t able to actually filter the rows.

4- Clustered indexes store records in physical order.

One of the more pervasive wrong beliefs commonly held is the idea that a clustered index

stores the records in a table in their physical order when on disk. This seems

to be primarily driven by confusion between what is stored on a page and where records are stored on those pages.

WRONG BELIEFS ABOUT SQL INDEXES

1- Databases don’t need indexes

Written by Mohammed Ragab

No responses yet