I post articles that are related to daily work and not commonly shown in textbooks. Please follow me or throw me a thumb if you find this article useful.
If you are looking for quick tips to tune the SQL performance, please check my first article: SQL Performance Tuning.
Hardware Upgrade < System Design < Database Design and Architecture.
Hardware Upgrade > System Design > DB D&A > SQL Query & Index
For API-based databases, there are principles for optimizing SQL performance:
- Reduce data request: proper data types and compress set up and proper index utilization.
- CPU: Avoid sorting and whole table scan.
- Reduce data returns: only return data that are needed to reduce Disk IO and Internet IO.
- Reduce interactions: DML sets and stored procedures help reduce DB connection times.
Summarize them into 2 topics: Index Management and Volume Reduce. We will focus on Index Management.
SQL Performance Tuning listed situations where SQL does not use an index.
First, let us go through the underlying concept of the Index.
Note: Morden DB has upgrades and tuning of this concept. This article only illustrates the basic architectural design of the index system.
Index is like the “content” of the book. A table without a primary key is organized randomly row by row on the disk, instead, a table with a primary key index(or any other clustered index) is organized based on the b+ tree algorithm.
The tree algorithm will reduce the complexity from O(n) to O(Log n) as math.log(number of rows, number of branches).
A Non-Clustered index logical order of index does not match physical stored order on the disk. The indexes on the above charts are all connected, thus leading to an organized physical order. A Non-Clustered Index is independent of other Non-Clustered indexes. It is also slower but smaller compared to Clustered Index.
However, DB has to rebuild the index after DML of update, insert and delete. We can dive deeper into this topic:
Insert: insert will insert the data into the table and then into the index, thus insert will require index update for each option.
Not necessary to read: During the operation, SQL will check this value in the index. If the value is there, the new value will be inserted into the free space right behind; if no value matches, the value will be added to the leaf, then the value and the pointer will be added to the branch node. During this process, if the page is full, a new page will be created and half the index will be moved to the new page.
Delete: When a record is deleted, all related index will be deleted. The cost is smaller compared to insert.
Update: Only the field that has index will be updated. The cost is smaller compared to insert.
Please google for “creating automatic rebuilding SQL Index maintenance plan”.
As a DBA, we need to evaluate the considerations and priorities of DML operations(OLAP/OLTP) and choose the amount, choice, and maintenance plan for the index.
To debug long queries and see how index can help with SSMS execution plan:
Identity Operations in the plan: Clustered Index Scan, Clustered Index Seek, Compute Scalar, Delete, Filter, Hash Match, Insert.
Optimize Related Operations:
Table Scan: Create clutered index.
Clustered Index Scan: Happens when there is non-clustered index inactive. Create a non-clustered index to help.
Hash Join: No index when two tables are joined. Created index on the columns used to join tables.
Nested Loops: Non-Clustered Index does not cover all the columns in the select.
RID Lookup: In the table, you have a non-clustered index but do not have clustered index. In this case, engine will search based on row ID. The clustered index should be created on this table.