SQL Performance Tuning

Sean Zhang
3 min readJul 18, 2022

--

SQL Textbooks teach us how to get the results by writing queries. In the production environment, query execution time matters just as much since it influences user experience or even leads to a production incident of “Runtime Out” due to platform threshold setup. For example, Tableau Server Admin usually set up a “runtime threshold” to automatically fail the backgrounder extract refresh if a query runs too long.

There are two major topics when we talk about SQL performance: Index Utilization and Avoiding Whole Table Reference. Before you apply any of these practices to your slow query, try to use EXPLAIN or EXPLAIN QUERY PLAN on the query to see how the database interprets it.

Here are the tips/tricks/best practices I have utilized in my SQL journey:

  1. String search

When we perform LIKE %%, SQL automatically gives up index reference.

--Before Tuning
SELECT * FROM USER WHERE USER_NAME LIKE '%Sean%'
--After Tuning
SELECT * FROM USER WHERE USER_NAME LIKE 'Sean%'

2. Avoid OR and IN Operator

OR operator is the “behind logic” of IN operator. E.g.

USER_ID IN (1, 3, 5) is translated into USER_ID = 1 OR USER_ID = 3 OR USER_ID = 5. Both IN and OR operators require SQL to rebuild the execution plan, which can be an expensive step sometimes.

--For OR Operator
--Before Tuning
SELECT * FROM USER WHERE USER_ID = 1 OR USER_ID = 5
--After Tuning
SELECT * FROM USER WHERE USER_ID = 1
UNION
SELECT * FROM USER WHERE USER_ID = 5
--For IN or NOT IN operators
----If it is numeric filed:
--Before Tuning:
SELECT * FROM USER WHERE USER_ID IN (1,2,3,4,6)
--After Tuning:
SELECT * FROM USER WHERE USER_ID BETWEEN (1,6) AND USER_ID!=5
----If it is string field:
USE EXISTS in a subquery

Utilizing these practices can be grammarly redundant sometimes. You don’t have to abandon the SQL-told way if the query runs within seconds. These practices can help if you have a long execution time.

3. WHERE 1=1

This query has been used by me for quite a long time since it makes the indent looks much neater in SQL. I just learned that WHERE 1=1 can actually cause SQL to do a table scan instead of an index scan. Need to validate this one!

4. ORDER BY aligning with WHERE

When the fields in ORDER BY and WHERE are the same, the index can be utilized more efficiently compared to when they are not.

A lot of times ORDER BY can be unnecessary when your query result is just fed into another platform like Tableau, where they will be sorted again, either on the same field or even a different field. If ORDER BY is needed, the numeric field is faster than the string(varchar) field.

5. UNION ALL vs UNION

UNION ALL is faster since UNION removes duplicates. If you are sure that there is no duplicate, use UNION ALL. If you are not sure about the duplicate, you can use UNION ALL and then remove duplicates in your code later, especially in the situation when you have to perform multiple UNIONs.

6. Multiple JOINs

When you are joining multiple tables together, try to put the big tables at first and smaller tables at last. MS SQL and Oracle SQL read tables from the right to the left and reading larger tables first makes the query faster. However, MySQL uses the opposite read sequence: you should put smaller tables first for MySQL.

7. Functions

A stored Procedure generally performs faster than a function. It is also more powerful compared to functions.

8. The WHERE magic

SQL interprets WHERE clause from the bottom to the top. Putting the WHERE clause which can filter out most data at last.

--

--

Sean Zhang
Sean Zhang

Written by Sean Zhang

Data Science | Machine Learning| Data Engineer

Responses (1)