In J2EE enterprise applications, the performance of the database queries significantly impacts the application’s efficiency. Slow or poorly performed queries delay the response to the end user, decrease productivity, and eventually lead to business loss. There are various techniques that are used for SQL query optimization. In this article, we will learn about SQL optimization techniques that we can use to improve the database query performance.
SQL Query Optimizer techniques
1. Indexing in DBMS
Indexing can be used for SQL query optimization. In general, we create an index on the column that is frequently used for retrieval or that comes inside the where clause. For example, if we have an employee table and we want to fetch all the employees whose salary is $50,000, we will create an index on the salary columns as it will come in the where clause.
<!!--Select all the employees whose salary is 50000 -->>
SELECT * FROM emp_tbl WHERE salary = 50000;
<!-- We create index on salary column as following: -->>
CREATE INDEX sal_index ON emp_tbl(salary);
Internally, the database engine will create a B-tree around the salary column, which actually improves the performance during the retrieval of data using indexed columns.
2. Replace UNION in SQL with UNION ALL operator
When we want to combine the data from two or more SELECT statements and we are aware that the data does not contain duplicate values (say, for example, transaction_id, as we know transaction id is unique for every transaction), we should use the UNION ALL operator instead of UNION. Both the operators combine the data, but the UNION will check for duplication also and slow down the performance, while UNION ALL simply concatenates the result set without removing the duplicate elements.
<!-- UNION operator will combine the results from two query and remove the duplicate elements -->>
SELECT col1, col2 FROM table1
UNION
SELECT col1, col2 FROM table2;
<!-- UNION ALL simply combines the result of two or more query -->>
SELECT col1, col2 FROM table1
UNION ALL
SELECT col1, col2 FROM table2;
3. Use VARCHAR instead of CHAR
When we want to declare a column of string type, we should make it varchar instead of char type. The char may add the trailing spaces. For example, if we declare emp_name char(50) and not if the employee’s name has less than 50 characters, say 30 or 40 only, then extra space will be added.
4. Comparison should be done in terms of equality; avoid using <> or !=
Whenever we want to compare something, we should try using the equals operation instead of not equal (<>) operator, because the <> or!= operators skip the indexes in the table.
<!-- <> this is not equal operator and it will skip the index from the table -->>
SELECT * FROM emp_tbl WHERE salary <> 50000;
5. Use EXISTS in SQL instead of COUNT in SQL
Sometime when we want to check whether some element is present in the table or not, we generally check its count, and if it is greater than 0 or some value, we say the element is present. Instead, we can use the EXISTS or NOT EXISTS command. As the COUNT will do a full table scan, which is time-consuming. EXISTS or NOT EXISTS will also do a full table scan, but as soon as they find the first elements, they complete their execution.
6. Minimum use of DISTINCT in SQL keyword
We should avoid using the DISTINCT keyword where ever possible, as it takes almost double the amount of execution time.
7. Numeric fields should be stored in numeric data structure
In general, we create varchar for most fields. We should use the appropriate data structure to store the data. For example, if we have an age field, this should be stored in numeric format (int) instead of varchar, and this will definitely improve the query performance.
8. Use DESCRIBE in SQL instead of SELECT statement
When we want to check the schema or check the columns of the tables, we do use select statements, and this will work fine. But if we have huge volumes of records in our table and we perform SELECT * FROM table, this may lead to system failure. We can simply check the schema using the describe keyword, or we may use limit with select statements.