Execution Plan Analysis for SQL Query Performance Optimization

Execution Plan Analysis for SQL Query Performance Optimization

Problem Description
An execution plan is a detailed step-by-step instruction generated by the database optimizer on how to execute an SQL query. When encountering a slow query, how can we identify performance bottlenecks by analyzing the execution plan and propose effective optimization solutions?

Solution Process

Obtaining the Execution Plan
First, you need to know how to make the database show its intended way of executing your SQL statement. Commands vary slightly across different database management systems:
- MySQL: Add EXPLAIN or EXPLAIN FORMAT=JSON before the SQL statement. For example: EXPLAIN SELECT * FROM users WHERE name = 'Alice';
- PostgreSQL: Use EXPLAIN or EXPLAIN (ANALYZE, BUFFERS). ANALYZE will actually execute the statement and show real runtime, while BUFFERS displays cache usage.
- Oracle: Use EXPLAIN PLAN FOR, then query the PLAN_TABLE.
  The crucial first step is to obtain the execution plan, which is your window into the database's "thought process."
Understanding Key Metrics
Execution plans are usually displayed as a table or tree structure with many columns. You need to focus on the following key metrics:
- Access Type (Type/Access Type): This describes how the database looks up data in a table. Common types from most to least efficient are:
  - system / const: Looking up a single row via primary key or unique index, fastest.
  - eq_ref: In join queries, using primary key or unique index for association.
  - ref: Using a non-unique index for lookup.
  - range: Using an index for range scans (e.g., BETWEEN, IN, >).
  - index: Full index scan (traversing the entire index tree).
  - ALL: Full table scan (scanning all rows in the table), least efficient, must be avoided for large datasets.
- Possible Keys: Which indexes the optimizer considers potentially usable.
- Key (Actual Key Used): Which index the optimizer finally chooses. If this is NULL, no index is used.
- Rows: The estimated number of rows the database needs to scan to return the result. Lower is better.
- Extra Information (Extra): Contains very important details, such as:
  - Using where: After retrieving data at the storage engine level, further filtering is needed at the Server layer. If the Rows value is large, efficiency is low.
  - Using index: All data required by the query can be obtained from the index ("covering index"), eliminating the need to look up the data rows ("table lookup"), which is highly efficient.
  - Using temporary: Requires creating a temporary table to handle the query (e.g., GROUP BY, ORDER BY on non-indexed columns), often needs optimization.
  - Using filesort: Cannot use an index to complete sorting, requires sorting in memory or on disk, which is less efficient.
Analyzing the Execution Plan: A Step-by-Step Example
Suppose we have an orders table with fields id (primary key), user_id, product_id, amount, created_at, and an index on user_id.

Slow Query: SELECT * FROM orders WHERE user_id = 100 AND amount > 50;

Step 1: Obtain the Execution Plan (using MySQL as an example)
EXPLAIN SELECT * FROM orders WHERE user_id = 100 AND amount > 50;

Step 2: Interpret the Results
Suppose you see the following key information:
- type: ref (Good, using index lookup)
- key: idx_user_id (Indeed using the user_id index)
- rows: 500 (Estimates scanning 500 rows)
- Extra: Using where (Needs to perform table lookups, then filter based on amount > 50)
Step 3: Identify the Bottleneck
Although the query uses an index, the bottleneck is Using where. The database first uses the idx_user_id index to quickly find all records with user_id=100 (say 500 records). Then, it must use the primary key id of these 500 records to go back to the main data table (a process called "table lookup") to fetch the complete data for each row. Finally, it checks whether amount in these rows is greater than 50. If only 10 records ultimately meet the condition, most of the table lookups are wasted.
Proposing Optimization Solutions
Based on the above analysis, the optimization idea is to reduce the number of table lookups and reduce the amount of data that needs filtering.
- Solution 1: Create a Composite Index
  Create an index covering both the query condition and the filter condition. For example, create a composite index (user_id, amount).
  - Principle: Now, on the (user_id, amount) index tree, the database can directly find all records with user_id=100, and because these records are sorted by amount on the index tree, it can quickly locate the position where amount > 50. More importantly, if our query only involves user_id, amount, and id (for table lookup), the index already contains all the necessary information (i.e., a "covering index"), potentially avoiding table lookups altogether. In our example, the query is SELECT *, requiring all fields, so table lookups cannot be completely avoided, but they can be drastically reduced (from about 500 times to around 10 times, matching the final result).
  - Re-check the Execution Plan: After creating the index, run EXPLAIN again. You might see type: range (due to the range query amount>50), key showing the newly created composite index, and Extra might show Using index condition (indicating Index Condition Pushdown, ICP, a more efficient filtering method). Performance is significantly improved.
- Solution 2: Rewrite the Query (if applicable)
  Sometimes, you can guide the optimizer to choose a better plan by changing the query approach. For example, if business logic allows, changing a range query to an equality query, or using optimizer hints (like USE INDEX), but this is usually a last resort.

Summary
Analyzing execution plans is a "diagnostic" process: Obtain the plan -> Interpret key metrics (pay special attention to access type and extra information) -> Identify performance bottlenecks (often full table scans, temporary tables, filesorts, or excessive table lookups) -> Propose targeted optimization solutions (most commonly creating appropriate composite/covering indexes). Mastering this workflow is core to solving SQL performance issues.