Backend Performance Optimization: Slow Query Analysis and SQL Tuning

Backend Performance Optimization: Slow Query Analysis and SQL Tuning

Description
A slow query refers to an SQL statement whose execution time exceeds a predefined threshold, which is a common source of database performance bottlenecks. Slow query analysis is the process of monitoring, locating, and optimizing these inefficient SQL statements, involving knowledge in SQL writing, index usage, database configuration, and other aspects. In high-concurrency systems, even if a single slow query only consumes an additional 100 milliseconds, it may cause system collapse due to request accumulation.

Problem-Solving Process

1. Discovering Slow Queries
Step Description: First, identify which SQL statements are considered "slow queries".

  • Enable Slow Query Log: Set slow_query_log=ON in MySQL and define a threshold (e.g., long_query_time=1 second).
  • Utilize Monitoring Tools: Analyze logs using tools like pt-query-digest from Percona Toolkit, or capture slow SQL in real-time via APM (e.g., SkyWalking).
  • Key Metrics: Focus on execution time, rows scanned, rows returned, and lock wait time.

2. Analyzing Execution Plans
Step Description: Use the EXPLAIN command to parse the SQL execution path and locate bottlenecks.

  • Check the type Field: If ALL (full table scan) or index (full index scan) appears, it indicates ineffective index utilization.
  • Examine the Extra Field:
    • Using filesort: Sorting is not using an index; ORDER BY needs optimization.
    • Using temporary: A temporary table is used, common in GROUP BY or complex JOIN operations.
    • Using where: Data needs to be retrieved by returning to the table.
  • Focus on the rows Field: When the estimated number of scanned rows is much larger than the returned rows, the index may be ineffective.

3. Index Optimization
Step Description: Adjust indexes based on issues identified in the execution plan.

  • Avoid Redundant Indexes: For example, a separate index (a) is redundant if a composite index (a, b) already exists.
  • Leftmost Prefix Principle: Composite indexes must be used in field order (e.g., index (a, b) cannot optimize WHERE b=1).
  • Covering Index Optimization: Include SELECT fields in the index as much as possible to avoid table lookups (e.g., when the index includes (a, b), querying SELECT a, b FROM table WHERE a=1 can return data directly from the index).
  • Index Selectivity Principle: Prioritize creating indexes for high-distinctiveness fields (e.g., user ID); indexes on low-distinctiveness fields (e.g., gender) are less effective.

4. SQL Statement Rewriting
Step Description: Eliminate inefficient operations from the writing perspective.

  • Avoid SELECT *: Query only necessary fields to reduce data transfer and memory usage.
  • Use JOIN Instead of Subqueries: Especially correlated subqueries (e.g., WHERE IN subqueries) can easily cause full table scans; using JOIN can leverage indexes.
  • Pagination Optimization: When the offset is large (e.g., LIMIT 10000,10), use WHERE id > 10000 LIMIT 10 to avoid scanning many unnecessary rows.
  • Batch Operations Instead of Loops: For example, use INSERT INTO table VALUES (a1,b1),(a2,b2) instead of multiple single-row inserts.

5. Database Parameter Tuning
Step Description: Adjust configurations to match query characteristics.

  • Adjust Buffer Pool: For example, set innodb_buffer_pool_size to 70%~80% of available memory to improve cache hit rate.
  • Optimize Sorting Configuration: Increase sort_buffer_size to avoid disk temporary tables.
  • Connection Pool Settings: Control the maximum number of connections (max_connections) to prevent excessive concurrency from slowing down overall performance.

6. Architectural-Level Optimization
Step Description: When single-machine optimization reaches its limits, architectural improvements are needed.

  • Read/Write Separation: Route slow queries (e.g., complex report SQL) to read-only replicas to avoid impacting writes on the primary database.
  • Caching Strategy: Add Redis caching for complex queries with unchanged results (e.g., historical data statistics).
  • Data Sharding: Split the database by business (e.g., user database, order database) to reduce single-table data volume.

Summary
Slow query optimization requires a multi-dimensional approach combining monitoring, execution plan analysis, index design, and SQL refactoring. It is an iterative process. In practice, establish a slow query alert mechanism and regularly review high-frequency SQL to systematically improve database performance.