SQL NULL Value Handling and Three-Valued Logic Analysis

SQL NULL Value Handling and Three-Valued Logic Analysis

Problem Description:
In SQL, the NULL value represents missing, unknown, or inapplicable data. Due to its special nature, it cannot be handled directly with conventional comparison operators (such as =, <>). Instead, three-valued logic (TRUE, FALSE, UNKNOWN) is introduced. Interviews often test understanding of NULL, including how to detect NULL, the impact of NULL on query results, and the behavior of three-valued logic in WHERE, JOIN, or aggregate operations.

Solution Process:

  1. The Nature of NULL

    • NULL is not equal to an empty string or 0; it signifies "value does not exist." For example, a NULL value in a table's "phone number" field might mean the user did not provide a number, rather than the number being an empty string.
    • Any arithmetic operation (e.g., NULL + 5) or comparison (e.g., NULL = 10) with NULL results in NULL.
  2. Correct Methods for Detecting NULL

    • Use IS NULL or IS NOT NULL instead of = NULL:
      -- Incorrect: WHERE column = NULL (always evaluates to UNKNOWN, returns no results)
      -- Correct: WHERE column IS NULL
      
    • Example: Query all users who have not provided an email:
      SELECT name FROM users WHERE email IS NULL;
      
  3. Behavior of Three-Valued Logic in WHERE Clauses

    • Rows are returned when a condition evaluates to TRUE; they are not returned when it evaluates to FALSE or UNKNOWN.
    • Example:
      SELECT * FROM products WHERE price <> 100;
      
      If a row's price is NULL, the comparison NULL <> 100 results in UNKNOWN, and that row will not be included in the result.
  4. Impact of NULL on JOIN Operations

    • JOINs using = (e.g., INNER JOIN) exclude rows where NULL values are matched because NULL = NULL evaluates to UNKNOWN.
    • Example: When joining two tables on a column that may contain NULL, explicit handling of NULL is needed:
      SELECT * 
      FROM table1 t1
      LEFT JOIN table2 t2 
         ON t1.id = t2.id OR (t1.id IS NULL AND t2.id IS NULL);
      
  5. NULL Handling in Aggregate Functions

    • COUNT(column) ignores NULL values, counting only non-NULL values; COUNT(*) counts all rows.
    • Functions like SUM and AVG ignore NULL, but return NULL if all values are NULL.
    • Example: When calculating the average score, NULL scores are excluded from the calculation:
      SELECT AVG(score) FROM exams; -- Result does not include rows with NULL
      
  6. Using COALESCE or IFNULL to Convert NULL

    • Replace NULL with a default value to avoid calculation errors:
      SELECT name, COALESCE(email, 'Not Provided') AS email_info FROM users;
      

Key Summary:

  • Always use IS NULL to detect NULL, avoiding direct comparison.
  • Three-valued logic may lead to query results differing from expectations; pay special attention in WHERE, JOIN, and aggregate scenarios.
  • Handle NULL using COALESCE, CASE WHEN, or default values to ensure correct data logic.