Detailed Explanation of I/O Multiplexing Technology for Backend Performance Optimization

Detailed Explanation of I/O Multiplexing Technology for Backend Performance Optimization

Problem Description
I/O multiplexing is a technique that uses a single thread to monitor multiple I/O connections simultaneously, addressing the resource waste issue of the "one thread per connection" model in traditional blocking I/O. Interviewers often examine the implementation principles, performance comparisons, and practical application scenarios of select/poll/epoll.

Background of Technological Evolution

Blocking I/O Model: Each connection requires a separate thread for handling, leading to massive thread context switching overhead as the number of connections increases.
Non-blocking I/O Model: Polling avoids thread blocking but causes CPU idle waste as it constantly checks connection states.
I/O Multiplexing: The operating system provides a unified interface to monitor multiple connections, performing actual I/O operations only when data is ready.

Implementation Principle of select

Data Structure: Uses fd_set (file descriptor set) to record monitored connections via a bitmap.
Workflow:
- The application copies all file descriptors (fds) to be monitored to kernel space.
- The kernel traverses the fd set to check readiness status.
- Returns the set of ready fds to the application.
- The application must traverse all fds to identify ready connections.
Limitations:
- Fixed size of fd_set (typically 1024).
- Each call requires copying the entire fd set.
- Requires traversing all fds to confirm readiness.

Improved Solution: poll

Data Structure: Uses an array of pollfd structures, overcoming the fd quantity limit of select.
Improvements:
- Uses linked list structure to avoid quantity restrictions.
- Separates monitoring events and returned events fields.
Remaining Issues:
- Still requires copying the full fd set.
- Both kernel and application need to traverse all connections.

Core Breakthrough of epoll

Three Core Functions:
- epoll_create: Creates an epoll instance.
- epoll_ctl: Dynamically adds/deletes monitored fds (incremental operations).
- epoll_wait: Waits for ready events (no need to pass the fd set).
Underlying Mechanisms:
- Red-black tree stores all monitored fds, ensuring efficient lookup.
- Ready list stores fds with events.
- Callback mechanism: Automatically adds fds to the ready list when they become ready.
Trigger Modes Explained:
- Level-Triggered (LT): Continuously notifies as long as the fd is readable/writable.
- Edge-Triggered (ET): Notifies only once when the fd's state changes, requiring all data to be processed in one go.

Performance Comparison Experimental Data

Performance as Connection Count Increases:
- select/poll: O(n) linear decline.
- epoll: O(1) remains essentially constant.
Ten-thousand Connection Test:
- select/poll CPU usage exceeds 80%.
- epoll CPU usage stays below 5%.

Practical Application Scenarios

Nginx Network Model: Uses epoll + non-blocking I/O to handle massive concurrent connections.
Redis Event-Driven Architecture: Implements single-threaded high-performance network processing based on epoll.
Game Servers: Uses ET mode to reduce event notification frequency and boost performance.

Key Points for Programming Practice

ET Mode Considerations:
- Must use non-blocking sockets.
- Need to loop read/write until EAGAIN error occurs.
- Avoid missing partial data processing.
Connection Management:
- Use non-blocking connect for connection establishment.
- Set reasonable timeout values to avoid prolonged blocking.
Buffer Design:
- Maintain independent read/write buffers for each connection.
- Use memory pools to reduce memory allocation overhead.

Best Practice Recommendations

select/poll performance is sufficient for less than 1000 connections.
Prefer epoll on Linux platforms; use kqueue on FreeBSD.
Edge-Triggered mode suits high-throughput scenarios; Level-Triggered mode is easier to program.
Combine with thread pools to handle time-consuming business logic, avoiding blocking the event loop.