Detailed Explanation of TCP Keep-Alive Mechanism

Detailed Explanation of TCP Keep-Alive Mechanism

I. Knowledge Point Description
The TCP Keep-Alive mechanism is an optional connection monitoring mechanism used to detect whether the other end of an idle TCP connection is still alive and reachable. When a connection remains idle for a long time, the local end periodically sends probe packets to the remote end. If no response is received over an extended period, the connection is considered invalid and is closed to release local resources. It is important to note that this is a completely different concept from HTTP Keep-Alive (which is used for connection reuse).

II. Detailed Working Mechanism

Step 1: Mechanism Activation and Parameters
By default, the TCP Keep-Alive mechanism is disabled. It must be explicitly enabled at the application layer (for example, in Socket programming). Its behavior is controlled by three core parameters (which may vary across different operating systems; the following uses common Linux systems as an example):

tcp_keepalive_time (default 7200 seconds): The duration of idle time before sending the first keep-alive probe packet.
tcp_keepalive_intvl (default 75 seconds): The time interval between sending consecutive probe packets.
tcp_keepalive_probes (default 9 times): The maximum number of probe packets sent before determining the connection as failed.

Step 2: Idle Timing
Once the Keep-Alive mechanism is enabled for a connection, the system maintains an idle timer for that connection. Whenever data (including TCP acknowledgment packets) is transmitted over the connection, the timer is reset. If the connection idle time reaches the threshold set by tcp_keepalive_time (e.g., 7200 seconds, or 2 hours), the Keep-Alive mechanism is activated.

Step 3: Sending Keep-Alive Probe Packets
After the idle timer expires, the local end sends a Keep-Alive probe packet to the remote end. This packet has the following characteristics:

It is an ordinary TCP acknowledgment (ACK) packet.
Its sequence number is set to "the next expected sequence number from the remote end minus one" (i.e., snd_nxt - 1).
This is done to make the remote end consider it a duplicate ACK, thereby triggering the remote end to return an ACK packet with the correct sequence number as a response, without affecting upper-layer applications.

Step 4: Processing Remote End Responses
After sending the probe packet, the local end enters a waiting state. The following scenarios may occur:

Normal Response from Remote End (Connection Healthy):
- Scenario: The remote host is running normally, and the connection is not interrupted.
- Process: Upon receiving this "unusual" ACK, the remote end replies with a normal ACK packet.
- Local Action: After receiving this response, the local end confirms that the remote end is alive. The idle timer is then reset. If the connection remains idle, probing will start again after the next tcp_keepalive_time period.
Remote End Crashes and Reboots (Connection Reset):
- Scenario: The remote host crashes and reboots during the idle period, losing all previous connection state information.
- Process: The local end sends a probe packet. Upon receiving it, the remote end finds it is an unrecognized connection (invalid sequence number) and replies with an RST (Reset) packet according to TCP rules.
- Local Action: Upon receiving the RST packet, the local end immediately determines the connection as invalid and closes the local connection.
No Response from Remote End (Connection Interrupted or Remote Host Down):
- Scenario: The remote host is down, the network link is completely interrupted, or an intermediate firewall discards the packets.
- Process: After the local end sends the probe packet, it receives no reply (ACK or RST) within tcp_keepalive_intvl (e.g., 75 seconds).
- Local Action: The local end does not give up immediately. It waits for tcp_keepalive_intvl and then sends a second probe packet. This process repeats until a response is successfully received or the cumulative number of probe packets sent reaches the limit of tcp_keepalive_probes (e.g., 9 times).

Step 5: Determining Connection Failure and Closing
If tcp_keepalive_probes probe packets are sent consecutively without any valid response, the local TCP module concludes that the remote end is unreachable (host down or network irreparably interrupted). Subsequently, the local end actively closes this TCP connection and reports an error to the upper-layer application (e.g., returning an error code during the next I/O operation).

III. Mechanism Summary and Key Points

Purpose: To clean up "half-open connections" and "zombie connections," thereby releasing system resources.
Trigger Condition: Connection idle time exceeds tcp_keepalive_time.
Probing Process: Send probes at intervals of tcp_keepalive_intvl, up to a maximum of tcp_keepalive_probes times.
Total Timeout: The maximum time from the start of probing to finally closing the connection is approximately tcp_keepalive_time + tcp_keepalive_intvl * tcp_keepalive_probes. Calculated with default parameters, this is about 7200 + 75 * 9 = 7875 seconds (over 2 hours).
Application Scenarios: Suitable for scenarios requiring long-lived connections with infrequent data interaction, such as database connection pools and long-connection proxies. Due to its long default timeout and because it is not a mandatory part of the TCP standard, many applications choose to implement their own more flexible heartbeat mechanisms at the application layer.