TCP's Retransmission Timeout Mechanism and RTT Estimation

TCP's Retransmission Timeout Mechanism and RTT Estimation

Problem Description

TCP ensures reliable data transmission through Retransmission Timeout (RTO). When a sender does not receive an acknowledgment (ACK) in time, it triggers a retransmission. However, how should the timeout be set appropriately? If set too short, it causes unnecessary retransmissions; if too long, it reduces efficiency. Therefore, TCP needs to dynamically estimate the Round-Trip Time (RTT) and calculate the RTO. This problem delves into the methods of RTT measurement, the logic of RTO calculation, and its optimization strategies.

Key Concepts: RTT and RTO

RTT (Round-Trip Time): The time interval from sending a data packet to receiving its ACK.
RTO (Retransmission Timeout): The timeout threshold the sender waits for an ACK, which should be slightly greater than the average RTT.

Step 1: Initial RTT Estimation and RTO Calculation

TCP cannot predict the initial RTT but needs to set an initial RTO:

Standard Specification: The initial RTO is set to 1 second (common implementations, such as Linux, default to 1 second).
First Measurement: After sending a data packet and receiving an ACK, the first SampleRTT is obtained, and subsequent adjustments are made dynamically based on it.

Step 2: Smoothed RTT Estimation (SRTT)

To avoid instantaneous fluctuations, TCP uses a weighted moving average to calculate the Smoothed RTT (SRTT):

SRTT = (1 - α) * SRTT + α * SampleRTT

Recommended α value: 0.125 (i.e., the new SampleRTT accounts for 12.5% of the weight).
Initial value: For the first measurement, SRTT is directly set to the SampleRTT.

Example:

If SampleRTT = 200ms, initial SRTT = 200ms.
If the next SampleRTT = 300ms, then the updated SRTT = 0.875×200 + 0.125×300 = 212.5ms.

Step 3: Calculating RTT Deviation (RTTVAR)

Using only SRTT may ignore volatility. TCP introduces RTT Deviation (RTTVAR) to reflect jitter:

Dev = |SampleRTT - SRTT|  
RTTVAR = (1 - β) * RTTVAR + β * Dev

Recommended β value: 0.25 (higher weight for deviation to respond quickly to changes).

Step 4: Final RTO Formula

Combining SRTT and deviation, the RTO calculation formula is:

RTO = SRTT + 4 * RTTVAR

Coefficient 4: An empirical value ensuring RTO covers the vast majority of delay fluctuations.
Lower Bound Constraint: RTO must be at least 1 second (to avoid premature retransmissions).

Example:
If SRTT = 212.5ms, RTTVAR = 50ms, then RTO = 212.5 + 4×50 = 412.5ms.

Step 5: Karn's Algorithm Handling Retransmission Ambiguity

When an ACK is received after a packet retransmission, it is impossible to determine whether the ACK corresponds to the original packet or the retransmitted packet (retransmission ambiguity).
Karn's Algorithm Rules:

Ignore the SampleRTT of retransmitted packets (do not update SRTT and RTTVAR).
Use exponential backoff during retransmissions: RTO = 2 × current RTO (to avoid worsening continuous timeouts).
Resume normal measurement upon receiving ACKs for non-retransmitted packets.

Step 6: Modern TCP Optimizations (RFC 6298)

The classic algorithm is sensitive to sudden delays; modern TCP (e.g., Linux) further optimizes:

First Measurement: SRTT = SampleRTT, RTTVAR = SampleRTT / 2.
Minimum RTO: Typically set to 200ms (instead of 1 second), adapting to high-speed networks.
Clock Precision: Use microsecond-level timers (to reduce estimation errors).

Summary and Significance

Dynamic Adaptability: Real-time tracking of network changes through SRTT and RTTVAR.
Robustness: Karn's algorithm avoids misleading retransmission ambiguity, and exponential backoff alleviates congestion.
Efficiency Balance: RTO is neither too long (reducing wait time) nor too short (minimizing unnecessary retransmissions).

This mechanism is one of the core foundations of TCP reliability, and subsequent congestion control (e.g., slow start) also relies on RTO to trigger retransmission behavior.