TCP Retransmission Mechanism and Timeout Calculation
Problem Description
The TCP retransmission mechanism is the core mechanism for ensuring reliable data transmission. When data packets sent by the sender are lost or corrupted, the mechanism retransmits the data. Interviewers will examine your understanding of retransmission triggers, timeout calculation, and various retransmission strategies.
I. Basic Principles of Retransmission Mechanism
-
Necessity of Retransmission
- The IP protocol itself does not guarantee reliable data transmission; packets may be lost due to network congestion, router failures, etc.
- TCP achieves reliability through acknowledgment (ACK) and retransmission mechanisms.
- The sender starts a retransmission timer (RTO) for each data segment sent. If an ACK is not received within the timeout period, retransmission is triggered.
-
Basic Retransmission Flow
Sender sends data segment Seq=1 → Starts RTO timer ↓ Receiver receives data → Replies with ACK=2 (expecting next sequence number) ↓ Sender receives ACK=2 → Cancels the timer, sends subsequent data ↓ If RTO expires without receiving ACK → Retransmits the Seq=1 data segment
II. Calculation of Timeout (RTO)
-
Core Concept: RTT (Round-Trip Time)
- RTT refers to the total time from data transmission to ACK receipt.
- It needs dynamic calculation because network latency changes.
-
Classic Algorithm: Jacobson/Karels Algorithm
- Calculate Smoothed RTT (SRTT): SRTT = α × SRTT + (1-α) × RTT Sample
- Calculate RTT Deviation (RTTVAR): RTTVAR = β × RTTVAR + (1-β) × |RTT Sample - SRTT|
- Calculate RTO: RTO = SRTT + 4 × RTTVAR
- Typical values: α=0.875, β=0.75, initial RTO is usually set to 1 second.
-
Specific Calculation Example
Initial: SRTT=1s, RTTVAR=0.5s Newly measured RTT sample=1.2s New SRTT = 0.875×1 + 0.125×1.2 = 1.025s New RTTVAR = 0.75×0.5 + 0.25×|1.2-1| = 0.425s New RTO = 1.025 + 4×0.425 = 2.725s
III. Evolution of Retransmission Mechanisms
-
Standard Timeout Retransmission (RTO Retransmission)
- The simplest retransmission method: retransmit upon timeout.
- Disadvantage: Long wait for timeout affects performance.
-
Fast Retransmit
- Trigger condition: Receipt of 3 duplicate ACKs (dup ACKs).
- Principle: Indicates subsequent data has arrived, but a packet in the middle is lost.
- Immediately retransmits the lost packet without waiting for timeout.
Example: Send: Seq=1,2,3,4,5 Receive: ACK=2, ACK=2, ACK=2 (receives 3 duplicate ACKs=2) → Immediately retransmit Seq=2 -
Selective Acknowledgment (SACK)
- Traditional ACK only acknowledges consecutive data; SACK can acknowledge non-consecutive data blocks.
- The receiver informs the sender of the received data ranges via the SACK option.
- The sender can retransmit only the actually lost packets.
-
Duplicate Selective Acknowledgment (D-SACK)
- Extension of SACK, can report duplicate received segments.
- Helps determine if retransmission is necessary, avoiding unnecessary retransmissions.
IV. Optimization Strategies in Practical Applications
-
Binary Exponential Backoff for Timeout Retransmission
- First timeout: RTO remains unchanged.
- Subsequent timeouts: RTO = RTO × 2 (exponential backoff).
- Avoids exacerbating network congestion.
-
Spurious Timeout and Eifel Detection Algorithm
- Problem: Sometimes ACKs are merely delayed, not lost (spurious timeout).
- Eifel Algorithm: Records a timestamp upon retransmission. If the timestamp of the received original ACK is older than that of the retransmission ACK, it indicates a spurious timeout.
- Can restore the original congestion control state.
-
Early Retransmit
- When the send window is small, it may not generate 3 dup ACKs.
- Solutions: Set a retransmission threshold (e.g., 2 dup ACKs) or use timers for assistance.
V. Implementation Details in Modern TCP
-
Relevant Parameters in Linux
- /proc/sys/net/ipv4/tcp_retries1: Minimum retransmission count.
- /proc/sys/net/ipv4/tcp_retries2: Maximum retransmission count (usually 15 times).
- Connection is terminated after exceeding the maximum retransmission count.
-
Timestamp Option (TCP Timestamps)
- Resolves RTT measurement ambiguity: the same ACK may correspond to the original transmission or a retransmission.
- Precisely calculates RTT for each data packet using timestamps.
By understanding these details of the TCP retransmission mechanism, you can deeply grasp the implementation principles of TCP reliable transmission and better diagnose and solve transmission performance issues in practical network programming.