TCP Sticky and Unpacking Problem
Problem Description
TCP sticky and unpacking are common issues in network programming. They refer to the phenomenon in TCP-based communication where the boundaries between consecutive data packets sent by the sender become blurred upon reception by the receiver. Multiple packets may be combined into one large packet (sticky) or a single packet may be split into multiple smaller ones (unpacking). This is directly related to TCP's nature as a byte-stream-oriented protocol.
Root Cause
The TCP protocol itself does not preserve message boundaries. It treats data from the application layer as an unstructured byte stream. It only guarantees reliable and ordered transmission of this byte stream, but does not care about the meaning or segmentation of these bytes. Therefore, applications must define their own protocol to distinguish between different messages.
Common Scenarios
-
Causes of Sticky Packets:
- Sender: Due to the Nagle algorithm (an algorithm that optimizes network efficiency by reducing the number of small packets sent), the sender may merge multiple small data packets sent within a short interval into one large TCP segment.
- Receiver: The receiver's application layer does not read data from the socket buffer in a timely manner, causing multiple packets to accumulate in the receive buffer and stick together.
-
Causes of Unpacking:
- When the size of a data packet to be sent exceeds the Maximum Segment Size (MSS) of a TCP segment, TCP will split the packet at the transport layer.
- When the packet size exceeds the Maximum Transmission Unit (MTU) of the data link layer (typically 1500 bytes, including IP and TCP headers), the IP layer will also fragment it. However, TCP typically avoids IP fragmentation through MSS negotiation.
Solution Approach
The fundamental solution is to define a clear boundary for each data packet at the application layer protocol. The receiver then correctly extracts each complete packet from the received byte stream based on this boundary rule.
Below are several mainstream solutions, explained progressively:
Solution 1: Fixed-Length Messages
This is the simplest method, where each application-layer message packet is defined to have a fixed length.
-
Workflow:
- Sender: If the actual data length is less than the defined length, it is padded with predefined characters (such as spaces or
\0). - Receiver: Reads a fixed length of data from the buffer each time. For example, if each packet is defined as 100 bytes, the receiver reads 100 bytes each time, which must be a complete message.
- Sender: If the actual data length is less than the defined length, it is padded with predefined characters (such as spaces or
-
Advantages: Simple to implement and efficient to parse.
-
Disadvantages: Extremely inflexible. Wastes bandwidth for messages much smaller than the fixed length; cannot handle messages longer than the fixed length. Rarely used in practical, complex business scenarios.
Solution 2: Special Delimiters
A special character or string is appended to the end of each application-layer message as a delimiter, e.g., newline \n, carriage return and newline \r\n, or a custom delimiter like ##END##.
-
Workflow:
- Sender: Appends the delimiter after sending each message.
- Receiver: Continuously reads the byte stream until the predefined delimiter is encountered. The data from the start up to the delimiter constitutes one complete message.
-
Advantages: Flexible, variable message length.
-
Disadvantages: The message content itself must not contain the delimiter, otherwise it will cause parsing errors. Therefore, if the message content is not controllable, escaping delimiters within the content (similar to handling quotes in JSON) is necessary, which adds complexity.
Solution 3: Declaring Message Length in the Header (Most Common and Standard Method)
Design an "envelope" (message header) for each application-layer message, containing the most crucial information—the length of the message body. The message body is the actual data to be transmitted.
-
Workflow:
-
Define Protocol Format: A message typically consists of two parts.
- Message Header: Contains fixed-length fields, one of which must indicate the length of the Message Body. The header can also include other information like version number, sequence number, command type, etc.
- Message Body: The actual, variable-length business data to be transmitted.
For example, a simple protocol can be designed as: the first 4 bytes (an int32) of the header represent the message body length, followed by the body.
[4-byte length][N-byte message body]
-
Sender:
a. First, create the message body.
b. Calculate the byte lengthNof the message body.
c. Write the lengthNinto a fixed-size field (e.g., 4 bytes) as the message header.
d. Concatenate the header and body and send it via TCP. -
Receiver:
a. First, attempt to read the fixed-length message header from the socket buffer (e.g., read exactly 4 bytes first, ignoring other data).
b. If even 4 bytes are not yet available, the header is incomplete; continue waiting.
c. After successfully reading 4 bytes, parse the message body lengthN.
d. Then, continue reading the nextNbytes from the buffer.
e. If the currently readable bytes in the buffer are less thanN, the message body is incomplete; wait for the remaining data to arrive.
f. OnceNbytes are successfully read, a complete application-layer message is obtained. Pass this message to the business logic for processing, then start parsing the next message (repeat step a).
-
-
Advantages: High efficiency, good security, no delimiter conflict issues. It is the most mainstream approach in the industry (used by protocols like HTTP, Redis, etc.).
-
Disadvantages: Slightly more complex to implement than the previous two methods, requiring precise control over the number of bytes read.
Summary
TCP sticky and unpacking are not flaws of the TCP protocol but are characteristics of its stream-oriented nature. The key to solving this problem lies in designing an application-layer protocol with clear boundaries. Among the three solutions, the "Header + Body" (with length declaration) approach has become the most common choice due to its flexibility and reliability. Understanding and implementing this unpacking logic is a fundamental skill for network programming engineers.