Detailed Explanation of HTTP Caching Mechanism
HTTP caching is a core technology for web performance optimization. It works by storing copies of requested resources and reusing them for subsequent requests, rather than fetching them again from the server. This reduces latency, lowers bandwidth consumption, and decreases server load.
I. Basic Concepts and Classification of Caching
Caching is primarily divided into two categories:
- Private Cache: Exclusive to a single user, such as browser cache.
- Shared Cache: Can be shared by multiple users, such as proxy server cache, CDN cache.
II. Key Header Fields for Cache Control
1. Cache-Control (HTTP/1.1)
This is the most commonly used cache control field. Main directives include:
no-cache: The response can be cached but must be validated with the server for freshness before use.no-store: Prohibits caching any content (used for sensitive data).max-age=3600: The maximum time (in seconds) the resource can be cached.public: Allows any cache node to cache this resource.private: Only allows the user's browser to cache; intermediate proxies cannot cache.
2. Expires (HTTP/1.0)
Specifies an absolute expiration time for the resource, e.g., Expires: Wed, 21 Oct 2025 07:28:00 GMT.
Due to its reliance on client clock synchronization, it is now largely superseded by max-age.
III. Cache Validation Mechanisms
When a cached resource expires or requires validation, the browser sends a validation request to the server:
1. Conditional Request Headers
If-Modified-Since: Used withLast-Modified.If-None-Match: Used withETag(more precise).
2. Example Validation Process
First Request:
GET /api/data
Response: ETag: "xyz123", Last-Modified: Wed, 21 Oct 2025 07:28:00 GMT
Request after cache expires:
GET /api/data
If-None-Match: "xyz123"
If-Modified-Since: Wed, 21 Oct 2025 07:28:00 GMT
Server validation:
- Resource unchanged → Returns 304 Not Modified (empty body)
- Resource changed → Returns 200 OK + new resource
IV. Complete Cache Decision Flow
The browser's cache decision follows this logic tree:
-
Is there a cached copy?
- No → Request directly from the server.
- Yes → Proceed to next step.
-
Is the cache fresh? (Check max-age/Expires)
- Fresh → Use cache directly (200 from cache).
- Not fresh → Proceed to next step.
-
Does it need revalidation? (Check no-cache or mandatory validation)
- No validation needed → Use cache but validate asynchronously.
- Validation needed → Send conditional request to the server.
-
Server validation result
- 304 Not Modified → Update cache freshness, use cached copy.
- 200 OK → Replace cache with the new resource.
V. Practical Configuration Examples
Static Resources (Long-term Caching)
Cache-Control: public, max-age=31536000 // Cache for 1 year
ETag: "abc123"
Combine with filename hashing to ensure immediate invalidation upon content updates.
API Responses (Cautious Caching)
Cache-Control: no-cache // Validate before each use
ETag: "xyz789"
Sensitive Data (No Caching)
Cache-Control: no-store
VI. Best Practices for Cache Strategies
- Static Resources: Long cache time + filename versioning.
- Dynamic Content: Short cache or no-cache + appropriate validation.
- Personalized Data: Use private cache to avoid information leakage.
- Critical Resources: Use
Cache-Controlto ensure availability.
By properly configuring these caching strategies, you can significantly improve website performance while ensuring the timeliness and accuracy of data.