The Complete Process from Entering a URL in the Browser to Page Display
1. Topic Description
This interview question assesses the overall understanding of the web technology stack, covering knowledge from multiple layers such as network protocols, browser rendering engines, and operating systems. The question requires a complete description of all key steps that occur from the moment a user enters a web address (e.g., https://www.example.com) in the browser's address bar until the webpage content is fully displayed in the browser.
2. Knowledge Explanation
Step 1: URL Parsing and Checking
- Input Processing: The browser first checks the string entered by the user. If it's not a valid URL format, the browser might submit it as a search query to the default search engine.
- Protocol Completion: If the user only enters "www.example.com", the browser will automatically complete the protocol part, such as "https://".
- HSTS Check: The browser checks the locally preloaded HSTS (HTTP Strict Transport Security) list. If the website is on this list, the browser will force the use of an HTTPS connection, even if the user entered HTTP.
Step 2: Checking Cache
- Browser Cache: Before initiating a network request, the browser first checks its own local cache to see if the resource for that URL is already cached and if the cache hasn't expired.
- Cache Lookup Order: Typically, it's Service Worker cache -> Memory Cache -> Disk Cache -> If a caching server (like a CDN) is configured, the caching server will also be queried.
Step 3: DNS Domain Name Resolution
If the cache misses, the server's IP address needs to be found via the domain name.
- Query Order:
- Browser Cache: The browser first looks in its own cache.
- Operating System Cache: Queries the local Hosts file (
C:\Windows\System32\drivers\etc\hostsor/etc/hosts) and the operating system's DNS cache. - Router Cache: Queries the local router's cache.
- ISP DNS Server: If not found in the above, the request is sent to the Internet Service Provider's (ISP) DNS server.
- Recursive Query: The ISP's DNS server will start a recursive query from the DNS root servers. The process is: root domain name server -> .com top-level domain name server -> example.com authoritative domain name server, finally obtaining the IP address corresponding to
www.example.com.
- DNS Optimization: DNS resolution results are cached at each of the above stages to reduce subsequent query time.
Step 4: Establishing a TCP Connection
After obtaining the IP address, the browser needs to establish a reliable TCP connection with the server.
- TCP Three-Way Handshake:
- SYN: The client (browser) sends a SYN packet (sequence number x) to the server, requesting to establish a connection.
- SYN-ACK: Upon receiving it, the server replies with a SYN-ACK packet (acknowledgment number x+1, sequence number y), indicating agreement to connect.
- ACK: The client sends an ACK packet (acknowledgment number y+1), and the connection is successfully established.
- Special Handling for HTTPS: If the HTTPS protocol is used, after the TCP connection is established, a TLS handshake is also required to negotiate encryption keys, verify the server's identity, and establish a secure encrypted channel.
Step 5: Browser Sends HTTP Request
Once the TCP connection is established, the browser sends an HTTP request message to the server.
- Request Message Structure:
- Request Line: Contains the method (GET/POST, etc.), URL path, and HTTP version (e.g.,
GET /index.html HTTP/1.1). - Request Headers: Contains a lot of information, such as
Host(hostname),User-Agent(browser identity),Accept(acceptable response types),Cookie, etc. - Request Body: For methods like POST, it contains the data to be submitted (e.g., form data).
- Request Line: Contains the method (GET/POST, etc.), URL path, and HTTP version (e.g.,
Step 6: Server Processes the Request and Returns a Response
The server receives the request and processes it.
- Processing Flow: The web server (e.g., Nginx, Apache) receives the request and might forward it to an application server (e.g., Tomcat, Django) or backend service (e.g., Node.js, PHP). The application server executes the corresponding business logic, such as querying a database, and then generates an HTTP response.
- Response Message Structure:
- Status Line: Contains the HTTP version, status code (e.g., 200 OK), and status message.
- Response Headers: Contains
Content-Type(content type, e.g., text/html),Content-Length(content length),Set-Cookie, etc. - Response Body: The actual requested content, such as an HTML document.
Step 7: Browser Receives Response and Parses/Renders
After receiving the HTTP response, the browser begins parsing and rendering the page. This is a complex process, often referred to as the "critical rendering path".
- 1. Building the DOM Tree:
- Process: The browser converts the received HTML byte data into a string, then through lexical analysis, converts it into a series of Tokens, and finally builds a DOM (Document Object Model) tree. The DOM tree reflects the nested structure of the HTML.
- 2. Building the CSSOM Tree:
- Process: Simultaneously, the browser parses CSS (including external CSS files, inline styles, and
<style>tags), building a CSSOM (CSS Object Model) tree. It determines the style for each DOM node.
- Process: Simultaneously, the browser parses CSS (including external CSS files, inline styles, and
- 3. Executing JavaScript:
- Blocking: When the parser encounters a
<script>tag (withoutasyncordeferattributes), it pauses the building of the DOM tree, immediately downloads, and executes the JavaScript script because JS might modify the DOM or CSSOM. - Optimization: Using the
async(asynchronous download, executed immediately after download) ordefer(asynchronous download, executed after DOM parsing is complete) attributes can avoid blocking.
- Blocking: When the parser encounters a
- 4. Building the Render Tree:
- Process: The DOM tree and CSSOM tree are merged to generate the render tree. The render tree only contains elements that need to be displayed on the page (excluding elements like
display: none).
- Process: The DOM tree and CSSOM tree are merged to generate the render tree. The render tree only contains elements that need to be displayed on the page (excluding elements like
- 5. Layout (Reflow):
- Process: Calculates the exact size and position within the viewport for each node in the render tree. This process is also called "reflow".
- 6. Painting:
- Process: The browser paints the pixels calculated during layout onto the screen, including text, colors, borders, shadows, etc.
- 7. Compositing:
- Process: If the page has layers (e.g., using properties like
transform,opacity), the browser paints each layer separately and then composites them in the compositor thread, finally displaying them on the screen.
- Process: If the page has layers (e.g., using properties like
Step 8: Closing the TCP Connection
- TCP Four-Way Handshake: When the page finishes loading, or after the connection has been idle for a period of time, the TCP connection is closed.
- FIN: One party (e.g., the client) sends a FIN packet, indicating data transmission is complete.
- ACK: The other party replies with an ACK packet for confirmation.
- FIN: The other party also sends a FIN packet.
- ACK: The original party replies with an ACK packet, and the connection is completely closed.
Summary
The entire process is interlinked, involving the complete network protocol stack from the application layer (HTTP/DNS) to the transport layer (TCP) to the network layer (IP), as well as the complex rendering mechanisms inside the browser. Understanding this process is crucial for web performance optimization and troubleshooting.