The interconnected network stack

Internet protocols are best thought of as a stack of layers. Ethernet provides physical data transfer and link between two point-to-point devices. IP provides a layer of addressing, allowing routers and large-scale networks to exist, but it's connectionless. Packets are fired into the ether, with no indication of whether they arrived or not. TCP adds a layer of reliable transmission by using sequence numbers, acknowledgement, and retransmission.

Finally, application-level protocols like HTTP are layered on top of TCP. At this level, we already have addressing and the illusion of reliable transmission and persistent connections. IP and TCP save application developers from constantly reimplementing packet retransmission and addressing and so on.

The independence of these layers is important. For example, when packets were lost during my 88.5 MB video transfer, the Internet's backbone routers didn't know; only my machine and the web server knew. Dozens of duplicate ACKs from my computer were all dutifully routed over the same routing infrastructure that lost the original packet. It's possible that the router responsible for dropping the lost packet was also the router carrying its replacement milliseconds later. This is an important point for understanding the Internet: the routing infrastructure doesn't know about TCP; it only routes. (There are exceptions to this, as always, but it's generally true.)

Layers of the protocol stack operate independently, but they weren't designed independently. Higher-level protocols tend to be built on lower-level ones: HTTP is built on TCP is built on IP is built on Ethernet. Design decisions in lower levels often influence decisions in higher levels, even decades later.

Ethernet is old and concerns the physical layer, so its needs set the base parameters. An Ethernet payload is at most 1,500 bytes.

The IP packet needs to fit within an Ethernet frame. IP has a minimum header size of 20 bytes, so the maximum payload of an IP packet is 1,500 - 20 = 1,480 bytes.

Likewise, the TCP packet needs to fit within the IP packet. TCP also has a minimum header size of 20 bytes, leaving a maximum TCP payload of 1,480 - 20 = 1,460 bytes. In practice, other headers and protocols can cause further reductions. 1,400 is a conservative TCP payload size.

The 1,400 byte limit influences modern protocols' designs. For example, HTTP requests are generally small. If we fit them into one packet instead of two, we reduce the probability of losing part of the request, with a correspondingly reduced likelihood of TCP retransmissions. To squeeze every byte out of small requests, HTTP/2 specifies compression for headers, which are usually small. Without context from TCP, IP, and Ethernet, this seems silly: why add compression to a protocol's headers to save only a few bytes? Because, as the HTTP/2 spec says in the introduction to section 2, compression allows "many requests to be compressed into one packet".

HTTP/2 does header compression to meet the constraints of TCP, which come from constraints in IP, which come from constraints in Ethernet, which was developed in the 1970s, introduced commercially in 1980, and standardized in 1983.

One final question: why is the Ethernet payload size set at 1,500 bytes? There's no deep reason; it's just a nice trade-off point. There are 42 bytes of non-payload data needed for each frame. If the payload maximum were only 100 bytes, only 70% (100/142) of time would be spent sending payload. A payload of 1,500 bytes means about 97% (1500/1542) of time is spent sending payload, which is a nice level of efficiency. Pushing the packet size higher would require larger buffers in the devices, which we can't justify simply to get another percent or two of efficiency. In short: HTTP/2 has header compression because of the RAM limitations of networking devices in the late 1970s.

This is one section of The Programmer's Compendium's article on Network Protocols, which contains more details and context.