Networking

HTTP is widely deployed and accepted, and supports Range: requests. But is it optimal from our point of view? HTTP's control data is text key: value pairs, and some control data is sent for every distinct block of data to be transferred. If a file is downloaded all at once, there is only one set of HTTP headers, so the overhead is negligible; once we begin transferring lots of disjoint blocks, this overhead must be quantified.

At its most basic, HTTP transfers one block of data per connection. Each request has a header like Range: bytes=1024-2047 and each response contains a header Content-range: bytes 1024-2047. But the full set of headers can be 6 or 7 lines:

HTTP/1.1 206 Partial Content
Date: Sat, 30 Oct 2004 17:28:36 GMT
Server: Apache/2.0.51p1 (Eduserv/Unix) PHP/4.3.8
Last-Modified: Thu, 16 Sep 2004 04:35:27 GMT
ETag: "3a0c62-27dbe000-935ae1c0"
Accept-Ranges: bytes
Content-Length: 1024
Content-range: bytes 1024-2047

This totals 265 characters — a 25% overhead on a 1024 byte block. There are also the overheads at lower layers: most web servers send the headers in a separate packet, so there is an extra set of TCP/IP headers to allow for too (some servers optimise this away and arrange to transmit the headers and data at once, but fewer will do so for requests with unusual headers like Range).

HTTP allows the client to request multiple ranges at once. It then replies with a single set of headers for the reply as a whole, and encodes the content as "multipart/byteranges", using a multipart MIME encoding. This encoding results in an extra header being emitted in front of each block, but this extra header is smaller, with just 3 lines per block:


HTTP/1.1 206 Partial Content
Date: Sat, 30 Oct 2004 17:28:36 GMT
Server: Apache/2.0.51p1 (Eduserv/Unix) PHP/4.3.8
Last-Modified: Thu, 16 Sep 2004 04:35:27 GMT
ETag: "3a0c62-27dbe000-935ae1c0"
Accept-Ranges: bytes
Content-Length: 2159
Content-Type: multipart/byteranges; boundary=3e7ad816c6e011b3

--3e7ad816c6e011b3
Content-type: application/octet-stream
Content-range: bytes 1024-2047

[...]

--3e7ad816c6e011b3
Content-type: application/octet-stream
Content-range: bytes 3072-4095

[...]
--3e7ad816c6e011b3--
[end]

This reduces the overhead per block to around 90 bytes, a significant saving (but with the full HTTP headers once per request, so the total overhead remains higher). There is the risk that this encoding puts more load back on the server — it would not be advisable to request very large numbers of ranges in a single request. This area needs discussion with some web server developers, to decide where the balance lies between less traffic and more server overhead.

Using multiple ranges alleviates the network-level problems too — it means fewer requests, and servers (Apache, at least) do not issue the boundary headers in separate packets, so the total number of packets will fall too. Note that HTTP servers are required not to issue a multibyte/ranges response if there is only a single range given.

HTTP/1.1 allows a further improvement, because the client and server can hold a connection open and issue multiple requests. The client can send multiple requests (each of which can include multiple ranges, as described above) over the same connection. This saves the overhead of connection setup and shutdown. It also allows the TCP stacks to get up to their best data transfer speed: TCP implementations usually use a slow start algorithm, where data is transmitted slowly at first, then increasing the speed until packet loss begins; this is a way of feeling out the available bandwidth between the two ends. Transfer speed is important, because even though zsync transmits less data, it could still take longer than a full transfer if the speed was much lower. TCP stacks are also free to perform other optimisations, like the Nagle algorithm, where packets are delayed so that ACK packets can be merged with outgoing data packets.

Finally, HTTP/1.1 allows pipelining. This allows the client to submit multiple requests without waiting for responses to each request before issuing the next. This is the difference between a full-duplex and a half-duplex connection between the two ends — while the client will be transmitting little to the server, it would clearly be less than ideal if the server has to pause and wait after finishing one block before receiving instructions for the next. While this could be worked around by having multiple connections to the server (so while one was waiting the other would still be transmitting), this would be far more complicated to implement and would be subject to the arbitrary choice of the server and of the network as to which connection used the most bandwidth.

zsync-0.0.1 used HTTP/1.0, with 5 ranges per request, a single connection to the server, and a new connection for every request. It could manage 200-350kbps per second on my ADSL line. zsync-0.0.2 uses HTTP/1.1, keeping the connection open as long as possible, and pipelining its requests, as well as issuing requests for up to 20 ranges per request — it achieves a sustained 480kbps — which is about the normal limit of my 512kbps ADSL line.

To minimise network load and maximise transfer speed, it is essential for any zsync implementation to use multiple ranges per request, HTTP/1.1 persistent connections and pipelining. See [[Pipe1999]] for more discussion of the performance advantage of HTTP/1.1 - although much of this paper is concerned about links between documents and retrieving links from small, partially downloaded files, some of the HTTP/1.1 and pipelining material is very relevant.