Control File Generation

The zsyncmake program generates a .zsync file for a given data file. It calculates the checksums of each block, and then prepends the header with file name, length, and the download URL. I chose a simple "key: value" format for the header. The header data is all text (with the exception of Z-Map2, described below), so the administrator can easily edit filenames and URLs if they change. A typical .zsync file header is like this:

zsync: 0.0.1
Filename: Packages
Blocksize: 1024
Length: 12133882
URL: http://localhost/~cph/Packages
SHA-1: 97edb7d0d7daa7864c45edf14add33ec23ae94f8

I have also chosen to include a SHA-1 checksum of the final file in the control file. Firstly, this serves as a safety check: early versions of zsync will doubtless have some bugs, and a final checksum will help catch any failure and flag then before they cause a problem. Secondly, I am aware of the parallel with bittorrent, which (I think) provides a similar final check. Downloading a file in chunks and merging odd chunks from the local system gives plenty of chance for mistakes, and I would not blame users for being sceptical about whether the jigsaw all fits together at the end! A final checksum gives this assurance. In any case, it only inconveniences the client — the program to make the control file has to read the data through once and only once, and can easily calculate a checksum while doing so.

zsyncmake automatically detects gzip files, and switches to inflating the contained data and recording blocks for this uncompressed data instead. It also adds a Z-Map2 line followed by another block of data in the header, which provides the map between the deflated and underlying data. I have had to include a locally customised copy of (part of) zlib to manage this. Each block of data in the zmap is a pair of offsets: one in the deflated stream, and the corresponding offset in the inflated stream. 2 bytes are used for each offset, except that one bit of the inflated offset field is used as a flag to indicate whether this marks a block header. Together, this gives the client enough information to read from the compressed file: it finds the block header preceding the data it wants, and a pair of offsets in the inflated stream that enclose the required data, and then downloads between the corresponding offsets in the compressed stream.

zsyncmake also has an option to compress a file, in a way optimised for zsync, before starting to make the .zsync. I chose to build in the gzip compressor, as it is too specialised to be of interest outside of zsync.

The zsync version is included to allow future clients to be backward compatible with older .zsync files. There is also a Min-Version header which warns off clients which are too old to use this file, and a Safe header which gives a list of other headers which older clients can safely ignore if they choose.