Fast zlib compression

Description


This is an implementation of the fast deflate algorithm for zlib (to be exact, here is a new implementation of the longest_match function). In my tests it works about 2.5-10x times faster than original method with a slightly better compression ratio (i.e. it never compress worse than original zlib).

The library exists in Assembly and C source code formats. Asm version is prepared only for 32-bit Intel architecture, however C version produces exactly the same output, and works just 5% slower than Asm version. The C code could be used for 64 bit platforms, and in my tests it works at nearly the same performance as 32-bit assembly version. 

 

Tests


Please note: these tests are obsolete and kept here for historical reasons! New test results are here.

For testing I have prepared 3 files about 50Mb each. The first file consists of various C/C++ sources (51,923,629 bytes), the second file is Windows executables (50,115,981 bytes of the exe/dll files from the Windows system directory), and 56,481,852 bytes ps (PostScript) file. I have tested compression with the following tools:

  • original zlib, compiled with C and asm versions (contrib/masmx86) of the longest_match function
  • my version of the longest_match function (x86 assembly) 
  • pkzip 4.0

The machine for testing is AMD Athlon 64 X2 4200+.

Zlib is tested using test program from the directory contrib/testzlib. Tested with compression levels 9 (maximal) and 8.

 

What is tested C/C++ source codeWin32 executablesPostScript file

size, bytes

time, ssize, bytestime, ssize, bytestime, s
zlib C version, -9 11,101,198 49.2 21,277,563 27.2 6,116,386 64.5
zlib ASM version -9 11,101,198 17.6 21,277,563 12.3 6,116,386 20.9
zlib MY ASM version -9 11,079,702 7.9 (123%) 21,256,264 8.6 (43%) 6,089,374 11.9 (76%)
zlib C version -8 11,125,415 22.9 21,285,601 18.5 6,162,581 35.9
zlib ASM version -8 11,125,415 10.3 21,285,601 9.9 6,162,581 13.7
zlib MY ASM version -8 11,083,776 7.2 (43%) 21,257,169 8.1 (22%) 6,108,081 10.2 (34%)
pkzip -9 11,116,529 4.5 21,218,554 4.9 6,085,662 9.5

As you can see, my implementation is 20-120% faster than its asm analogue from the zlib distribution, and provides slightly better compression ratio than original zlib. Also notice that -8 compression level with this algorithm is still better than -9 compression level for original algorithm.

 

Algorithm


The idea is not mine. I've seen the algorithm used here in very old MS-DOS archiver "AIN" (it's very hard to find information about it now). This archiver was the best before Win32 era - its compression ratio and speed were much better than PKZIP 2.04 available at that time (PKZIP's deflate were used zlib's tricks).

I have performed search and did not found any patents about lookup for matches for the LZ77 algorithm, so I suppose it is absolutely legal to use it in other applications.

The implementation is my own (original version were written on the 8086 assembly). I have used NASM assembler because it has capabilities to create OBJ files for any x86 compiler (or at least for most of them). I have successfully compiled and tested my library with WatcomC and Visual C 6-9.

Also I've checked new PKZIP internals - it looks like it's current deflate algorithm is based on reversed AIN source code - it uses the same principles and very similar programming tricks for data compression. Better ratio (in comparison to zlib) is achieved using adaptive TOO_FAR threshold depending on the file type.

 

Source code and binaries


You may find everything on GitHub: https://github.com/gildor2/fast_zlib. Build instructions are available in README.md. Precompiled binaries are in "Releases" section on GitHub.

 

Discussion


 Everything is here