Volodymyr Mykhailyk blog: Integration of JS Deflate Compression with other platforms

This post contain information how you could integrate JavaScript Deflate algorithms with other platforms like Java.

I will describe what need to be done in order to successfully compress string (even with non-ASCII) characters and decompress it using Java. So lets start.

JavaScript part:
In order to support international strings and be able to decompress them - on the first step we need to convert string into byte array. There are plenty of similar functions on web. Below is one of the examples:

	function stringToByteArray(str) {
	var b = [], i, unicode;
	for(i = 0; i < str.length; i++) {
	unicode = str.charCodeAt(i);
	// 0x00000000 - 0x0000007f -> 0xxxxxxx
	if (unicode <= 0x7f) {
	b.push(String.fromCharCode(unicode));
	// 0x00000080 - 0x000007ff -> 110xxxxx 10xxxxxx
	} else if (unicode <= 0x7ff) {
	b.push(String.fromCharCode((unicode >> 6) \| 0xc0));
	b.push(String.fromCharCode((unicode & 0x3F) \| 0x80));
	// 0x00000800 - 0x0000ffff -> 1110xxxx 10xxxxxx 10xxxxxx
	} else if (unicode <= 0xffff) {
	b.push(String.fromCharCode((unicode >> 12) \| 0xe0));
	b.push(String.fromCharCode(((unicode >> 6) & 0x3f) \| 0x80));
	b.push(String.fromCharCode((unicode & 0x3f) \| 0x80));
	// 0x00010000 - 0x001fffff -> 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
	} else {
	b.push(String.fromCharCode((unicode >> 18) \| 0xf0));
	b.push(String.fromCharCode(((unicode >> 12) & 0x3f) \| 0x80));
	b.push(String.fromCharCode(((unicode >> 6) & 0x3f) \| 0x80));
	b.push(String.fromCharCode((unicode & 0x3f) \| 0x80));
	}
	}

	return b;
	}

view raw gistfile1.js hosted with ❤ by GitHub

In order to create correct Deflate bytesequence we also need some adjustments. Deflate libraries we reviewing are not compatible by default with ZLIB format available in lot of languages (including java.util.zip.Deflater). These JS libraries producing output in such called form as RawDeflate. Java deflater and any other ZLIB library could decompress this data but they require additional info: header and checksum. Info about ZLIB format

Header: During experiments I discovered that Java deflater/infalter accept/produce following bytes in header: 0x78 0xDA

Checksum: should be calculated using Adler32 algorithm. Adler32 code to create checksum is quite simple:
Note: before returning checksum function should convert it to the byte array in order to make it compatible with other data.

	function adler32(data) {
	var MOD_ADLER = 65521;
	var a = 1, b = 0;
	var index;

	// Process each byte of the data in order
	for (index = 0; index < data.length; ++index) {
	a = (a + data.charCodeAt(index)) % MOD_ADLER;
	b = (b + a) % MOD_ADLER;
	}
	//adler checksum as integer;
	var adler = a \| (b << 16);

	//adler checksum as byte array
	return String.fromCharCode(((adler >> 24) & 0xff),
	((adler >> 16) & 0xff),
	((adler >> 8) & 0xff),
	((adler >> 0) & 0xff));
	}

view raw gistfile1.js hosted with ❤ by GitHub

The final code which producing compressed data that could be easily decompressed on other platforms looks like:
Note: checksum should be calculated on the original byte array data before compression.

	var originalString = "some data that should be compressed";
	//convert string to bytes array
	var originalBytes = stringToByteArray(originalString);
	//generate header as byte array
	var headerBytes = String.fromCharCode(120, 218);
	//compress data
	var compressedBytes = compress(originalBytes);
	//calculate checksum
	var checksumBytes = adler32(originalBytes);
	//create final byte array
	var resultBytes = headerBytes + compressedBytes + checksumBytes;
	//convert it to base64.
	var base64String = window.btoa(result);

view raw gistfile1.js hosted with ❤ by GitHub

After base64String is transferred to another platform it could be easily decompressed using zlib library.

Example of simple Java class which performs decompression:

	package com.vmykhailyk.compression.deflate;

	import org.apache.commons.codec.binary.Base64;

	import java.io.ByteArrayOutputStream;
	import java.util.zip.DataFormatException;
	import java.util.zip.Inflater;

	public class DeflateDecompressor {
	private Inflater decompressor;
	private ByteArrayOutputStream outputStream;

	public DeflateDecompressor() {
	decompressor = new Inflater();
	outputStream = new ByteArrayOutputStream();
	}

	public String decompress(String data) {
	try {
	decompressor.reset();
	outputStream.reset();
	decompressor.setInput(Base64.decodeBase64(data));

	byte[] buffer = new byte[1024];
	while (!decompressor.finished()) {
	int dataLength = decompressor.inflate(buffer);
	outputStream.write(buffer, 0, dataLength);
	}

	return new String(outputStream.toByteArray(), "UTF-8");
	} catch (DataFormatException e) {
	e.printStackTrace();
	} catch (Exception e) {
	e.printStackTrace();
	}
	return null;
	}
	}

view raw gistfile1.java hosted with ❤ by GitHub

Summary:
Deflate integration is not complicated, it works fine with international strings (if you will convert data to byte array) and there are plenty of code on all platforms to decompress the data. In the next post I will describe how to integrate lzw librariees with other platforms.

Previous Post: Compression performance test
Starting Post: Libraries and Test conditions
Next Post: Integration of LZW compression with other platforms.

8 comments:

EthanNovember 17, 2011 at 8:34 AM
Hi Volodymyr,
Thanks for your post. It's very helpful to me.
I have a question: where is the "compress()" method from in "var compressedBytes = compress(originalBytes);"? Are you using dankogai/js-deflate or any other js libs?
UnknownNovember 23, 2011 at 8:58 AM
Hi Ethan,

compress() funciton is abstraction of the call to the actual
compression library.

In case of dankogai-js-deflate it will be :
RawDeflate.deflate(data, level);

In case of onicios-deflate:
zip_deflate(data, level);
AnonymousAugust 10, 2013 at 2:37 AM
Hi,

do you have a working example of this? I implemented this end to end and the Java side is throwing exceptions.

Thanks!
UnknownDecember 4, 2013 at 12:24 AM
I have this working, but if I change the text, the decompressor does not work any more. How did you decide what to set the header with?
KKFebruary 1, 2014 at 11:13 AM
Hi,

var headerBytes = String.fromCharCode(120, 218);

Can you please explain me how this gives the header bytes. Means, will the parameters 120 and 218 change ??

Friday, November 4, 2011

Integration of JS Deflate Compression with other platforms

8 comments: