I will describe what need to be done in order to successfully compress string (including international characters) and decompress it using Java.
All LZW algorithms I located:
- Could work only with characters which code-point doesn't exceed 256 (non international characters). So you need to convert original string into byte representation as in previous library.
- Produce array of integers as output after compression. So you need to think of a way how this array could be transferred over the XMLHTTPRequest.
Regarding the second item there are lot of options:
- JSON data
- Bencode
- Serialization to string with separators
In current example for simplicity I am using data as serialized string. For example:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
//array of integers | |
var t = [123, 5468, 215,4,6543] | |
//will be serialized using: | |
//t.join(",") | |
//to following string: | |
"123,5468,215,4,6543" |
JavaScript code
In order to compress the data on JS side you need to perform 3 steps:
1. Convert string to the byte array. Note: byte array should also be represented as solit string where every character is single byte (code: array.join("")). LZW libraries could accept only strings as input.
2. Compress the data using any LZW library.
3. Serialize result
For this test I converting array to the text representation.
Note: In real world example this is not the most efficient way to transmit integer arrays.
In order to compress the data on JS side you need to perform 3 steps:
1. Convert string to the byte array. Note: byte array should also be represented as solit string where every character is single byte (code: array.join("")). LZW libraries could accept only strings as input.
2. Compress the data using any LZW library.
3. Serialize result
For this test I converting array to the text representation.
Note: In real world example this is not the most efficient way to transmit integer arrays.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
var originalString = "some data that should be compressed"; | |
//convert string to bytes array | |
var originalBytes = stringToByteArray(originalString).join(""); | |
//compress data | |
var compressedData = compress(originalBytes); | |
//serialize the data | |
var serializedData = compressedData.join(","); |
Java code:
Extraction of code which accepts input as String presented above ("123,5468,215,4,6543"). And decompress it to the UTF-8 string. Use decompress method as entry point
Note: This code doesn't contain any input validation or error handling code but it should give basic idea how to decompress data on the other side of html. decompressData method is taken from here.
Extraction of code which accepts input as String presented above ("123,5468,215,4,6543"). And decompress it to the UTF-8 string. Use decompress method as entry point
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package com.vmykhailyk.compression.lzw; | |
import java.io.UnsupportedEncodingException; | |
import java.nio.ByteBuffer; | |
import java.util.ArrayList; | |
import java.util.HashMap; | |
import java.util.List; | |
import java.util.Map; | |
public class LZWModuleDecompression { | |
/** | |
* Decompress a list of output ks to a string. | |
*/ | |
private String decompressData(List<Integer> compressed) { | |
// Build the dictionary. | |
int dictSize = 256; | |
Map<Integer, String> dictionary = new HashMap<Integer, String>(); | |
for (int i = 0; i < 256; i++) | |
dictionary.put(i, "" + (char) i); | |
String w = "" + (char) (int) compressed.remove(0); | |
String result = w; | |
for (int k : compressed) { | |
String entry; | |
if (dictionary.containsKey(k)) | |
entry = dictionary.get(k); | |
else if (k == dictSize) | |
entry = w + w.charAt(0); | |
else | |
throw new IllegalArgumentException("Bad compressed k: " + k); | |
result += entry; | |
// Add w+entry[0] to the dictionary. | |
dictionary.put(dictSize++, w + entry.charAt(0)); | |
w = entry; | |
} | |
return result; | |
} | |
private byte[] getCharsAsBytes(String decompressed) { | |
int length = decompressed.length(); | |
ByteBuffer buffer = ByteBuffer.allocate(length); | |
for (int i = 0; i < length; i++) { | |
buffer.put((byte) decompressed.codePointAt(i)); | |
} | |
return buffer.array(); | |
} | |
public String decompress(String data) { | |
try { | |
String[] intsAsString = data.split(","); | |
ArrayList<Integer> integers = new ArrayList<Integer>(); | |
for (String anIntsAsString : intsAsString) { | |
integers.add(Integer.parseInt(anIntsAsString)); | |
} | |
String decompressed = decompressData(integers); | |
return new String(getCharsAsBytes(decompressed), "UTF-8"); | |
} catch (UnsupportedEncodingException e) { | |
e.printStackTrace(); | |
} | |
return ""; | |
} | |
} |
Note: This code doesn't contain any input validation or error handling code but it should give basic idea how to decompress data on the other side of html. decompressData method is taken from here.
Summary:
Code for LZW alghhorithms are very small, fast and simple. They allow you quickly compress required data and decompress without any problems. However before usage you need to decide how integers arrays will be transmitted between compressor/decompressor. Also compression ratio is not as good as Deflate
Code for LZW alghhorithms are very small, fast and simple. They allow you quickly compress required data and decompress without any problems. However before usage you need to decide how integers arrays will be transmitted between compressor/decompressor. Also compression ratio is not as good as Deflate
Previous Post: Integration of JS Deflate Compression with other platforms
Starting Post: Libraries and Test conditions
Next Post: TBD: Integration of LZMA compression with other platforms.
No comments:
Post a Comment