# LZW Algorithm

What is LZW Algorithm?

    LZW Algorithm is a universal Lossless
data compression algorithm created by Abraham Lempel , Jacob Ziv and Terry Welch.

This Algorithm is simple to implement and has the potential for very high Throughput in hardware
implementations.

LZW can be made really fast; it grabs a fixed number of bits from input stream, so bit parsing is
very easy. Table lookup is automatic.

[LZW] is a data compression method that takes
advantage of this repetition. The original version of
the method was created by Lempel and Ziv in 1978 (LZ78) and was further refined by Welch in 1984,
hence the LZW acronym. Like any adaptive/dynamic compression method, the idea is to
(2) read data piece by piece,
(3) and update the model and encode the data as you go along.
LZW is a "dictionary"-based compression algorithm. This means that instead of tabulating character counts
and building trees (as for Huffman encoding), LZW encodes data by referencing a dictionary. Thus, to encode
a substring, only a single code number, corresponding to that substring's index in the dictionary, needs to
be written to the output file. Although LZW is often explained in the context of compressing text files, it
can be used on any type of file. However, it generally performs best on files with repeated substrings, such as
text files.


##Here I am trying to demonstrate example of compression and decompression of string 'banana_bandana' using LZW algorithm Now trying to create dictionary with index of above individual strings.

Index Entry
0 a
1 b
2 d
3 n
4 _

# Compression

## Pseudo code

string s;
char ch;
...

s = empty string;
while (there is still data to be read)
{
if (dictionary contains s+ch)
{
s = s+ch;
}
else
{
encode s to output file;
s = ch;
}
}
encode s to output file;


## code

    # Building and initializing the dictionary.
dictionary_size = 256
dictionary = {chr(i): i for i in range(dictionary_size)}
string = ""             # String is null.
compressed_data = []    # variable to store the compressed data.

# iterating through the input symbols.
# LZW Compression algorithm
for symbol in data:
string_plus_symbol = string + symbol # get input symbol.
if string_plus_symbol in dictionary:
string = string_plus_symbol
else:
compressed_data.append(dictionary[string])
if(len(dictionary) <= maximum_table_size):
dictionary[string_plus_symbol] = dictionary_size
dictionary_size += 1
string = symbol

if string in dictionary:
compressed_data.append(dictionary[string])


## Example

Input Current String Encoded output New Dictionary Index
b b Nothing none none
ba ba 1 ba 5
ban an 1,0 an 6
bana na 1,0,3 na 7
banan an - - -
banana ana 1,0,3,6 ana 8
banana_ a_ 1,0,3,6,0 a_ 9
banana_b _b 1,0,3,6,0,4 _b 10
banana_ba ba - - -
banana_ban ban 1,0,3,6,0,4,5 ban 11
banana_band nd 1,0,3,6,0,4,5,3 nd 12
banana_banda da 1,0,3,6,0,4,5,3,2 da 13
banana_bandan an - - -
banana_bandana ana 1,0,3,6,0,4,5,3,2,8 - -

# Decompression

## Pseudocode

string entry;
char ch;
int prevcode, currcode;
...

prevcode = read in a code;
decode/output prevcode;
while (there is still data to read)
{
currcode = read in a code;
entry = translation of currcode from dictionary;
output entry;
ch = first char of entry;
add ((translation of prevcode)+ch) to dictionary;
prevcode = currcode;
}


##code

    maximum_table_size = pow(2,int(n))
file = open(input_file, "rb")
compressed_data = []
next_code = 256
decompressed_data = ""
string = ""

while True:
if len(rec) != 2:
break
(data, ) = unpack('>H', rec)
compressed_data.append(data)

# Building and initializing the dictionary.
dictionary_size = 256
dictionary = dict([(x, chr(x)) for x in range(dictionary_size)])

# iterating through the codes.
# LZW Decompression algorithm
for code in compressed_data:
if not (code in dictionary):
dictionary[code] = string + (string[0])
decompressed_data += dictionary[code]
if not(len(string) == 0):
dictionary[next_code] = string + (dictionary[code][0])
next_code += 1
string = dictionary[code]


## Example

Encoded input Dictionary Translation Decoded Input Current String New Dictionary Index
1 1=b b - none none
1,0 0=a ba b ba 5
1,0,3 3=n ban a an 6
1,0,3,6 6=an banan n na 7
1,0,3,6,0 0=a banana an ana 8
1,0,3,6,0,4 4=_ banana_ a a_ 8
1,0,3,6,0,4 5=ba banana_ba _ _b 10
1,0,3,6,0,4,5,3 3=n banana_ban ba ban 11
1,0,3,6,0,4,5,3 2=d banana_band n nd 12
1,0,3,6,0,4,5,3,2,8 8=ana banana_bandana d da 13