Please use this identifier to cite or link to this item:

Huffman coding with unequal letter costs

Authors Golin, Mordecai J. View this author's profile
Kenyon, Claire
Young, Neal E.
Issue Date 2002
Summary In the standard Huffman coding problem, one is given a set of words and for each word a positive frequency. The goal is to encode each word w as a codeword c(w) over a given alphabet. The encoding must be prefix free (no codeword is a prefix of any other) and should minimize the weighted average codeword size ∑w freq(w) \c(w)\. The problem has a well-known polynomial-time algorithm due to Huffman [15]. Here we consider the generalization in which the letters of the encoding alphabet may have non-uniform lengths. The goal is to minimize the weighted average codeword length ∑wfreq(w) cost(c(w)), where cost(s) is the sum of the (possibly non-uniform) lengths of the letters in s. Despite much previous work, the problem is not known to be NP-hard, nor was it previously known to have a polynomial-time approximation algorithm. Here we describe a polynomial-time approximation scheme (PTAS) for the problem.
Language English
Format Technical report
Files in this item:
File Description Size Format
200202.pdf 127568 B Adobe PDF