Let Σ1={μ1,…,μN} be a finite alphabet and X a random variable that takes each value μi with probability pi. Define the entropy H(X) of X.
Suppose Σ2={0,1} and c:Σ1→Σ2∗ is a decipherable code. Write down an expression for the expected word length E(S) of c.
Prove that the minimum expected word length S∗ of a decipherable code c:Σ1→Σ2∗ satisfies
H(X)⩽S∗<H(X)+1.
[You can use Kraft's and Gibbs' inequalities as long as they are clearly stated.]
Suppose a decipherable binary code has word lengths s1,…,sN. Show that
NlogN⩽s1+⋯+sN.
Suppose X is a source that emits N sourcewords a1,…,aN and pi is the probability that ai is emitted, where p1⩾p2⩾⋯⩾pN. Let b1=0 and bi=∑j=1i−1pj for 2⩽i⩽N. Let si=⌈−logpi⌉ for 1⩽i⩽N. Now define a code c by c(ai)=bi∗ where bi∗ is the (fractional part of the) binary expansion of bi to si decimal places. Prove that this defines a decipherable code.
What does it mean for a code to be optimal? Is the code c defined in the previous paragraph in terms of the bi∗ necessarily optimal? Justify your answer.