B1.14

Information Theory
Part II, 2001

Let p1,,pnp_{1}, \ldots, p_{n} be a probability distribution, with p=maxi[pi]p^{*}=\max _{i}\left[p_{i}\right]. Prove that

(i)ipilogpiplogp(1p)log(1p)( ii )ipilogpilog(1/p); and ( iii )ipilogpi2(1p)\begin{aligned} &(i)-\sum_{i} p_{i} \log p_{i} \geqslant-p^{*} \log p^{*}-\left(1-p^{*}\right) \log \left(1-p^{*}\right) \\ &(\text { ii })-\sum_{i} p_{i} \log p_{i} \geqslant \log \left(1 / p^{*}\right) ; \text { and } \\ &(\text { iii })-\sum_{i} p_{i} \log p_{i} \geqslant 2\left(1-p^{*}\right) \end{aligned}

All logarithms are to base 2 .

[Hint: To prove (iii), it is convenient to use (i) for p12p^{*} \geqslant \frac{1}{2} and (ii) for p12p^{*} \leqslant \frac{1}{2}.]

Random variables XX and YY with values xx and yy from finite 'alphabets' II and JJ represent the input and output of a transmission channel, with the conditional probability p(xy)=P(X=xY=y)p(x \mid y)=\mathbb{P}(X=x \mid Y=y). Let h(p(y))h(p(\cdot \mid y)) denote the entropy of the conditional distribution p(y),yJp(\cdot \mid y), y \in J, and h(XY)h(X \mid Y) denote the conditional entropy of XX given YY. Define the ideal observer decoding rule as a map f:JIf: J \rightarrow I such that p(f(y)y)=maxxIp(xy)p(f(y) \mid y)=\max _{x \in I} p(x \mid y) for all yJy \in J. Show that under this rule the error probability

πer(y)=xIxf(y)p(xy)\pi_{\mathrm{er}}(y)=\sum_{\substack{x \in I \\ x \neq f(y)}} p(x \mid y)

satisfies πer(y)12h(p(y))\pi_{\mathrm{er}}(y) \leqslant \frac{1}{2} h(p(\cdot \mid y)), and the expected value satisfies

Eπer(Y)12h(XY)\mathbb{E} \pi_{\mathrm{er}}(Y) \leqslant \frac{1}{2} h(X \mid Y)