# Shannon

The creator of Information Theory

# On Channel Capacity

In this post I will discuss about channel capacity. The question I am going to address here is at what rate can data be transmitted through a channel. Channel is defined as a probability map ${P_{Y|X}(y|x)}$. That is given an input with what probability can it produce different output. Here, X is our input and Y is the output of the channel. Let us take an example.

The Noisy Typewriter: Let us take this funny typewriter which behaves in the following manner. When you press the key A it might write a letter A on the paper with probability 1/2 or it might write B with probability 1/2. When you press the key B it might write a letter B on the paper with probability 1/2 or it might write C with probability 1/2, and so on for each of the letters as shown in the figure below (for Z it will write letter A with probability 1/2). Consider the following problem: Alice wants to send a message to Bob (who has the knowledge of the channel characteristics, i.e. ${P_{Y|X}(y|x)}$). Alice types the message on a sheet of paper using the funny typewriter described above. Bob is now given the sheet of paper on which the message was typed. Can he determine what letters were typed to produce the message? Clearly, if Alice used all the 26 keys to write the message, then, Bob cannot determine what key strokes (input) produced the result on the paper. For example, if the alphabet on the paper is ‘B’ then it could have been created by typing an ‘A’ or ‘B’. We now ask the following question: Is there a way in which Alice can use this typewriter in such a manner that Bob can determine the message that was typed without any error? What if, Alice now restricts to typing a message using the letters in the following set ${\chi = \{A, C, E, . . . ., Y\}}$. If Bob is given a message produced from the keys in the set ${\chi}$ then he can tell which keys were used to produce the message, as, now each output symbol is mapped to a single input. The same is true for the set ${\chi^{c}=\{B, D, ... , Z\}}$. By restricting the set of input characters, Alice has made the set of outputs disjoint. That is its only letter A that can cause the output A and B (see the above figure). No other letter when typed from the set ${\chi}$ can cause these outputs.

In ‘information theory’ information is generally represented in bits. 13 alphabets can be represented by ${log_{2}13}$ bits. As bits representing the letters from the set ${\chi}$ can be sent without error in decoding the message. We say that Alice can communicate with Bob at a rate ${log_{2}13}$ bits per second (bps). Shannon answered this question in his paper – “A mathematical Theory of Communication”.

Let us look at the problem in a more general setting. Alice wants to send some message to Bob.

1. Alice uses the input alphabet ${\mathcal{X}}$.
2. Bob receives the output from the output alphabet ${\mathcal{Y}}$.
3. Both Alice and Bob know about the channel charecteristics.

At what rate can Alice send information to Bob so that Bob can decode the message without any error. For devising such a coding scheme Alice now has to use those set of codewords (strings of the alphabet ${\mathcal{X}}$) so that Bob can map back each output that he gets to one single input. This means the rate at which Alice can transmit information is ${log_{2}C}$. Here, ${C}$ is the number of codewords that Alice can create from alphabet ${\mathcal{X}}$ so that the output which Bob gets are disjoint for all codewords.