This article is from the Biological Information Theory and Chowder Society FAQ, by Thomas D. Schneider toms@ncifcrf.gov.
If someone says that information = uncertainty = entropy, then they are
confused, or something was not stated that should have been. Those
equalities lead to a contradiction, since entropy of a system increases as
the system becomes more disordered. So information corresponds to disorder
according to this confusion.
If you always take information to be a decrease in uncertainty at the
receiver and you will get straightened out:
R = Hbefore - Hafter.
where H is the Shannon uncertainty:
H = - sum (from i = 1 to number of symbols) Pi log2 Pi (bits per symbol)
and Pi is the probability of the ith symbol. If you don't understand this,
please refer to "Is There a Quick Introduction to Information Theory
Somewhere?".
Imagine that we are in communication and that we have agreed on an alphabet.
Before I send you a bunch of characters, you are uncertain (Hbefore) as to
what I'm about to send. After you receive a character, your uncertainty goes
down (to Hafter). Hafter is never zero because of noise in the communication
system. Your decrease in uncertainty is the information (R) that you gain.
Since Hbefore and Hafter are state functions, this makes R a function of
state. It allows you to lose information (it's called forgetting). You can
put information into a computer and then remove it in a cycle.
Many of the statements in the early literature assumed a noiseless channel,
so the uncertainty after receipt is zero (Hafter=0). This leads to the
SPECIAL CASE where R = Hbefore. But Hbefore is NOT "the uncertainty", it is
the uncertainty of the receiver BEFORE RECEIVING THE MESSAGE.
A way to see this is to work out the information in a bunch of DNA binding
sites.
Definition of "binding": many proteins stick to certain special spots on DNA
to control genes by turning them on or off. The only thing that
distinguishes one spot from another spot is the pattern of letters
(nucleotide bases) there. How much information is required to define this
pattern?
Here is an aligned listing of the binding sites for the cI and cro proteins
of the bacteriophage (i.e., virus) named lambda:
alist 5.66 aligned listing of:
* 96/10/08 19:47:44, 96/10/08 19:31:56, lambda cI/cro sites
piece names from:
* 96/10/08 19:47:44, 96/10/08 19:31:56, lambda cI/cro sites
The alignment is by delila instructions
The book is from: -101 to 100
This alist list is from: -15 to 15
------ ++++++
111111--------- +++++++++111111
5432109876543210123456789012345
...............................
OL1 J02459 35599 + 1 tgctcagtatcaccgccagtggtatttatgt
J02459 35599 - 2 acataaataccactggcggtgatactgagca
OL2 J02459 35623 + 3 tttatgtcaacaccgccagagataatttatc
J02459 35623 - 4 gataaattatctctggcggtgttgacataaa
OL3 J02459 35643 + 5 gataatttatcaccgcagatggttatctgta
J02459 35643 - 6 tacagataaccatctgcggtgataaattatc
OR3 J02459 37959 + 7 ttaaatctatcaccgcaagggataaatatct
J02459 37959 - 8 agatatttatcccttgcggtgatagatttaa
OR2 J02459 37982 + 9 aaatatctaacaccgtgcgtgttgactattt
J02459 37982 - 10 aaatagtcaacacgcacggtgttagatattt
OR1 J02459 38006 + 11 actattttacctctggcggtgataatggttg
J02459 38006 - 12 caaccattatcaccgccagaggtaaaatagt
^
 
Continue to: