PRML Notes - 1.6 Information Theory

Chilly_Rain posted @ 2012年1月27日 17:19 in Pattern Recognition and Maching Learning with tags 信息论 , 2301 阅读

信息量和香农熵

一个变量取值的信息量可以看作是它带来的“使人惊讶的程度”,一个必然事件没有任何信息量,而一个极其偶然的事件的发生则会使人非常“惊讶”,因而包括大量信息。

自然地,信息量的概率就与变量的概率分布联系在了一起。香农熵(Shannon Entropy)成功表达了一个离散型变量所带来的平均信息量:

\[ H(x)=-\sum_xp(x)log_2p(x)\]

注意到$limit_{p->0} plnp=0$,因此计算某个变量的香农熵时只考虑非零取值即可。另外,香农熵是非负的。

无噪声编码定理:香农熵是传递一个变量状态所需要的比特数的下界。也就是说,在期望意义下,对一个变量的取值进行编码所需要的最小的比特数即为香农熵。一般情况下,香农熵对数的底取2。

对于一个概率分布,当概率集中于较少的某几个取值时(绝大多数情况下变量会取少数的几个值之一),香农熵的值会较低,相反地,如果概率在各种取值上比较平均(几乎无法判断变量会取哪个值),那么香农熵会较高。使用拉格朗日乘子法(约束概率分布的归一化)计算香农熵的最大值,可知当概率分布是均匀分布时,香农熵可取到最大值$H=lnM$,其中M为变量的状态总数(所有可能取值的个数)。因此,香农熵也可以看作是一个变量不确定度的度量。

物理上关于香农熵的解释:mulitplicity, microstate, macrostate, weight

连续型变量的微分熵

对于一个连续型变量,无法直接使用上面香农熵的定义。可以近似地对连续型变量的取值进行离散化,将整个取值范围划分成宽度为$\Delta$的小区域。均值定值告诉我们,在每个小区域内总存在一个值$x_i$,使得以下等式成立

$$\int_{i\Delta}^{(i+1)\Delta} p(x)dx=p(x_i)\Delta$$

因此,我们可以把每个落入第i个小区域的的点赋予$x_i$。这样,我们就可以套用离散型变量的香农熵公式

$$H_{\Delta}=-\sum_ip(x_i)\Delta ln(p(x_i)\Delta)=-\sum_ip(x_i)\Delta ln(p(x_i))-ln\Delta$$

而当$\Delta$趋近于0时,上式最右侧第二项趋近于0,而第一个项则趋近的表达式称为微分熵(differential entropy):

$$H(x)=-\int p(x)lnp(x)dx$$

仍然使用拉格朗日乘子法,约束均值和方差,以及概率分布的归一化,可知在均值和方差一定的情况下,使微分熵最大的概率分布为正态分布。而正态分布的微分熵表达式为

$$H(x)=\frac{1}{2}(1+ln(2\pi\sigma^2))$$

由以上的表达式可知,香农熵随着方差而增大。同时,我们也可以看出,与离散型变量的香农熵不同,微分熵可以是负的。

条件熵(conditional entropy)

$$H[y|x]=-\int \int p(x,y)lnp(y|x)dydx$$

$$H[x,y]=H[y|x]+H[x]$$

相对熵(relative entropy)

依然从编码角度来考虑,若一个变量的真实分布为$p(x)$,而我们实际上使用了$q(x)$来对这个变量进行编码,那么由此而使用了的多余的比特数定义为相对熵或者KL距离(Kullback-Leibler divergence)

$$KL(p||q)=-\int p(x)ln(q(x))dx - (-\int p(x)ln(p(x))dx) = -\int p(x)ln(\frac{q(x)}{p(x)})dx$$

注意到,虽然名为距离,但是KL距离(相对熵)没有对称性。另外,相对熵是非负的,当且仅当$p(x)=q(x)$时相对熵取零。其证明用到了以下内容:

凸函数定义为$f(\lambda a + (1-\lambda)b)<=\lambda f(a)+(1-\lambda)f(b)$。等价地,函数的二阶导数各处均非负。如果仅当$\lambda=0,1$时等号成立,那个这个函数称为严格凸函数。凸函数的相反数为凹函数。香农熵为凹函数。

简森不等式(Jensen's inequality)

$f(\sum_i \lambda_ix_i)<=\sum_i\lambda_if(x_i)$,其中$\lambda_i>=0, \sum_i\lambda_i=1$, $f(x)$为凸函数

如果$\lambda_i=p(x_i)$,那么有

  • 连续型:$$f(\int xp(x)dx)<=\int f(x)p(x)dx$$
  • 离散型:$$f(E[x])<=E[f(x)]$$

于是,可证相对熵的非负性

$$KL(p||q)=-\int p(x)ln(\frac{q(x)}{p(x)})dx >= -ln(\int q(x)dx)=0$$

其中$-lnx$严格凸函数,因而当且仅当$p=q$时取等号。

相对熵与似然函数的关系

假设未知真实分布为$p(x)$,我们希望使用一个参数模型$q(x|\theta)$结合N个观测数据来确定一个最优的$\theta$来模拟真实分布。一种自然的方法是使用KL距离做为误差函数,以最小化$p(x)$$q(x|\theta)$的KL距离为标准来确定最优的参数值。

$$KL(p||q)=\sum_i (-lnq(x_i|\theta)+lnp(x_i))$$

将上面的误差函数相对于参数$\theta$求导,可知:最小化KL距离等价于最大化似然函数

互信息(mutual information)

互信息描述了两个变量之间互相包含关于对方的信息量。定义为两个分布$p(x,y)$$p(x)p(y)$之间的KL距离

$$I(x, y)=KL(p(x,y)||p(x)p(y))=-\int \int p(x, y)ln(\frac{p(x)p(y)}{p(x,y)})dxdy$$

根据相对熵的非负性可知,互信息是非负的,当仅且当两个变量相互独立时互信息为零。

$$I(x, y)=H(x)-H(x|y)=H(y)-H(y|x)$$

由此可知,互信息可以看作,当已知一个变量的情况下,另一个变量不确定性降低的程度。

 

[终于搞定了第一章,春节假期任务完成^_^]

Avatar_small
NCERT Term 2 Sample 说:
2022年9月23日 17:25

In every school Generally Teachers will analyse students’ skills in a particular subject after completion of a certain amount of syllabus in the decided time. As part of this Analysis, they conduct Examinations regularly to test their student’s knowledge of the subject in the completed syllabus, that’s the way every 3rd class students also need to participate in Term 2 Exams for all subjects & languages of the course. In every school Generally Teachers will analyse students’ skills in a particular subject after completion of a certain amount of syllabus in the decided time. NCERT Term 2 Sample Paper Class 3 As part of this Analysis, they conduct Examinations regularly to test their student’s knowledge of the subject in the completed syllabus, that’s the way every 3rd class students also need to participate in Term 2 Exams for all subjects & languages of the course.

Avatar_small
Australia Public Hol 说:
2023年1月18日 18:45

Australia holidays are viewed differently with their State and county-wide celebrations, and there is different segregation that is shown as per the divided state and territory, and this list of public holidays is declared and is declared when the entire nation does go off from work. Australia Public Holidays 2023 These holidays are permanent holidays that are applicable to everyone who is working in certain states. National holidays are taken as most important, as every stream of working employees and companies does grant them leave which is for free.

Avatar_small
PNB Net Banking 说:
2023年1月28日 23:29

Punjab National Bank is a popular yet most preferred National Bank in India, and the Net Banking service for the bank allows its customers to utilize different services online without investing in the branch. PNB Net Banking As the modern days are invoking with the technology, the services from Banks through their Net Banking feature are in good spike, and there are multiple benefits for the customer who does get the access to the PNB Net Banking service.

Avatar_small
mutthu 说:
2023年4月02日 14:00

The Unified Multi-Purpose ID (UMID) is a Philippine identity card which was introduced in 2010. The card was developed as a single card for the relations between several government-related agencies. Previously, all the Indian Railways employees and pensioners who were part of the medical health and services had to carry a physical card/ health book by applying with railway employee application which is now replaced with a UMID card that can be easily accessed from Online browser or even your Smartphone app as well.In this article, we will be going over about Unique Medical Identity Card given by Indian Railways to its employees. We will also look into the registration, login, password retrieval, status check and process to download the UMID card as well.

Avatar_small
sample-questions-pa 说:
2023年7月01日 18:59

Our reporting team plans to release the Education & Recruitment Update for all age groups and provide inside coverage to show the real picture of current occurrences. We intend to publish news divided into General, Political, Crime, Sports, Entertainment, Education, and World News as an initiative of professionalsample-questions-paper.in writers who have come together for dedicated news coverage of recent events across the nation (India). Our goal is to meet the needs of people of all age groups.


登录 *


loading captcha image...
(输入验证码)
or Ctrl+Enter