哈尔滨工业大学深圳模式识别考试重要地知识点.docx-资源下载

哈尔滨工业大学深圳模式识别考试重要地知识点.docx

1、哈尔滨工业大学深圳模式识别考试重要地知识点 ( i | j) be the loss incurred for taking action i when the state of nature is j.action i assign the sample into any class-Conditional risk for i = 1,a Select the action i for which R( i | x) is minimumR is minimum and R in this case is called the Bayes risk = best reasonable

2、result that can be achieved! ij :loss incurred for deciding i when the true state of nature is jgi(x) = - R( i | x)max. discriminant corresponds to min. riskgi(x) = P( i | x)max. discrimination corresponds to max. posteriorgi(x) p(x | i) P( i) gi(x) = ln p(x | i) + ln P( i)问题由估计似然概率变为估计正态分布的参数问题极大似然

3、估计和贝叶斯估计结果接近相同，但方法概念不同Please present the basic ideas of the maximum likelihood estimation method and Bayesian estimation method. When do these two methods have similar results ?请描述最大似然估计方法和贝叶斯估计方法的基本概念。什么情况下两个方法有类似的结果？IMaximum-likelihood view the parameters as quantities whose values are fixed but u

4、nknown. The best estimate of their value is defined to be the one that maximizes the probability of obtaining the samples actually observed.IIBayesian methods view the parameters as random variables having some known prior distribution. Observation of the samples converts this to a posterior density

5、, thereby revising our opinion about the true values of the parameters.IIIUnder the condition that the number of the training samples approaches to the infinity, the estimation of the mean obtained using Bayesian estimation method is almost identical to that obtained using the maximum likelihood est

6、imation method.最小风险决策通常有一个更低的分类准确度相比于最小错误率贝叶斯决策。然而，最小风险决策能够避免可能的高风险和损失。贝叶斯参数估计方法。Vectorize the samples.Calculation of the mean of all training samples.Calculation of the covariance matrixCalculation of eigenvectors and eigenvalue of the covariance matrix. Build the feature space.Feature extraction o

7、f all samples. Calculation the feature value of every sample.Calculation of the test sample feature value.Calculation of the samples of training samples like the above step.Find the nearest training sample as the result. Exercises1.How to use the prior and likehood to calculate the posterior ? What

8、is the formula ?怎么用先验概率和似然函数计算后验概率？公式是什么？P(j | x) = p(x | j) . P(j) / p(x), 2.Whats the difference in the ideas of the minimum error Bayesian decision and minimum risk Bayesian decision? Whats the condition that makes the minimum error Bayesian decision identical to the minimum risk Bayesian decisio

9、n?最小误差贝叶斯决策和最小风险贝叶斯决策的概念的差别是什么？什么情况下最小误差贝叶斯决策和最小风险贝叶斯决策是一致的（相同的）？答：在两类问题中，若有，即所谓对称损失函数的情况，则这时最小风险的贝叶斯决策和最小误差的贝叶斯决策方法显然是一致的。 the minimum error Bayesian decision: to minimize the classification error of the Bayesian decision. the minimum risk Bayesian decision: to minimize the risk of the Bayesian dec

10、ision. if R(1 | x) R(2 | x) action 1: “decide 1” is takenR(1 | x) = 11P(1 | x) + 12P(2 | x)R(2 | x) = 21P(1 | x) + 22P(2 | x) 3.A person takes a lab test of nuclear radiation and the result is positive. The test returns a correct positive result in 99% of the cases in which the nuclear radiation is

11、actually present, and a correct negative result in 95% of the cases in which the nuclear radiation is not present. Furthermore, 3% of the entire population are radioaetively eontaminated. Is this person eontaminated?一人在某实验室做了一次核辐射检测，结果是阳性的。当核辐射真正存在时，检测结果返回正确的阳性概率是99%；当核辐射不存在时，结果返回正确的阴性的概率是95%。而且，所有被

12、测人群中有3%的人确实被辐射污染了。那么这个人被辐射污染了吗？答：被辐射污染概率未被辐射污染概率X表示阳性，表示阴性，则有如下结论：，。则根据贝叶斯决策规则有：所以这个人未被辐射污染。4.Please present the basic ideas of the maximum likehood estimation method and Bayesian estimation method. When do these two methods have similar results ?请描述最大似然估计方法和贝叶斯估计方法的基本概念。什么情况下两个方法有类似的结果？答：I. 设有一个

13、样本集，要求我们找出估计量，用来估计所属总体分布的某个真实参数使得带来的贝叶斯风险最小，这就是贝叶斯估计的概念。(另一种说法：把待估计的参数看成是符合某种先验概率分布的随机变量；对样本进行观测的过程，就是把先验概率密度转化为后验概率密度，这样就利用样本的信息修正了对参数的初始估计值)II. 最大似然估计法的思想很简单：在已经得到试验结果的情况下，我们应该寻找使这个结果出现的可能性最大的那个作为真的估计。III.在训练样本数目接近无穷时，使用贝叶斯估计方法获得的平均值估计几乎和使用最大似然估计的方法获得的平均值一样题外话：Prior + samplesIMaximum-likelihood vi

14、ew the parameters as quantities whose vales are fixed but unknown. The best estimate of their value is defined to be the one that maximizes the probability of obtaining the samples actually observed.IIBayesian methods view the parameters as random variables having some known prior distribution. Obse

15、rvation of the samples converts this to a posterior density, thereby revising our opinion about the true values of the parameters.IIIUnder the condition that the number of the training samples approaches to the infinity, the estimation of the mean obtained using Bayesian estimation method is almost

16、identical to that obtained using the maximum likehood estimation method.5.Please present the nature of principal component analysis.请描述主成分分析法的本质答：主成分分析也称主分量分析，旨在利用降维的思想，把多指标转化为少数几个综合指标。Capture the component that varies the most.(变化最大)The component that varies the most contains main information of th

17、e samples（信息量最大）We also say that PCA is the optimal representation method, which allows us to obtain the minimum reconstruction error.（最小重构误差）As the transform axes of PCA are orthogonal, it is also referred to as an orthogonal transform method.（正交变换）PCA is also a de-correlation method.（不相关法）PCA can

18、be also used as a compression method and is able to obtain a high compression ratio.（高压缩比）6.Describe the basic idea and possible advantage of Fisher discriminant analysis. 描述Fisher判别分析的基本概念和可能的优势答：Fisher准则是典型的模式识别方法，它强调将线性方法中的法向量与样本的乘积看做样本向量在单位法向量上的投影。所获得的结果与正态分布协方差矩阵等的贝叶斯决策结果类似，这说明如果两类分布围绕各自均值的确相近

19、，Fisher准则可使错误率较小。SupervisedMaximize the between-class distance and minimize the within-class distanceExploit the training sample to produce transform axes.(number of effective Fisher transform axes, c-1; how to avoid singular within-class scatter matrix-PCA+FDA)7.What is the K nearest neighbor class

20、ifier ? Is it reasonable ?什么是K近邻分类器，它合理吗？答：近邻法的基本思想是在测试样本x的k个近邻中，按出现最多的样本类别来作为x的类别，即先对x的k个近邻一一找出它们的类别，然后最x类进行判别。在k近邻算法中，若样本相对较稀疏，只按照前k个近邻样本的顺序而不考虑其距离差别以决策测试样本x的类别是不适当的，尤其是当k取值较大时。K nearest neighbor classifier view satisfy the k nearest neighbor rule ,the rule classifies x by assigning it the label

21、most fequently represented among the k nearest samples; in other words, a decision is made b examining the labels on the k nearest neighbors and taking a vote.8.Is it possible that a classifier can obtain a higher accuracy for any dataset than any other classifier? 一个分类器比其他分类器在任何数据集上都能获得更高的精度，可能吗？答：

22、显然不可能的。这个理由很多。NO,9.Please describe the over-fitting problem.请描述过度拟合的问题答：过拟合：为了得到一致假设而使假设变得过度复杂称为过拟合。想像某种学习算法产生了一个过拟合的分类器，这个分类器能够百分之百的正确分类样本数据（即再拿样本中的文档来给它，它绝对不会分错），但也就为了能够对样本完全正确的分类，使得它的构造如此精细复杂，规则如此严格，以至于任何与样本数据稍有不同的文档它全都认为不属于这个类别！过拟合问题就是分类器分的太细了，太具体，Over-fitting generally occurs when a model is ex

23、cessively complex, such as having too many parameters relative to the number of observations. A model which has been over-fit will generally have poor predictive performance, as it can exaggerate minor fluctuations in the data.10.Usually a more complex learning algorithm can obtain a higher accuracy

24、 in the training stage. So, should a more complex learning algorithm be favored ?通常一个更复杂的学习算法在训练阶段能获得更高的精度。那么我就该选择更复杂的学习算法吗？答：不No context-independent or usage-independent reasons to favor one learning or classification method over another to obtain good generalization performance.When confronting a

25、new pattern recognition problem, we need focus on the aspects prior information, data distribution, amount of training data and cost or reward functions.Ugly Duckling Theorem: an analogous theorem, addresses features and patterns. shows that in the absence of assumptions we should not prefer any lea

26、rning or classification algorithm over another.11.Under the condition that the number of the training samples approaches to the infinity, the estimation of the mean obtained using Bayesian estimation method is almost identical to that obtained using the maximum likehood estimation method. Is this st

27、atement correct ?在训练样本数目接近无穷时，使用贝叶斯估计方法获得的平均值估计几乎和使用最大似然估计的方法获得的平均值一样。这种情况正确吗？答：理由同第4题，没找到。YES12.Can the minimum squared error procedure be used for binary classification ? 最小平方误差方法能用于2维数据的分类吗答：略Yes, the minimum squared error procedure can be used for binary classification. , .A simple way to set :

28、if is from the first class, then is set to 1; if is from the second class, then is set to -1.Another simple way to set : if is from the first class, then is set to ; if is from the second class, then is set to -.13.Can you devise a minimum squared error procedure to perform multiclass classification

29、 ? 你能设计出一个能多级别识别的最小平方误差方法吗？14.Which kind of applications is the Markov model suitable for ?Markov模型适合哪类应用？答：Markov model has found greatest use in such problems， for instance speech recognition or gesture recognition.（语音、手势识别）The evaluation problemThe decoding problemThe learning problem15.For minim

30、um squared error procedure based on Ya=b (Y is the matrix consisting of all the training samples), if we have proper b and criterion function, then this minimum squared error procedure might be equivalent to Fisher discriminant analysis. Is this presentation correct ?对于基于Ya=b的最小平方误差方法，如果我们有合适的b和判别函数

31、，那么最小平方误差方法就会和Fisher判别方法等价。这么说对吗？答：中文书198页，英文书pdf的289页，章节5.8.2。豆丁上的课件 16.Suppose that the number of the training samples approaches to the infinity, then the minimum error Bayesian decision will perform better than any other classifier achieving a lower classification error rate. Do you agree on this ?假设训练样本的数目接近无穷，那么最小误差贝叶斯决策会比其他分类器的分类误差率更小。你同意这种观点吗？答：待定17.What are the upper and lower bound of the classification error rate of the K nearest neighbor classifier ?K近邻方法的分类误差上界与下界是什么？答：不同k值的k近邻法错误率不同，k=1时为最近邻法的情况（上、下界分别为贝叶斯错误率P*和）。当k增加时，上限逐渐靠近下限-贝叶斯错误率P*。当k趋于无

邮箱/手机：
温馨提示：	快捷下载时，用户名和密码都是您填写的邮箱或者手机号，方便查询和重复下载（系统自动生成）。如填写123，账号就是123，密码也是123。
特别说明：	请自助下载，系统不会自动发送文件的哦；如果您已付费，想二次下载，请登录后访问：我的下载记录
支付方式：
验证码：	换一换

账号：
密码：
验证码：	换一换
当日自动登录忘记密码？

哈尔滨工业大学深圳 模式识别 考试重要地知识点.docx