斯坦福大学机器学习所有问题及答案合集.pdf

资源描述

斯坦福大学机器学习所有问题及答案合集.pdf

《斯坦福大学机器学习所有问题及答案合集.pdf》由会员分享，可在线阅读，更多相关《斯坦福大学机器学习所有问题及答案合集.pdf（97页珍藏版）》请在冰豆网上搜索。

斯坦福大学机器学习所有问题及答案合集.pdf

CS229机器学习（问题及答案）斯坦福大学目录

（1）作业1（SupervisedLearning）1

（2）作业1解答（SupervisedLearning）5（3）作业2（Kernels,SVMs,andTheory）15（4）作业2解答（Kernels,SVMs,andTheory）19（5）作业3（LearningTheoryandUnsupervisedLearning）27（6）作业3解答（LearningTheoryandUnsupervisedLearning）31（7）作业4（UnsupervisedLearningandReinforcementLearning）39（8）作业4解答（UnsupervisedLearningandReinforcementLearning）44（9）ProblemSet#1:

SupervisedLearning56（10）ProblemSet#1Answer62（11）ProblemSet#2:

ProblemSet#2:

NaiveBayes,SVMs,andTheory78（12）ProblemSet#2Answer85CS229ProblemSet#11CS229,PublicCourseProblemSet#1:

SupervisedLearning1.NewtonsmethodforcomputingleastsquaresInthisproblem,wewillprovethatifweuseNewtonsmethodsolvetheleastsquaresoptimizationproblem,thenweonlyneedoneiterationtoconvergeto.（a）FindtheHessianofthecostfunctionJ（）=12Pmi=1（Tx（i）y（i）2.（b）ShowthatthefirstiterationofNewtonsmethodgivesus=（XTX）1XTy,thesolutiontoourleastsquaresproblem.2.Locally-weightedlogisticregressionInthisproblemyouwillimplementalocally-weightedversionoflogisticregression,whereweweightdifferenttrainingexamplesdifferentlyaccordingtothequerypoint.Thelocally-weightedlogisticregressionproblemistomaximize（）=2T+mXi=1w（i）hy（i）logh（x（i）+（1y（i）log（1h（x（i）i.The2Thereiswhatisknownasaregularizationparameter,whichwillbediscussedinafuturelecture,butwhichweincludeherebecauseitisneededforNewtonsmethodtoperformwellonthistask.Fortheentiretyofthisproblemyoucanusethevalue=0.0001.Usingthisdefinition,thegradientof（）isgivenby（）=XTzwherezRmisdefinedbyzi=w（i）（y（i）h（x（i）andtheHessianisgivenbyH=XTDXIwhereDRmmisadiagonalmatrixwithDii=w（i）h（x（i）（1h（x（i）Forthesakeofthisproblemyoucanjustusetheaboveformulas,butyoushouldtrytoderivetheseresultsforyourselfaswell.Givenaquerypointx,wechoosecomputetheweightsw（i）=exp?

|xx（i）|222?

.Muchlikethelocallyweightedlinearregressionthatwasdiscussedinclass,thisweightingschemegivesmorewhenthe“nearby”pointswhenpredictingtheclassofanewexample.1CS229ProblemSet#12（a）ImplementtheNewton-Raphsonalgorithmforoptimizing（）foranewquerypointx,andusethistopredicttheclassofx.Theq2/directorycontainsdataandcodeforthisproblem.Youshouldimplementthey=lwlr（Xtrain,ytrain,x,tau）functioninthelwlr.mfile.Thisfunc-tiontakesasinputthetrainingset（theXtrainandytrainmatrices,intheformdescribedintheclassnotes）,anewquerypointxandtheweightbandwitdhtau.Giventhisinputthefunctionshould1）computeweightsw（i）foreachtrainingexam-ple,usingtheformulaabove,2）maximize（）usingNewtonsmethod,andfinally3）outputy=1h（x）0.5astheprediction.Weprovidetwoadditionalfunctionsthatmighthelp.TheXtrain,ytrain=loaddata;functionwillloadthematricesfromfilesinthedata/folder.Thefunc-tionplotlwlr（Xtrain,ytrain,tau,resolution）willplottheresultingclas-sifier（assumingyouhaveproperlyimplementedlwlr.m）.Thisfunctionevaluatesthelocallyweightedlogisticregressionclassifieroveralargegridofpointsandplotstheresultingpredictionasblue（predictingy=0）orred（predictingy=1）.Dependingonhowfastyourlwlrfunctionis,creatingtheplotmighttakesometime,sowerecommenddebuggingyourcodewithresolution=50;andlaterincreaseittoatleast200togetabetterideaofthedecisionboundary.（b）Evaluatethesystemwithavarietyofdifferentbandwidthparameters.Inparticular,try=0.01,0.050.1,0.51.0,5.0.Howdoestheclassificationboundarychangewhenvaryingthisparameter?

Canyoupredictwhatthedecisionboundaryofordinary（unweighted）logisticregressionwouldlooklike?

3.MultivariateleastsquaresSofarinclass,wehaveonlyconsideredcaseswhereourtargetvariableyisascalarvalue.Supposethatinsteadoftryingtopredictasingleoutput,wehaveatrainingsetwithmultipleoutputsforeachexample:

（x（i）,y（i）,i=1,.,m,x（i）Rn,y（i）Rp.Thusforeachtrainingexample,y（i）isvector-valued,withpentries.Wewishtousealinearmodeltopredicttheoutputs,asinleastsquares,byspecifyingtheparametermatrixiny=Tx,whereRnp.（a）ThecostfunctionforthiscaseisJ（）=12mXi=1pXj=1?

（Tx（i）jy（i）j?

2.WriteJ（）inmatrix-vectornotation（i.e.,withoutusinganysummations）.Hint:

StartwiththemndesignmatrixX=（x

（1）T（x

（2）T.（x（m）T2CS229ProblemSet#13andthemptargetmatrixY=（y

（1）T（y

（2）T.（y（m）TandthenworkouthowtoexpressJ（）intermsofthesematrices.（b）FindtheclosedformsolutionforwhichminimizesJ（）.Thisistheequivalenttothenormalequationsforthemultivariatecase.（c）Supposeinsteadofconsideringthemultivariatevectorsy（i）allatonce,weinsteadcomputeeachvariabley（i）jseparatelyforeachj=1,.,p.Inthiscase,wehaveapindividuallinearmodels,oftheformy（i）j=Tjx（i）,j=1,.,p.（Sohere,eachjRn）.Howdotheparametersfromthesepindependentleastsquaresproblemscomparetothemultiv

展开阅读全文