斯坦福大学机器学习所有问题及答案合集.pdf
《斯坦福大学机器学习所有问题及答案合集.pdf》由会员分享,可在线阅读,更多相关《斯坦福大学机器学习所有问题及答案合集.pdf(97页珍藏版)》请在冰豆网上搜索。
![斯坦福大学机器学习所有问题及答案合集.pdf](https://file1.bdocx.com/fileroot1/2022-10/7/ecd50399-5ead-42b5-9a8c-08ac5276fb8c/ecd50399-5ead-42b5-9a8c-08ac5276fb8c1.gif)
CS229机器学习(问题及答案)斯坦福大学目录
(1)作业1(SupervisedLearning)1
(2)作业1解答(SupervisedLearning)5(3)作业2(Kernels,SVMs,andTheory)15(4)作业2解答(Kernels,SVMs,andTheory)19(5)作业3(LearningTheoryandUnsupervisedLearning)27(6)作业3解答(LearningTheoryandUnsupervisedLearning)31(7)作业4(UnsupervisedLearningandReinforcementLearning)39(8)作业4解答(UnsupervisedLearningandReinforcementLearning)44(9)ProblemSet#1:
SupervisedLearning56(10)ProblemSet#1Answer62(11)ProblemSet#2:
ProblemSet#2:
NaiveBayes,SVMs,andTheory78(12)ProblemSet#2Answer85CS229ProblemSet#11CS229,PublicCourseProblemSet#1:
SupervisedLearning1.NewtonsmethodforcomputingleastsquaresInthisproblem,wewillprovethatifweuseNewtonsmethodsolvetheleastsquaresoptimizationproblem,thenweonlyneedoneiterationtoconvergeto.(a)FindtheHessianofthecostfunctionJ()=12Pmi=1(Tx(i)y(i)2.(b)ShowthatthefirstiterationofNewtonsmethodgivesus=(XTX)1XTy,thesolutiontoourleastsquaresproblem.2.Locally-weightedlogisticregressionInthisproblemyouwillimplementalocally-weightedversionoflogisticregression,whereweweightdifferenttrainingexamplesdifferentlyaccordingtothequerypoint.Thelocally-weightedlogisticregressionproblemistomaximize()=2T+mXi=1w(i)hy(i)logh(x(i)+(1y(i)log(1h(x(i)i.The2Thereiswhatisknownasaregularizationparameter,whichwillbediscussedinafuturelecture,butwhichweincludeherebecauseitisneededforNewtonsmethodtoperformwellonthistask.Fortheentiretyofthisproblemyoucanusethevalue=0.0001.Usingthisdefinition,thegradientof()isgivenby()=XTzwherezRmisdefinedbyzi=w(i)(y(i)h(x(i)andtheHessianisgivenbyH=XTDXIwhereDRmmisadiagonalmatrixwithDii=w(i)h(x(i)(1h(x(i)Forthesakeofthisproblemyoucanjustusetheaboveformulas,butyoushouldtrytoderivetheseresultsforyourselfaswell.Givenaquerypointx,wechoosecomputetheweightsw(i)=exp?
|xx(i)|222?
.Muchlikethelocallyweightedlinearregressionthatwasdiscussedinclass,thisweightingschemegivesmorewhenthe“nearby”pointswhenpredictingtheclassofanewexample.1CS229ProblemSet#12(a)ImplementtheNewton-Raphsonalgorithmforoptimizing()foranewquerypointx,andusethistopredicttheclassofx.Theq2/directorycontainsdataandcodeforthisproblem.Youshouldimplementthey=lwlr(Xtrain,ytrain,x,tau)functioninthelwlr.mfile.Thisfunc-tiontakesasinputthetrainingset(theXtrainandytrainmatrices,intheformdescribedintheclassnotes),anewquerypointxandtheweightbandwitdhtau.Giventhisinputthefunctionshould1)computeweightsw(i)foreachtrainingexam-ple,usingtheformulaabove,2)maximize()usingNewtonsmethod,andfinally3)outputy=1h(x)0.5astheprediction.Weprovidetwoadditionalfunctionsthatmighthelp.TheXtrain,ytrain=loaddata;functionwillloadthematricesfromfilesinthedata/folder.Thefunc-tionplotlwlr(Xtrain,ytrain,tau,resolution)willplottheresultingclas-sifier(assumingyouhaveproperlyimplementedlwlr.m).Thisfunctionevaluatesthelocallyweightedlogisticregressionclassifieroveralargegridofpointsandplotstheresultingpredictionasblue(predictingy=0)orred(predictingy=1).Dependingonhowfastyourlwlrfunctionis,creatingtheplotmighttakesometime,sowerecommenddebuggingyourcodewithresolution=50;andlaterincreaseittoatleast200togetabetterideaofthedecisionboundary.(b)Evaluatethesystemwithavarietyofdifferentbandwidthparameters.Inparticular,try=0.01,0.050.1,0.51.0,5.0.Howdoestheclassificationboundarychangewhenvaryingthisparameter?
Canyoupredictwhatthedecisionboundaryofordinary(unweighted)logisticregressionwouldlooklike?
3.MultivariateleastsquaresSofarinclass,wehaveonlyconsideredcaseswhereourtargetvariableyisascalarvalue.Supposethatinsteadoftryingtopredictasingleoutput,wehaveatrainingsetwithmultipleoutputsforeachexample:
(x(i),y(i),i=1,.,m,x(i)Rn,y(i)Rp.Thusforeachtrainingexample,y(i)isvector-valued,withpentries.Wewishtousealinearmodeltopredicttheoutputs,asinleastsquares,byspecifyingtheparametermatrixiny=Tx,whereRnp.(a)ThecostfunctionforthiscaseisJ()=12mXi=1pXj=1?
(Tx(i)jy(i)j?
2.WriteJ()inmatrix-vectornotation(i.e.,withoutusinganysummations).Hint:
StartwiththemndesignmatrixX=(x
(1)T(x
(2)T.(x(m)T2CS229ProblemSet#13andthemptargetmatrixY=(y
(1)T(y
(2)T.(y(m)TandthenworkouthowtoexpressJ()intermsofthesematrices.(b)FindtheclosedformsolutionforwhichminimizesJ().Thisistheequivalenttothenormalequationsforthemultivariatecase.(c)Supposeinsteadofconsideringthemultivariatevectorsy(i)allatonce,weinsteadcomputeeachvariabley(i)jseparatelyforeachj=1,.,p.Inthiscase,wehaveapindividuallinearmodels,oftheformy(i)j=Tjx(i),j=1,.,p.(Sohere,eachjRn).Howdotheparametersfromthesepindependentleastsquaresproblemscomparetothemultiv