上海交通大学神经网络原理与应用作业3.pdf
《上海交通大学神经网络原理与应用作业3.pdf》由会员分享,可在线阅读,更多相关《上海交通大学神经网络原理与应用作业3.pdf(5页珍藏版)》请在冰豆网上搜索。
![上海交通大学神经网络原理与应用作业3.pdf](https://file1.bdocx.com/fileroot1/2022-10/7/ac0fabdd-b787-46c3-bb34-2740a65933a9/ac0fabdd-b787-46c3-bb34-2740a65933a91.gif)
NeuralNetworkTheoryandApplicationsHomeworkAssignment3oxstarSJTUJanuary19,20121DataPreprocessingFirstweusedsvm-scaleofLibSVMtoscalethedata.Therearetwomainadvantagesofscaling:
oneistoavoidattributesingreaternumericrangesdominatingthoseinsmallernumericranges,anotheroneistoavoidnumericaldifficultiesduringthecalculation1.Welinearlyscaledeachattributetotherange-1,+1.2ModelSelectionWetriedthreedifferentkernelfunctions,namelylinear,polynomialandRBF.liner:
K(xi,xj)=xTixjpolynomial:
K(xi,xj)=(xTixj+r)d,0radialbasisfunction(RBF):
K(xi,xj)=exp(kxixjk2),0ThepenaltyparameterCandkernelparameters(,r,d)shouldbechosen.Weusedthegrid-search1onCandwhileranddaresettotheirdefaultvalues:
0and3.InFigure1,wepresentsthecontourmapsforchoosingtheproperattributes.Wejustsearchedforsomemaximawhiletheglobalmaximumisusuallydifficulttofindandwiththevaluesofattributesincreasing,therunningtimeincreasingdramatically.Notethatovrstandsforone-versus-resttaskdecompositionmethodswhileovoisshortforone-versus-oneandpvpisshortforpart-versus-part.Thelinerkerneldoesnthaveprivateattributes,soweshouldjustsearchforthepenaltyparameterC.TheresultsareshowninFigure2.ThefinalselectionforeachattributesarepresentedinTable1.Table1:
ASelectionforEachAttributesDecompositionKernelCRBF101.0one-versus-restPolynomial0.10.7Liner1RBF11.5one-versus-onePolynomial0.010.2Liner0.1RBF10.1part-versus-partPolynomial0.010.4Liner11Gammalg(cost)0.20.40.60.811.21.41.61.820123840424446485052(a)RBFKernel(ovr)Gammalg(cost)0.20.40.60.812103132333435363738394041(b)PolynomialKernel(ovr)Gammalg(cost)0.20.40.60.811.21.41.61.8210152535455565758(c)RBFKernel(ovo)Gammalg(cost)0.20.40.60.81321038404244464850525456(d)PolynomialKernel(ovo)Gammalg(cost)0.20.40.60.811.21.41.61.8210120253035404550(e)RBFKernel(pvp)Gammalg(cost)0.20.40.60.811.21.4321015202530354045(f)PolynomialKernel(pvp)Figure1:
GridSearchforRBFandPolynomialKernel2321010510152025303540lg(cost)Accuracy(a)ovr3210120102030405060lg(cost)Accuracy(b)ovo32101205101520253035404550lg(cost)Accuracy(c)pvpFigure2:
AttributesSearchforLinerKernel3Experiments3.1TaskDecompositionMethodsThereareseveralmulti-classclassificationtechniqueshavebeenproposedforSVMmodels.Themosttypicalapproachfordecomposingtasksisthesocalledone-versus-restmethodwhichclassifiesoneclassfromtheotherclass.AssumethatweconstructNtwo-classclas-sifiers,atestdatumisclassifiedtoCiiff.theithclassifierrecognizesthisdatum.However,probablymorethanoneclassifiersrecognizeit.Inthiscase,wesetitbelongingtoCiiftheithclassifiergivesthelargestdecisionvalue.Ontheotherside,ifnoclassifierrecognizesit,wewouldsetitbelongingtoCiiftheithclassifiergivesthesmallestdecisionvalueforclassifyingittotherestclass.One-versus-onecombiningallpossibletwo-classclassifierisanothermethodologyfordealingwithmulti-classproblems3.Thesizeofclassifiergrowssuper-linearlywiththenumberofclassesbuttherunningtimemaynotbecauseeachdividedproblemismuchsmaller.Weusedaelectionstrategytomakethefinaldecisions:
ifthenumberofi-relativeclassifiersthatclassifyingadatumtotheithclassisthelargest,wewouldsaythatthisdatumbelongstoCi.Part-versus-partmethodisanotherchoice4.Anytwo-classproblemcanbefurtherde-composedintoanumberoftwo-classsub-problemsassmallasneeded.Itisgoodatdealingwithunbalanceclassificationproblems.AsshowninTable2,numberoftrainingdataineachclassfromourdatasetisjustunbalance.Table2:
NumberofTrainingDatainEachClassClassNumberofTrainingDataClassNumberofTrainingData05376741994758423281545391910046891013385381143WeusedMAX-MINstrategytomakethefinaldecisions.Wealsohavetodeterminethesizeofminimumparts,whichaffectstheperformanceofclassificationalot.FromFigure3,wechose200asthenumberofeachsub-classbecausetheaccuracyreachalocalmaximumanditwouldmakenosenseif1600ischosen.3.2ResultsInourexperiments,weusedtheJavaversionofLibSVM2.3255010020040080016000102030405060NperpartAccuracyFigure3:
RelationshipbetweenSun-classSizeandAccuracyovrovopvp0102030405060TaskDecompositionMethodsAccuracy(%)RBFpolynomiallinear(a)Accuracyovrovopvp0102030405060708090100TaskDecompositionMethodsTime(s)RBFpolynomiallinear(b)RunningTimeFigure4:
PerformanceofEachTaskDecompositionMethodandEachKernelTherunningtimeandaccuracyareshowninFigure4aandFigure4b.3.3DiscussionComparingwithovoandpvp,one-versus-onedecompositionmethodalwayshastheworstaccuracynomaterwhichkernelisused.However,duetothesimpleprocedure,onlyNclassifiersarerequiredforaN-classproblem.Sothescalabilityofthismethodisbetterthantheothers.Theone-versus-onedecompositionmethodperformedthebestinourexper