迁移学习算法研究-庄福振New.ppt
《迁移学习算法研究-庄福振New.ppt》由会员分享,可在线阅读,更多相关《迁移学习算法研究-庄福振New.ppt(87页珍藏版)》请在冰豆网上搜索。
INSTITUTEOFCOMPUTINGTECHNOLOGYINSTITUTEOFCOMPUTINGTECHNOLOGY迁移学习迁移学习算法研究算法研究庄福振庄福振中国科学院计算技术研究所中国科学院计算技术研究所2016年年4月月18日日INSTITUTEOFCOMPUTINGTECHNOLOGYTrainingDataClassifierUnseenData(,long,T)good!
Whatif2传统监督机器学习传统监督机器学习(1/2)(1/2)2022/11/10fromProf.QiangYangINSTITUTEOFCOMPUTINGTECHNOLOGY传统监督机器学习传统监督机器学习(2/2)(2/2)32022/11/10l传统监督学习同源、独立同分布同源、独立同分布两个基两个基本假设本假设标注足够多的训练样本标注足够多的训练样本在实际应用中在实际应用中通常不能满足!
通常不能满足!
训练集测试集分类器训练集测试集分类器INSTITUTEOFCOMPUTINGTECHNOLOGY迁移学习迁移学习42022/11/10l实际应用学习场景HP新闻新闻Lenovo新闻新闻不同源、分布不一致不同源、分布不一致人工标记训练样本,费人工标记训练样本,费时耗力时耗力迁移迁移学习学习运用已有的知识对运用已有的知识对不同但相关领域不同但相关领域问题问题进行求解的一种新的机器学习方法进行求解的一种新的机器学习方法放宽了传统机器学习的两个基本假设放宽了传统机器学习的两个基本假设INSTITUTEOFCOMPUTINGTECHNOLOGY迁移学习场景迁移学习场景(1/4)(1/4)52022/11/10l迁移学习场景无处不在迁移迁移知识知识迁移迁移知识知识图像分类图像分类HP新闻新闻Lenovo新闻新闻新闻网页分类新闻网页分类INSTITUTEOFCOMPUTINGTECHNOLOGY异构特征空间6Theappleisthepomaceousfruitoftheappletree,speciesMalusdomesticaintherosefamilyRosaceae.BananaisthecommonnameforatypeoffruitandalsotheherbaceousplantsofthegenusMusawhichproducethiscommonlyeatenfruit.Training:
TextFuture:
ImagesApplesBananas迁移学习场景迁移学习场景(2/4)(2/4)2022/11/10fromProf.QiangYangXinJin,FuzhenZhuang,SinnoJialinPan,ChangyingDu,PingLuo,QingHe:
HeterogeneousMulti-taskSemanticFeatureLearningforClassification.CIKM2015:
1847-1850.INSTITUTEOFCOMPUTINGTECHNOLOGYTestTestTrainingTrainingClassifierClassifier72.65%DVDElectronicsElectronics84.60%ElectronicsDrop!
迁移学习场景迁移学习场景(3/4)(3/4)72022/11/10fromProf.QiangYangINSTITUTEOFCOMPUTINGTECHNOLOGY8DVDElectronicsBookKitchenClothesVideogameFruitHotelTeaImpractical!
迁移学习场景迁移学习场景(4/4)(4/4)2022/11/10fromProf.QiangYangINSTITUTEOFCOMPUTINGTECHNOLOGYOutlinepConceptLearningforTransferLearningConceptLearningbasedonNon-negativeMatrixTri-factorizationforTransferLearningConceptLearningbasedonProbabilisticLatentSemanticAnalysisforTransferLearningpTransferLearningusingAuto-encodersTransferLearningfromMultipleSourceswithAutoencoderRegularizationSupervisedRepresentationLearning:
TransferLearningwithDeepAuto-encoders92022/11/10INSTITUTEOFCOMPUTINGTECHNOLOGYConceptLearningbasedonNon-negativeMatrixTri-factorizationforTransferLearningConceptLearningforTransferLearning102022/11/10INSTITUTEOFCOMPUTINGTECHNOLOGYIntroductionManytraditionallearningtechniquesworkwellonlyundertheassumption:
TrainingandtestdatafollowthesamedistributionTraining(labeled)ClassifierTest(unlabeled)FromdifferentcompaniesEnterpriseNewsClassification:
includingtheclasses“ProductAnnouncement”,“Businessscandal”,“Acquisition”,Productannouncement:
HPsjust-releasedLaserJetProP1100printerandtheLaserJetProM1130andM1210multifunctionprinters,priceperformance.AnnouncementforLenovoThinkPadThinkCentreprice$150offLenovoK300desktopusingcouponcode.LenovoThinkPadThinkCentreprice$200offLenovoIdeaPadU450plaptopusing.theirperformanceHPnewsLenovonewsDifferentdistributionFail!
11ConceptLearningforTransferLearning2022/11/10INSTITUTEOFCOMPUTINGTECHNOLOGYMotivation(1/3)ExampleAnalysisProductannouncement:
HPsjust-releasedLaserJetProP1100printerandtheLaserJetProM1130andM1210multifunctionprinters,priceperformance.AnnouncementforLenovoThinkPadThinkCentreprice$150offLenovoK300desktopusingcouponcode.LenovoThinkPadThinkCentreprice$200offLenovoIdeaPadU450plaptopusing.theirperformanceHPnewsLenovonewsProductwordconceptLaserJet,printer,price,performanceThinkPad,ThinkCentre,price,performanceRelatedProductannouncementdocumentclass:
12Sharesomecommonwords:
announcement,price,performanceindicateConceptLearningforTransferLearning2022/11/10INSTITUTEOFCOMPUTINGTECHNOLOGYMotivation(2/3)ExampleAnalysis:
HPLaserJet,printer,price,performanceetal.LenovoThinkpad,Thinkcentre,price,performanceetal.Thewordsexpressingthesamewordconceptaredomain-dependent13ProductProductannouncementwordconceptindicatesTheassociationbetweenwordconceptsanddocumentclassesisdomain-independentConceptLearningforTransferLearning2022/11/10INSTITUTEOFCOMPUTINGTECHNOLOGYMotivation(3/3)14Furtherobservations:
Differentdomainsmayusesamekeywordstoexpressthesameconcept(denotedasidenticalconcept)Differentdomainsmayalsousedifferentkeywordstoexpressthesameconcept(denotedasalikeconcept)Differentdomainsmayalsohavetheirowndistinctconcepts(denotedasdistinctconcept)TheidenticalandalikeconceptsareusedasthesharedconceptsforknowledgetransferWetrytomodelthesethreekindsofconceptssimultaneouslyfortransferlearningtextclassificationConceptLearningforTransferLearning2022/11/10INSTITUTEOFCOMPUTINGTECHNOLOGYPreliminaryKnowledgeBasicformulaofmatrixtri-factorization:
wheretheinputXistheword-documentco-occurrencematrixdenotesconceptinformation,mayvaryindifferentdomainsFdenotesthedocumentclassificationinformationindeedistheassoci