MathorCup优秀论文B题.docx
《MathorCup优秀论文B题.docx》由会员分享,可在线阅读,更多相关《MathorCup优秀论文B题.docx(20页珍藏版)》请在冰豆网上搜索。
![MathorCup优秀论文B题.docx](https://file1.bdocx.com/fileroot1/2023-4/17/fd8a79da-ac35-48af-9319-c7e47fa6b0ad/fd8a79da-ac35-48af-9319-c7e47fa6b0ad1.gif)
MathorCup优秀论文B题
Thejudgesscoring,note
Teamnumber:
20017
Thejudgesscoring,note
Thejudgesscoring,note
`
Problem:
B
Thejudgesscoring,note
Title:
BooksRecommendation
Abstract
WiththedevelopmentofinformationtechnologyandtheInternet,weareenteringgraduallyfromalackofinformationeraintotheeraofinformationoverload.Bothinformationconsumersandinformationproducershaveencounteredalotofchallenges,forwhichrecommendationisanimportanttooltoresolvethiscontradiction,soitiswidelyusedintheproductsandapplicationsintheInternet.
QuestionOne:
Thedataintheannexuser_book_score.txtreflectsthatthebookIDanduserIDaffectbookscores.ThedatainAnnexbook_tag.txtreflectsthatbooklabelsindicatestypesofbooksandaccordingtothebookID,wecanfindthecorrespondinglabel,whichaffectsuserratingsofbooks.InAnnexuser_read_history.txt,thehistoryusersviewedbooksreflectsthekindofbookstheuserslike,thusaffectingthescore.InAnnexuser_social.txt,concernuserstotheirfriendsreflectstherelationshipbetweentheuser'schoiceandlikingofbooks.Thuscontactbetweentheuserandthebooksaredrawn,whichaffectsthescore.Sothefivefactorsimpactingtheuserratingforthebooksare:
bookID,booklabels,bookhistoryusersbrowseing,usersconcernedfriendsandtheuserID.
QuestionTwo:
theuserscoreofbooksintheannexpredict.txtconsidersthefactorsinuserscoreofbooks.Firstofall,dataareinkmotestbyapplicationofsoftwarespss,andthetestsig<0.05,soprincipalcomponentanalysisofthedataisdone.Inordertoensurethereliabilityoftheresults,dataforbooksIDanduserIDarestandardized,andfivemainindicatorsareincomponentanalysis.Whenthreemaincomponentsareacquired,thecumulativecontributionratereaches81.371%.Thesethreemaincomponentsareusedtodofactoranalysis,andintegratedmodelforprincipalcomponentis:
.Predictionforuserratingsforbooksinpredict.txtisultimatelygot.
QuestionThree:
throughthebookhistorytheuserbrowsingandanalysisofthebookscores,ontheonehandthesimilarityofbooklabelsarefoundinbooksbrowsinghistorytofurtherinfertheuser'sfavoritekindofbooks,andontheotherhandreaders’favoritebookscanalsobejudgedbythesimilarityhighscorebooksontheratingscale.Consideringthetwoaspectsofinformationabove,collaborativefilteringrecommendationalgorithmisusedtocalculatethesimilaritybetweenuserIDtoelecttwoorthreebookswhichhavethemostcomprehensivesimilarityandrecommendtotheusers.
Keywords:
overlap、similarity、principalcomponentanalysis、、factoranalysis、collaborativefilteringalgorithm
1PROBLEMRESTATED
WiththedevelopmentofinformationtechnologyandInternet,peoplegraduallywentoutintoaninformationoverloadedworld.Atthispoint,whetherinformationconsumersorinformationproducersareexperiencinggreatchallenges:
Asforinformationconsumers,itisaverydifficulttasktofindtheinterestinginformationfromalargeamountsofinformation.Forinformationproducers,lettingtheirownproductionofinformationstandoutanddrawinguser'sattentionwouldalsobeaverytoughissue.
Recommendationisakeytooltoresolvethiscontradiction.TheproductsandapplicationsarewidelyusedontheInternet,includingtherelevantsearch,recommendedtopic,variousproductsinelectroniccommercerecommendation,datingandrecommendationonthesocialnetwork.
Weobtainedawell-knownuserbehaviorinformationonlinebookstores,includingratingstatisticsforbooks,taginformationandtheuser'ssocialrelationships.Pleasecompletethefollowingquestionsbasedonthedata.
1Analyzingthefactorsaffectingtheassessmentofbooks;
2Developamodel,predictingbookscoresintheattachmentofpredict.txt;
3Aimingpredict.txtAnnexusers,werecommendthreebookstouserswhodidnotreadthesebooksbefore.
2PROBLEMANALYSIS
QuestionOne:
Toanalyzetheimpactofthebookusersratingfactors,itintroducestheannexbook_tag.txt,user_book_score.txt,user_read_history.txt,user_social.txtdatatothesoftware,afterdataprocessing,dataattachmentscanbeseeninuser_book_score.txtbookIDandtheuserIDofthemainfactorsaffectingthescore,AfterAnnexbook_tag.txt,user_social.txtuser_read_history.txtthedataanalysis,thelabelrepresentsdifferenttypesofbooks,thusaffectingthescore.Userscanfindthelabelsofvariouskindsofbooks,lookingthroughthehistorybooksoftheIDaccordingtobookswithdifferentID,.ForusersconcerningaboutthenumberoffriendsID,youcananalyzetherelationshipbetweentheobtaineddegreeofusers’choicesandfavoritebooks,andfinallydrawcontactbetweentheusersandthebooksandaffectthescore.
QuestionTwo:
Torequireuserstopredictthescoreofbooksinpredict.txtannex,itmustbebasedonaknownproblemaffectingusersofbooksscorefactorsanduseprincipalcomponentanalysismodeledbyprincipalcomponentanalysis:
Consideringthefactorsintheuserassessmentofbooks.Firstofall,thedatausingsoftwarekmotest,thetestsig<0.05,principalcomponentanalysisofthedata.AsthebookID,userIDisjustasymbolinordertoensurereliabilityoftheresults,thentheneedtostandardizetherawindexdataprocessing,compositionanalysisoffiveindicatorsshots,whenacquiredthreemaincomponents,thecumulativecontributionrateof81.371%,andusingthesethreeprincipalcomponentsfactoranalysis,principalcomponentsolutionwaseventuallyintegratedmodelis:
ultimatelypredictpredict.txtuserratingsforbooks.
Questionthree:
Aftertheuserviewingthehistorybooksaswellasanalysisofthescoresofbooks,ontheonehand,wefindthesimilarityofbookstaggedbooksfrombrowsinghistoryandinferusers’favoritebooks.Onanotherhand,findingtheratingscalehighlevelsimilarityofbookscanbejudged.Consideringtheabovetwoaspectsofinformation,withtheusageofcollaborativefilteringrecommendationalgorithm,wecalculatethesimilaritybetweentheusers’IDtoelectthemostcomprehensivesimilarityamongthreebooksrecommendedtotheuser.
3.MODELASSUMPTIONS
1Assumesthatthedatafollowanormaldistribution.
2Variousfactorsareindependenteachother.
3Assumingcharacteristicvaluescanbeseeninawayisaprincipalcomponentofimpactindicatorseffortssize.
4.Ignoretheoveralluserratingsforbookserrorsbasedonlimitedsampledata.
4SYMBOLDESCRIPYTION
Userdatarates
BooksIDstandardization
NumberofFriends
UserIDstandardization
Thenumberofbookslabels
Usersbrowsinghistory
Principalcomponent1
Principalcomponent2
Principalcomponent3
Principalcomponent
Commonusersandexcessivecollectionofitemscomment
Userratingsforbooks
Averageuserratingofbooks
Averageuserratingofbooks
(5)ESTABLISHANDSOVELTHEMODEL
5.1QuestionOne:
5.1.1Establishmentofamodel:
Accordingtothedatagiveninthetitlemeaningandaccessories,weanalyzesAnnexuser_book_score.txtuserID,booksID,listingthreecolumnsofdata,thefirststageisdividedintothreecontact:
Contacts
(1)theconnectionbetweenbooksandbooksID
(2)theconnectionbetweentheuserIDoftheuserID;
(3)theconnectionbetweentheuserIDandtheIDofbooks.
ThesecondlevelgiveninAnnexbook_tag.txtbookslabeldatafurtheranalysis:
(1)Foruserswhohavesamehobbies,selectedbookswillbesimilar,sothedegreeofoverlapwouldbehigh,booksoverlapsizecomparisonbetweenthelabelscandeterminethesimilaritybetweenthebookandthebooksize;
(2)ThenaccordingtoAnnexuser_read_history.txtuserviewedhistorybooksIDDataAnalysis:
Comparisonofthesimilaritybetweentheuserreadthebookcandeterminethesizeofcontactbetweentheusers.
(3)AnalyzingdataofAnnexuser_book_score.txt,accordingtouserbrowsingthroughthebooksofhistoryandscoringthelevelofsimilarity,itcanbebroughtbacktothesizeofthesimilaritybetweenthebooksandlabelanalysis,wecanlettheuserstoselectthedegreeandinterestsofbooksduetotherelationshipbetweentheobtainedcontactbetweentheusersandbooks.Conclusioncanbedrawnonthesecondstagebythefirststage,whichmeansanalysisoftheimpactoftheuseri.e.bookratingfactors.
Thefirststage
Thesecondstage
Figure5-1scoreaffectusersonbooksanalysischart
Figure5-1EffectcreateuserratinganalysisdiagramBooks,influencetherelationshipbetweenthebookIDlabeloverlapbooks,booksoverlapbooktagIDtotherelationshipbetweenthereaction,therelationshipbetweenthebookIDandtheinteractionscore;therelationshipbetweenbooksanduserIDaffecttheuser'sscorereadbooks,readbooksuserratingscounterproductivetotherelationshipbetweenbooksanduserID,userIDbetweenbooksandtherelationshipbetweenscoreinteraction;affecttherelationshipbetweentheuserIDusersreadbooks,readbooksontheuserreactiontotherelationshipbetweenauserID.
5.1.2Solvingamodel:
Figure4-1affectedbyuserratinganalysischartbooksanalyzetheimpactonusersofbooksscorefactors:
(1)IDbooks
Tagnumber
(2)books
(3)UserIDhistorybooksviewed
(4)thenumberofusersconcernedfriendsID
(5)theuserID
5.1.2Solvingamodel:
Figure4-1affectedbyuserratinganalysischartbooksanalyzetheimpactonusersofbooksscorefactors:
(1)IDbooks
Tagnumber
(2)books
(3)UserIDhistorybooksviewed
(4)thenumberofusersconcernedfriendsID
(5)theuserID
5.2