bestmachinelearningresourcesforgettingstartedmachinelearningmastery.docx
《bestmachinelearningresourcesforgettingstartedmachinelearningmastery.docx》由会员分享,可在线阅读,更多相关《bestmachinelearningresourcesforgettingstartedmachinelearningmastery.docx(6页珍藏版)》请在冰豆网上搜索。
bestmachinelearningresourcesforgettingstartedmachinelearningmastery
BestMachineLearningResourcesforGettingStartedMachineLearningMastery
ThiswasareallyhardposttowritebecauseIwantittobereallyvaluable.Isatdownwithablankpageandaskedthereallyhardquestionofwhataretheverybestlibraries,courses,papersandbooksIwouldrecommendtoanabsolutebeginnerinthefieldofMachineLearning.
Ireallyagonisedoverwhattoincludeandwhattoexclude.Ihadtoworkhardtoputmyselfintheshoesofaprogrammerandbeginneratmachinelearningandthinkaboutwhatresourceswouldbestbenefitthem.
Ipickedthebestforeachtypeofresource.Ifyouareatruebeginnerandexcitedtogetstartedinthefieldofmachinelearning,Ihopeyoufindsomethinguseful.Mysuggestionwouldbetopickonething,onebookoronelibraryandreaditcovertocoverorworkthroughallofthetutorials.Pickoneandsticktoit,thenonceyoumasterit,pickanotherandrepeat.Let’sgetintoit.
ProgrammingLibraries
Iamanadvocateof“learnjustenoughtobedangerousandstarttryingthings”.ThisishowIlearnedtoprogramandI’msuremanyotherpeoplelearnedthatwaytoo.Knowyourlimitationsandexploityourstrengths.Ifyouknowhowtoprogram,leveragethattogetdeepintomachinelearningfast.Thenhavethedisciplinetogoandlearnthemathforthetechniquebeforeyouimplementitaproductionsystem.
Findalibraryandreadthedocumentation,followthetutorialsandstarttryingthingsout.Thefollowingarethebestopensourcemachinelearningprogramminglibrariesoutthere.Idon’tthinktheyareallsuitableforusinginyourproductionsystem,buttheyareidealforlearning,exploringandprototyping.
Startwithalibraryinalanguageyouknowwellthenmoveontoothermorepowerfullibraries.Ifyou’reagoodprogrammer,youknowyoucanmovefromlanguagetolanguagereasonablyeasily.It’sallthesamelogic,justdifferingsyntaxandAPIs.
RProjectforStatisticalComputing:
Thisisanenvironmentandalisp-likescriptinglanguage.AllthestatsstuffyoucouldeverwanttodowillbeprovidedintoR,includingamazingplotting.TheMachineLearningcategoryonCRAN(think:
third-partyMachineLearningpackages)hascodewrittenbyleadersinthefieldwithstateoftheartmethods,aswellasanythingelseyoucanthinkof.LearningRisamustifyouwanttoprototypeandexplorequickly.Itjustmightnotbethefirstplaceyoustart.
WEKA:
ThisisaDataMiningworkbenchprovidingAPI,andanumberofcommandlineandgraphicaluserinterfacesforthewholedatamininglifecycle.Youcanpreparedata,visualizeexplore,buildclassification,regressionandclusteringmodelsandmanyalgorithmsareprovidedbuiltinaswellasprovidedinthirdpartyplugins.NotrelatedtoWEKA,MahoutisagoodJavaframeworkforMachineLearningonHadoopinfrastructureifthatismoreyourthing.Ifyou’renewtobigdataandmachinelearning,stickwithWEKAandlearnonethingatatime.
ScikitLearn:
MachineLearninginPythonbuiltontopofNumPyandSciPy.IfyouareaPythonoraRubyprogrammer,thisisthelibraryforyou.It’sfriendly,powerfulandcomeswithexcellentdocumentation.Orangewouldbeagoodalternativeifyou’dliketotrysomethingelse.
Octave:
IfyouarefamiliarwithMatLaboryou’reaNumPyprogrammerlookingforsomethingdifferent,considerOctave.ItisanenvironmentfornumericalcomputingjustlikeMatlabandmakesiteasytowriteprogramstosolvelinearandnon-linearproblems,suchasthosethatunderliemostmachinelearningalgorithms.Ifyouhaveanengineeringbackground,thismightbeagoodplaceforyoutostart.
BigML:
Maybeyoudon’twanttodoanyprogramming.YoucandrivetoolslikeWEKAcompletelywithoutprogramming.YoucangoonestepfurtheranduseserviceslikeBigMLthatoffermachinelearninginterfacesonthewebwhereyoucanexplorebuildingmodelsallinthebrowser.
Pickaplatformanduseittodoyourpracticalmachinelearningeducation.Don’tjustread,do.
VideoCourses
Videoisaverypopularwaytogetstartedinmachinelearning.IwatchalotofmachinelearningvideosonYouTubeandVideoLectures.Net.Theriskisthatallyouwilldoisconsumeandfailtotakeaction.Irecommendyoushouldalwaystakenoteswhenwatchingavideo,evenifyoudiscardthenoteslater.Ialsorecommendtryingoutwhateveritisyou’relearninginthelecture.
Frankly,noneofthevideocoursesIhaveseenarereallysuitableforabeginner,foratruebeginner.Theyallpresupposeaworkingknowledgeofatleastlinearalgebraandprobabilitytheory,andmore.AndrewNg’sStanfordlecturesareprobablythebestplacetostartforacourse,otherwisethereareone-offvideosIrecommend.
StanfordMachineLearning:
AvailableviaCourseraandtaughtbyAndrewNg.Inadditiontoenrolling,youcanwatchallthelecturesanytimeandgetthehandoutsandlecturenotesfromtheactualStanfordCS229course.ThecourseincludeshomeworkandquizzesandfocusesonlinearalgebraandusingOctave.
CaltechLearningfromData:
AvailableviaedXandtaughtbyYaserAbu-Mostafa.AllthelecturesandmaterialsareavailableontheCalTechsite.Again,liketheStanfordclass,youcantakeitatyourownpaceandcompletethehomeworkandassignments.Itcoverssimilarsubjectsandgoesintoalittlebitmoredetailsandismoremathematical.Thehomeworkisprobablytoochallengingforabeginner.
MachineLearningCategoryonVideoLectures.Net:
Thisisaneasyplacetodrownintheoverloadofcontent.Lookforvideosthatseeminterestingandtrythemout.Bailifit’satthewronglevelortakenotesifyou’reenjoyingit.IfindIkeepcomingbacktorefreshmyselfontopicsandtopickupentirelynewtopics.Also,it’sgreattoseewhatthemastersofthefieldactuallylooklike.
“GettingInShapeForTheSportOfDataScience”–TalkbyJeremyHoward:
AtalktoalocalRusersgrouponthepracticalprocessfordoingwellincompetitivemachinelearning.Thisisveryvaluablebecausesofewpeopletalkaboutwhatit’sactuallyliketoworkonaproblemandhowtodoit.Inot-so-secretlyfantasiseaboutfundingawebrealityTVshowthatfollowsparticipantsinmachineleaningcompetitions.That’showintoitIam!
OverviewPapers
Ifyouarenotusedtoreadingresearchpapers,youwillfindthelanguageverystiff.Apaperislikeasnippetofatextbook,butdescribesanexperimentorsomeotherfrontierofthefield.Nevertheless,therearesomepapersthatyoumightfindinterestingifyouarelookingtogetstartedinmachinelearning.
TheDisciplineofMachineLearning:
AwhitepaperdefiningthedisciplineofMachineLearningbyTomMitchell.ThiswasapieceoftheargumentMitchellusedtoconvincethePresidentofCMUtocreateastandaloneMachineLearningdepartmentforasubjectthatwillstillbearoundin100years(alsoseethisshortinterviewwithTomMitchell).
AFewUsefulThingstoKnowaboutMachineLearning:
Thisisagreatpaperbecauseitpullsbackfromspecificalgorithmsandmotivatesanumberofimportantissuessuchasfeatureselectiongeneralizabilityandmodelsimplicity.Thisisallgoodstufftogetrightandthinkclearlyaboutfromthebeginning.
I’veonlylistedtwoimportantpapers,becausereadingpaperscanreallybogyoudown.
BeginnerMachineLearningBooks
Therearealotofmachinelearningbooksandveryfewarewrittenforbeginners.Whatisabeginnerreally?
Mostlikelyyou’recomingtomachinelearningfromanotherfield,mostlikelycomputerscience,programmingorstatistics.Eventhen,mostbooksexpectyoutohaveagroundinginatleastlinearalgebraandprobabilitytheory.
Nevertheless,thereareafewbooksouttherethatencourageeagerprogrammerstogetstartedbyteachingtheminimumintuitionforanalgorithmandpointtotoolsandlibrariessothatyoucanrunofftoandtrythingsout.MostnotablyProgrammingCollectiveIntelligence,MachineLearningforHackersandDataMining:
PracticalMachineLearningToolsandTechniquesforPython,R,andJavarespectively.Ifindoubt,graboneofthesethreebooks!
BooksforMachineLearningBeginners
ProgrammingCollectiveIntelligence:
BuildingSmartWeb2.0Applications(AffiliateLink):
Thisbookwaswrittenforyoudearprogrammer.It’sliteontheory,heavyoncodeexamplesandpracticalwebproblemsandsolutions.Buyit,readit,dotheexercises.
MachineLearningforHackers(AffiliateLink):
I’drecommendthisbookafterreadingProgrammingCollectiveIntelligence(above).Itagainprovidesworkedexamplesthatarepractical,butithasamoreofadataanalysisflavorandusesR.Ireallylikethisbook!
MachineLearning:
AnAlgorithmicPerspective(AffiliateLink).ThisbookislikeamoreadvancedversionofProgrammingCollectiveIntelligence(above).Ithassimilaraims(getprogrammersstartedinMachineLearning),butitincludesmathsandreferencesaswellasexamplesandsnippetsinpython.I’drecommendreadingthisafterreadingProgrammingCollectiveIntelligenceifyou’restillinterested.
DataMining:
PracticalMachineLearningToolsandTechniques,ThirdEdition(AffiliateLink):
Iactuallystartedwiththisbook,actuallyitwasthefirsteditionanditwasabouttheyear2000.IwasaJavaprogrammerandthisbookandthecompanionlibraryWEKAprovidedaperfectenvironmentformetotrythingsout,implementmyownalgorithmsasplug-insandgenerallypracticeMachineLearningandthebroaderprocessofDataMining.Ihighlyrecommendthisbookandthispath.
MachineLearning(AffiliateLink):
Thisisanoldbookanddoesincludeformulasandlotsofreferences.It’satextbookbutisalsoveryaccessiblewithgroundedmotivationsforeach