数据挖掘实例广告投放问题eng.docx
《数据挖掘实例广告投放问题eng.docx》由会员分享,可在线阅读,更多相关《数据挖掘实例广告投放问题eng.docx(12页珍藏版)》请在冰豆网上搜索。
数据挖掘实例广告投放问题eng
SASTutorial:
EnterpriseMiner
Casedescription
Supposethatyouworkforamailorderenterprisethatsendsoutacatalogoffurnishingsandhousewareseachmonth.Aspartofanupcomingsalescampaign,youwanttodistributeaspecialcatalogthatisdevotedtofinediningandcontainskitchenware,dishes,andflatware.It'stooexpensivetosendthiscatalogtoallofyourcustomers,soyouneedtotargetthosemostlikelytobuy.Youdothisbydevelopingatargetingmodelandthenusingittoproduceanewmailinglist.
Youhaveanextensiverecordofcustomerpurchases.Thedataincludesvariablesthatindicatewhethercustomersboughtkitchenware,dishes,orflatwareinthepasttwoyears.ThispurchasehistoryhasbeenusedtocreatetheCUSTDET1dataset,whichcontains49variableswiththefollowinglabels:
PurchaseDollarsSpentYearlyIncomeHomeValueOrderFrequencyRecencyMarriedNamePrefixAgeSexTelemarketInd.RentsApartmentOccupied<1YearDomesticProductApparelPurchaseLeisureProductLuxuryItems
KitchenProductDishesPurchaseFlatwarePurchaseTotalDining(kitch+dish+flat)Promo:
1-7MonthsPromo:
8-13Months$ValueperMailingCountryCodeTotalReturnsMensApparelHomeFurnitureLampsPurchaseLinensPurchaseBlanketsPurchaseTowelsPurchaseOutdoorProductCoatsPurchase
LadiesCoatsLadiesApparelHis/HerApparelJewelryPurchaseDate1stOrderTelemarketOrderAccountNumberStateCodeRaceHeatingTypeNumberofCarsNumberofKidsTravelTimeEducationLevelJobCategory
EnterpriseMiner
BasedonSASsoftware,EnterpriseMinercombinesthedataminingprocesswithgraphicaleaseofuse.Itdeliversabroadrangeofpredictiveanddescriptivemodelsthatyoucanapply,test,andcomparetodeterminethebestfitforthedata.Mostofthisisdonebymanipulatingicons:
youconnectnodesinagraphicalworkspace,adjustsettings,andruntheworkflow.Here'sanexampleoftheworkflow:
NodesareakeyconceptinEnterpriseMiner;mostofthetimeyouinteractwiththeprogrambydragginganddropping,right-clicking,ordouble-clickingthenodethatcorrespondstoaparticulartask.
1.InvokeEnterpriseMiner
AfteryoustarttheSASSystem,fromtheSASmenubar,youcanselectSolutionsAnalysisEnterpriseMiner.
Creatinganewproject
2.Tocreateanewproject,selectFileNewProject.WhentheCreatenewprojectwindowappears,enterDiningListintheNamefield.InthelocationfieldselectthebrowsebuttonandcreateanewdirectoryDiningListfortheproject.
ClicktheCreatebutton.
Youhavenowcreatedadirectoryforyourproject.EMhascreatedthreeadditionalsubdirectories:
EMDATA,EMPROJandREPORTS.
3.TheDiningListprojectappearsintheleftwindowpane.Belowitappearsthedefaultnameoftheworkflow,Untitled.SelectUntitledandenterthenewnamePropensity.
4.Gotothecoursewebsiteanddownloadthetutorialfile:
DataMining2003.sas7bdatandsaveitintheemprojsubdirectory.
Applyworkspacenodes
Youmustfirstspecifythesourceofthedataforthemailinglistproject.Todefineadatasource,youdraganddropanodeontotheworkspace.
ViewthenodesbyclickingtheToolstabatthebottomoftheleftwindowpane.Agroupedlistoficonsandlabelsappears;asmallsectionisshownbelow:
YoucanalsoselectnodesfromtheEnterpriseMinermenubaratthetopofthewindow.Thebarshowsalargerversionofsomeoftheiconsinthenodelist(Toreadthenameofanodeinthemenubar,brieflyholdthemousepointeroveritsicon.Thenamewillappearinatooltipbox).
Defineadatasource
1.Left-clickthenodethatislabeledInputDataSourceanddragittotherightwindowpane.Releasethenodetodropitintotheworkspace.
2.Double-clickthenewnodetospecifythesourcedata.TheInputDataSourcewindowappears,withtheDatatabintheforeground.
3.ClicktheSelectbutton.TheSASDataSetwindowappears.Selecttheemprojdirectory.
4.SelectDataMining2003fromthelistoftables.ClickOKtoclosethewindow.
5.ClosetheInputDataSourcewindow.Aconfirmationboxappears.
6.ClickYestosaveyourchanges.
DataExploration
ApplytheInsightnode
EnterpriseMinerincludesanInsightnodeforexploringyourproject'sdata..Itallowsyoutoexplorethesituationofmissingvalues,outliers,orskeweddistributionscan.
1.DraganddroptheInsightnodeontotheworkspace.PlacethenewnodeundertheInputDataSourcenode
2.ConnecttheInputDataSourcenodetotheInsightnode:
1.HoldthemousepointerattheedgeoftheInputDataSourcenodeuntilitbecomesapairofcrosshairs.
2.Left-clickandquicklydragtotheInsightnode.
3.Anarrowbetweenthenodesappears.
3.Double-clicktheInsightnode.TheInsightSettingswindowappears:
4.NoticethatthedatasetnameisnotDataMining2003butanewnamethathasbeenprovidedbyEnterpriseMiner.TheDescriptionfielddisplaystheoriginaldatasetname.
5.Becauseofthelargedatastoresyoumightworkwith,theInsightnodedefaultstoexaminingasamplesizeof2,000records.InthecaseoftheDataMining2003dataset,whichhas1,966records,clicktheEntiredatasetbutton.TheSamplesizefieldwillchangetoDatasetsizetoreflectthechange.
6.ClosetheInsightSettingswindow.Aconfirmationboxappears.
7.ClickYestosaveyourchanges.
ViewInsightnodeoutput
8.Right-clicktheInsightnodeandselectRun.Agreenborderappearsaroundthenodeasitreadsyourdata,andthenaconfirmationboxpopsup.
9.ClickYestoviewresults.Atabularviewofyourdataappears:
10.Withthetableviewopen,selectAnalyzeDistributionfromtheSASSystemmenubar.
11.Awindowforselectingdistributionvariablesappears.Selectincome.ClicktheYbuttonandthenOK.Awindowwiththedistributionoftheincomevariableappears.
Createatarget–Transformingvariables
Youaretryingtotargetthebuyersofdiningwares,butthevariableDININGpresentsaproblem.BecauseitcontainsthesumofKITCHEN,DISHES,andFLATWARE,itsvaluesrangefrom0to28.Butyouarelookingforthebuyersofanydiningwares,representedbyallvaluesgreaterthan0.Whatyouneed,therefore,isabinaryversionofDINING,wherethevaluesgreaterthan0arecollapsedto1.
YoucreatevariablesinEnterpriseMinerwiththeTransformVariablesnode
1.DraganddroptheTransformVariablesnodeontotheworkspace.
2.ConnecttheInputDataSourcenodetotheTransformVariablesnode.
3.Double-clicktheTransformVariablesnode.TheTransformVariableswindowappears.
4.ClicktheCreateVariableiconontheworkspacemenubar.TheCreateVariablewindowappears.
5.EnterDINEBINintheNamefield.
6.EnterDININGNo/YesintheLabelfield.
7.ClickDefine.TheCustomizewindowappears.
8.Enterdining>0intheDINEBIN(N)=formulafieldatthebottomofthewindow.
9.ClickOK.TheCreateVariablereappearswithdining>0displayedintheFormulafield.
10.ClickOK.ThenewvariableDINEBINappearsintheTransformVariableswindow.
11.ClosetheTransformVariableswindow.Aconfirmationboxappears.
12.SelectYestosaveyourchanges
ModifyingAttributes
Younowneedtoidentifyitasthemodel'starget.ThisisdonewiththeDataSetAttributesnode
1.DraganddroptheDataSetAttributesnodeontotheworkspacetotherightoftheinputdatasourcenode.ConnecttheTransformVariablenodetotheDataSetAttributesnode.
2.Double-clicktheDataSetAttributesnode.TheDataSetAttributeswindowappears.
3.ClicktheVariablestab.
ScrolldownthelistofvariablesuntilDINEBINappears.Noticethegrayed-outcolumnModelRoleandthewhitecolumnNewModelRole.Grayed-outcolumnsreflecttheoriginaldatasetattributesandtheycannotbeedited.
Rolereferstotheuseofeachvariable.Mostvariablesaretreatedasinputvariablesinanattempttopredictthetarget.Ifyouscrolldownthelistofvariables,youwillseethatEnterpriseMinerconsiderscertainvariablesunsuitableasinputs(e.g.,dates,orvariableswithasinglevalue).Suchvariablesaregiventherolerejected.
4.Right-clickinthecolumnNewModelRoletotherightofthevariableDINEBIN.
5.SelectSetNewModelRolefromthepop-upmenu.
6.Selecttarget.
7.Youaretryingtotargetthebuyersofdiningwares(forwhomthevariableDinebin=1).Howeverothervariablesinthedatasetcontainthesameinformation:
KITCHEN,DISHES,andFLATWAREDINEBINhasvalueof1ifthecustomerhadboughtanydiningware.Itisthereforenecessarytoexcludethemfromtheanalysis(assigna“reject”status).
Note:
WithintheDataSetAttributesWindow,thecolumnMeasurementreferstomeasurementlevel.Thisistherangeofvaluesthatisfoundineachvariable.Therearefivepossibleassignments:
unary-onevalue
forexample,avariablewithaparticularvaluethatwasusedtocreateadatasubset
binary-twovalues
forexample,thevariableMARITALthatcontainsNoorYes
nominal-morethantwonon-numericvalues,butnoimpliedorder
forexample,STATECODthatcontainsAK,AL,AR,AZ,etc.
ordinal-morethantwobutnotmorethantennumericvalues,withimpliedorder
forexample,NUMCARSthatcontainsvaluesfrom0to3
interval-morethantennumericvalues
forexample,AMOUNTthatcontainsmanydifferentdollarvalues
AneworderforvaluesinTargetVariable
Whenyoubuildamodel,EnterpriseMinerconsidersthetargeteventtobethefirstsortedvalueofthetargetvariable.Thedefaultsortorderisascending.Butthenewtargetvariable,DINEBIN,containsvaluesof0and1,with1representingthepurchaseofanydiningwares.ThevaluesneedtobeindescendingorderforEnterpriseMinertoaimattheintendedtarget.
Tochangetheorderofatargetvariable:
1.