RegrdiagWord文档下载推荐.docx
《RegrdiagWord文档下载推荐.docx》由会员分享,可在线阅读,更多相关《RegrdiagWord文档下载推荐.docx(9页珍藏版)》请在冰豆网上搜索。
1.DoestherelationbetweenYandtheX’sfollowalinearpattern?
2.Aretheresidualsapproximatelynormallydistributed?
3.Aretheresidualsreasonablyhomoscedastic?
4.Aretheresidualsautocorrelated?
5.Doalldatapointscontributeroughlyequallytodeterminethepositionofthelineordosomepointsexertundueinfluenceontheoutcome?
6.Arethereanyoutlyingdatapointswhichclearlydonotfitthegeneralpattern?
Anumberofplotscanbepreferred,butithastoberememberedthattheyarenotalwayssufficientlypowerfulandtheycanbemisleading.Suchplotsareasfollows:
1.Scatterplotofyvxi
Usewithcare,butmaysuggestnon-linearity.
2.Residuals/StandardisedResidualsvxi
Thepresenceofacurvilinearrelationshipsuggeststhatahigher-orderterm,e.g.quadratic,shouldbeaddedtothemodel,oratransformation,suchasalog,shouldbeconsidered.Canindicatetheexistenceofoutliers,structuralbreaksandnon-constantvarianceoftheerrorterm,i.e.heteroscedasticity
3.Residualsvexplanatoryvariablesnotinthemodel.
Thepresenceofarelationshipwouldsuggestthattheexplanatoryvariableshouldbeincludedinthemodel.
4.Residualsvy
Ifthevarianceoftheresidualschangeswiththepredictedvalues,thenheteroscedasticityisindicated.Outliers,non-linearityandstructuralbreaksmayalsobeindicated.
5.Residualsvtime
Inthecaseoftimeseriesdata,correlationbetweentheerrortermscanbedetectedsuggestingthepresenceofautocorrelation.Thismayindicatemissingvariable(s)inthemodel.
6.Variablesvtime
Aproblemassociatedwithnon-stationaryvariables,andfrequentlyfacedbyeconometricianswhendealingwithtimeseriesdata,isthespuriousregressionproblem.Ifatleastoneoftheexplanatoryvariablesinaregressionequationisnon-stationaryinthesensethatitdisplaysadistinctstochastictrend,itisverylikelythecasethatthedependentvariableintheequationwilldisplayasimilartrend.Ifsuchaproblemisdetected,thenerrorcorrectionmodels(ECM)andcointegrationanalysiswillhavetobeconsidered.
7.Normalplotoftheresiduals
Theuseofnormalityplotscanhelpdetectabnormalitieswiththedataandthemodel.Ifthemodeliscorrectlyspecified,thentheresidualsshouldlooklikeasamplefromanormaldistribution.
Note:
Anysystematicpatternintheresidualsofaregressionequationshouldberegardedassuggestiveofthepossibilityofmisspecification.
NormalityTests
Recallthatoneoftheassumptionsintheclassicalregressionmodelisthattheerrorshadtobenormallydistributedabouttheirzeromean.Theassumptionisnecessaryiftheinferentialaspectsofclassicalregression(ttest,Ftestsetc.)aretobevalidinsmallsamples.
Thereareseveraltestsofnormalitythatcanbeused.
1.HistogramofResiduals
Asimplegraphicaldevice,butrathersubjective.
2.NormalProbabilityPlot
Arathercomparativelysimplegraphicaldevice.MINITABwillproducenormalscoresbymeansoftheNSCOREScommand.ThesecanbeplottedagainsttheresidualsandanelongatedS-shapedcurveshouldbeproducediftheresidualsarenormallydistributed.Thereshouldalsobeanextremelyhighcorrelationbetweentheresidualsandthenscores–statisticaltableswillberequiredtocheckthesignificanceoftheresults.
3.NormalProbabilityTestsproducedbyMINITAB
Anderson-Darlingnormalitytest
Ryan-Joinernormalitytest
Kolmogorov-Smirnoffnormalitytest
Thesetestsareconstructedusingdifferentassumptionsaboutthedata(fordetailsseetheHELPfacilitywithinMINITAB).Theyalltakethenullhypothesistobeoneofnormality.Therefore,normalityoftheresidualswillberejectedifthequotedp-valueissmallerthanthesignificancelevel,.
4.TheJarque-BeraTestforNormality
AtestofnormalitywhichisfoundinanumberofeconometricpackagesintheJarque-Bera(JB)test.ThisisanasymptoticorlargesampletestandisbasedonOLSresiduals.Thetesthingesonthevaluesforskewnessandkurtosiswhichforanormaldistributionare0and3respectively.ThesearemeasuredbythethirdandfourthmomentsoftheOLSresiduals.
Underanullhypothesisofnormallydistributeddisturbances,
wehaveskewness(3)=0andkurtosis(4)=3
Itcanbeshownthat
Z3=3nandZ4=(4-3)n
624
bothhaveastandardnormaldistributioninlargesamples.
Therefore,
(Z32+Z42)willhavea2with2df
Hence,
Ho:
residualsarenormallydistributed
i.e.3=0and4=3
v
H1:
residualsarenotnormallydistributed
i.e.30or43orboth
WerejectH0if
JB=Z32+Z42
=(n/6).32+(n/24).(4-3)2
=n[32/6+(4-3)2/24]
>
(2)2at%significancelevel
Wherenisthesamplesize,3isskewnessand4iskurtosis
Now,fromthesampleofresiduals
2=ei2/n,3=ei3/n,4=ei4/n
Itcanbeshownthat
32=32/23and4=4/22
Outliers,LeverageandInfluence
Inregressionanalysis,youshouldalwaysbewareofpointswhichdonotfitthegeneralpatternorexertunderinfluenceontheoutcomeofournumericalsummaries.Therearethreetypesofdatapointswhichshouldconcernus.Theseare:
anoutlier,apointofhighleverage,andaninfluentialpoint.
Anoutlierinaregressionisadatapointwhichhasalargeresidual(usuallymorethanthreestandarddeviationsfromthemean(=0)).
Apointofhighleveragecanbedefinedthus:
‘AdatapointhasahighleverageifitisextremeintheX-direction,i.e.itisadisproportionatedistanceawayfromthemiddlerangeoftheX–values’.ThesepointscanexertundueinfluenceontheoutcomeofanOLSregressionline.Theyarecapableofexertingastrongpullontheslopeoftheregressionline.Whethertheydosoornotisanothermatter.
Aninfluentialpointisapointwhichifremovedfromthesamplewouldmarkedlychangethepositionoftheleastsquaresregressionline.Hence,influentialdatapointspulltheregressionlineintheirdirection.Notethatinfluentialdatapointsdonotnecessarilyproducelargeresiduals,thatis,theyarenotalwaysoutliersaswell,althoughtheycanbe.Itispreciselybecausetheydrawtheregressionlinetowardsthemselvesthattheymayendupwithsmallresiduals.Conversely,anoutlierisnotnecessarilyaninfluentialpoint,particularlywhenitisapointwithlittleleverage.
Ingeneralwenote:
outliersarenotnecessarilyinfluential
buttheycanbeso(dependingonleverage)
yethighleveragepointsarenotalwaysinfluential
andinfluentialpointsarenotnecessarilyoutliers.
Thepresenceofoutliersorofinfluentialpointsoftengivesusaclearsignalthatourmodelisprobablymisspecified.Intermsofvisualdisplays,outlierscanbespottedwithresidualplots,whereasinfluentialpointsreallyneedscatterplotswhicharenotalwayssomeaningfulwhendealingwithseveralexplanatoryvariables.Apartfromthesegraphicalmethods,wecanalsousesomespecialstatisticsdesignedtodetectoutliers,pointsofleverageandpointsofinfluence.
StudentisedResiduals
Agoodwaytodetectoutliersistoinvestigateeachobservationatatime,usinganOLSregressionwiththerelevantobservationexcluded,andtestingwhetherthepredictionerrorforthatobservationissignificantlylarger.Thiscanbemosteasilydonebyincludinganobservation-specificdummyvariable.Forexample,toinvestigatetheithobservationinadataset,wedefineadummyvariabletakingavalueofunityfortheithobservationandzeroforallotherobservations.IfweincludethisdummyintheOLSregression,itscoefficientwillequaltherequiredpredictionerror.Totestthepredictionerrorforsignificance,wecanexamineitstratio.Thistratioisreferredtoasastudentisedresidual.Ithasastudent’stdistributionwith(T-1-K-1)degreesoffreedomwhereKisthenumberofexplanatoryvariables.
Wedefinethestudentisedresiduals(ei*).
ei*=ei/[s(i)(1-hi)]=sei/s(i)
Wheres(i)isthestandarderrorestimateoftheregressionfittedafterdeletingtheithobservation,andhiisameasureofleverage,andei’isthestandardizedresidual.
UnusualYvalueswillclearlystando