Statistical methods for research workers.docx

上传人:b****4 文档编号:4901585 上传时间:2022-12-11 格式:DOCX 页数:18 大小:1.06MB
下载 相关 举报
Statistical methods for research workers.docx_第1页
第1页 / 共18页
Statistical methods for research workers.docx_第2页
第2页 / 共18页
Statistical methods for research workers.docx_第3页
第3页 / 共18页
Statistical methods for research workers.docx_第4页
第4页 / 共18页
Statistical methods for research workers.docx_第5页
第5页 / 共18页
点击查看更多>>
下载资源
资源描述

Statistical methods for research workers.docx

《Statistical methods for research workers.docx》由会员分享,可在线阅读,更多相关《Statistical methods for research workers.docx(18页珍藏版)》请在冰豆网上搜索。

Statistical methods for research workers.docx

Statisticalmethodsforresearchworkers

Statisticalmethodsforresearchworkers

STATISTICALMETHODSFORRESEARCHWORKERS

ByRonaldA.Fisher(1925)

PostedApril2000

VI

THECORRELATIONCOEFFICIENT

30. Noquantityismorecharacteristicofmodernstatisticalworkthanthecorrelationcoefficient,andnomethodhasbeenappliedsuccessfullytosuchvariousdataasthemethodofcorrelation.Observationaldatainparticular,incaseswherewecanobservetheoccurrenceofvariouspossiblecontributorycausesofaphenomenon,butcannotcontrolthem,hasbeengivenbyitsmeansanaltogethernewimportance.Inexperimentalworkproperitspositionismuchlesscentral;itwillbefoundusefulintheexploratorystagesofanenquiry,aswhentwofactorswhichhadbeenthoughtindependentappeartobeassociatedintheiroccurrence;butitisseldom,withcontrolledexperimentalconditions,thatitisdesiredtoexpressourconclusionintheformofacorrelationcoefficient.

Oneoftheearliestandmoststrikingsuccessesofthemethodofcorrelationwasinthebiometricalstudyofinheritance.Atatimewhennothingwasknownofthemechanismofinheritance,orofthestructureofthegerminalmaterial,itwaspossiblebythismethodtodemonstratetheexistenceofinheritance,andto[p.139]"measureitsintensity";andthisinanorganisminwhichexperimentalbreedingcouldnotbepractised,namely,Man.Bycomparisonoftheresultsobtainedfromthephysicalmeasurementsinmanwiththoseobtainedfromotherorganisms,itwasestablishedthatman'snatureisnotlessgovernedbyhereditythanthatoftherestoftheanimateworld.Thescopeoftheanalogywasfurtherwidenedbydemonstratingthatcorrelationcoefficientsofthesamemagnitudewereobtainedforthementalandmoralqualitiesinmanasforthephysicalmeasurements.

Theseresultsarestilloffundamentalimportance,fornotonlyisinheritanceinmanstillincapableofexperimentalstudy,andexistingmethodsofmentaltestingarestillunabletoanalysethementaldisposition,butevenwithorganismssuitableforexperimentandmeasurement,itisonlyinthemostfavourablecasesthattheseveralfactorscausingfluctuatingvariabilitycanberesolved,andtheireffectsstudied,byMendelianmethods.Suchfluctuatingvariability,withanapproximatelynormaldistribution,ischaracteristicofthemajorityoftheusefulqualitiesofdomesticplantsandanimals;andalthoughthereisstrongreasontothinkthatinheritanceinsuchcasesisultimatelyMendelian,thebiometricalmethodofstudyisatpresentalonecapableofholdingouthopesofimmediateprogress.

WegiveinTable31anexampleofacorrelationtable.Itconsistsofarecordincompactformofthestatureof1376fathersanddaughters.(PearsonandLee'sdata.)Themeasurementsaregroupedin[p.140-141][table][p.142]inches,andthosewhosemeasurementwasrecordedasanintegralnumberofincheshavebeensplit;thusafatherrecordedasof67incheswouldappearas1/2under66.5and1/2under67.5.Similarlywiththedaughters;inconsequence,whenbothmeasurementsarewholenumbersthecaseappearsinfourquarters.Thisgivesthetableaconfusingappearance,sincethemajorityofentriesarefractional,althoughtheyrepresentfrequencies.Itispreferable,ifbiasinmeasurementcanbeavoided,togrouptheobservationsinsuchawaythateachpossibleobservationlieswhollywithinonegroup.

Themostobviousfeature of thetableisthatcasesdonotoccurinwhichthefatherisverytallandthedaughterveryshort,and viceversa ;theupperright-handandlowerleft-handcornersofthetableareblank,sothatwemayconcludethatsuchoccurrencesaretooraretooccurinasampleofabout1400cases.Theobservationsrecordedlieinaroughlyellipticalfigurelyingdiagonallyacrossthetable.Ifwemarkouttheregioninwhichthefrequenciesexceed10itappearsthatthisregion,apartfromnaturalirregularities,issimilar,andsimilarlysituated.Thefrequencyofoccurrenceincreasesfromallsidestothecentralregionofthetable,whereafewfrequenciesover30maybeseen.Thelinesofequalfrequencyareroughlysimilarandsimilarlysituatedellipses.Intheouterzoneobservationsoccuronlyoccasionally,andthereforeirregularly;beyondthiswecouldonlyexplorebytakingamuchlargersample.

Thetablehasbeendividedintofourquadrantsby[p.143]markingoutcentralvaluesofthetwovariates;thesevalues,67.5inchesforthefathersand63.5inchesforthedaughters,arenearthemeans.Whenthetableissodivideditisobviousthatthelowerright-handandupperleft-handquadrantsaredistinctlymorepopulousthantheothertwo;notonlyaremoresquaresoccupied,butthefrequenciesarehigher.Itisapparentthattallmenhavetalldaughtersmorefrequentlythantheshortmen,and viceversa. Themethodofcorrelationaimsatmeasuringthedegreetowhichthisassociationexists.

Themarginaltotalsshowthefrequencydistributionsofthefathersandthedaughtersrespectively.Thesearebothapproximatelynormaldistributions,asisfrequentlythecasewithbiometricaldatacollectedwithoutselection.Thismarksafrequentdifferencebetweenbiometricalandexperimentaldata.Anexperimenterwouldperhapshavebredfromtwocontrastedgroupsoffathersof,forexample,63and72inchesinheight;allhisfatherswouldthenbelongtothesetwoclasses,andthecorrelationcoefficient,ifused,wouldbealmostmeaningless.Suchanexperimentwouldservetoascertaintheregressionofdaughter'sheightinfather'sheight,andsotodeterminetheeffectonthedaughtersofselectionappliedtothefathers,butitwouldnotgiveusthecorrelationcoefficientwhichisadescriptiveobservationalfeatureofthepopulationasitis,andmaybewhollyvitiatedbyselection.

Justasnormalvariationwithonevariatemaybespecifiedbyafrequencyformulainwhichthe[p.144]logarithmofthefrequencyisaquadraticfunctionofthevariate,sowithtwovariatesthefrequencymaybeexpressibleintermsofaquadraticfunctionofthevaluesofthetwovariates.Wethenhaveanormalcorrelationsurface,forwhichthefrequencymayconvenientlybewrittenintheform

Inthisexpression x and y arethedeviationsofthetwovariatesfromtheirmeans, σ1 and σ2 arethetwostandarddeviations,and ρ isthecorrelation between x and y. Thecorrelationintheaboveexpressionmaybepositiveornegative,butcannotexceedunityinmagnitude;itisapurenumberwithoutphysicaldimensions.If ρ=0,theexpressionforthefrequencydegeneratesintotheproductofthetwofactors

showingthatthelimitofthenormalcorrelationsurface,whenthecorrelationvanishes,ismerelythatoftwonormallydistributedvariatesvaryingincompleteindependence.Attheotherextreme,when p is+1or-1,thevariationofthetwovariatesisinstrictproportion,sothatthevalueofeithermaybecalculatedaccuratelyfromthatoftheother.Inotherwords,weceasestrictlytohavetwovariates,butmerelytwomeasuresofthesamevariablequantity.

Ifwepickoutthecasesinwhichonevariatehasanassignedvalue,wehavewhatistermedanarray;[p.145]thecolumnsandrowsofthetablemay,exceptasregardsvariationwithinthegrouplimits,beregardedasarrays.Withnormalcorrelationthevariationwithinanarraymaybeobtainedfromthegeneralformula,bygiving x aconstantvalue,(say) a,anddividingbythetotalfrequencywithwhichthisvalueoccurs;thenwehave

showing(i.)thatthevariationof y withinthearrayisnormal;(ii.)thatthemeanvalueof y forthatarrayis ρaσ2/σ1,sothattheregressionofy on x islinear,withregressioncoefficient

and(iii.)thatthevarianceof y withinthearrayis σ22(1-ρ2),andisthesamewithineacharray.Wemayexpressthisbysayingthatofthetotalvarianceof y thefraction(1-ρ2)isindependentof x,whiletheremainingfraction, ρ2,isdeterminedby,orcalculablefrom,thevalueof x.

Theserelationsarereciprocal,theregressionof x on y islinear,withregressioncoefficient ρσ1/σ2;thecorrelation ρ isthusthegeometricmeanofthetworegressions.Thetworegressionlinesrepresentingthemeanvalueof x forgiven y,andthemeanvalueof y forgiven x,cannotcoincideunless ρ=[plusorminus]1.Thevariationof x withinanarrayinwhich y isfixed,isnormalwithvarianceequalto σ12(1-ρ2),sothatwemaysaythatofthevarianceof x thefraction(1-ρ2)[p.146]isindependentof y,andtheremainingfraction, ρ2,isdeterminedby,orcalculablefrom,thevalueof y.

Sucharetheformalmathematicalconsequencesofnormalcorrelation.Muchbiometricdatacertainlyshowsageneralagreementwiththefeaturestobeexpectedonthisassumption;thoughIamnotawarethatthequestionhasbeensubjectedtoanysufficientlycriticalenquiry.Approximateagreementisperhapsallthatisneededtojustifytheuseofthecorrelationasaquantitydescriptiveofthepopulation;itsefficacyinthisrespectisundoubted,anditisnotimprobablethatinsomecasesitaffordsacompletedescriptionofthesimultaneousvariationofthevariates.

31.TheStatisticalEstimationoftheCorrelation

Justasthemeanandthestandarddeviationofanormalpopulationinonevariatemaybemostsatisfactorilyestimatedfromthefirsttwomomentsoftheobserveddistribution,sotheonlysatisfactoryestimateofthecorrelation,whenthevariatesarenormallycorrelated,isfoundfromthe"productmoment."If x and y representthedeviationsofthetwovariatesfromtheirmeans,wecalculatethethreestatistics s1, s2, r bythethreeequations

ns12 =S(x2), ns22 =S(y2), nrs1s2 =S(xy);

then s1 and s2 areestimatesofthestandarddeviations σ1,and σ2,and 

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 教学研究 > 教学反思汇报

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1