语言测试学资料 2.docx

上传人:b****5 文档编号:7537632 上传时间:2023-01-24 格式:DOCX 页数:6 大小:18.48KB
下载 相关 举报
语言测试学资料 2.docx_第1页
第1页 / 共6页
语言测试学资料 2.docx_第2页
第2页 / 共6页
语言测试学资料 2.docx_第3页
第3页 / 共6页
语言测试学资料 2.docx_第4页
第4页 / 共6页
语言测试学资料 2.docx_第5页
第5页 / 共6页
点击查看更多>>
下载资源
资源描述

语言测试学资料 2.docx

《语言测试学资料 2.docx》由会员分享,可在线阅读,更多相关《语言测试学资料 2.docx(6页珍藏版)》请在冰豆网上搜索。

语言测试学资料 2.docx

语言测试学资料2

Chapter2

(第二章)

TheValidityofLanguageTesting

(语言测试的效度)

Whatisvalidity?

Atestissaidtobevalidifitmeasuresaccuratelywhatitisintendedtomeasure.

Validityhasanumberofaspects:

•Contentvalidity内容效度

•Criterion-relatedvalidity标准相关效度

•Constructvalidity编制效度

•Facevalidity表面效度

•Theuseofvalidity效度的用途

ContentValidity:

Atestissaidtohavecontentvalidityifitscontentconstitutesarepresentativesampleofthelanguageskills,structures,etc.withwhichitismeanttobeconcerned.

e.g.

Agrammartestmustbemadeupofitemstestingknowledgeorcontrolofgrammar.

Butthisinitselfdoesnotensurecontentvalidity.Thetestwouldhavecontentvalidityonlyifitincludedapropersampleoftherelevantstructures.

Whataretherelevantstructureswilldependuponthepurposeofthetest.

Inordertojudgewhetherornotatesthascontentvalidity,weneedaspecificationoftheskillsorstructuresetc.thatitismeanttocover.

Suchaspecificationshouldbemadeataveryearlystageintestconstruction.

Itisn’ttobeexpectedthateverythinginthespecificationwillalwaysappearinthetest;theremaysimplybetoomanythingsforallofthemtoappearinssingletest.

Butitwillprovidethetestconstructorwiththebasisformakingaprincipledselectionofelementsforinclusioninthetest.

Acomparisonoftestspecificationandtestcontentisthebasisforjudgmentsastocontentvalidity.Ideallythesejudgmentsshouldbemadebypeoplewhoarefamiliarwithlanguageteachingandtestingbutwhoarenotdirectlyconcernedwiththeproductionofthetestinquestion.

Whatistheimportanceofcontentvalidity?

First,thegreateratest’scontentvalidity,themorelikelyitistobeanaccuratemeasureofwhatitissupposedtomeasure.

Atestinwhichmajorareasidentifiedinthespecificationareunder-represented-----ornotrepresentedatall-----isunlikelytobeaccurate.

Secondly,suchatestislikelytohaveaharmfulbackwasheffect.Areaswhicharenottestedarelikelytobecomeareasignoredinteachingandlearning.Toooftenthecontentoftestsisdeterminedbywhatiseasytotestratherthanwhatisimportanttotest.

Thetestsafeguardagainstthisistowritefulltestspecificationsandtoensurethatthetestcontentisafairreflectionofthese.

Discussion

Case1

Doyouthinkanachievementtestforintermediatelearnerstocontainjustthesamesetofstructuresasoneforadvancedlearnershascontentvalidity?

No.

Case2

About20yearsago,thecandidatesofuniversityentranceexaminationinAmericawasgivenacompositiontopic:

Isphotographyanartorscience?

Discuss.

Doyouthinkthistesthasvalidity?

No.

Case3

Theintentionofotherpeopleconcerned,suchastheMinisterofDefense,toinfluencethegovernmentleaderstoadapttheirpolicytofitinwiththedemandsoftherightwing,cannotbeignored.

Whatisthesubjectof“cannotbeignored”?

A.theintention

B.otherpeopleconcerned

C.theMinisterofDefense

D.thedemandsoftherightwing

Whatdoesthisitemwanttomeasure,readingcomprehensionorsentencestructure?

Criterion-relatedvalidity:

Anotherapproachtotestvalidityistoseehowfarresultsonthetestagreewiththoseprovidedbysomeindependentandhighlydependableassessmentofthecandidate’sability.Thisindependentassessmentisthusthecriterionmeasureagainstwhichthetestisvalidated.

Thereareessentiallytwokindsofcriterion-relatedvalidity:

concurrentvalidity(共时效度)andpredictivevalidity(预时效度).

Whatisconcurrentvalidity?

Concurrentvalidityisestablishedwhenthetestandthecriterionareadministeredataboutthesametime.

e.g.

Courseobjectivescallforanoralcomponentaspartofthefinalachievementtest.

Theobjectivesmaylistalargenumberof‘functions’whichstudentsareexpectedtoperformorally,totestallofwhichmighttake45minutesforeachstudent.Thiscouldwellbeimpractical.

Perhapsitisfeltthatonlytenminutescanbedevotedtoeachstudentfortheoralcomponent.

Thequestionthenarises:

Cansuchaten-minutesessiongiveasufficientlyaccurateestimateofthestudentsabilitywithrespecttothefunctionsspecifiedinthecourseobjectives?

Isitavalidmeasure?

Fromthepointofviewofcontentvalidity,thiswilldependonhowmanyofthefunctionsaretestedinthecomponent,andhowrepresentativetheyareofthecompletesetoffunctionsincludedintheobjectives.

Everyeffortshouldbemadewhendesigningtheoralcomponenttogiveitcontentvalidity.Oncethishasbeendone,however,wecangofurther.Wecanattempttoestablishtheconcurrentvalidityofthecomponent.

Howtodoit?

Weshouldchooseatrandomasampleofallthestudentstakingthetest.

Thesestudentswouldthenbesubjectedtothefull45minuteoralcomponentnecessaryforcoverageofallthefunctions,usingperhapsfourscorerstoensurereliablescoring.

Thiswouldbethecriteriontestagainstwhichtheshortertestwouldbejudged.

Thestudents’scoresonthefulltestwouldbecomparedwiththeonestheyobtainedontheten-minutesession,whichwouldhavebeenconductedandscoredintheusualway,withoutknowledgeoftheirperformanceonthelongerversion.

Ifthecomparisonbetweenthetwosetsofscoresrevealsahighlevelofagreement,thentheshorterversionoforalcomponentmaybeconsideredvalid,inasmuchasitgivesresultssimilartothoseobtainedwiththelongerversion.

If,ontheotherhand,thetwosetsofscoresshowlittleagreement,theshorterversioncannotbeconsideredvalid;itcannotbeusedasadependablemeasureofachievementwithrespecttothefunctionsspecifiedintheobjectives.

Ofcourse,iftenminutesreallyisallthatcanbesparedforeachstudent,thentheoralcomponentmaybeincludedforthecontributionthatitmakestotheassessmentofstudents’overallachievementandforitsbackwasheffect.Butitcannotberegardedasanaccuratemeasureinitself.

‘ahighlevelofagreement’

‘littleagreement’

Howisthelevelofagreementmeasured?

Standardproceduresforcomparingsetsofscores:

‘validitycoefficient’

amathematicalmeasureofsimilarity

Perfectagreementbetweentwosetsofscoreswillresultinavaliditycoefficientof1.

Totallackofagreementwillgiveacoefficientofzero.

Itisbesttosquarethatcoefficient.

acoefficientof0.7betweenthetwooraltests

Squared

0.49

convertedtoapercentage,

49percent

Onthebasisofthis,wecansaythatthescoresontheshorttestpredict49percentofthevariationinscoresonthelongertest.

Inbroadterms,thereisalmost50percentagreementbetweenonesetofscoresandtheother.

Acoefficientof0.5wouldsignify25percentagreement;

Acoefficientof0.8wouldindicate64percentagreement.

Itisimportanttonotethata‘levelofagreement’of50percentdoesnotmeanthat50percentofthestudentswouldeachhaveequivalentscoresonthetwoversions.Wearedealingwithanoverallmeasureofagreementthatdoesnotrefertotheindividualscoresofstudents.

Whatispredictivevalidity?

Predictivevalidityconcernsthedegreetowhichatestcanpredictcandidates’futureperformance.

e.g.

Howwellcouldaproficiencytestpredictastudent’sabilitytocopewithagraduatecourseataBritishuniversity?

Thechoiceofcriterionmeasureraisesinterestingissues:

Shouldwerelyonthesubjectiveanduntrainedjudgmentsofsupervisors?

HowhelpfulisittousefinaloutcomeasthecriterionmeasurewhensomanyfactorsotherthanabilityinEnglish(suchassubjectknowledge,intelligence,motivation,healthandhappiness)willhavecontributedtoeveryoutcome?

Whereoutcomeisusedasthecriterionmeasure,avaliditycoefficientofaround0.4(only20percentagreement)isaboutashighasonecanexpect.

Thisispartlybecauseoftheotherfactors,andpartlybecausethosestudentswhoseEnglishthetestpredictedwouldbeinadequatearenotnormallypermittedtotakethecourse,andsothetest’s(possible)accuracyinpredictingproblemsforthosestudentsgoesunrecognized.Asaresult,avaliditycoefficientofthisorderisgenerallyregardedassatisfactory.

e.g.

Tovalidateaplacementtest:

Placementtestsattempttopredictthemostappropriateclassforanyparticularstudent.Validationwouldinvolveanenquiry,oncecourseswereunderway,intotheproportionofstudentswhowerethoughttobemisplaced.Itwouldthenbeamatterofcomparingthenumberofmisplacements(andtheireffectonteachingandlearning)withthecostofdevelopingandadministeringatestwhichwouldplacestudentsmoreaccurately.

Whatcriterionmeasureshouldwechoose?

Shouldwechooseanassessmentofthestudent’sEnglishasperceivedbyhisorhersupervisorattheuniversity,ortheoutcomeofthecourse(pass/failetc.)?

Constructvalidity

Atest,partofatest,oratestingtechniqueissaidtohaveconstructvalidityifitcanbedemonstratedthatitmeasuresjusttheabilitywhichitissupposedtomeasure.

Theword‘construct’referstoanyunderlyingability(ortrait)whichishypothesizedinatheoryoflanguageability.

Onemighthypothesize,forexample,thattheabilitytoreadinvolvesanumberofsub-abilities,suchastheabilitytoguessthemeaningofunknownwordsfromthecontextinwhichtheyaremet.

Itwouldbeamatterofempiricalresearchtoestablishwhetherornotsuchadistinctabilityexistedandcouldbemeasured.Ifweattemptedtomeasurethatabilityinaparticulartest,

Itwouldbeamatterofempiricalresearchtoestablishwhetherornotsuchadistinctabilityexistedandcouldbemeasured.Ifweattemptedtomeasurethatabilityinaparticulartes

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 表格模板 > 合同协议

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1