英语教育测量与评价.docx

资源描述

英语教育测量与评价.docx

《英语教育测量与评价.docx》由会员分享，可在线阅读，更多相关《英语教育测量与评价.docx（10页珍藏版）》请在冰豆网上搜索。

英语教育测量与评价.docx

英语教育测量与评价

研究生学位课程试卷

院（系、所）外国语学院专业英语

考试科目英语教育测量与评价第二学期

研究生姓名戎竞雄学号132300176

考试成绩

导师评语：

导师签字

年月日

说明

一、凡学位课程考试试题、试卷必须与本封面一起装订。

阅卷导师务必用红笔批卷，并在本封面规定位置打分、写完评语后在二周（论文考试一个月）内交院（系、所）办公室教务员，教务员及时做好成绩登记，在学期结束前或第二学期初将成绩单交研究生处统一整理归档。

试题、试卷由院（系、所）办公室保管。

二、学位课程考试用纸除计算机专用打字纸、16开小方格稿子纸外，一律使用研究生处统一印制的“学位课程考试纸”。

三、该封面请用A4纸双面打印，将此说明打印于封面背面。

上海师范大学标准试卷

2013~2014学年第二学期考试日期2014年8月日

科目：

英语教育测量与评价

学科教学＋课程与教学论专业硕士13年级姓名戎竞雄学号_132300176__

项目

一

二

三

四

五

六

七

八

总分

分值

100

得分

我承诺，遵守《上海师范大学考场规则》，诚信考试。

签名：

__戎竞雄_____

I.Answerindetailthefollowingquestions:

（45%,15pointsforeach）

1.Suggestthedifferencesbetweenproficiencytestsandachievementtests.Giveexamplesifnecessary.

Answer:

Aproficiencytestassessesthegeneralknowledgeorskillscommonlyrequiredforentryintoagroupofsimilarinstitutions.OneexampleisTOEFL.Proficiencytestsarenorm-referencedtestsbecauseNRTshaveallthequalitiesdesirableforproficiencydecisions.Whileanachievementtestmustbedesignedwithveryspecificreferencetoaparticularcourse.Andtheachievementtestsareoftendirectlybasedoncourseobjectivesandwillthereforebecriterion-referenced.Suchtestswilltypicallybeadministeredattheendofacoursetodeterminehoweffectivelystudentshavemasteredtheinstructionalobjectives.Achievementtestsmustbenotonlyveryspecificallydesignedtomeasuretheobjectivesofagivencoursebutalsoflexibleenoughtohelpteachersreadilyrespondtowhattheylearnfromthetestaboutthestudents'abilities,thestudents'needs,andthestudents'learningofthecourseobjectives.Oneexampleisthetestsattheendofthecourse.

2.Thefollowingaretwodifferentkindsofscoredistribution.

Whatinformationdothesetwofiguresconveyus?

（DiscussfromthescoredistributionofNRTandthatofCRT）

Answer:

Thefirstfigureconveysusthatitisanorm-referencedtest,whichisdesignedtomeasuregloballanguageabilities（forinstance,overallEnglishlanguageproficiencyincludinglisteningability,readingcomprehension,andsoon）.Eachstudent'sscoreonsuchatestisinterpretedrelativetothescoresofallotherstudentswhotookthetest.Suchcomparisonsareusuallydonewithreferencetotheconceptofthenormaldistribution（familiarlyknownasthebellcurve）.ThepurposeofanNRTistospreadstudentsoutalongacontinuumofscoressothatthosewithpoorlanguageabilitiesareatoneendofthenormaldistribution,whilethosewith"high"abilitiesareattheotherend（withthebulkofthestudentsfallingnearthemiddle）.

Thesecondfigureshowsitisacriterion-referencedtest（CRT）,whichisusuallyproducedtomeasurewell-definedandfairlyspecificobjectives.Oftentheseobjectivesarespecifictoaparticularcourseorprogram.Eachstudent'sscoreismeaningfulwithoutreferencetotheotherstudents'scores.Astudent'sscoreonaparticularobjectiveindicatesthepercentoftheknowledgeorskillinthatobjectivethatthestudenthaslearned.

Moreover,thedistributionofscoresonaCRTneednotnecessarilybenormal.Ifallthestudentsknow100%ofthematerialonalltheobjectives,thenallthestudentsshouldreceivethesamescore.

ThepurposeofaCRTistomeasuretheamountoflearningthatastudenthasaccomplishedoneachobjective.Inmostcases,thestudentswouldknowinadvancewhattypesofquestions,tasks,andcontenttoexpectforeachobjective.

3.Whatisreliabilityandvalidity?

Whatistherelationshipbetweenreliabilityandvalidity?

Toassessacandidate’sorallanguageabilityinanoraltest,theexaminingbodyoftenaskstwoexaminerstoscorethatcandidate’sperformance.Similarly,whenanexaminerisgradingacompositionforacertaintest,i.e.TEM4,thesamecompositioncanbemarkedbythesameexaminerontwooccasions.Explainindetailwhysuchmeasuresshouldbetaken.

Answer:

Thetestreliabilityisdefinedastheextenttowhichtheresultscanbeconsideredconsistentorstable.Testvalidityisdefinedhereasthedegreetowhichatestmeasureswhatitclaims,orpurports,tobemeasuring.

Ifteachersadministeraplacementtesttotheirstudentsononeoccasion,theywouldlikethescorestobeverymuchthesameiftheyweretoadministerthesametestagain.Thedegreetowhichatestisconsistent,orreliablecanbeestimatedbycalculatingareliabilitycoefficient,whichcangoashighas+1.0foraperfectlyreliabletestoraslowas0whentheresultsonthetestaretotallyunreliable.Oncethetestsareadministeredtwiceandthepairsofscoresforeachstudentarelinedup,simplycalculateaPearsonproduct-momentcorrelationcoefficientbetweenthetwosetsofscores.Thecorrelationcoefficientwillprovideaconservativeestimate（thatisunderestimate）ofthereliabilityofthetestovertime.Thisreliabilityestimatecanbeinterpretedasthepercentofreliablevarianceonthetest.

Testvalidityisdefinedhereasthedegreetowhichatestmeasureswhatitclaims,orpurports,tobemeasuring.Forexample,ifatestclaimstomeasureproficiencyinGermanlisteningcomprehension,thatisjustwhatitshouldassess.

II.Discussion（55%,17pointsfor1and2,21pointsfor3）

1.Lookatthefollowingtableandanswerthequestionsthatfollow:

1）Calculatethetotalstandardscoresforthetwostudents

2）Comparethetotalstandardscoresbetweenthetwostudents,seewhichstudentscoredhigher,andexplainbrieflywhyateacherhadbetterusethetotalstandardscoresinsteadofthetotalrawscores.

Subject

Mean

StudentA

StudentB

Psychology

Writing

ListeningComprehension

Reading

Literature

Total

424

417

Answer:

1）

Subject

Mean

StudentA

Standard

scores

StudentB

Standardscores

Psychology

0.667

-0.17

Writing

-0.556

0.667

ListeningComprehension

1.2

Reading

1.9

-0.8

Literature

0.667

2.33

Total

424

3.88

417

5.03

2）从上图所计算出的标准分可以看出虽然学生A的总分比学生B的总分高，但

学生B的考试得分其中三项的标准分显然比学生A要高，也就是大部分的标准分比学生A要高。

从标准分总分来看，学生A的标准分总分为3.88，学生B的标准分总分为5.03，高于学生A。

所以学生B考得好些。

不同考试是不同质的，把不同考试的分数求和是没有意义的，同时也难以真实反映学生的整体情况。

所以正确的做法是先将各门课程的考分转换成标准分，再求和，标准分求和后的总分加以比较，从而判断学生A，B在考试中的优劣，这样才更科学。

2.UseSPSS16.0

（1）tocalculatethecorrelationcoefficientbetweentwosetsofwritingscoresmarkedbytwoteachersforthesamegroupofstudentsandthenmakethescatterplot.Discussifthereisanycorrelationbetweenthesetwosetsofscores.

（2）todothedependentsampleTtesttoseewhetherthereisanysignificantdifferencebetweenthetwoteachers’markingofthesamepaper.Reportanddiscusstheresult.Ifthereexistssignificantdifference,givesuggestionsastohowtosolvethisproblem.

（3）tocalculatethecorrelationcoefficientbetweenthreesetsofwritingscoresmarkedbythreeteachersforthesamegroupofstudentsandmakethescatterplotbetweenRater1andRater2;Rater1andRater3,andRater2andRater3.Discussifthereisanycorrelationbetweenthesethreesetsofscores.

（4）todoANOVAtoseewhetherthereisanysignificantdifferenceamongthethreeteachers’markingofthesamepaper（useLSDandSNK,anddrawMeansPlot）.Reportanddiscusstheresult.

（1）