语音识别 外文翻译 外文文献 英文文献Word文件下载.docx

上传人:b****9 文档编号:12994423 上传时间:2022-10-01 格式:DOCX 页数:14 大小:22.63KB
下载 相关 举报
语音识别 外文翻译 外文文献 英文文献Word文件下载.docx_第1页
第1页 / 共14页
语音识别 外文翻译 外文文献 英文文献Word文件下载.docx_第2页
第2页 / 共14页
语音识别 外文翻译 外文文献 英文文献Word文件下载.docx_第3页
第3页 / 共14页
语音识别 外文翻译 外文文献 英文文献Word文件下载.docx_第4页
第4页 / 共14页
语音识别 外文翻译 外文文献 英文文献Word文件下载.docx_第5页
第5页 / 共14页
点击查看更多>>
下载资源
资源描述

语音识别 外文翻译 外文文献 英文文献Word文件下载.docx

《语音识别 外文翻译 外文文献 英文文献Word文件下载.docx》由会员分享,可在线阅读,更多相关《语音识别 外文翻译 外文文献 英文文献Word文件下载.docx(14页珍藏版)》请在冰豆网上搜索。

语音识别 外文翻译 外文文献 英文文献Word文件下载.docx

WayneWard

MITLaboratoryforComputerScience,Cambridge,Massachusetts,USAOregonGraduateInstituteofScience&

Technology,Portland,Oregon,USA

CarnegieMellonUniversity,Pittsburgh,Pennsylvania,USA

1 DefiningtheProblem

Speechrecognitionistheprocessofconvertinganacousticsignal,capturedbyamicrophoneoratelephone,toasetofwords.Therecognizedwordscanbethefinalresults,asforapplicationssuchascommands&

control,dataentry,anddocumentpreparation.Theycanalsoserveastheinputtofurtherlinguisticprocessinginordertoachievespeechunderstanding,asubjectcoveredinsection.

Speechrecognitionsystemscanbecharacterizedbymanyparameters,someofthemoreimportantofwhichareshowninFigure.Anisolated-wordspeechrecognitionsystemrequiresthatthespeakerpausebrieflybetweenwords,whereasacontinuousspeechrecognitionsystemdoesnot.Spontaneous,orextemporaneouslygenerated,speechcontainsdisfluencies,andismuchmoredifficulttorecognizethanspeechreadfromscript.Somesystemsrequirespeakerenrollment---ausermustprovidesamplesofhisorherspeechbeforeusingthem,whereasother systemsaresaidtobespeaker-independent,inthatnoenrollmentisnecessary.Someoftheotherparametersdependonthespecifictask.Recognitionisgenerallymoredifficultwhenvocabulariesarelargeorhavemanysimilar-soundingwords.Whenspeechisproducedinasequenceofwords,languagemodelsorartificialgrammarsareusedtorestrictthecombinationofwords.

Thesimplestlanguagemodelcanbespecifiedasafinite-statenetwork,where

thepermissiblewordsfollowingeachwordaregivenexplicitly.Moregenerallanguagemodelsapproximatingnaturallanguagearespecifiedintermsofacontext-sensitivegrammar.

Onepopularmeasureofthedifficultyofthetask,combiningthevocabularysizeandthe1languagemodel,isperplexity,looselydefinedasthegeometricmeanofthenumberofwordsthatcanfollowawordafterthelanguagemodelhasbeenapplied(seesectionforadiscussionoflanguagemodelingingeneralandperplexityinparticular).Finally,therearesomeexternalparametersthatcanaffectspeechrecognitionsystemperformance,includingthecharacteristicsoftheenvironmentalnoiseandthetypeandtheplacementofthemicrophone.

Speechrecognitionisadifficultproblem,largelybecauseofthemanysourcesofvariabilityassociatedwiththesignal.First,theacousticrealizationsofphonemes,thesmallestsoundunitsofwhichwordsarecomposed,arehighlydependentonthecontextinwhichtheyappear.Thesephoneticvariabilitiesareexemplifiedbythe

acousticdifferencesofthephoneme,Atwordboundaries,contextualvariationscanbequitedramatic---makinggasshortagesoundlikegashshortageinAmericanEnglish,anddevoandaresoundlikedevandareinItalian.

Second,acousticvariabilitiescanresultfromchangesintheenvironmentaswellasinthepositionandcharacteristicsofthetransducer.Third,within-speakervariabilitiescanresultfromchangesinthespeaker'

sphysicalandemotionalstate,speakingrate,orvoicequality.Finally,differencesinsociolinguisticbackground,dialect,andvocaltractsizeandshapecancontributetoacross-speakervariabilities.

Figureshowsthemajorcomponentsofatypicalspeechrecognitionsystem.Thedigitizedspeechsignalisfirsttransformedintoasetofusefulmeasurementsorfeaturesatafixedrate,2typicallyonceevery10--20msec(seesectionsand11.3forsignalrepresentationanddigitalsignalprocessing,respectively).Thesemeasurementsarethenusedtosearchforthemostlikelywordcandidate,makinguseofconstraintsimposedbytheacoustic,lexical,andlanguagemodels.Throughoutthisprocess,trainingdataareusedtodeterminethevaluesofthemodelparameters.

Speechrecognitionsystemsattempttomodelthesourcesofvariabilitydescribedaboveinseveralways.Atthelevelofsignalrepresentation,researchershavedevelopedrepresentationsthatemphasizeperceptuallyimportantspeaker-independentfeaturesofthesignal,andde-emphasizespeaker-dependentcharacteristics.Atthe

acousticphoneticlevel,speakervariabilityistypicallymodeledusingstatisticaltechniquesappliedtolargeamountsofdata.Speakeradaptationalgorithmshavealsobeendevelopedthatadaptspeaker-independentacousticmodelstothoseofthecurrentspeakerduringsystemuse,(seesection).Effectsoflinguisticcontextattheacousticphoneticlevelaretypicallyhandledbytrainingseparatemodelsforphonemesindifferentcontexts;

thisiscalledcontextdependentacousticmodeling.

Wordlevelvariabilitycanbehandledbyallowingalternatepronunciationsofwordsinrepresentationsknownaspronunciationnetworks.Commonalternatepronunciationsofwords,aswellaseffectsofdiale

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 农林牧渔 > 水产渔业

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1