Phonetic figures are sound.docx

上传人:b****7 文档编号:24022896 上传时间:2023-05-23 格式:DOCX 页数:13 大小:136.85KB
下载 相关 举报
Phonetic figures are sound.docx_第1页
第1页 / 共13页
Phonetic figures are sound.docx_第2页
第2页 / 共13页
Phonetic figures are sound.docx_第3页
第3页 / 共13页
Phonetic figures are sound.docx_第4页
第4页 / 共13页
Phonetic figures are sound.docx_第5页
第5页 / 共13页
点击查看更多>>
下载资源
资源描述

Phonetic figures are sound.docx

《Phonetic figures are sound.docx》由会员分享,可在线阅读,更多相关《Phonetic figures are sound.docx(13页珍藏版)》请在冰豆网上搜索。

Phonetic figures are sound.docx

Phoneticfiguresaresound

Phoneticfiguresaresound-relatedfiguresofspeech.

Theyencompassvariousstylisticmeans,namelythealliteration,assonance,cacophony,

paronomasia(pun)andonomatopoiea.

Alloftheseconcreterealisationsofphoneticfiguresrelatetosoundsastheyrepresentrepititionofsoundsandvowels

(alliterationandassonance),clashesofsounds(cacophony),"playuponthesoundsandmeaningsofwords"(pun)

andimitationofsounds(onomatopoeiaPrevious/Next/Index

3.PhoneticsandTheoryofSpeechProduction

Speechprocessingandlanguagetechnologycontainslotsofspecialconceptsandterminology.Tounderstandhowdifferentspeechsynthesisandanalysismethodsworkwemusthavesomeknowledgeofspeechproduction,articulatoryphonetics,andsomeotherrelatedterminology.Thebasictheoryofthesetopicswillbediscussedbrieflyinthischapter.Formoredetailedinformation,seeforexampleFant(1970),Flanagan(1972),Witten(1982),O'Saughnessy(1987),orKleijnetal(1998).

3.1RepresentationandAnalysisofSpeechSignals

Continuousspeechisasetofcomplicatedaudiosignalswhichmakesproducingthemartificiallydifficult.Speechsignalsareusuallyconsideredasvoicedorunvoiced,butinsomecasestheyaresomethingbetweenthesetwo.Voicedsoundsconsistoffundamentalfrequency(F0)anditsharmoniccomponentsproducedbyvocalcords(vocalfolds).Thevocaltractmodifiesthisexcitationsignalcausingformant(pole)andsometimesantiformant(zero)frequencies(Witten1982).Eachformantfrequencyhasalsoanamplitudeandbandwidthanditmaybesometimesdifficulttodefinesomeoftheseparameterscorrectly.Thefundamentalfrequencyandformantfrequenciesareprobablythemostimportantconceptsinspeechsynthesisandalsoinspeechprocessingingeneral.

Withpurelyunvoicedsounds,thereisnofundamentalfrequencyinexcitationsignalandthereforenoharmonicstructureeitherandtheexcitationcanbeconsideredaswhitenoise.Theairflowisforcedthroughavocaltractconstrictionwhichcanoccurinseveralplacesbetweenglottisandmouth.Somesoundsareproducedwithcompletestoppageofairflowfollowedbyasuddenrelease,producinganimpulsiveturbulentexcitationoftenfollowedbyamoreprotractedturbulentexcitation(Kleijnetal.1998).Unvoicedsoundsarealsousuallymoresilentandlesssteadythanvoicedones.ThedifferencesbetweentheseareeasytoseefromFigure3.2wherethesecondandlastsoundsarevoicedandtheothersunvoiced.Whisperingisthespecialcaseofspeech.Whenwhisperingavoicedsoundthereisnofundamentalfrequencyintheexcitationandthefirstformantfrequenciesproducedbyvocaltractareperceived.

Speechsignalsofthethreevowels(/a//i//u/)arepresentedintime-andfrequencydomaininFigure3.1.Thefundamentalfrequencyisabout100HzinallcasesandtheformantfrequenciesF1,F2,andF3withvowel/a/areapproximately600Hz,1000Hz,and2500Hzrespectively.Withvowel/i/thefirstthreeformantsare200Hz,2300Hz,and3000Hz,andwith/u/300Hz,600Hz,and2300Hz.Theharmonicstructureoftheexcitationisalsoeasytoperceivefromfrequencydomainpresentation.

Fig.3.1.Thetime-andfrequency-domainpresentationofvowels/a/,/i/,and/u/.

Itcanbeseenthatthefirstthreeformantsareinsidethenormaltelephonechannel(from300Hzto3400Hz)sotheneededbandwidthforintelligiblespeechisnotverywide.Forhigherquality,upto10kHzbandwidthmaybeusedwhichleadsto20kHzsamplingfrequency.Unless,thefundamentalfrequencyisoutsidethetelephonechannel,thehumanhearingsystemiscapabletoreconstructitfromitsharmoniccomponents.

Anothercommonlyusedmethodtodescribeaspeechsignalisthespectrogramwhichisatime-frequency-amplitudepresentationofasignal.Thespectrogramandthetime-domainwaveformofFinnishwordkaksi(two)arepresentedinFigure3.2.Higheramplitudesarepresentedwithdarkergray-levelssotheformantfrequenciesandtrajectoriesareeasytoperceive.Alsospectraldifferencesbetweenvowelsandconsonantsareeasytocomprehend.Therefore,spectrogramisperhapsthemostusefulpresentationforspeechresearch.FromFigure3.2itiseasytoseethatvowelshavemoreenergyanditisfocusedatlowerfrequencies.Unvoicedconsonantshaveconsiderablylessenergyanditisusuallyfocusedathigherfrequencies.Withvoicedconsonantsthesituationissomethingbetweenofthesetwo.InFigure3.2thefrequencyaxisisinkilohertz,butitisalsoquitecommontouseanauditoryspectrogramwherethefrequencyaxisisreplacedwithBark-orMel-scalewhichisnormalizedforhearingproperties.

Fig.3.2.Spectrogramandtime-domainpresentationofFinnishwordkaksi(two).

Fordeterminingthefundamentalfrequencyorpitchofspeech,forexampleamethodcalledcepstralanalysismaybeused(Cawley1996,Kleijnetal.1998).CepstrumisobtainedbyfirstwindowingandmakingDiscreteFourierTransform(DFT)forthesignalandthenlogaritmizingpowerspectrumandfinallytransformingitbacktothetime-domainbyInverseDiscreteFourierTransform(IDFT).TheprocedureisshowninFigure3.3.

Fig.3.3.Cepstralanalysis.

Cepstralanalysisprovidesamethodforseparatingthevocaltractinformationfromexcitation.Thusthereversetransformationcanbecarriedouttoprovidesmootherpowerspectrumknownashomomorphicfiltering.

Fundamentalfrequencyorintonationcontouroverthesentenceisimportantforcorrectprosodyandnaturalsoundingspeech.Thedifferentcontoursareusuallyanalyzedfromnaturalspeechinspecificsituationsandwithspecificspeakercharacteristicsandthenappliedtorulestogeneratethesyntheticspeech.ThefundamentalfrequencycontourcanbeviewedasthecompositesetofhierarchicalpatternsshowninFigure3.4.Theoverallcontourisgeneratedbythesuperpositionofthesepatterns(Sagisaga1990).MethodsforcontrollingthefundamentalfrequencycontoursaredescribedlaterinChapter5.

Fig.3.4.Hierarchicallevelsoffundamentalfrequency(Sagisaga1990).

3.2SpeechProduction

HumanspeechisproducedbyvocalorganspresentedinFigure3.5.Themainenergysourceisthelungswiththediaphragm.Whenspeaking,theairflowisforcedthroughtheglottisbetweenthevocalcordsandthelarynxtothethreemaincavitiesofthevocaltract,thepharynxandtheoralandnasalcavities.Fromtheoralandnasalcavitiestheairflowexitsthroughthenoseandmouth,respectively.TheV-shapedopeningbetweenthevocalcords,calledtheglottis,isthemostimportantsoundsourceinthevocalsystem.Thevocalcordsmayactinseveraldifferentwaysduringspeech.Themostimportantfunctionistomodulatetheairflowbyrapidlyopeningandclosing,causingbuzzingsoundfromwhichvowelsandvoicedconsonantsareproduced.Thefundamentalfrequencyofvibrationdependsonthemassandtensionandisabout110Hz,200Hz,and300Hzwithmen,women,andchildren,respectively.Withstopconsonantsthevocalcordsmayactsuddenlyfromacompletelyclosedpositioninwhichtheycuttheairflowcompletely,tototallyopenpositionproducingalightcoughoraglottalstop.Ontheotherhand,withunvoicedconsonants,suchas/s/or/f/,theymaybecompletelyopen.Anintermediatepositionmayalsooccurwithforexamplephonemeslike/h/.

Fig.3.5.Thehumanvocalorgans.

(1)Nasalcavity,

(2)Hardpalate,(3)Alveoralridge,(4)Softpalate(Velum),(5)Tipofthetongue(Apex),(6)Dorsum,(7)Uvula,(8)Radix,(9)Pharynx,(10)Epiglottis,(11)Falsevocalcords,(12)Vocalcords,(13)Larynx,(14)Esophagus,and(15)Trachea.

Thepharynxconnectsthelarynxtotheoralcavity.Ithasalmostfixeddimensions,butitslengthmaybechangedslightlybyraisingorloweringthelarynxatoneendandthesoftpalateattheotherend.Thesoftpalatealsoisolatesorconnectstheroutefromthenasalcavitytothepharynx.Atthebottomofthepharynxaretheepiglottisandfalsevocalcordstopreventfoodreachingthelarynxandtoisolatetheesophagusacousticallyfromthevocaltract.Theepiglottis,thefalsevocalcordsandthevocalcordsareclosedduringswallowingandopenduringnormalbreathing.

Theoralcavityisoneofthemostimportantpartsofthevocaltract.Itssize,shapeandacousticscanbevariedbythemovementsofthepalate,thetongue,thelips,thecheeksandtheteeth.Especiallythetongueisveryflexible,thetipandtheedgescanbemovedindependentlyandtheentiretonguecanmoveforward,backward,upanddown.Thelipscontrolthesizeandshapeofthemouthopeningthroughwhichspeechsoundisradiated.Unliketheoralcavity,thenasalcavityhasfixeddimensionsandshape.Itslengthisabout12cmandvolume60cm3.Theairstreamtothenasalcavityiscontrolledbythesoftpalate.

Fromtechnicalpointofview,thevocalsystemmaybeconsideredasasingleacoustictubebetweentheglottisandmouth.GlottalexcitedvocaltractmaybethenapproximatedasastraightpipeclosedatthevocalcordswheretheacousticalimpedanceZg=∞andopenatthemouth(Zm=0).Inthiscasethevolume-velocitytransferfunctionofvocaltractis(Flanagan1972,O'Saughnessy1987)

(3.1)

wherelisthelengthofthetube,ωisradianfrequencyandcissoundvelocity.ThedenominatoriszeroatfrequenciesFi=ωi/2π(i=1,2,3,...),where

and

(3.2)

Ifl=17cm,V(ω)isinfiniteatfrequenciesFi=500,1500,2500,...Hzwhichmeansresonancesevery1kHzstartingat500Hz.Ifthelengthlisotherthan17cm,thefrequenciesFiwillbescaledbyfactor17/lsothevocaltractmaybeapproximatedwithtwoorthreesectionsoftubewheretheareasofadjacentsectionsarequitedifferen

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 经管营销 > 经济市场

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1