1、语音信号的短时分析语音信号的短时分析一、实验目的1.在理论学习的基础上,进一步地理解和掌握语音信号短时分析的意义,短时时域分析的基本方法。2.进一步理解和掌握语音信号短时平均能量函数及短时平均过零数的计算方法和重要意义。二、实验原理及方法一定时宽的语音信号,其能量的大小随时间有明显的变化。其中清音段(以清音为主要成份的语音段),其能量比浊音段小得多。短时过零数也可用于语音信号分析中,发浊音时,其语音能量约集中于3kHz以下,而发清音时,多数能量出现在较高频率上,可认为浊音时具有较低的平均过零数,而清音时具有较高的平均过零数,因而,对一短时语音段计算其短时平均能量及短时平均过零数,就可以较好地区
2、分其中的清音段和浊音段,从而可判别句中清、浊音转变时刻,声母韵母的分界以及无声与有声的分界。这在语音识别中有重要意义。三、实验仪器 微型计算机,Matlab软件环境四、实验内容1.上机前用Matlab语言完成程序编写工作。2.程序应具有加窗(分帧)、计算 、 以及绘制曲线等功能。3.上机实验时先调试程序,通过后进行信号处理。4.对录入的语音数据进行处理,并显示运行结果。5.依据曲线对该语音段进行所需要的分析,并作出结论。6.改变窗的宽度(帧长),重复上面的分析内容。五、预习和实验报告要求1.预习课本有关内容,理解和掌握短时平均能量函数及短时平均过零数函数的意义及其计算方法。2.参考Matlab
3、有关资料,设计并编写出具有上述功能的程序。六、上机实验报告要求:1.报告中,实验目的、实验原理、实验步骤、方法等格式和内容的要求与其它实验相同。2.画出求得的 、 曲线,注明语音段和所用窗函数及其宽度。阐述所作分析和判断的过程,提出依据,得出判断结论。七、思考题1语音信号短时平均能量及短时平均过零数分析的主要用途是什么?2窗的宽度(帧长)的改变,对 的特性产生怎样的影响? 附:所用语音信号文件名为one.wavMatlab编程实验步骤: 1新建M文件,扩展名为“.m”,编写程序; 2选择File/Save命令,将文件保存在F盘中; 3在Command Window窗中输入文件名,运行程序;Ma
4、tlab部分函数语法格式: 读wav文件: x=wavread(filename) 数组a及b中元素相乘: a.*b创建图形窗口命令: figure绘图函数: plot(x) 坐标轴: axis(xmin xmax ymin ymax)坐标轴注解: xlabel() ylabel() 图例注解: legend( )一阶高通滤波器: y=filter(1-0.09375,1,x)分帧函数: f=enframe(x,len,inc)x为输入语音信号,len指定了帧长,inc指定帧移,函数返回为nlen的一个矩阵,每一行都是一帧数据。x=wavread(3.wav);figure;subplot(4
5、,1,1);plot(x);axis(1 length(x) -1 1);ylabel(Speech); enhance=filter(1-0.9375,1,x); FrameLen=240;FrameInc=80;yframe=enframe(x,FrameLen,FrameInc);amp1=sum(abs(yframe),2);subplot(4,1,2);plot(amp1);axis(1 length(amp1) 0 max(amp1);ylabel(Energy);legend(amp1=x); amp2=sum(abs(yframe.*yframe),2);subplot(4,1
6、,3);plot(amp2);axis(1 length(amp2) 0 max(amp2);ylabel(Energy);legend(amp1=x*x); %zcr=zeros(size(yframe,1),1)delta=0.02%for i=1:size(yframe,1) x=yframe(i,:) for j=1:length(x)-1 if x(j)*x(j+1)delta % zcr(i)=zcr(i)+1 end endend tmp1=enframe(x(1:end-1),FrameLen,FrameInc);tmp2=enframe(x(2:end),FrameLen,F
7、rameInc);signs=(tmp1.*tmp2)0.02;zcr=sum(signs.*diffs,2); subplot(4,1,4);plot(zcr);axis(1 length(zcr) 0 max(zcr);ylabel(ZCR);legend(zcr);x=wavread(3.wav);figure;subplot(4,1,1);plot(x);axis(1 length(x) -1 1);ylabel(Speech);enhance=filter(1-0.9375,1,x);FrameLen=240;FrameInc=80;yframe=enframe(x,FrameLen
8、,FrameInc);amp1=sum(abs(yframe),2);subplot(4,1,2);plot(amp1);axis(1 length(amp1) 0 max(amp1);ylabel(Energy);legend(amp1=x); amp2=sum(abs(yframe.*yframe),2);subplot(4,1,3);plot(amp2);axis(1 length(amp2) 0 max(amp2);ylabel(Energy);legend(amp1=x*x);zcr=zeros(size(yframe,1),1);delta=0.02for i=1:size(yfr
9、ame,1); a=yframe(i,:) for j=1:length(a)-1 if a(j).*a(j+1)delta zcr(i)=zcr(i)+1 end endend %tmp1=enframe(x(1:end-1),FrameLen,FrameInc);%tmp2=enframe(x(2:end),FrameLen,FrameInc);%signs=(tmp1.*tmp2)0.02;%zcr=sum(signs.*diffs,2); subplot(4,1,4);plot(zcr);axis(1 length(zcr) 0 max(zcr);ylabel(ZCR);legend(
10、zcr);function f=enframe(x,win,inc)%ENFRAME split signal up into (overlapping) frames: one per row. F=(X,WIN,INC)% F = ENFRAME(X,LEN) splits the vector X up into% frames. Each frame is of length LEN and occupies% one row of the output matrix. The last few frames of X% will be ignored if its length is
11、 not divisible by LEN.% It is an error if X is shorter than LEN.% F = ENFRAME(X,LEN,INC) has frames beginning at increments of INC% The centre of frame I is X(I-1)*INC+(LEN+1)/2) for I=1,2,.% The number of frames is fix(length(X)-LEN+INC)/INC)% F = ENFRAME(X,WINDOW) or ENFRAME(X,WINDOW,INC) multipli
12、es% each frame by WINDOW(:) % Copyright (C) Mike Brookes 1997% Last modified Tue May 12 13:42:01 1998% VOICEBOX home page: http:/www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html% This program is free software; you can redistribute it and/or modify% it under the terms of the GNU General Public Lic
13、ense as published by% the Free Software Foundation; either version 2 of the License, or% (at your option) any later version.% This program is distributed in the hope that it will be useful,% but WITHOUT ANY WARRANTY; without even the implied warranty of% MERCHANTABILITY or FITNESS FOR A PARTICULAR P
14、URPOSE. See the% GNU General Public License for more details.% You can obtain a copy of the GNU General Public License from% ftp:/prep.ai.mit.edu/pub/gnu/COPYING-2.0 or by writing to% Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.%nx=length(x);nwin=length(win);if (nwin = 1) l
15、en = win;else len = nwin;endif (nargin 1) w = win(:); f = f .* w(ones(nf,1),:);End语音信号的谱分析及应用 x=wavread(3.wav);figure;subplot(4,1,1);plot(x);axis(1 length(x) -1 1);ylabel(Speech);enhance=filter(1-0.9375,1,x);FrameLen=240;FrameInc=80;yframe=enframe(x,FrameLen,FrameInc);amp1=sum(abs(yframe),2);subplot
16、(4,1,2);plot(amp1);axis(1 length(amp1) 0 max(amp1);ylabel(Energy);legend(amp1=x);amp2=sum(abs(yframe.*yframe),2);subplot(4,1,3);plot(amp2);axis(1 length(amp2) 0 max(amp2);ylabel(Energy);legend(amp1=x*x);%zcr=zeros(size(yframe,1),1)delta=0.02%for i=1:size(yframe,1) x=yframe(i,:) for j=1:length(x)-1 i
17、f x(j)*x(j+1)delta % zcr(i)=zcr(i)+1 end endend tmp1=enframe(x(1:end-1),FrameLen,FrameInc);tmp2=enframe(x(2:end),FrameLen,FrameInc);signs=(tmp1.*tmp2)0.02;zcr=sum(signs.*diffs,2); subplot(4,1,4);plot(zcr);axis(1 length(zcr) 0 max(zcr);ylabel(ZCR);legend(zcr);x=wavread(3.wav);figure;subplot(4,1,1);pl
18、ot(x);axis(1 length(x) -1 1);ylabel(Speech);enhance=filter(1-0.9375,1,x);FrameLen=240;FrameInc=80;yframe=enframe(x,FrameLen,FrameInc);amp1=sum(abs(yframe),2);subplot(4,1,2);plot(amp1);axis(1 length(amp1) 0 max(amp1);ylabel(Energy);legend(amp1=x);amp2=sum(abs(yframe.*yframe),2);subplot(4,1,3);plot(am
19、p2);axis(1 length(amp2) 0 max(amp2);ylabel(Energy);legend(amp1=x*x);zcr=zeros(size(yframe,1),1);delta=0.02for i=1:size(yframe,1); a=yframe(i,:) for j=1:length(a)-1 if a(j).*a(j+1)delta zcr(i)=zcr(i)+1 end endend %tmp1=enframe(x(1:end-1),FrameLen,FrameInc);%tmp2=enframe(x(2:end),FrameLen,FrameInc);%s
20、igns=(tmp1.*tmp2)0.02;%zcr=sum(signs.*diffs,2); subplot(4,1,4);plot(zcr);axis(1 length(zcr) 0 max(zcr);ylabel(ZCR);legend(zcr);function f=enframe(x,win,inc)%ENFRAME split signal up into (overlapping) frames: one per row. F=(X,WIN,INC)% F = ENFRAME(X,LEN) splits the vector X up into% frames. Each fra
21、me is of length LEN and occupies% one row of the output matrix. The last few frames of X% will be ignored if its length is not divisible by LEN.% It is an error if X is shorter than LEN.% F = ENFRAME(X,LEN,INC) has frames beginning at increments of INC% The centre of frame I is X(I-1)*INC+(LEN+1)/2)
22、 for I=1,2,.% The number of frames is fix(length(X)-LEN+INC)/INC)% F = ENFRAME(X,WINDOW) or ENFRAME(X,WINDOW,INC) multiplies% each frame by WINDOW(:)nx=length(x);nwin=length(win);if (nwin = 1) len = win;else len = nwin;endif (nargin 1) w = win(:); f = f .* w(ones(nf,1),:);End 语音信号倒谱与复倒谱的分析clc; clear
23、;tic,y,fs=wavread(speech_10k.wav);L=length(y);fw=y.*hamming(L);r=real(log(fft(fw,L)pfw=cceps(fw);rpfw=rceps(fw);z=rpfw(1:30);p=pfw(31:L)logz=real(exp(fft(z,L);logp=real(fft(p);subplot(3,2,1);plot(y);title(原始波形)subplot(3,2,3);plot(pfw);title(复倒谱)subplot(3,2,5);plot(rpfw);title(实倒谱)subplot(3,2,6);plot(logz);title(倒谱域滤波后的对数幅度谱)subplot(3,2,4);plot(r);title(对数幅度谱)subplot(3,2,2);plot(fw);title(加海明窗后的波形)
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1