语音实验一端点检测-

实验一语音信号端点检测一、实验目的1学会 MATLAB 的使用，掌握 MATLAB 的程序设计方法；2掌握语音处理的基本概念、基本理论和基本方法；3掌握基于 MATLAB 编程实现带噪语音信号端点检测；4学会用 MATLAB 对信号进行分析和处理。5. 学会利用短时过零率和短时能量，对语音信号的端点进行检测。二、实验仪器设备及软件HP D538、MATLAB三、实验原理端点检测是语音信号处理过程中非常重要的一步，它的准确性直接影响到语音信号处理的速度和结果。本次实验利用短时过零率和短时能量相结合的语音端点检测算法利用短时过零率来检测清音，用短时能量来检测浊音，两者相配合便实现了信号信噪比较大情况下的端点检测。算法对于输入信号的检测过程可分为短时能量检测和短时过零率检测两个部分。算法以短时能量检测为主，短时过零率检测为辅。根据语音的统计特性，可以把语音段分为清音、浊音以及静音(包括背景噪声)三种。在本算法中，短时能量检测可以较好地区分出浊音和静音。对于清音，由于其能量较小，在短时能量检测中会因为低于能量门限而被误判为静音；短时过零率则可以从语音中区分出静音和清音。将两种检测结合起来，就可以检测出语音段 (清音和浊音)及静音段1、短时能量计算定义 n 时刻某语言信号的短时平均能量 En 为：En =艺x(m) w(n - m)2 =工x(m) w(n - m)2-gm=n-(N-1)式中 N 为窗长，可见短时平均能量为一帧样点值的平方和。特殊地，当窗函数为矩形窗时，有 En =x2(m)m=n-(N-1)2、短时过零率过零就是指信号通过零值。过零率就是每秒内信号值通过零值的次数。对于离散时间序列，过零则是指序列取样值改变符号，过零率则是每个样本的改变符号的次数。对于语音信号，则是指在一帧语音中语音信号波形穿过横轴(零电平)的次数。可以用相邻两个取样改变符号的次数来计算。如果窗的起点是n=0,短时过零率Z为波形穿过横轴(零电平)的次数Z 二 2幼 Sgn(Sw(n) - Sgn(Sw(n -1) In=0sgn( x) = i, x-0-1，x nn=m;elsen=n;x=x;endif nargin 3, SAMP_FREQ=16000; end;if nargin 4, l=SAMP_FREQ/40; end;if nargin 5, step=l/2; end;num_frames=ceil(n/step); %NUMBER OF FRAMES x(n+1:n+2*l)=zeros(2*l,1); %ADD ZEROS AT THE END OF THE SPEECH SIGNAL i=0 : st e p : num_frame s*st e p ; %i is the arithmetical proportion series by step j=i*ones(1,l);i=j+ones(num_frames+1,1)*1:l; y=reshape(x(i),num_frames+1,l); y=(hanning(l)*ones(1,num_frames+1).*y;for i=1:num_framescmd=sprintf (yy ( : , i) =%s (y ( : , i) ) ; , func) ;eval(cmd);endmelcepst 定义function c=melcepst(s,fs,w,nc,p,n,inc,fl,fh)%MELCEPST Calculate the mel cepstrum of a signal C=(S,FS,W,NC,P,N,INC,FL,FH)% Simple use: c=melcepst(s,fs) % calculate mel cepstrum with 12 coefs, 256 sample frames% c=melcepst(s,fs,e0dD) % include log energy, 0th cepstral coef, delta and delta-delta coefs% Inputs:%s speech signalfs sample rate in Hz (default 11025)nc number of cepstral coefficients excluding 0th coefficient (default 12) n length of frame (default power of 2 30 ms)p number of filters in filterbank (default floor(3*log(fs) )inc frame increment (default n/2)fl low end of the lowest filter as a fraction of fs (default = 0)fh high end of highest filter as a fraction of fs (default = 0.5)w any sensible combination of the following:R rectangular window in time domainN Hanning window in time domainM Hamming window in time domain (default)%t triangular shaped filters in mel domain (default) n hanning shaped filters in mel domainm hamming shaped filters in mel domainp filters act in the power domaina filters act in the absolute magnitude domain (default)0 include 0th order cepstral coefficiente include log energyd include delta coefficients (dc/dt)D include delta-delta coefficients (d入2c/dt入2)%z highest and lowest filters taper down to zero (default)y lowest filter remains at 1 down to 0 frequency and highest filter remains at 1 up to nyquist freqencyIf ty or ny is specified, the total power in the fft is preserved.% Outputs: cmel cepstrum output: one frame per row% Copyright (C) Mike Brookes 1997% Last modified Thu Jun 15 09:14:48 2000% VOICEBOX is a MATLAB toolbox for speech processing. Home page is at% http:/www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html%