资源预览内容
第1页 / 共24页
第2页 / 共24页
第3页 / 共24页
第4页 / 共24页
第5页 / 共24页
第6页 / 共24页
第7页 / 共24页
第8页 / 共24页
第9页 / 共24页
第10页 / 共24页
亲,该文档总共24页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
数字信号(digital signal)1. speech recognition overviewSpeech recognition is a technique that attempts to make the machine understand human speech. The function of speech recognition is to convert speech into equivalent written information, that is, to make computers understand people. As an interdisciplinary subject, speech recognition is based on the speech as the research object, is an important research direction of processing of speech signal, is a branch of pattern recognition, which relates to computer, signal processing, physiology, linguistics, neural psychology, artificial intelligence and other areas, but also is related to the persons body language (such as people in speaking facial expressions, gestures and other actions can help each other understand), its ultimate goal is to realize the natural language communication between man and machine1.1 foreign research history and current situationThe research of speech recognition can be traced back to 1950s. In 1952, the Audry system at AT&T Baer laboratory was the first speech recognition system to recognize ten English digits. At the end of 1960s and early 70s there are several basic ideas of speech recognition, which is the important achievements put forward linear prediction (LPC) signal encoding technology and dynamic time warping (DTW) technology, effectively solves the problem of feature extraction of speech signal and unequal voice matching problem; at the same time put forward the vector quantization (VQ) the hidden Markov model (HMM) theory. In the 20 world, speech recognition research was further developed in 80s: its remarkable feature is the successful application of HMM model and artificial neural network (ANN) in speech recognition. In 90s, with the rapid development of computer technology and telecom applications, it is urgent for speech recognition systems to move from laboratory to practical. The most representative are IBMs ViaVoice and Dragons Dragon Dictate systems. These systems have the ability of speaker adaptation, and new users do not need to train all the words so that the recognition rate can be continuously improved in use.1.2 domestic research history and current situationOur country has also invested a great deal of effort in the field of speech recognition research. The Institute of automation of China Academy of Sciences, acoustics and other research institutions and universities in Tsinghua University are engaged in the field of speech recognition research and development. National 863 intelligent computer expert group for speech recognition technology research project, the research level of our speech recognition technology has basically synchronized with foreign countries.2. speech recognition processAccording to the different descriptions of the output observation probability, HMM (hidden Markov chain) can be divided into discrete HMM (DHMM) and HMM (CHMM), which is similar to that of the difference is CHMM using a continuous probability density function of state probability calculation. DHMM uses discrete vector quantization (vector, quantization, VQ) to compute the state probability. In the speech signal speech recognition based on DHMM is first divided into several frame (frame), each frame is represented by a feature vector of parameters, then the vector time series of speech feature vectors, the speech signal at each frame into the VQ code, using the codebook training HMM, finally test the recognition rate.3. speech signal analysis method classification time-domain characteristicsThe characteristics of the time-domain waveform of the speech signal are calculated directly from the time domain signals. Such as short-time average energy, short time average zero crossing rate, resonance peak, pitch period, etc.The frequency domain and cepstrum domain features are transformed by time domain signals to reflect the frequency domain characteristics of speech signals, including Fu Liye spectrum, cepstrum, and time-frequency spectrum using the temporal information of speech signals. Auditory characteristics refer to the characteristics of speech signals, not from the auditory system, but from the perceptual characteristics of the human auditory system.4. short time analysisThe speech signal is non-stationary and time-varying signal characteristics of speech signals is changing with the time lucky is a short-time stationary short time within the scope of its characteristics remained unchanged (slow transformation), which is relatively stable, short-term quasi stationary process namely short-time analysis in the analysis of the speech signal, the speech signal is divided into a section of the section, using the method of analysis of nonstationary signal processing each segment as a frame of each section: General 1030ms for a short time on the analysis frame of speech recognition, HMM should be used to analyze, to voice signal processing of transient and non-stationar
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号