资源预览内容
第1页 / 共39页
第2页 / 共39页
第3页 / 共39页
第4页 / 共39页
第5页 / 共39页
第6页 / 共39页
第7页 / 共39页
第8页 / 共39页
第9页 / 共39页
第10页 / 共39页
亲,该文档总共39页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
acoustic-phoneticsacoustic-phoneticsAcoustic parametersFundamental frequency (F0:Hz) Pitch (auditory)Amplitude (dB) Loudness (auditory)Formants: frequencies above F0(Hz) Quality (auditory)Different quality between vowel A and vowel B different formant patternsLength: Duration (s)Speech sound consists of small variations in air pressure which are audible btw 20 to 20,000 perturbations/Hz. Every time the vocal cords open and close, there is a pulse of air from the lungs. In the case of voiced sounds, the vibrating vocal cords chop up the stream of lung air so that pulses of high pressure alternates with pulses of lower pressure. When variations in air pressure in the form of sound waves reach the ear of the listener, they cause the eardrum to vibrate.Sound waveSound is caused by perturbation of air pressure, it is an alternating pattern of high and low pressure areas.The red line represents the average background air pressure - the air pressure that would exist if there were no sound waves. Points above this line represent higher pressure (more crowded air molecules); points below represent lower pressure (less crowded air molecules). The waveform shows speech as a function ofAmplitude (which is measured in decibels (dB), and the frequency of the sound waves measured in hertz (Hz). Properties of Sine WaveThis simplest kind of wave is often called a sine wave. Combining WavesIf youre listening to waves from two sources at the same time:a high pressure from one source will tend to cancel out a low pressure from the other; two high pressures will reinforce each other and create an even higher pressure; two lower pressures will reinforce each other and create an even lower pressure Any complex wave can be treated as a combination of simple sine waves.For most practical purposes in phonetics, we dont care about the actual complex waveform itself. Were only interested in the frequencies and amplitudes of the simple waves that its made of. (a), (b), (c) are component frequencies,and (d) is a complex waveform.When the signal( sound) is periodic, the sine waves in the complex waves are whole-number multiples (整倍数) of the same fundamental frequency(F0).When the signal is noise (aperiodic), the frequencies of the sine waves have no relationship (无比例关系) to one another. Analysis of a complex sound is called spectral analysis.Wavelength: the length of a sound wave is the distance in space that one cycle occupies. One can measure from any point in one cycle to the corresponding point in the next cycle. High frequency sounds occupy less space per cycle, have a shorter wavelength, than do low frequency sounds.Pitch and Frequency Frequency is the number of the complete repetitions (cycles) of waveforms ( or the number of vibrations) in a second. The unit of frequency is measured in Hz. If the vocal cords make 200 complete opening and closing movements in a second, the frequency is 200Hz.So it is possible to determine the frequency of a sound by counting the peaks of air pressure in a record of its waveforms. Frequency determines pitch. When a speech sound goes up in frequency, it also goes up in pitch, the rate of speech also goes up. The pitch of a sound may be equated with its fundamental frequencyFundamental frequency(基频)基频) E1):determined by how many times the vocal folds vibrate in one second ( if the vocal cords open and close 100 times per second, the glottal period equals 10 millisecond, the F0 equals 100 Hz); E2): first harmonic, fundamental .F0 ( read as F naught or F zero) is of particular importance in studies of intonation. F0 determines tone. Tone is connected with pitch variations. We perceive tones. Pitch: Vowels have an intrinsic pitch which correlates with vowel height: high vowels have high pitch and low vowels have low pitch, and the difference can be as large as 20-25 Hz. Pitch patterns are either steady, rising or falling, and it is changing pitch that has a greater perceptual salience. A speaker takes less time to produce a falling pitch than a rising pitch. All languages vary pitch to convey linguistic information.Amplitude/ Intensity/LoudnessAmplitude determines loudness. The loudness of a sound depends on the size of the variations in air pressure ( or amplitude). Acoustic intensity is the appropriate measure corresponding to loudness ( measured in dB). Spectrogram can tell us how much acoustic energy is produced. Intensity is a physical characteristic of a sound, and loudness is the subjective property of a sound that is most directly related to intensity. Resonance (共鸣频率)Resonance (共鸣频率):A resonator is something that is set into forced vibration by another vibration.In speech, the resonances of the vocal tract are called formant frequencies or simply formants. The resonant frequencies depend on the shape and size of the vocal tract.Vocal folds vibration - harmonics, H1 (F0), H2, H3 - Oral, nasal, and pharyngeal cavities - to form resonators- Different shapes/sizes of the resonators - resonate and emphasize certain harmonics- formants (F1, F2, F3) - Formant patterns- quality of sounds (vowels/sonorants) Formants( (共振峰)共振峰)A formant is a concentration of acoustic energy, reflecting the way air from the lungs vibrates in the vocal tract.In wideband spectrogram(0.005), formants show up as a group of dark energy bands; in narrow-band spectrogram (0.025), formants show up as a group of adjacent harmonics. So wide-band is better in viewing formants, while narrow-band is better for viewing harmonics. Formants are numbered.Vowels are characterized by three formants: F1,F2,F3. The change of F1 and F2 means the change of a vowel. Vowel height F1, The higher a vowel, the less the F1 backnessF2-F1 The larger the number of F2-F1,the more forward the vowelStop transition & place of articulationEg_a_stop_a/i_stop_iadjacent to a labial stop, the formant pattern of the vowel exhibits a lowered f2adjacent to an alveolar stop, the formant of the vowel exhibits a raised f2adjacent to a velar stop, the format of the vowel exhibits divergent f2 and f3 (velar pinch:F2 and F3 often come together in velar transitions)Harmonic (谐波 )Harmonic: refers to regular ( periodic) waveform accompanying a fundamental frequency, which helps to identify a complex tone, also called an overtone(倍音). Harmonics are whole-number multiples of the fundamental frequency;e.g., if the fundamental frequency is 200, the harmonics will be at 400Hz, 600Hz, 800HzIn this example, 400 Hz would be the first harmonic, 600 Hz will be the second harmonic, 800 Hz is the third harmonic. Harmonic implies periodic, otherwise it is aperiodic.Distinguish harmonics with formant frequenciesIf we know the fundamental frequency, we know the harmonics. Varying the fundamental frequency changes the pitch of the vowel. Such is not the case for the formant frequencies which result from the position of the tongue and the shape of the mouth. When the tongue is high and forward in the vocal tract such as the vowel/i/, /i/ has the highest F2 ( about 2250) and lowest F1 ( about 220). When the back of the tongue is high such as the vowel /u/, it has the lower F2 (about 870) and higher F1 ( about 310) when compared with /i/.Vowel Normalization (元音的规整元音的规整)Men, women and children differ in vocal tract length, these differences result in differences in formant frequencies in producing vowels.They also differ in the vocal folds :females have shorter vocal folds with less mass than males, so they tend to have higher average fundamental frequencies. Children, whose vocal folds are even shorter, usually have much higher fundamental frequencies than adults.These physical differences in speakers result in considerable differences in the formant frequencies for particular vowels.Vowel normalization refers to the perceptual process of factoring out differences due to vocal tract differences.In fact there is a considerable degree of overlap of vowel spaces (元音空间有很大程度的重叠)) between adults and children. Research with very young children has provided evidence that they are capable of normalizing vowels for different speakers (Kuhl, 1987; Lieberman, 1984).Infants can recognize the same vowel across different speakersdespite the acoustic differences. Researchers have suggested that this capacity ( vowel normalization) is innate and present at birth. VOTIn voiced stop consonants, the voicing begins almost simultaneously when the occlusion of the airstream is released.In contrast, there is a delay in the onset of vocal fold vibration after the occlusion of the airstream is released in the case of voiceless consonants. The difference between voiced and voiceless consonants is actually one of the relative timing of the onset of the vocal fold vibration. Voice Onset Time (VOT) is the duration of the period of time between the release of a plosive and the beginning of vocal fold vibration. This period is usually measured in milliseconds (ms).Three types of VOTZero VOT: where the onset of vocal fold vibration coincides (approximately) with the plosive releasePositive VOT: where there is a delay in the onset of vocal fold vibration after the plosive releaseNegative VOT: where the onset of vocal fold vibration precedes the plosive release. Ph: VOT0 p: VOT=0 b: VOT0PrevoicingVoiced stop consonants have a relatively short VOT ( a short voice lag), whereas voiceless consonants have a longer VOT ( a long voice tag). In many languages, the vibration of the vocal folds for voiced consonants may actually begin before the consonant is released ( also called prevoicing). In this case, we speak of negative VOT.结束结束
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号