US3855416A - Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment - Google Patents

Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment Download PDF

Info

Publication number
US3855416A
US3855416A US00311422A US31142272A US3855416A US 3855416 A US3855416 A US 3855416A US 00311422 A US00311422 A US 00311422A US 31142272 A US31142272 A US 31142272A US 3855416 A US3855416 A US 3855416A
Authority
US
United States
Prior art keywords
amplitude
detecting
peak
modulation
vibratto
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US00311422A
Inventor
F Fuller
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US00311422A priority Critical patent/US3855416A/en
Application granted granted Critical
Publication of US3855416A publication Critical patent/US3855416A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices

Definitions

  • Speech is the acoustic energy response of: (a) the voluntary motions of the vocal cords and the vocal tract which consists of the throat, the nose, the mouth, the tongue, the lips and the pharynx, and (b) the resonances of the various openings and cavities of the human head.
  • the primary source of speech energy is excess air under pressure, contained in the lungs. This air pressure is allowed to flow out of the mouth and nose under muscular control which produces modulation. This flow is controlled or modulated by the human speaker in a variety of ways.
  • the major source of modulation is the vibration of the vocal cords. This vibration produces the major component of the voiced speech sounds, such as those required when conus the vowel sounds in a normal manner. These voiced sounds, formed by the buzzing action of the vocal cords, contrast to the voiceless sounds such as the letter s or the letter f produced by the nose, tongue and lips. This action of voicing is known as phonation.
  • the basic buzz or pitch frequency which establishes phonation, is different for men and woman.
  • the basic pitch pulses of phonation contain many harmonics and overtones of the fundamental rate in both men women.
  • the vocal cords are capable of a variety of shapes and motions. During the process of simple breathing, they are involuntarily held open and during phonation, they are brought together. As air is expelled from the lungs, at the onset of phonation, the vocal cords vibrate back and forth, alternately closing and opening. Current physiological authorities hold that the muscular tension and the effective mass of the cords is varied by learned muscular action. These changes strongly influence the oscillating or vibrating system.
  • phonation is established by or governed by two different structures in the pharynx, i.e., the vocal cord muscles and a mucous membrane called the cones elasticus. These two structures are acoustically coupled together at a mutual edge within the pharynx, and cooperate to produce two different modes of vibration.
  • a pitch cycle begins with a subglottal closure of the conus elasticus. This membrane is forced upward toward the coupled edge of the vocal cord muscle in a wave-like fashion, by air pressure being expelled from the lungs.
  • a small puff of air explosively occurs, giving rise to the open phase of vocal cord motion.
  • the subglottal closure is pulled shut by a suction which results from the aspiration of air through the glottis.
  • the vocal cord muscles Shortly after this, the vocal cord muscles also close.
  • the two masses tend to vibrate in opposite phase. The result is a relatively long closed time, alternated with short sharp air pulses which may produce numerous overtones and harmonics.
  • the pharyngeal cavity resonating as a closed pipe.
  • the second formant arises in the mouth cavity.
  • the third formant is often considered related to the second resonance of the pharyngeal cavity.
  • the modes of the higher order forrnants are too complex to be very simply identified.
  • the frequency of the various formants vary greatly with the production of the various voiced sounds.
  • the fine structure of the fundamental pitch frequency, as well as the relative peak energy at high and low frequency regions appears to be an acoustic correlate of emotional content, transmitted through speech.
  • Other parameters thought to be related to the emotional transmission of information include: Phonetic Content, Gross Changes in Fundamental Frequency, Relative Energy Levels in Various Frequency Bands, and the Speech Envelope Amplitude. These parameters all contribute to the conveyance of emotion or a stressful condition existing in the speaker.
  • Speech analysis and the equipment for accomplishing the same has been developed for a variety of loosely related purposes.
  • One of the primary concerns is the transmission of speech with a high order to intelligibility and presence over a very reduced bandwidth.
  • the applicability of this particular art becomes obivous in civil and military communications.
  • Other fields in which speech analysis equipment are used are the voice operated printing or recording device, such as a typewriter and systems, equipment and devices commanded and controlled by the spoken word or phrase. While these activites are interesting and valuable in themselves, they do not relate to the detection of emotional content of a speech wave nor to its use to determine the veracity of the speaker.
  • the fine structure of the basic phonation may be assessed and quantified by measurement of the amount of rapid amplitude modulation on the speech signal envelope of a spoken word and weighted by the peak amplitude in a selected frequency band. This rapid variation of the speech signal is called vibratto for the purposes of this application.
  • This invention discloses a means whereby the measure of vibratto in the speech envelope of a person under interrogation may be meaningfully quantified in real time, so that a Truth/Lie decision can be made.
  • Research into the vibratto component of the speech wave has conclusively demonstrated that the amount of vibratto correlates well with stress or emotional involvement which leads to the Truth/Lie decision.
  • the present invention provides a means for determining the truth and veracity of a speakers response under interrogation by quantification of vibratto content of his answer and weighting such quantification with the peak amplitude in a selected frequency band of his speech.
  • the vibratto is quantified by rectification, smoothing, time and amplitude discrimination and level detected to produce a series of pulses of uniform width and amplitude. These pulses represent the vibratto content.
  • the speakers voice peak is detected by processing the signal through a band pass filter, rectifier, smoothing filter, and peak-detection-and-hold circuit.
  • the vibratto pulse train is weighted by multiplying it by the detected peak amplitude in an analog manner.
  • the resulting pulse train is a series of pulses whose timing is related to the vibratto content of the speech and whose amplitude is proportional to the peak energy level in the fundamental pitch region as selected or defined by the band pass filter in the peak detector circuit.
  • the two signals may be recorded in a two-track chart recorder and the value of the vibratto pulse could be multiplied by the peak voltage reading by hand.
  • the vibratto measurement could also be quantified by digital counter with the resultant number being multiplied by the voltage reading of the peak energy circuit.
  • the output from the analog multiplier may be further quantified by measuring the average value with a simple DC voltimeter or with an averaging and recording instrument.
  • the present invention provides a visual real-time display of a weighted vibratto content of the speech of the subject from which a trained interrogator may derive the veracity thereof by comparison with a known truthful response.
  • An additional object of this invention is to detect this emotional or stressful condition while the person who is speaking is under direct and skillful interrogation.
  • a further object of this invention is to provide means whereby a valid Truth/Lie decision can be rendered by direct observations of the data readout of a voice or speech analysis system.
  • a still further object of this invention is to detect the emotional or stressful condition by analysis of the vibratto content of speech weighted by the peak amplitude of a selected frequency band during the same phonation utterance.
  • FIG. 1 is an oscillograph of a male voice responding with the word yes in the English language, in answer to a direct question at a bandwidth of SkI-Iz,
  • FIG. 2 is an oscillograph of a male voice responding with the word no in the English language in answer to a direct question at a bandwidth of SkHz,
  • FIGS. 3a and 3b are oscillgraphs of a male voice responding yes in the English Language as measured in the 150-30OHz and 600l 200Hz frequency regions, respectively,
  • FIGS. 4a and 4b are oscillographs of a male voice responding no in the English language as measured in the l5030OI-Iz and 600-OI-Iz frequency regions, respectively,
  • FIGS. 5a and 5b are graphs of a yes" response with and without emotional stress, respectively.
  • FIGS. 6a and 6b are graphs of a no response with and without emotional stress, respectively.
  • FIG. 7 is a block diagram of the weighted vibratto signal processing circuit
  • FIG. 8 is a detailed schematic of the weighted vibratto signal processing circuit.
  • FIG. 1 shows an oscillograph of a male voice responding with the word yes in the English language in answer to a direct question at a bandwidth of SkHz.
  • the wave form contains two distinct sections, the first being for the ye sound and the second being for the unvoiced 5 sound. Since the first section of the yes" signal wave form is a voiced sound being produced primarily by the vocal cords and conus elasticus, this portion will be processed to detect emotional stress content or vibratto modulation.
  • the male voice responding with the word no in the English language at a bandwidth of SkI-Iz is shown in FIG. 2. This response has a single voiced section which will be analyzed by the present device to detect the presence of the vibratto or rapid modulation of the phonation constituent of the speech signal.
  • FIGS. 3 and 4 show an oscillograph of the same male voice as in FIGS. 1 and 2 responding yes and no, respectively, in the English language as measured in the l50300I-Iz frequency region.
  • This spectral region contains a great deal of the fundamental energy of phonation.
  • this band of frequencies is rectified and smoothed, the peak amplitude of the phonation will make an ideal weighting function.
  • FIG. 5a is a drawn replica of a portion of the response yes, delivered under emotional stress.
  • the rapid modulation or vibratto pulses can be seen extending above and below the normal envelope. These additional excursions occur as the result of non-symmetric action between the vocal cords and the conus elasticus.
  • the basic repetition period of this male voice is about 8.3 milliseconds.
  • FIG. 5b is a drawn replica of a portion of a male voice responding yes delivered under conditions of no emotional stress.
  • the smooth r'egular features of the pitch pulses can be easily seen.
  • FIG. 6a is a drawn replica of a portion of the same male voice responding no under a condition of emotional stress.
  • the vibratto modulations appear as distortions near the axis of averages ,and as excessively high peaks in the position direction. This non-regularity is the result of interaction in the pharynx between the vocal cords and the conus elasticus leading to an explosive type of formant excitation.
  • FIG. 6b is a drawn replica of a portion of the same male voice answering no-to a non-stress question. The smoothness and regularity of the response can be readily seen.
  • a high fidelity acoustic transducer 2 is used to bring the transduced electrical energy of the phonation into the system.
  • the electrical signal divides into two channels.
  • the upper channel is concerned with the speech processing of the Fundamental Pitch Frequency, and the lower channel processes the Vibratto Component of the voice.
  • an amplifier 4 serves to increase the energy of the signal from the transducer and to isolate the following circuits from theoutput impedance of the transducer.
  • a band-pass filter 6 follows, which is selected or adjusted to accept only the fundamental pitch frequency of the voice. This is commonly found to be in the vicinity of 120 to lSOHz for men and about 250l-Iz for women and thus a filter with a bandwidth of l50-300I-Iz may be selected.
  • the signal at the output of band-pass filter 6 is tied to the energy detection means which comprises a rectifier 8 followed by a smoothing or low pass filter 10. By this pair of circuits, a voltage which is representative of the envelope of the phonation is produced. This envelope is then peak detected and held or stored in device 12 until all further processing has been accomplished. The circuit is then reset, either manually or automatically, as desired by the operator of the machine.
  • the electrical signal from the transducer 2 is amplified and isolated by amplifier 14.
  • the energy is again rectified by rectifier 16, isolated and amplified by amplifier l7 and smoothed by filter 18 to form an envelope of the signal.
  • the smoothing filter 18 must have its characteristics adjusted accordingly.
  • a stage of isolation amplification 20 is used to separate the processing of the smoothing filter from the time and amplitude discriminator 22. This latter circuit acts to compare the amplitude of the signal envelope with the time derivative of the signal envelope.
  • the output of the time and amplitude discriminator 22 is a zero based burst of pulses of a single polarity. These pulses will have a high proportion of the vibratto component.
  • the pulses which are of varying amplitude and width, are fed into a voltage comparator 24, a module of conventional design and commercially available.
  • the comparator compares the incoming wavetrain with a DC voltage from the potentiometer 26. The only pulses that are allowed to emerge from the comparator are those that are larger in voltage value than the set voltage of the potentiometer. This potentiometer is set by the operator to a value which is somewhat above the baseline.
  • the Vibratto Component is usually of high sharp pulses, so these are enhanced with a high setting of the potentiometer.
  • the output of the comparator is a series of pulses, predominantly consisting of the Vibratto Component, all of constant amplitude one for each pulse exceeding the preset level.
  • the detected and held envelope peak enters input port 27 of the analog multiplier 28, while the series of vibratto pulses enters input port 29. These two waveforms are multiplied together in the mathematical manner known as weighting. The result will be a series of pulses of increased amplitude at a period defined by the vibratto pulse train at input port 29 that will be recorded in the DC recorder 30 for observation and analysls.
  • FIG. 8 A more detailed explanation of the preferred embodiment of the invention is shown schematically in FIG. 8.
  • the acoustic phonation of the subject under interrogation enters the system of instrumentation in one of two ways. It may, at the discretion of the operator, enter the system directly (in real time) from the transducer 50 or from tape recorder 54 and its transducer 52. These transducers are high fidelity microphone types of devicesthat reproduce the electrical analog of the acoustic signal, with a minimum of frequency and amplitude distortion. Switch means 56 is used to select either the microphone directly or the tape recorder output. As in FIG. 7, the upper channel in FIG. 8, pertains to the envelope processing and peak detection of a selected frequency band and the lower channel pertains to the envelope processing and vibratto quantification.
  • the switch means 56 is followed in the upper channel by an operational amplfier 58 with its gain determining resistors 60 and 62.
  • the broad band speech signal from the transducer and switch enters the band pass filter 64 and the spectral region typically from l5030OHz is allowed to pass. This exact region may be changed, depending upon the voice of the subject involved.
  • the region of -300I-Iz is a good compromise and was employed in the preferred embodiment.
  • FIGS. 1 and 2 show the waveforms of a male human voice responding the words yes and no in English when the bandwidth is limitd to 150-300l-Iz.
  • Another stage of isolation employing an operational amplifier 66, with its gain determining resistors 68 and 70, is used to isolate the band-pass filter from the converter of the dual polarity signals.
  • the conversion of the dual polarity signals out of the isolation amplifier are performed by a simple solid state diode 72 acting as a rectifier.
  • a reversed polarity diode would function nearly as well.
  • full wave or bridge rectifiers could also be used with attendant increase in signal level. If diodes with a particular characteristic, such as Square Law were employed, the instrument would operate upon a true measure of the power in the speech signal.
  • a further operational amplifier 74 with its gain setting resistors 76 and 78 serves to isolate the rectifier circuit from the rest of the instrument.
  • a smoothing filter 79 follows this amplifier.
  • the smoothing filter consists of a low-pass filter network that is comprised of a variable resistor 80 and a fixed capacitor 82. It can be seen that other types of low-pass filters could be employed here, to remove the higher frequency fluctuations of the rectified signal, thus rendering the output of the smoothing filter essentially that of the envelope of the speech wave, in the defined pass-band. The exact time constant of this filter is adjusted depending upon the pitch of the voice under assessment.
  • the smoothed envelope is amplified and isolated by operational amplifier 84 with its gain determining resistors 86 and 88. This envelope of speech energy then passes into a peak detect and hold circuit 90 of conventional design. Such circuits can be readily fabricated from a variety of components and modules by those skilled in the art.
  • a single circuit to perform this task such as a single module named Infinite Sample Hold, manufactured by Hybrid Systems Corp.
  • This particular module has the advantage of preventing decay of the peak value until the circuit is reset by a signal on input lead 94 from the control means 96 which may be a simple switch.
  • the output peak value appears on lead 92 and is applied to one channel of the analog multiplier 98.
  • two voltages could be multiplied together, thereby weighting or modulating one voltage with the other. This process could be performed manually, by obtaining the individual values of the voltages involved. It is preferred, however, to employ one of the modern analog computer modules that have become available such as Model 107C analog multiplier/divider, manufactured by Hybrid Systems Corp.
  • the lower channel that processes the speech or phonation energy to obtain the Vibratto components functions in the following manner.
  • the electrical analog of the phonation enters the channel through an isolating operational amplifier 103 with its gain determining resistors 104 and 106.
  • This amplifier is used to provide isolation and linear amplification of the signal with no frequency discrimination.
  • This isolated and amplified signal is applied to a diode 108 where one polarity of the speech signal is allowed to pass into the following circuitry.
  • a diode connected in the opposite polarity could be used nearly as well.
  • a full wave rectification circuit or a bridge rectification circuit (not shown) could be used as well with a small additional complication of the circuit.
  • the electrical energy out of the diode, at the input of the following circuitry, is therefore predominantly and primarily of one polarity.
  • Operational amplifer 110 with its gain determining resistances 112 and 114 is used to isolate the diode circuit from the follow-on circuitry.
  • the follow-on circuitry consists of a smoothing filter 116 in the form of an R/C Integrator having a variable resistor 117 and a fixed capacitor 118. It can be seen, by those versed in the art, that a variety of different active and passive smoothing filters could be used to remove the high frequency energy of the phonation and to extract a signal which is representative of the envelope of the speech wave and yet retains the Vibratto Component.
  • the R/C Integrator that is used in the present preferred embodiment functions quite well and is simple to employ. The time constant is variable to afford adjustment for different voices of various fundamental frequencies.
  • the R/C Integrator is folowed by another operational amplifier 120 with its gain determining resistors 122 and 124.
  • This operational amplifier isolates the processing of the R/C Integrator from the subsequent circuit action.
  • a special discriminator circuit that processes the time derivative of the incoming speech wave and the amplitude of the speech envelope at the same time.
  • the special differentiator circuit involves a fixed capacitor 126 and a variable resistor 128. These two components perform the time differentiation function.
  • the potentiometer 130 provides a measure of the undifferentiated signal envelope which is used to null out residual envelope energy. This component, connected as it is, performs the envelope amplitude discrimination function.
  • An operational amplifier 132 with its gain and performance determining resistances 134, 136 and 138, accepts the time derivative signal and the amplitude discrimination signal and provides effective base line restoration for most types of phonation.
  • Base line restoration can be accomplished in a variety of ways as well, as by various types of clamping and DC restoration circuits. Irrespective of the circuit of use, the output of the amplifier 132 becomes a series of pulses with a defined base line that contains the Vibratto Component and thus comprises the variable fine structure of the phonation. These pulses are of varying amplitude and width.
  • the output pulses from the time and amplitude discriminator are applied to a comparator circuit 140 which determines the level of statistically signifcant pulses. This level is determined by a knowledgeable operator of the equipment, and is controlled by adjustment of potentiometer 142. This control is shown to function at either a positive or a negative voltage level. When the polarity of the diode 108 is selected, the comparator voltage level must be adjacent to the polarity that will select either excess positive or excess negative peaks.
  • the potentiometer 142 may also be set at 0" volts at which time the circuit becomes conventional zero-crossing detector means. It has been found that the statistical significance of the Truth/Lie Decision process will improve if a level of voltage off the O baseline is selected for the comparator switching level.
  • This comparator may be a simple diode circuit or it may be a Schmitt trigger circuit, each with suitable voltage supplies, passive and active components. However, for simplicity and economy, a differential voltage comparator such as the Motorola MC17I0 was used.
  • the output of voltage comparator 140 is a series of equiamplitude pulses, one for each input above the selected level. This pulse train enters the analog multiplier 98, where it is weighed by the output voltage level of the peak detect and hold module 90. The output pulse train thus has an amplitude varied by the peak detected signal and a spacing of the vibratto pulse train.
  • a method for detecting emotional stress in the utterance of an individual comprising:
  • a method as in claim 1 including rectifying and smoothing said selected frequency band before detectmg 3.
  • a method as in claim 3 wherein said smoothing comprises integrating and said time and amplitude discriminating includes differentiating and DC. base line restoration.
  • a method as in claim 1 including holding said detected peak amplitude for the duration of said utterance and resetting said held peak amplitude upon termination of said utterance.
  • a device for indicating emotional stress from the utterances of a human comprising:
  • a device as in olaim 6 including rectifying means and smoothing means connected between said bandpassing means and said peak detecting means.
  • said band-passing means has a band of to 3001-12.
  • a device as in claim 6 including means connected between said modulation detecting means and said weighting means for comparing said detected modulation to a selected level.
  • said modulation detecting means includes differentiating means producing a series of varying amplitude pulses and said comparing means producing a series of uniform amplitude pulses for each varying amplitude pulse above said se lected level.
  • a device as in claim 6 wherein said detecting means include differentiation means and baseline restoration means.

Abstract

A method and apparatus for indicating emotional stress in speech by detecting the presence of vibratto or rapid modulation and weighting the vibratto with a detected peak value of a preselected frequency band.

Description

United States Patent Fuller Dec. 17, 1974 [54] METHOD AND APPARATUS FOR 3,346.694 10/1967 Brady 179/1 SA PHONATION ANALYSIS LEADING o 3,592,969 7/1971 Yoshino 179/1 SA VALID TRUTH/LIE DECISIONS BY FOREIGN PATENTS OR APPLICATIONS FUNDAMENTAL SPEEC -EN R 1,113,225 5/1968 Great Britain .v 179/1 SA WEIGHTED VIBRATTO COM ASSESSMENT PONENT OTHER PUBLICATIONS Lieberman & Michaels. Some Aspects of Fundamen- Inventor: Fred Fuller, 4450 Park tal Frequency & Envelope Amplitude As Related to Chevy Chase, 20014 the Emotional Content of Speech, J.A.S.A. 7/1962, 221 Filed: Dec. 1, 1972 P [2]] Appl- N04 311,422 Primary E.\'aminerDavid L. Stewart Attorney, Agent, or Firm-Fidelman, Wolffe, Leitner [52 11.5. C1 179/1 SA, 179/1 SP [51] Int. Cl. G101 U04 1581 Field ofSearch... 179/1 SA, 1 SB, 1 US, 15.55 R, 1571 ABSTRACT 179/l5.55 T; 128/206; 35/21 A method and apparatus for indicating emotional stress in speech by detecting the presence of vibratto [56] References Cited or rapid modulation and weighting the vibratto with 21 UNITED STATES PATENTS detected peak value of a preselected frequency band.
3268.661 8/1966 Coulter 179/1 SA 14 Claims, 12 Drawing Figures AMPLIFIER AND PASS LOW PASS 1a FlLTER RECHFIER FILTER g fi g 15o 300112 (SMOOTHING) s a 10 12 27 30*] ANALOG d3 6 MULTIPLIER RECORDER 2 29 LOW PASS TIME AND VOLTAGE RECTIFIER F'LTER gq g afi i COMPARATOR (SMOOTHINGI AMPL'FEQ lSOLATlNG Wk AMPLIFIER 26 METHOD AND APPARATUS FOR PHONATION ANALYSIS LEADING TO VALID TRUTH/LIE DECISIONS BY FUNDAMENTAL SPEECH-ENERGY WEIGHTED VIBRATTO COMPONENT ASSESSMENT BACKGROUND OF THE INVENTION The present invention relates generally to voice signal analysis systems and more specifically to a method and apparatus for detecting emotional stress within a voice pattern. The presence of an emotional state will be used to determine the truthfulness of a response to questions asked by a skilled interrogator. This invention must be understood and examined in the light of my copending applications, Ser. Nos. 311,391 and 31 1,392. These inventions provide parts of the technology of this particular invention which is different from and is an extension of the technology of both copending applications.
DESCRIPTION OF THE PRIOR ART It has long been known that the voice may be, and often is, used to convey the emotions of the speaker. The emotional state of the speaker produces readily observable variation of measurable parameters of the voice.
Speech is the acoustic energy response of: (a) the voluntary motions of the vocal cords and the vocal tract which consists of the throat, the nose, the mouth, the tongue, the lips and the pharynx, and (b) the resonances of the various openings and cavities of the human head. The primary source of speech energy is excess air under pressure, contained in the lungs. This air pressure is allowed to flow out of the mouth and nose under muscular control which produces modulation. This flow is controlled or modulated by the human speaker in a variety of ways.
The major source of modulation is the vibration of the vocal cords. This vibration produces the major component of the voiced speech sounds, such as those required when conus the vowel sounds in a normal manner. These voiced sounds, formed by the buzzing action of the vocal cords, contrast to the voiceless sounds such as the letter s or the letter f produced by the nose, tongue and lips. This action of voicing is known as phonation.
The basic buzz or pitch frequency, which establishes phonation, is different for men and woman. The vocal cords of a typical adult male vibrate or buzz at a frequency of about l20I-Iz, whereas for women this basic rate is approximately an octave higher, near 250 Hz. The basic pitch pulses of phonation contain many harmonics and overtones of the fundamental rate in both men women.
The vocal cords are capable of a variety of shapes and motions. During the process of simple breathing, they are involuntarily held open and during phonation, they are brought together. As air is expelled from the lungs, at the onset of phonation, the vocal cords vibrate back and forth, alternately closing and opening. Current physiological authorities hold that the muscular tension and the effective mass of the cords is varied by learned muscular action. These changes strongly influence the oscillating or vibrating system.
Certain physiologists consider that phonation is established by or governed by two different structures in the pharynx, i.e., the vocal cord muscles and a mucous membrane called the cones elasticus. These two structures are acoustically coupled together at a mutual edge within the pharynx, and cooperate to produce two different modes of vibration.
In one mode, which seems to be an emotionally stable or non stressful timbre of voice, the conus elasticus and the vocal cord muscle vibrate as a unit in synchronism. Phonation in this mode sounds soft or mellow and few overtones are present.
In the second mode, a pitch cycle begins with a subglottal closure of the conus elasticus. This membrane is forced upward toward the coupled edge of the vocal cord muscle in a wave-like fashion, by air pressure being expelled from the lungs. When the closure reaches the coupled edge, a small puff of air explosively occurs, giving rise to the open phase of vocal cord motion. After the explosive puff of air has been released, the subglottal closure is pulled shut by a suction which results from the aspiration of air through the glottis. Shortly after this, the vocal cord muscles also close. Thus in this mode, the two masses tend to vibrate in opposite phase. The result is a relatively long closed time, alternated with short sharp air pulses which may produce numerous overtones and harmonics.
The balance of respiratory tract and the nasal and cranial cavities give rise to a variety of resonances, known as formants in the physiology of speech. The
' lowest frequency format can be approximately identified with the pharyngeal cavity, resonating as a closed pipe. The second formant arises in the mouth cavity. The third formant is often considered related to the second resonance of the pharyngeal cavity. The modes of the higher order forrnants are too complex to be very simply identified. The frequency of the various formants vary greatly with the production of the various voiced sounds.
As pointed out in my copending applications, the fine structure of the fundamental pitch frequency, as well as the relative peak energy at high and low frequency regions, appears to be an acoustic correlate of emotional content, transmitted through speech. Other parameters thought to be related to the emotional transmission of information include: Phonetic Content, Gross Changes in Fundamental Frequency, Relative Energy Levels in Various Frequency Bands, and the Speech Envelope Amplitude. These parameters all contribute to the conveyance of emotion or a stressful condition existing in the speaker.
Speech analysis and the equipment for accomplishing the same has been developed for a variety of loosely related purposes. One of the primary concerns is the transmission of speech with a high order to intelligibility and presence over a very reduced bandwidth. The applicability of this particular art becomes obivous in civil and military communications. Other fields in which speech analysis equipment are used are the voice operated printing or recording device, such as a typewriter and systems, equipment and devices commanded and controlled by the spoken word or phrase. While these activites are interesting and valuable in themselves, they do not relate to the detection of emotional content of a speech wave nor to its use to determine the veracity of the speaker.
According to the present invention, the fine structure of the basic phonation may be assessed and quantified by measurement of the amount of rapid amplitude modulation on the speech signal envelope of a spoken word and weighted by the peak amplitude in a selected frequency band. This rapid variation of the speech signal is called vibratto for the purposes of this application.
This invention discloses a means whereby the measure of vibratto in the speech envelope of a person under interrogation may be meaningfully quantified in real time, so that a Truth/Lie decision can be made. Research into the vibratto component of the speech wave has conclusively demonstrated that the amount of vibratto correlates well with stress or emotional involvement which leads to the Truth/Lie decision.
There are many ways to detect and measure the amount of vibratto in the phonation of an emotionally involved person under interrogation. Frequency fluctuation in the basic pitch frequency could be quantified with the aid of a frequency discriminator, for example. In addition, variations in the time between successive pitch pulses could be obtained by conventional zero crossing analysis.
SUMMARY OF THE INVENTION The present invention provides a means for determining the truth and veracity of a speakers response under interrogation by quantification of vibratto content of his answer and weighting such quantification with the peak amplitude in a selected frequency band of his speech. The vibratto is quantified by rectification, smoothing, time and amplitude discrimination and level detected to produce a series of pulses of uniform width and amplitude. These pulses represent the vibratto content. Simultaneously in a parallel circuit, the speakers voice peak is detected by processing the signal through a band pass filter, rectifier, smoothing filter, and peak-detection-and-hold circuit. The vibratto pulse train is weighted by multiplying it by the detected peak amplitude in an analog manner. The resulting pulse train is a series of pulses whose timing is related to the vibratto content of the speech and whose amplitude is proportional to the peak energy level in the fundamental pitch region as selected or defined by the band pass filter in the peak detector circuit.
Though using an analog multiplier to weight the two functions is simple and straightforward, other means are available for producing the desired weighted function. For example, the two signals may be recorded in a two-track chart recorder and the value of the vibratto pulse could be multiplied by the peak voltage reading by hand. The vibratto measurement could also be quantified by digital counter with the resultant number being multiplied by the voltage reading of the peak energy circuit. Similarly, instead of using a DC chart recorder, the output from the analog multiplier may be further quantified by measuring the average value with a simple DC voltimeter or with an averaging and recording instrument.
The present invention provides a visual real-time display of a weighted vibratto content of the speech of the subject from which a trained interrogator may derive the veracity thereof by comparison with a known truthful response.
OBJECTS OF THE INVENTION It is an object of the present invention to provide a means for detecting a stressful or emotional condition in a human being who is speaking.
An additional object of this invention is to detect this emotional or stressful condition while the person who is speaking is under direct and skillful interrogation.
A further object of this invention is to provide means whereby a valid Truth/Lie decision can be rendered by direct observations of the data readout of a voice or speech analysis system.
A still further object of this invention is to detect the emotional or stressful condition by analysis of the vibratto content of speech weighted by the peak amplitude of a selected frequency band during the same phonation utterance.
Other objects, advantages and novel features of the present invention will become apparent from the following detailed description of the invention which considered in conjunction with the accompanying drawmgs.
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is an oscillograph of a male voice responding with the word yes in the English language, in answer to a direct question at a bandwidth of SkI-Iz,
FIG. 2 is an oscillograph of a male voice responding with the word no in the English language in answer to a direct question at a bandwidth of SkHz,
FIGS. 3a and 3b are oscillgraphs of a male voice responding yes in the English Language as measured in the 150-30OHz and 600l 200Hz frequency regions, respectively,
FIGS. 4a and 4b are oscillographs of a male voice responding no in the English language as measured in the l5030OI-Iz and 600-OI-Iz frequency regions, respectively,
FIGS. 5a and 5b are graphs of a yes" response with and without emotional stress, respectively.
FIGS. 6a and 6b are graphs of a no response with and without emotional stress, respectively.
FIG. 7 is a block diagram of the weighted vibratto signal processing circuit,
FIG. 8 is a detailed schematic of the weighted vibratto signal processing circuit.
DESCRIPTION OF PREFERRED EMBODIMENTS FIG. 1 shows an oscillograph of a male voice responding with the word yes in the English language in answer to a direct question at a bandwidth of SkHz. The wave form contains two distinct sections, the first being for the ye sound and the second being for the unvoiced 5 sound. Since the first section of the yes" signal wave form is a voiced sound being produced primarily by the vocal cords and conus elasticus, this portion will be processed to detect emotional stress content or vibratto modulation. The male voice responding with the word no in the English language at a bandwidth of SkI-Iz is shown in FIG. 2. This response has a single voiced section which will be analyzed by the present device to detect the presence of the vibratto or rapid modulation of the phonation constituent of the speech signal.
FIGS. 3 and 4 show an oscillograph of the same male voice as in FIGS. 1 and 2 responding yes and no, respectively, in the English language as measured in the l50300I-Iz frequency region. This spectral region contains a great deal of the fundamental energy of phonation. When this band of frequencies is rectified and smoothed, the peak amplitude of the phonation will make an ideal weighting function.
FIG. 5a is a drawn replica of a portion of the response yes, delivered under emotional stress. The rapid modulation or vibratto pulses can be seen extending above and below the normal envelope. These additional excursions occur as the result of non-symmetric action between the vocal cords and the conus elasticus. The basic repetition period of this male voice is about 8.3 milliseconds.
FIG. 5b is a drawn replica of a portion of a male voice responding yes delivered under conditions of no emotional stress. The smooth r'egular features of the pitch pulses can be easily seen.
FIG. 6a is a drawn replica of a portion of the same male voice responding no under a condition of emotional stress. The vibratto modulations appear as distortions near the axis of averages ,and as excessively high peaks in the position direction. This non-regularity is the result of interaction in the pharynx between the vocal cords and the conus elasticus leading to an explosive type of formant excitation.
FIG. 6b is a drawn replica of a portion of the same male voice answering no-to a non-stress question. The smoothness and regularity of the response can be readily seen.
Thus it is an object of the present invention to isolate the rapid modulation'of the phonation constituent of the speech signal envelope in order to detect the presence of emotional stress in the speaker.
The components for achieving the stated objectives are shown in block diagram form in FIG. 7. As in the previous copendng applications, a high fidelity acoustic transducer 2 is used to bring the transduced electrical energy of the phonation into the system. The electrical signal divides into two channels. The upper channel is concerned with the speech processing of the Fundamental Pitch Frequency, and the lower channel processes the Vibratto Component of the voice.
In theupper channel, an amplifier 4 serves to increase the energy of the signal from the transducer and to isolate the following circuits from theoutput impedance of the transducer. A band-pass filter 6 follows, which is selected or adjusted to accept only the fundamental pitch frequency of the voice. This is commonly found to be in the vicinity of 120 to lSOHz for men and about 250l-Iz for women and thus a filter with a bandwidth of l50-300I-Iz may be selected. The signal at the output of band-pass filter 6 is tied to the energy detection means which comprises a rectifier 8 followed by a smoothing or low pass filter 10. By this pair of circuits, a voltage which is representative of the envelope of the phonation is produced. This envelope is then peak detected and held or stored in device 12 until all further processing has been accomplished. The circuit is then reset, either manually or automatically, as desired by the operator of the machine.
In the lower channel, the electrical signal from the transducer 2 is amplified and isolated by amplifier 14. The energy is again rectified by rectifier 16, isolated and amplified by amplifier l7 and smoothed by filter 18 to form an envelope of the signal. At this point the bandwidth of the signal in the lower channel is unrestricted. The smoothing filter 18 must have its characteristics adjusted accordingly. Following this filter, a stage of isolation amplification 20 is used to separate the processing of the smoothing filter from the time and amplitude discriminator 22. This latter circuit acts to compare the amplitude of the signal envelope with the time derivative of the signal envelope.
The output of the time and amplitude discriminator 22 is a zero based burst of pulses of a single polarity. These pulses will have a high proportion of the vibratto component. The pulses, which are of varying amplitude and width, are fed into a voltage comparator 24, a module of conventional design and commercially available. The comparator compares the incoming wavetrain with a DC voltage from the potentiometer 26. The only pulses that are allowed to emerge from the comparator are those that are larger in voltage value than the set voltage of the potentiometer. This potentiometer is set by the operator to a value which is somewhat above the baseline. It has been demonstrated experimentally that the Vibratto Component is usually of high sharp pulses, so these are enhanced with a high setting of the potentiometer. The output of the comparator is a series of pulses, predominantly consisting of the Vibratto Component, all of constant amplitude one for each pulse exceeding the preset level.
The detected and held envelope peak enters input port 27 of the analog multiplier 28, while the series of vibratto pulses enters input port 29. These two waveforms are multiplied together in the mathematical manner known as weighting. The result will be a series of pulses of increased amplitude at a period defined by the vibratto pulse train at input port 29 that will be recorded in the DC recorder 30 for observation and analysls.
A more detailed explanation of the preferred embodiment of the invention is shown schematically in FIG. 8.
The acoustic phonation of the subject under interrogation enters the system of instrumentation in one of two ways. It may, at the discretion of the operator, enter the system directly (in real time) from the transducer 50 or from tape recorder 54 and its transducer 52. These transducers are high fidelity microphone types of devicesthat reproduce the electrical analog of the acoustic signal, with a minimum of frequency and amplitude distortion. Switch means 56 is used to select either the microphone directly or the tape recorder output. As in FIG. 7, the upper channel in FIG. 8, pertains to the envelope processing and peak detection of a selected frequency band and the lower channel pertains to the envelope processing and vibratto quantification.
The switch means 56 is followed in the upper channel by an operational amplfier 58 with its gain determining resistors 60 and 62. The broad band speech signal from the transducer and switch enters the band pass filter 64 and the spectral region typically from l5030OHz is allowed to pass. This exact region may be changed, depending upon the voice of the subject involved. The region of -300I-Iz is a good compromise and was employed in the preferred embodiment. FIGS. 1 and 2 show the waveforms of a male human voice responding the words yes and no in English when the bandwidth is limitd to 150-300l-Iz.
Another stage of isolation, employing an operational amplifier 66, with its gain determining resistors 68 and 70, is used to isolate the band-pass filter from the converter of the dual polarity signals. In the preferred embodiment of the invention, the conversion of the dual polarity signals out of the isolation amplifier are performed by a simple solid state diode 72 acting as a rectifier. A reversed polarity diode would function nearly as well. In addition, full wave or bridge rectifiers could also be used with attendant increase in signal level. If diodes with a particular characteristic, such as Square Law were employed, the instrument would operate upon a true measure of the power in the speech signal. A further operational amplifier 74 with its gain setting resistors 76 and 78 serves to isolate the rectifier circuit from the rest of the instrument. A smoothing filter 79 follows this amplifier.
The smoothing filter consists of a low-pass filter network that is comprised of a variable resistor 80 and a fixed capacitor 82. It can be seen that other types of low-pass filters could be employed here, to remove the higher frequency fluctuations of the rectified signal, thus rendering the output of the smoothing filter essentially that of the envelope of the speech wave, in the defined pass-band. The exact time constant of this filter is adjusted depending upon the pitch of the voice under assessment. The smoothed envelope is amplified and isolated by operational amplifier 84 with its gain determining resistors 86 and 88. This envelope of speech energy then passes into a peak detect and hold circuit 90 of conventional design. Such circuits can be readily fabricated from a variety of components and modules by those skilled in the art. It is preferred to use a single circuit to perform this task such as a single module named Infinite Sample Hold, manufactured by Hybrid Systems Corp. This particular module has the advantage of preventing decay of the peak value until the circuit is reset by a signal on input lead 94 from the control means 96 which may be a simple switch. The output peak value appears on lead 92 and is applied to one channel of the analog multiplier 98. There are many ways that two voltages could be multiplied together, thereby weighting or modulating one voltage with the other. This process could be performed manually, by obtaining the individual values of the voltages involved. It is preferred, however, to employ one of the modern analog computer modules that have become available such as Model 107C analog multiplier/divider, manufactured by Hybrid Systems Corp.
The lower channel that processes the speech or phonation energy to obtain the Vibratto components functions in the following manner. The electrical analog of the phonation enters the channel through an isolating operational amplifier 103 with its gain determining resistors 104 and 106. This amplifier is used to provide isolation and linear amplification of the signal with no frequency discrimination. This isolated and amplified signal is applied to a diode 108 where one polarity of the speech signal is allowed to pass into the following circuitry. A diode connected in the opposite polarity could be used nearly as well. A full wave rectification circuit or a bridge rectification circuit (not shown) could be used as well with a small additional complication of the circuit.
The electrical energy out of the diode, at the input of the following circuitry, is therefore predominantly and primarily of one polarity. Operational amplifer 110, with its gain determining resistances 112 and 114 is used to isolate the diode circuit from the follow-on circuitry. The follow-on circuitry consists of a smoothing filter 116 in the form of an R/C Integrator having a variable resistor 117 and a fixed capacitor 118. It can be seen, by those versed in the art, that a variety of different active and passive smoothing filters could be used to remove the high frequency energy of the phonation and to extract a signal which is representative of the envelope of the speech wave and yet retains the Vibratto Component. The R/C Integrator that is used in the present preferred embodiment functions quite well and is simple to employ. The time constant is variable to afford adjustment for different voices of various fundamental frequencies.
The R/C Integrator is folowed by another operational amplifier 120 with its gain determining resistors 122 and 124. This operational amplifier isolates the processing of the R/C Integrator from the subsequent circuit action. Following the isolation amplifier 120 is a special discriminator circuit that processes the time derivative of the incoming speech wave and the amplitude of the speech envelope at the same time. The special differentiator circuit involves a fixed capacitor 126 and a variable resistor 128. These two components perform the time differentiation function. The potentiometer 130 provides a measure of the undifferentiated signal envelope which is used to null out residual envelope energy. This component, connected as it is, performs the envelope amplitude discrimination function. An operational amplifier 132, with its gain and performance determining resistances 134, 136 and 138, accepts the time derivative signal and the amplitude discrimination signal and provides effective base line restoration for most types of phonation.
Base line restoration can be accomplished in a variety of ways as well, as by various types of clamping and DC restoration circuits. Irrespective of the circuit of use, the output of the amplifier 132 becomes a series of pulses with a defined base line that contains the Vibratto Component and thus comprises the variable fine structure of the phonation. These pulses are of varying amplitude and width.
The output pulses from the time and amplitude discriminator are applied to a comparator circuit 140 which determines the level of statistically signifcant pulses. This level is determined by a knowledgeable operator of the equipment, and is controlled by adjustment of potentiometer 142. This control is shown to function at either a positive or a negative voltage level. When the polarity of the diode 108 is selected, the comparator voltage level must be adjacent to the polarity that will select either excess positive or excess negative peaks. The potentiometer 142 may also be set at 0" volts at which time the circuit becomes conventional zero-crossing detector means. It has been found that the statistical significance of the Truth/Lie Decision process will improve if a level of voltage off the O baseline is selected for the comparator switching level. This comparator may be a simple diode circuit or it may be a Schmitt trigger circuit, each with suitable voltage supplies, passive and active components. However, for simplicity and economy, a differential voltage comparator such as the Motorola MC17I0 was used. The output of voltage comparator 140 is a series of equiamplitude pulses, one for each input above the selected level. This pulse train enters the analog multiplier 98, where it is weighed by the output voltage level of the peak detect and hold module 90. The output pulse train thus has an amplitude varied by the peak detected signal and a spacing of the vibratto pulse train.
The algebraic analog product of these two voltages emerges from the multiplier module on lead 100 and enters a DC chart recorder 146. The responses of each utterance of phonation from the subject under interrogation are thereby recorded for comparison and analysis. The chart recorder is also under the control (on cable 148) of the control means 96. By this means, prior to the act of asking a particular question, the operator/interrogator erases or resets the sample and hold voltage level from module 90 and primes the chart recorder. Immediately after the question is asked, the circuits are all enabled by the control means so that the phonation of the subject may be processed by the instrumentation. Although the invention has been described and illustrated in detail, it is to be clearly understood that the same is by way of illustration and example only and it not to be taken by way of limitation, the spirit and scope of the invention being limited only by the terms of the appended claims.
What is claimed is:
l. A method for detecting emotional stress in the utterance of an individual comprising:
converting said utterance to an electrical signal;
selecting a frequency band of said electrical signals;
detecting the peak amplitude of said selected frequency band;
simultaneously with said frequency selecting and peak detecting, smoothing said electrical signals to form an envelope, detecting rapid aperiodic amplitude modulation on said envelope, and selecting detected modulation exceeding a preselected amplitude;
weighting said selected modulating with said detected peak amplitude; and
displaying said weighted signal which is indicative of emotional stress.
2. A method as in claim 1 including rectifying and smoothing said selected frequency band before detectmg 3. A method as in claim 1 wherein detecting rapid modulation includes time and amplitude discriminating said smoothed signal.
4. A method as in claim 3 wherein said smoothing comprises integrating and said time and amplitude discriminating includes differentiating and DC. base line restoration.
5. A method as in claim 1 including holding said detected peak amplitude for the duration of said utterance and resetting said held peak amplitude upon termination of said utterance.
6. A device for indicating emotional stress from the utterances of a human comprising:
means for converting said utterances into electrical signals; means connected to said converting means for passing a frequency band of said electrical signals;
means connected to said band-passing means for detecting the peak amplitude of said passed electrical signals;
means connected to said converting means for shaping said electrical signals into an envelope;
means connected to said shaping means for detecting rapid aperiodic amplitude modulation on said envelope;
means connected to said peak detecting means and said modulation detecting means for weighting said detected modulation with said detected peak am plitude; and
means connected to said weighting means for indicating said weighted signal.
7. A device as in olaim 6 including rectifying means and smoothing means connected between said bandpassing means and said peak detecting means.
8. A device as in claim 7 wherein said band-passing means has a band of to 3001-12.
9. A device as in claim 7 wherein said peak detecting means also hold said detected peak until reset.
10. A device as in claim 6 including means connected between said modulation detecting means and said weighting means for comparing said detected modulation to a selected level.
11. A device as in claim 10 wherein said modulation detecting means includes differentiating means producing a series of varying amplitude pulses and said comparing means producing a series of uniform amplitude pulses for each varying amplitude pulse above said se lected level.
12. A device as in claim 11 wherein said weighting means produces a series of pulses whose spacing is that of said comparing means pulses and whose amplitude is proportional to said detected peak.
13. A device as in claim 10 wherein said shaping means includes rectifying means and integrating means.
14. A device as in claim 6 wherein said detecting means include differentiation means and baseline restoration means.

Claims (14)

1. A method for detecting emotional stress in the utterance of an individual comprising: converting said utterance to an electrical signal; selecting a frequency band of said electrical signals; detecting the peak amplitude of said selected frequency band; simultaneously with said frequency selecting and peak detecting, smoothing said electrical signals to form an envelope, detecting rapid aperiodic amplitude modulation on said envelope, and selecting detected modulation exceeding a preselected amplitude; weighting said selected modulating with said detected peak amplitude; and displaying said weighted signal which is indicative of emotional stress.
2. A method as in claim 1 including rectifying and smoothing said selected frequency band before detecting
3. A method as in claim 1 wherein detecting rapid modulation includes time and amplitude discriminating said smoothed signal.
4. A method as in claim 3 wherein said smoothing comprises integrating and said time and amplitude discriminating includes differentiating and D.C. base line restoration.
5. A method as in claim 1 including holding said detected peak amplitude for the duration of said utterance and resetting said held peak amplitude upon termination of said utterance.
6. A device for indicating emotional stress from the utterances of a human comprising: means for converting said utterances into electrical signals; means connected to said converting means for passing a frequency band of said electrical signals; means connected to said band-passing means for detecting the peak amplitude of said passed electrical signals; means connected to said converting means for shaping said electrical signals into an envelope; means connected to said shaping means for detecting rapid aperiodic amplitude modulation on said envelope; means connected to said peak detecting means and said modulation detecting means for weighting said detected modulation with said detected peak amplitude; and means connected to said weighting means for indicating said weighted signal.
7. A device as in claim 6 including rectifying means and smoothing means connected between said band-passing means and said peak detecting means.
8. A device as in claim 7 wherein said band-passing means has a band of 150 to 300Hz.
9. A device as in claim 7 wherein said peak detecting means also hold said detected peak until reset.
10. A devIce as in claim 6 including means connected between said modulation detecting means and said weighting means for comparing said detected modulation to a selected level.
11. A device as in claim 10 wherein said modulation detecting means includes differentiating means producing a series of varying amplitude pulses and said comparing means producing a series of uniform amplitude pulses for each varying amplitude pulse above said selected level.
12. A device as in claim 11 wherein said weighting means produces a series of pulses whose spacing is that of said comparing means pulses and whose amplitude is proportional to said detected peak.
13. A device as in claim 10 wherein said shaping means includes rectifying means and integrating means.
14. A device as in claim 6 wherein said detecting means include differentiation means and baseline restoration means.
US00311422A 1972-12-01 1972-12-01 Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment Expired - Lifetime US3855416A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US00311422A US3855416A (en) 1972-12-01 1972-12-01 Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US00311422A US3855416A (en) 1972-12-01 1972-12-01 Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment

Publications (1)

Publication Number Publication Date
US3855416A true US3855416A (en) 1974-12-17

Family

ID=23206805

Family Applications (1)

Application Number Title Priority Date Filing Date
US00311422A Expired - Lifetime US3855416A (en) 1972-12-01 1972-12-01 Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment

Country Status (1)

Country Link
US (1) US3855416A (en)

Cited By (38)

* Cited by examiner, ā€  Cited by third party
Publication number Priority date Publication date Assignee Title
US4142067A (en) * 1977-06-14 1979-02-27 Williamson John D Speech analyzer for analyzing frequency perturbations in a speech pattern to determine the emotional state of a person
US4335276A (en) * 1980-04-16 1982-06-15 The University Of Virginia Apparatus for non-invasive measurement and display nasalization in human speech
US4383135A (en) * 1980-01-23 1983-05-10 Scott Instruments Corporation Method and apparatus for speech recognition
US4444199A (en) * 1981-07-21 1984-04-24 William A. Shafer Method and apparatus for monitoring physiological characteristics of a subject
US5077800A (en) * 1988-10-14 1991-12-31 Societe Anonyme Dite: Laboratorie D'audiologie Dupret-Lefevre S.A. Electronic device for processing a sound signal
US5134657A (en) * 1989-03-13 1992-07-28 Winholtz William S Vocal demodulator
US5148483A (en) * 1983-08-11 1992-09-15 Silverman Stephen E Method for detecting suicidal predisposition
US5976081A (en) * 1983-08-11 1999-11-02 Silverman; Stephen E. Method for detecting suicidal predisposition
US6006188A (en) * 1997-03-19 1999-12-21 Dendrite, Inc. Speech signal processing for determining psychological or physiological characteristics using a knowledge base
US20020077825A1 (en) * 2000-08-22 2002-06-20 Silverman Stephen E. Methods and apparatus for evaluating near-term suicidal risk using vocal parameters
US6427137B2 (en) * 1999-08-31 2002-07-30 Accenture Llp System, method and article of manufacture for a voice analysis system that detects nervousness for preventing fraud
US6463415B2 (en) * 1999-08-31 2002-10-08 Accenture Llp 69voice authentication system and method for regulating border crossing
EP1256937A2 (en) * 2001-05-11 2002-11-13 Sony France S.A. Emotion recognition method and device
US20020194002A1 (en) * 1999-08-31 2002-12-19 Accenture Llp Detecting emotions using voice signal analysis
US20030023444A1 (en) * 1999-08-31 2003-01-30 Vicki St. John A voice recognition system for navigating on the internet
US6719707B1 (en) 2001-06-15 2004-04-13 Nathan Montgomery Apparatus and method for performing musical perception sound analysis on a system
US6724887B1 (en) 2000-01-24 2004-04-20 Verint Systems, Inc. Method and system for analyzing customer communications with a contact center
US20040105464A1 (en) * 2002-12-02 2004-06-03 Nec Infrontia Corporation Voice data transmitting and receiving system
US7139699B2 (en) 2000-10-06 2006-11-21 Silverman Stephen E Method for analysis of vocal jitter for near-term suicidal risk assessment
US20060277042A1 (en) * 2005-04-01 2006-12-07 Vos Koen B Systems, methods, and apparatus for anti-sparseness filtering
US7165033B1 (en) 1999-04-12 2007-01-16 Amir Liberman Apparatus and methods for detecting emotions in the human voice
US20070033009A1 (en) * 2005-08-05 2007-02-08 Samsung Electronics Co., Ltd. Apparatus and method for modulating voice in portable terminal
US20070192108A1 (en) * 2006-02-15 2007-08-16 Alon Konchitsky System and method for detection of emotion in telecommunications
US20090024050A1 (en) * 2007-03-30 2009-01-22 Searete Llc, A Limited Liability Corporation Of The State Of Delaware Computational user-health testing
US7511606B2 (en) 2005-05-18 2009-03-31 Lojack Operating Company Lp Vehicle locating unit with input voltage protection
US20100060461A1 (en) * 2008-09-08 2010-03-11 Sprague Phillip R Psychophysiological Touch Screen Stress Analyzer
US20100090834A1 (en) * 2008-10-13 2010-04-15 Sandisk Il Ltd. Wearable device for adaptively recording signals
US20100211394A1 (en) * 2006-10-03 2010-08-19 Andrey Evgenievich Nazdratenko Method for determining a stress state of a person according to a voice and a device for carrying out said method
US7869586B2 (en) 2007-03-30 2011-01-11 Eloyalty Corporation Method and system for aggregating and analyzing data relating to a plurality of interactions between a customer and a contact center and generating business process analytics
US20110178803A1 (en) * 1999-08-31 2011-07-21 Accenture Global Services Limited Detecting emotion in voice signals in a call center
US7995717B2 (en) 2005-05-18 2011-08-09 Mattersight Corporation Method and system for analyzing separated voice data of a telephonic communication between a customer and a contact center by applying a psychological behavioral model thereto
US8023639B2 (en) 2007-03-30 2011-09-20 Mattersight Corporation Method and system determining the complexity of a telephonic communication received by a contact center
US8094803B2 (en) 2005-05-18 2012-01-10 Mattersight Corporation Method and system for analyzing separated voice data of a telephonic communication between a customer and a contact center by applying a psychological behavioral model thereto
US8094790B2 (en) 2005-05-18 2012-01-10 Mattersight Corporation Method and software for training a customer service representative by analysis of a telephonic interaction between a customer and a contact center
US8718262B2 (en) 2007-03-30 2014-05-06 Mattersight Corporation Method and system for automatically routing a telephonic communication base on analytic attributes associated with prior telephonic communication
US9043214B2 (en) 2005-04-22 2015-05-26 Qualcomm Incorporated Systems, methods, and apparatus for gain factor attenuation
US9257122B1 (en) 2012-08-06 2016-02-09 Debra Bond Cancro Automatic prediction and notification of audience-perceived speaking behavior
US10419611B2 (en) 2007-09-28 2019-09-17 Mattersight Corporation System and methods for determining trends in electronic communications

Citations (4)

* Cited by examiner, ā€  Cited by third party
Publication number Priority date Publication date Assignee Title
US3268661A (en) * 1962-04-09 1966-08-23 Melpar Inc System for determining consonant formant loci
US3346694A (en) * 1965-06-02 1967-10-10 Bell Telephone Labor Inc Speech level measuring apparatus
GB1113225A (en) * 1965-01-05 1968-05-08 Nat Res Dev Apparatus for distinguishing between voiced and unvoiced sounds in a speech signal
US3592969A (en) * 1968-07-24 1971-07-13 Matsushita Electric Ind Co Ltd Speech analyzing apparatus

Patent Citations (4)

* Cited by examiner, ā€  Cited by third party
Publication number Priority date Publication date Assignee Title
US3268661A (en) * 1962-04-09 1966-08-23 Melpar Inc System for determining consonant formant loci
GB1113225A (en) * 1965-01-05 1968-05-08 Nat Res Dev Apparatus for distinguishing between voiced and unvoiced sounds in a speech signal
US3346694A (en) * 1965-06-02 1967-10-10 Bell Telephone Labor Inc Speech level measuring apparatus
US3592969A (en) * 1968-07-24 1971-07-13 Matsushita Electric Ind Co Ltd Speech analyzing apparatus

Non-Patent Citations (1)

* Cited by examiner, ā€  Cited by third party
Title
Lieberman & Michaels, Some Aspects of Fundamental Frequency & Envelope Amplitude As Related to the Emotional Content of Speech, J.A.S.A. 7/1962, pgs. 922 927. *

Cited By (80)

* Cited by examiner, ā€  Cited by third party
Publication number Priority date Publication date Assignee Title
US4142067A (en) * 1977-06-14 1979-02-27 Williamson John D Speech analyzer for analyzing frequency perturbations in a speech pattern to determine the emotional state of a person
US4383135A (en) * 1980-01-23 1983-05-10 Scott Instruments Corporation Method and apparatus for speech recognition
US4335276A (en) * 1980-04-16 1982-06-15 The University Of Virginia Apparatus for non-invasive measurement and display nasalization in human speech
US4444199A (en) * 1981-07-21 1984-04-24 William A. Shafer Method and apparatus for monitoring physiological characteristics of a subject
US6591238B1 (en) * 1983-08-11 2003-07-08 Stephen E. Silverman Method for detecting suicidal predisposition
US5148483A (en) * 1983-08-11 1992-09-15 Silverman Stephen E Method for detecting suicidal predisposition
US5976081A (en) * 1983-08-11 1999-11-02 Silverman; Stephen E. Method for detecting suicidal predisposition
US5077800A (en) * 1988-10-14 1991-12-31 Societe Anonyme Dite: Laboratorie D'audiologie Dupret-Lefevre S.A. Electronic device for processing a sound signal
US5134657A (en) * 1989-03-13 1992-07-28 Winholtz William S Vocal demodulator
US6006188A (en) * 1997-03-19 1999-12-21 Dendrite, Inc. Speech signal processing for determining psychological or physiological characteristics using a knowledge base
US7165033B1 (en) 1999-04-12 2007-01-16 Amir Liberman Apparatus and methods for detecting emotions in the human voice
US20030023444A1 (en) * 1999-08-31 2003-01-30 Vicki St. John A voice recognition system for navigating on the internet
US7590538B2 (en) 1999-08-31 2009-09-15 Accenture Llp Voice recognition system for navigating on the internet
US20020194002A1 (en) * 1999-08-31 2002-12-19 Accenture Llp Detecting emotions using voice signal analysis
US20070162283A1 (en) * 1999-08-31 2007-07-12 Accenture Llp: Detecting emotions using voice signal analysis
US20110178803A1 (en) * 1999-08-31 2011-07-21 Accenture Global Services Limited Detecting emotion in voice signals in a call center
US6463415B2 (en) * 1999-08-31 2002-10-08 Accenture Llp 69voice authentication system and method for regulating border crossing
US7222075B2 (en) 1999-08-31 2007-05-22 Accenture Llp Detecting emotions using voice signal analysis
US8965770B2 (en) 1999-08-31 2015-02-24 Accenture Global Services Limited Detecting emotion in voice signals in a call center
US6427137B2 (en) * 1999-08-31 2002-07-30 Accenture Llp System, method and article of manufacture for a voice analysis system that detects nervousness for preventing fraud
US7627475B2 (en) 1999-08-31 2009-12-01 Accenture Llp Detecting emotions using voice signal analysis
US6724887B1 (en) 2000-01-24 2004-04-20 Verint Systems, Inc. Method and system for analyzing customer communications with a contact center
US7062443B2 (en) 2000-08-22 2006-06-13 Silverman Stephen E Methods and apparatus for evaluating near-term suicidal risk using vocal parameters
US20020077825A1 (en) * 2000-08-22 2002-06-20 Silverman Stephen E. Methods and apparatus for evaluating near-term suicidal risk using vocal parameters
US7139699B2 (en) 2000-10-06 2006-11-21 Silverman Stephen E Method for analysis of vocal jitter for near-term suicidal risk assessment
US7565285B2 (en) 2000-10-06 2009-07-21 Marilyn K. Silverman Detecting near-term suicidal risk utilizing vocal jitter
EP1256937A3 (en) * 2001-05-11 2004-09-29 Sony France S.A. Emotion recognition method and device
EP1256937A2 (en) * 2001-05-11 2002-11-13 Sony France S.A. Emotion recognition method and device
US6719707B1 (en) 2001-06-15 2004-04-13 Nathan Montgomery Apparatus and method for performing musical perception sound analysis on a system
US7451079B2 (en) 2001-07-13 2008-11-11 Sony France S.A. Emotion recognition method and device
US20030055654A1 (en) * 2001-07-13 2003-03-20 Oudeyer Pierre Yves Emotion recognition method and device
US20040105464A1 (en) * 2002-12-02 2004-06-03 Nec Infrontia Corporation Voice data transmitting and receiving system
US7839893B2 (en) * 2002-12-02 2010-11-23 Nec Infrontia Corporation Voice data transmitting and receiving system
US8484036B2 (en) * 2005-04-01 2013-07-09 Qualcomm Incorporated Systems, methods, and apparatus for wideband speech coding
US8364494B2 (en) 2005-04-01 2013-01-29 Qualcomm Incorporated Systems, methods, and apparatus for split-band filtering and encoding of a wideband signal
US20060282263A1 (en) * 2005-04-01 2006-12-14 Vos Koen B Systems, methods, and apparatus for highband time warping
US20060277042A1 (en) * 2005-04-01 2006-12-07 Vos Koen B Systems, methods, and apparatus for anti-sparseness filtering
US8332228B2 (en) 2005-04-01 2012-12-11 Qualcomm Incorporated Systems, methods, and apparatus for anti-sparseness filtering
US8260611B2 (en) 2005-04-01 2012-09-04 Qualcomm Incorporated Systems, methods, and apparatus for highband excitation generation
US8069040B2 (en) 2005-04-01 2011-11-29 Qualcomm Incorporated Systems, methods, and apparatus for quantization of spectral envelope representation
US8244526B2 (en) 2005-04-01 2012-08-14 Qualcomm Incorporated Systems, methods, and apparatus for highband burst suppression
US20070088558A1 (en) * 2005-04-01 2007-04-19 Vos Koen B Systems, methods, and apparatus for speech signal filtering
US8140324B2 (en) 2005-04-01 2012-03-20 Qualcomm Incorporated Systems, methods, and apparatus for gain coding
US8078474B2 (en) 2005-04-01 2011-12-13 Qualcomm Incorporated Systems, methods, and apparatus for highband time warping
US9043214B2 (en) 2005-04-22 2015-05-26 Qualcomm Incorporated Systems, methods, and apparatus for gain factor attenuation
US10104233B2 (en) 2005-05-18 2018-10-16 Mattersight Corporation Coaching portal and methods based on behavioral assessment data
US10021248B2 (en) 2005-05-18 2018-07-10 Mattersight Corporation Method and system for analyzing caller interaction event data
US9571650B2 (en) 2005-05-18 2017-02-14 Mattersight Corporation Method and system for generating a responsive communication based on behavioral assessment data
US8094803B2 (en) 2005-05-18 2012-01-10 Mattersight Corporation Method and system for analyzing separated voice data of a telephonic communication between a customer and a contact center by applying a psychological behavioral model thereto
US8094790B2 (en) 2005-05-18 2012-01-10 Mattersight Corporation Method and software for training a customer service representative by analysis of a telephonic interaction between a customer and a contact center
US7995717B2 (en) 2005-05-18 2011-08-09 Mattersight Corporation Method and system for analyzing separated voice data of a telephonic communication between a customer and a contact center by applying a psychological behavioral model thereto
US9692894B2 (en) 2005-05-18 2017-06-27 Mattersight Corporation Customer satisfaction system and method based on behavioral assessment data
US9432511B2 (en) 2005-05-18 2016-08-30 Mattersight Corporation Method and system of searching for communications for playback or analysis
US9357071B2 (en) 2005-05-18 2016-05-31 Mattersight Corporation Method and system for analyzing a communication by applying a behavioral model thereto
US9225841B2 (en) 2005-05-18 2015-12-29 Mattersight Corporation Method and system for selecting and navigating to call examples for playback or analysis
US8781102B2 (en) 2005-05-18 2014-07-15 Mattersight Corporation Method and system for analyzing a communication by applying a behavioral model thereto
US10129402B1 (en) 2005-05-18 2018-11-13 Mattersight Corporation Customer satisfaction analysis of caller interaction event data system and methods
US7511606B2 (en) 2005-05-18 2009-03-31 Lojack Operating Company Lp Vehicle locating unit with input voltage protection
US8594285B2 (en) 2005-05-18 2013-11-26 Mattersight Corporation Method and system for analyzing separated voice data of a telephonic communication between a customer and a contact center by applying a psychological behavioral model thereto
US20070033009A1 (en) * 2005-08-05 2007-02-08 Samsung Electronics Co., Ltd. Apparatus and method for modulating voice in portable terminal
US20070192108A1 (en) * 2006-02-15 2007-08-16 Alon Konchitsky System and method for detection of emotion in telecommunications
US20100211394A1 (en) * 2006-10-03 2010-08-19 Andrey Evgenievich Nazdratenko Method for determining a stress state of a person according to a voice and a device for carrying out said method
US9270826B2 (en) 2007-03-30 2016-02-23 Mattersight Corporation System for automatically routing a communication
US8891754B2 (en) 2007-03-30 2014-11-18 Mattersight Corporation Method and system for automatically routing a telephonic communication
US8718262B2 (en) 2007-03-30 2014-05-06 Mattersight Corporation Method and system for automatically routing a telephonic communication base on analytic attributes associated with prior telephonic communication
US9124701B2 (en) 2007-03-30 2015-09-01 Mattersight Corporation Method and system for automatically routing a telephonic communication
US10129394B2 (en) 2007-03-30 2018-11-13 Mattersight Corporation Telephonic communication routing system based on customer satisfaction
US9699307B2 (en) 2007-03-30 2017-07-04 Mattersight Corporation Method and system for automatically routing a telephonic communication
US8983054B2 (en) 2007-03-30 2015-03-17 Mattersight Corporation Method and system for automatically routing a telephonic communication
US20090024050A1 (en) * 2007-03-30 2009-01-22 Searete Llc, A Limited Liability Corporation Of The State Of Delaware Computational user-health testing
US7869586B2 (en) 2007-03-30 2011-01-11 Eloyalty Corporation Method and system for aggregating and analyzing data relating to a plurality of interactions between a customer and a contact center and generating business process analytics
US8023639B2 (en) 2007-03-30 2011-09-20 Mattersight Corporation Method and system determining the complexity of a telephonic communication received by a contact center
US10419611B2 (en) 2007-09-28 2019-09-17 Mattersight Corporation System and methods for determining trends in electronic communications
US10601994B2 (en) 2007-09-28 2020-03-24 Mattersight Corporation Methods and systems for determining and displaying business relevance of telephonic communications between customers and a contact center
US8264364B2 (en) 2008-09-08 2012-09-11 Phillip Roger Sprague Psychophysiological touch screen stress analyzer
US20100060461A1 (en) * 2008-09-08 2010-03-11 Sprague Phillip R Psychophysiological Touch Screen Stress Analyzer
US8031075B2 (en) 2008-10-13 2011-10-04 Sandisk Il Ltd. Wearable device for adaptively recording signals
US20100090834A1 (en) * 2008-10-13 2010-04-15 Sandisk Il Ltd. Wearable device for adaptively recording signals
US8258964B2 (en) 2008-10-13 2012-09-04 Sandisk Il Ltd. Method and apparatus to adaptively record data
US9257122B1 (en) 2012-08-06 2016-02-09 Debra Bond Cancro Automatic prediction and notification of audience-perceived speaking behavior

Similar Documents

Publication Publication Date Title
US3855416A (en) Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment
US3855418A (en) Method and apparatus for phonation analysis leading to valid truth/lie decisions by vibratto component assessment
US3971034A (en) Physiological response analysis method and apparatus
US3855417A (en) Method and apparatus for phonation analysis lending to valid truth/lie decisions by spectral energy region comparison
Lieberman Some acoustic measures of the fundamental periodicity of normal and pathologic larynges
Pabon et al. Automatic phonetogram recording supplemented with acoustical voice-quality parameters
US6697457B2 (en) Voice messaging system that organizes voice messages based on detected emotion
EP1038291B1 (en) Apparatus and methods for detecting emotions
US6427137B2 (en) System, method and article of manufacture for a voice analysis system that detects nervousness for preventing fraud
US6353810B1 (en) System, method and article of manufacture for an emotion detection system improving emotion recognition
US4862503A (en) Voice parameter extractor using oral airflow
US4817155A (en) Method and apparatus for speech analysis
Sondhi Measurement of the glottal waveform
US4335276A (en) Apparatus for non-invasive measurement and display nasalization in human speech
EP0054365A1 (en) Speech recognition systems
Holmes Formant excitation before and after glottal closure
CN112397074A (en) Voiceprint recognition method based on MFCC (Mel frequency cepstrum coefficient) and vector element learning
Howard Peakā€picking fundamental period estimation for hearing prostheses
Fritzell Inverse filtering
US3387090A (en) Method and apparatus for displaying speech
US3925616A (en) Apparatus for determining the glottal waveform
JPH0797279B2 (en) Voice recognizer
Alpert Feedback effects of audition and vocal effort on intensity of voice
US4401850A (en) Speech analysis apparatus
Hamlet Vocal compensation: An ultrasonic study of vocal fold vibration in normal and nasal vowels