CN1835077B - Automatic speech recognizing input method and system for Chinese names - Google Patents

Automatic speech recognizing input method and system for Chinese names Download PDF

Info

Publication number
CN1835077B
CN1835077B CN2005100545841A CN200510054584A CN1835077B CN 1835077 B CN1835077 B CN 1835077B CN 2005100545841 A CN2005100545841 A CN 2005100545841A CN 200510054584 A CN200510054584 A CN 200510054584A CN 1835077 B CN1835077 B CN 1835077B
Authority
CN
China
Prior art keywords
character
identification
voice
name
syllable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2005100545841A
Other languages
Chinese (zh)
Other versions
CN1835077A (en
Inventor
王瑞璋
蔡锦和
黄良声
沈家麟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Delta Electronics Inc
Delta Optoelectronics Inc
Original Assignee
Delta Optoelectronics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Delta Optoelectronics Inc filed Critical Delta Optoelectronics Inc
Priority to CN2005100545841A priority Critical patent/CN1835077B/en
Publication of CN1835077A publication Critical patent/CN1835077A/en
Application granted granted Critical
Publication of CN1835077B publication Critical patent/CN1835077B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to a Chinese name input method by automatic voice identification, comprising: (a) inputting a first voice by a user to describe a person's name to be identified, where the name comprises plural characters; (b) using a full name identification network unit to identify the first voice to obtain an identified result; (c) sending the result to a character confirm unit; (d) using the character confirm unit to confirm characters in the result; (e) if the step (d) confirms the characters accurate, outputting the confirmed result; (f) if the step (d) confirms one of the characters wrong, the user uses a type to input a second voice to describe the wrong identified character; (g) using a description identification unit corresponding to the type to identify the second voice and sending corresponding identified result to the character confirm unit.

Description

Chinese name automatic speech recognizing input method and system
Technical field
The present invention relates to a kind of automatic speech recognizing input method and system, relate in particular to a kind of Chinese name automatic speech recognizing input method and system.
Background technology
Along with the automatic speech recognizing technology is day by day ripe, the automated procedures of using the automatic speech recognizing technology are provided, in order to solve usual loaded down with trivial details routine work, just can reach the effect of saving huge manpower expenditure cost.
For instance, present nationwide directory exchange (for example: 104 and 105), be to use Artificial Cognition's mode to carry out the identification work that name is inquired about, there is no the input method that automatic speech recognizing is provided, and the China Telecom of 104 directory exchanges service is provided, then must use thousands of manpowers just can deal with huge query demand, if can be with its service function robotization, then,, can both receive significant effect even be that human resources are done more appropriate utilization arrangement no matter be to save the human cost aspect.
Existing Chinese name automatic speech recognizing system, the mode of being taked all with Chinese name one by one as phrase unit's train language model, when the user carries out the name identification with phonetic entry, recognition engine can be with reference to using the language model that Chinese name trained to compare, and system is exportable complete Chinese name after to be identified the finishing.Yet, the Chinese name automatic speech recognizing of this kind system but only is applicable in a small amount in the Chinese name identification of (several thousand number of person names certificates approximately), in case meet a large amount of name data (several ten thousand even to millions of number of person names according to), the identification success ratio will rapidly descend, therefore existing Chinese name automatic speech recognizing system is only applicable to the telephone exchange system of general company's row number, if will be applied on the nationwide directory system, because the user's is patient limited, probably can't stand the extremely low system of discrimination power, therefore on implementing, still have the certain difficulty degree.
Therefore, the applicant is through concentrated test and research in view of the defective of known technology, develops a kind of Chinese name automatic speech recognizing input method and system finally.
Summary of the invention
Of the present invention mainly being contemplated that provides a kind of Chinese name automatic speech recognizing input method, and its step comprises: (a) import one first voice by a user, desired a name of identification to describe, this name comprises a plurality of characters; (b) utilize total man's name identification network unit (Name Net Recognizer), these first voice are carried out identification, to obtain a name identification result; (c) transmit this name identification result to character confirmation unit (Character Confirmation Unit); (d) utilize this character confirmation unit that respectively this character of this name identification result is confirmed respectively; (e) if confirm that in step (d) respectively this character recognition is correct, this name identification result after then output is confirmed; (f) if confirm respectively one of them identification mistake of this character in step (d), then this user imports one second voice with a kenel and describes this identification error character; (g) utilize to should kenel one describe identification unit these second voice carried out identification, and its identification result is sent to this character confirmation unit; And (h) repeating step (d)-(g).
According to above-mentioned conception, wherein step (b) also comprises the following step: a characteristic parameter that (b1) obtains these first voice; And, utilize this total man's name identification network unit that these first voice are carried out identification (b2) according to this characteristic parameter.
According to above-mentioned conception, wherein step (b1) also comprises the following step: (b11) these first voice are carried out pre-service; And this characteristic parameter that (b12) obtains these first voice.
According to above-mentioned conception, wherein step (b11) also comprises the following step: amplify this first voice signal; To this first voice signal normalization (normalization); This first voice signal is carried out pre-emphasis (pre-emphasis); These first voice are multiplied by Hamming window (Hamming Window); And with these first voice by a low-pass filter or a Hi-pass filter.
According to above-mentioned conception, wherein step (b12) also comprises the following step: these first voice are carried out fast fourier transform, and (Fast Fourier Transform FFT) handles; And ask for these first voice Mel cepstrum parameter (Mel-Frequency Cepstrum Coefficients, MFCC).
According to above-mentioned conception, wherein step (d) also comprises the following step: export a plurality of characters hypothesis with respect to this character respectively one by one; And by this user correct respectively this character of selection from these a plurality of character hypothesis.
According to above-mentioned conception, wherein this output be with speech play respectively this character hypothesis one words and phrases are described.
According to above-mentioned conception, wherein this output is with screen display this character hypothesis respectively.
According to above-mentioned conception, wherein this user selects with phonetic entry.
According to above-mentioned conception, wherein this user selects with the button input.
According to above-mentioned conception, wherein step (g) also comprises the following step: a characteristic parameter that (g1) obtains these second voice; And, utilize this to describe identification unit these second voice are carried out identification (g2) according to this characteristic parameter.
According to above-mentioned conception, wherein step (g1) also comprises the following step: (g11) these second voice are carried out pre-service; And this characteristic parameter that (g12) obtains these second voice.
According to above-mentioned conception, wherein step (g11) also comprises the following step: amplify this second voice signal; To this second voice signal normalization (normalization); This second voice signal is carried out pre-emphasis (pre-emphasis); These second voice are multiplied by Hamming window (Hamming Window); And with these second voice by a low-pass filter or a Hi-pass filter.
According to above-mentioned conception, wherein step (g12) also comprises the following step: these second voice are carried out fast fourier transform, and (Fast Fourier Transform FFT) handles; And ask for these second voice Mel cepstrum parameter (Mel-Frequency Cepstrum Coefficients, MFCC).
According to above-mentioned conception, wherein in the step (f), this user describes this identification error character with a character phrase kenel.
According to above-mentioned conception, wherein to description identification unit that should character phrase kenel be a character describe identification unit (Character Description Recognizer, CDR).
According to above-mentioned conception, wherein in the step (f), this user describes this identification error character in a syllable phrase mode.
According to above-mentioned conception, wherein to description identification unit that should syllable phrase kenel be a syllable describe identification unit (Syllable Spelling Recognizer, SSR).
Another conception of the present invention is to provide a kind of Chinese name automatic speech recognizing input system, and it comprises: a speech input device, and it is with so that a user imports voice, and a name of identification desired in these voice in order to describe, and this name comprises a plurality of characters; One total man's name identification network unit (Name NetRecognizer), it gets a name identification result in order to these voice of identification; One character confirmation unit (Character Confirmation Unit), whether correct in order to respectively this character of confirming this name identification result respectively; One character describe identification unit (Character Description Recognizer, CDR), when it describes respectively this character as this user with a character phrase kenel, in order to identification this character respectively; One syllable describe identification unit (Syllable Spelling Recognizer, SSR), when it describes respectively this character as this user with a syllable phrase kenel, in order to identification this character respectively; And an output unit, this name identification result after confirming in order to output.
According to above-mentioned conception, wherein this total man's name identification network unit also comprises a total man's name identification network engine and a name character string language model.
According to above-mentioned conception, the language model of this name character string language model wherein for being trained according to a basic vocabulary and known person name data.
According to above-mentioned conception, wherein this basic vocabulary is made up of 408 syllables.
According to above-mentioned conception, wherein this basic vocabulary is made up of 1300 toned syllables.
According to above-mentioned conception, wherein this basic vocabulary is made up of 408 syllables and 1300 toned syllables.
According to above-mentioned conception, wherein this character is described identification unit and is also comprised a character and describe a recognition engine and a character descriptive language model.
According to above-mentioned conception, wherein this character descriptive language model is the language model that uses the phrase data of character description to be trained according to a basic vocabulary and.
According to above-mentioned conception, wherein this basic vocabulary is made up of 408 syllables.
According to above-mentioned conception, wherein this basic vocabulary is made up of 1300 toned syllables.
According to above-mentioned conception, wherein this basic vocabulary is made up of 408 syllables and 1300 toned syllables.
According to above-mentioned conception, wherein this syllable is described identification unit and is also comprised a syllable and describe recognition engine, a syllable descriptive language model and the corresponding character list of a syllable.
According to above-mentioned conception, wherein this syllable descriptive language model is the language model that uses the phrase data of syllable description to be trained according to a basic vocabulary and.
According to above-mentioned conception, wherein this basic vocabulary is made up of 408 syllables.
According to above-mentioned conception, wherein this basic vocabulary is made up of 1300 toned syllables.
According to above-mentioned conception, wherein this basic vocabulary is made up of 408 syllables and 1300 toned syllables.
Another conception of the present invention is to provide a kind of total man's name identification network unit (Name NetRecognizer), it is used for an automatic speech recognizing input system, when a user imports voice, it is in order to describe a Chinese name, but this total man's name identification network unit is this name of identification just, it comprises: a name identification network engine and a name character string language model, the language model of this name character string language model wherein for being trained according to a basic vocabulary and known person name data, wherein this basic vocabulary is made up of 408 syllables or 1300 toned syllables or 408 syllables and 1300 toned syllables, wherein this name identification network engine with reference to this name character string language model with these voice of identification.
Another conception of the present invention is to provide a kind of character to describe identification unit (CharacterDescription Recognizer, CDR), it is used for an automatic speech recognizing input system, when a user imports voice, when it describes a character with a character phrase kenel, but just this character of identification of identification unit described in this character, it comprises: a recognition engine and a character descriptive language model described in a character, wherein this character descriptive language model is the language model that uses the phrase data of character description to be trained according to a basic vocabulary and, wherein this basic vocabulary is made up of 408 syllables or 1300 toned syllables or 408 syllables and 1300 toned syllables, wherein this character describe recognition engine with reference to this character descriptive language model with these voice of identification.
Another conception of the present invention is to provide a kind of syllable to describe identification unit (Syllable SpellingRecognizer, SSR), it is used for an automatic speech recognizing input system, when a user imports voice, when it describes a character with a syllable phrase kenel, but just this character of identification of identification unit described in this syllable, it comprises: recognition engine described in a syllable, the corresponding character list of one a syllable descriptive language model and a syllable, wherein this syllable descriptive language model is the language model that uses the phrase data of syllable description to be trained according to a basic vocabulary and, wherein this basic vocabulary is made up of 408 syllables or 1300 toned syllables or 408 syllables and 1300 toned syllables, wherein this syllable describe recognition engine with reference to this syllable descriptive language model with these voice of identification.
Another conception of the present invention is to provide a kind of name inquiry system that comprises above-mentioned Chinese name automatic speech recognizing input system.
According to above-mentioned conception, wherein this name inquiry system is used for the directory exchange.
According to above-mentioned conception, wherein this name inquiry system is used for the Auto Attendant conversational system.
According to above-mentioned conception, wherein this name inquiry system is used for the voice portal website.
The present invention must get a more deep understanding by the explanation of following accompanying drawing and embodiment.
Description of drawings
Figure 1 shows that a process flow diagram of Chinese name automatic speech recognizing input method of the present invention.
Figure 2 shows that the synoptic diagram of total man's name identification network unit application of the present invention in Chinese name automatic speech recognizing input system.
Figure 3 shows that character of the present invention describes the synoptic diagram that identification unit is applied in Chinese name automatic speech recognizing input system.
Figure 4 shows that syllable of the present invention describes the synoptic diagram that identification unit is applied in Chinese name automatic speech recognizing input system.
Wherein, description of reference numerals is as follows:
1: Chinese name automatic speech recognizing input system
11: total man's name identification network unit (Name Net Recognizer)
12: character confirmation unit (Character Confirmation Unit)
13: character describe identification unit (Character Description Recognizer, CDR)
14: syllable describe identification unit (Syllable Spelling Recognizer, SSR)
Embodiment
The present invention can fully be understood by following embodiment explanation, make those skilled in the art to finish according to this, yet enforcement of the present invention be not can be limited it by the following example to implement kenel.
See also Fig. 1, it is a process flow diagram of Chinese name automatic speech recognizing input method of the present invention.This method is performed by Chinese name automatic speech recognizing input system 1, this Chinese name automatic speech recognizing input system 1 comprises speech input device (not showing on the figure), total man's name identification network unit (NameNet Recognizer) 11, character confirmation unit (Character Confirmation Unit) 12, character and describes identification unit (Character Description Recognizer, CDR) 13, identification unit (Syllable Spelling Recognizer, SSR) 14 and output unit (not showing on the figure) described in syllable.
At first, the user is via speech input device, with the Chinese name of identification that phonetic entry is desired.Thereafter, the voice of being imported are sent to total man's name identification network unit 11, to carry out the name identification, the identification result of total man's name identification network unit 11 is the possible character string group of each character of Chinese name, after cutting, be divided into a plurality of single character groups, and send character confirmation unit 12 respectively to.Character confirmation unit 12 is after receiving single character group, just at each character group, export possible character one by one by output unit, allow the user carry out the branch segment acknowledgement and select correct character, all identification is correct as if each character, then by the correct result of output unit output, if the character recognition failure is wherein arranged, then further identification action can the guiding user be done by system 1.
If character recognition failure is arranged, the user must be at the character of identification mistake, describes the character of identification mistake with another kind of kenel input voice, passes to then description identification unit that should kenel is carried out identification.When if the user redescribes the character of identification mistake with character phrase kenel, facility is described 13 pairs of these characters of identification unit with character and is carried out identification, when if the user redescribes the character of identification mistake with syllable phrase kenel, then be to utilize syllable to describe 14 pairs of these characters of identification unit to carry out identification, no matter take which kind of kenel this character is described, the result of institute's identification all is the possible character groups of this character, character is described identification unit 13 and syllable and is described identification unit 14 and this character group can be sent to character confirmation unit 12 subsequently, and 12 auxiliary users do further to confirm by the character confirmation unit.If character identification once again failure is arranged, then continue to be repeated in this description the step of this character.Yet, in order to prevent that identification mistake again and again from can cause the user to have a feeling of impatience, if character confirmation unit 12 is after confirming that the identification failure surpasses certain number of times (for example: 4 times), then system 1 just can switch the user to the attendant, provides the user required inquiry service by the attendant.
Aspect implementation detail, this total man's name identification network unit 11, character are described identification unit 13 and syllable and are described identification unit 14 and all must carry out the processing that characteristic parameter obtains to the voice of being imported earlier.This voice signal earlier through the pre-service of appropriate step (for example: signal amplifies, normalization (normalization), pre-emphasis (pre-emphasis), be multiplied by Hamming window (Hamming Window), by low-pass filter or Hi-pass filter etc.), then just enter the step that characteristic parameter obtains processing.It is unit with the Frame that characteristic parameter obtains processing, for example carry out earlier at each Frame: fast fourier transform (Fast Fourier Transform, FFT) processing transfers voice signal to frequency spectrum, then further to this frequency spectrum try to achieve Mel cepstrum parameter (Mel-Frequency Cepstrum Coefficients, MFCC).After each identification unit (11,13,14) is tried to achieve the characteristic parameter of voice signal, just can compare with the language model that each identification unit (11,13,14) are possessed, finding out the most possible character group of each character, and, again the result is sent to character confirmation unit 12 according to frequency of occurrences ordering.
See also Fig. 2, it is the synoptic diagram of total man's name identification network unit application of the present invention in Chinese name automatic speech recognizing input system.Total man's name identification network unit comprises total man's name identification network engine and name character string language model, the language model of this name character string language model wherein for being trained according to a basic vocabulary and known person name data, with reference to comparison usefulness, wherein this basic vocabulary can be made up of 408 syllables or 1300 toned syllables or 408 syllables and 1300 toned syllables for total man's name identification network engine.After phonetic entry, obtain processing via above-mentioned characteristic parameter, total man's name identification network engine just can reference man's name character sequential language model, and draws the most possible character group of each character.
See also Fig. 3, it describes the synoptic diagram that identification unit is applied in Chinese name automatic speech recognizing input system for character of the present invention.Character is described identification unit and is comprised a character and describe a recognition engine and a character descriptive language model, wherein this character descriptive language model is the language model that uses the phrase data of character description to be trained according to a basic vocabulary and, describe recognition engine with reference to comparison usefulness for character, wherein this basic vocabulary can be made up of 408 syllables or 1300 toned syllables or 408 syllables and 1300 toned syllables.After describing the phonetic entry of kenel with a character, obtain processing via above-mentioned characteristic parameter, recognition engine described in character just can reference character descriptive language model, and draws the most possible character group of this character.
See also Fig. 4, it describes the synoptic diagram that identification unit is applied in Chinese name automatic speech recognizing input system for syllable of the present invention.Syllable is described identification unit and is comprised a syllable and describe a recognition engine and a syllable descriptive language model, wherein this syllable descriptive language model is the language model that uses the phrase data of syllable description to be trained according to a basic vocabulary and, describe recognition engine with reference to comparison usefulness for syllable, wherein this basic vocabulary can be made up of 408 syllables or 1300 toned syllables or 408 syllables and 1300 toned syllables.After describing the phonetic entry of kenel with a syllable, obtain processing via above-mentioned characteristic parameter, recognition engine described in syllable just can be with reference to syllable descriptive language model, and pick out the possible outcome that this syllable is described, refer again to the corresponding character list of syllable, and find out possible character group that should syllable thereafter.
For instance, if the modern desire of user identification " Wang Xiaoming ", then as user during with phonetic entry " Wang Xiaoming " (Wang Xiao Ming), total man's name identification network unit just can cut into name " king ", " little ", " bright " identification respectively, to pronounce subsequently to be " king ", " little ", the character confirmation unit passed in the most possible character of " bright ", auxiliary user carries out affirmation work by the character confirmation unit, suppose that output unit is a screen, then screen can show the most possible character of pronunciation for " king " respectively, for example: 1. king 2. dies, when the user selects after 1, show the most possible character of pronunciation again for " little ", for example: 1 little 2. knows 3. formal little slender bamboos, is confirmed by the user, by that analogy again.The demonstration of most possible character is to sort from high to lower with the frequency of occurrences, can confirm faster to make things convenient for the user, and also can just dynamically update the frequency of occurrences according to the probability of inquiring about at set intervals.If output unit is the words (for example Help by Phone) of voice, then most possible character can be described with phrase kenel commonly used, and for example: " the 2. dawn of dawn in spring ", the user also can confirm by telephone key-press.
If the unit identification of total man's name identification network is errorless, the user is selected correct character in view of the above, then exports the name that the user confirms, (for example is linked to the phone that database is found out " Wang Xiaoming " to proceed the desired service of user.If wherein the character recognition mistake is arranged, then system can guide the user otherwise to redescribe this character.For example: total man's name identification network unit becomes " people " with " bright " identification, the user does not have correct character and can select, therefore the user can adopt character phrase kenel to redescribe " bright ", for example: " tomorrow bright ", just describe identification unit by character and carry out identification this moment, and send most possible character group to the character confirmation unit, allow the user do further affirmation, the user also can adopt syllable phrase kenel to redescribe " bright ", for example " ㄇ one ㄥ is bright, two statements " (M-i-n-g), just described identification unit by syllable at this moment and carried out identification, and send most possible character group to the character confirmation unit, allow the user do further affirmation.If the action that character is confirmed surpasses four times,, provide the user required service with manpower just then the attendant can take over.
In sum, the present invention provides a kind of Chinese name automatic speech recognizing input method and system really, by the mode of Chinese name automatic speech recognizing input method of the present invention with Chinese name " cut apart and conquer " (divide and conquer), Chinese name automatic speech recognizing input system will no longer be subject to the Chinese name of some, and can be under the situation of keeping the identification accuracy, the unlimited many Chinese names of identification, not only can save huge human cost, also can do more appropriate arrangement human resources.
Can appoint by those skilled in the art even if the present invention has been described in detail by the above embodiments and to execute the craftsman and think and do some and modify, however the scope of neither disengaging institute of the present invention desire protection.

Claims (16)

1. the input method of a Chinese name automatic speech recognizing, its step comprises:
(a) import one first voice by a user, desired a name of identification to describe, this name comprises a plurality of characters;
(b) utilize total man's name identification network unit, these first voice are carried out identification, to obtain a name identification result;
(c) transmit this name identification result to character confirmation unit;
(d) utilize this character confirmation unit that respectively this character of this name identification result is confirmed respectively;
(e) if confirm that in step (d) respectively this character recognition is correct, this name identification result after then output is confirmed;
(f) if confirm respectively one of them identification mistake of this character in step (d), then this user imports one second voice with an input kenel and describes this identification error character;
(g) utilize to should imported attitude one describe identification unit these second voice carried out identification, and will be sent to this character confirmation unit to the identification result of these second voice; And
(h) repeating step (d)-(g).
2. method according to claim 1, wherein step (b) also comprises the following step:
(b1) obtain a characteristic parameter of these first voice; And
(b2) according to this characteristic parameter, utilize this total man's name identification network unit that these first voice are carried out identification, wherein:
Step (b1) also comprises the following step:
(b11) these first voice are carried out pre-service; And
(b12) obtain this characteristic parameters of this first voice, wherein:
Step (b11) also comprises the following step:
Amplify this first voice signal;
To this first voice signal normalization;
This first voice signal is carried out pre-emphasis;
These first voice are multiplied by Hamming window; And
These first voice are passed through a low-pass filter or a Hi-pass filter, and wherein:
Step (b12) also comprises the following step:
These first voice are carried out fast fourier transform to be handled; And
Ask for the Mel cepstrum parameter of these first voice.
3. method according to claim 1, wherein step (d) also comprises the following step:
Export a plurality of characters hypothesis one by one with respect to this character respectively; And
From these a plurality of character hypothesis, select correct respectively this character by this user, wherein;
This output with speech play respectively this character hypothesis one words and phrases are described, or with screen display this character hypothesis respectively; And/or
This user selects with phonetic entry or selects with the button input.
4. method according to claim 1, wherein step (g) also comprises the following step:
(g1) obtain a characteristic parameter of these second voice; And
(g2) according to this characteristic parameter, utilize this to describe identification unit these second voice are carried out identification, wherein:
Step (g1) also comprises the following step:
(g11) these second voice are carried out pre-service; And
(g12) obtain this characteristic parameters of this second voice, wherein step (g11) also comprises the following step:
Amplify this second voice signal;
To this second voice signal normalization;
This second voice signal is carried out pre-emphasis;
These second voice are multiplied by Hamming window; And
These second voice are passed through a low-pass filter or a Hi-pass filter, and wherein step (g12) also comprises the following step:
These second voice are carried out fast fourier transform to be handled; And
Ask for the Mel cepstrum parameter of these second voice.
5. method according to claim 1, wherein in the step (f), this user describes this identification error character with a character phrase kenel, is that identification unit described in a character to description identification unit that should character phrase kenel wherein.
6. method according to claim 1, wherein in the step (f), this user describes this identification error character in a syllable phrase mode, is that identification unit described in a syllable to description identification unit that should syllable phrase kenel wherein.
7. the input system of a Chinese name automatic speech recognizing comprises:
One speech input device, with so that a user imports voice, a name of identification desired in these voice in order to describe, and this name comprises a plurality of characters;
One total man's name identification network unit gets a name identification result in order to these voice of identification;
One character confirmation unit, whether correct in order to respectively this character of confirming this name identification result respectively;
Identification unit described in one character, when this user describes respectively this character with a character phrase kenel, in order to identification this character respectively;
Identification unit described in one syllable, when this user describes respectively this character with a syllable phrase kenel, in order to identification this character respectively; And
One output unit is in order to this name identification result after the output affirmation.
8. system according to claim 7, wherein:
This total man's name identification network unit also comprises a total man's name identification network engine and a name character string language model;
The language model of this name character string language model for being trained according to a basic vocabulary and known person name data.
9. system according to claim 8, wherein:
This basic vocabulary is formed, is formed or be made up of 408 syllables and 1300 toned syllables by 1300 toned syllables by 408 syllables.
10. system according to claim 7, wherein:
This character is described identification unit and is also comprised a character and describe a recognition engine and a character descriptive language model;
The language model that this character descriptive language model is trained for the phrase data according to a basic vocabulary and a utilization character description.
11. system according to claim 10, wherein:
This basic vocabulary is formed, is formed or be made up of 408 syllables and 1300 toned syllables by 1300 toned syllables by 408 syllables.
12. system according to claim 7, wherein:
This syllable is described identification unit and is also comprised a syllable and describe recognition engine, a syllable descriptive language model and the corresponding character list of a syllable;
The language model that this syllable descriptive language model is trained for the phrase data according to a basic vocabulary and a utilization syllable description.
13. system according to claim 12, wherein:
This basic vocabulary is formed, is formed or be made up of 408 syllables and 1300 toned syllables by 1300 toned syllables by 408 syllables.
14. total man's name identification network unit is used for an automatic speech recognizing input system, when user input when describing voice of a Chinese name, but just this name of identification of this total man's name identification network unit, this total man's name identification network unit comprises:
One name identification network engine; And
One name character string language model,
The language model of this name character string language model wherein for being trained according to a basic vocabulary and known person name data,
Wherein this basic vocabulary is made up of a syllable,
Wherein this name identification network engine with reference to this name character string language model with these voice of identification.
15. identification unit described in a character, is used for an automatic speech recognizing input system, when user input is described voice of a character with a character phrase kenel, but just this character of identification of identification unit described in this character, comprises:
Recognition engine described in one character; And
One character descriptive language model,
Wherein this character descriptive language model is the language model that uses the phrase data of character description to be trained according to a basic vocabulary and,
Wherein this basic vocabulary is made up of a syllable,
Wherein this character describe recognition engine with reference to this character descriptive language model with these voice of identification.
16. identification unit described in a syllable, is used for an automatic speech recognizing input system, when user input is described voice of a character with a syllable phrase kenel, but just this character of identification of identification unit described in this syllable, comprises:
Recognition engine described in one syllable;
One syllable descriptive language model; And
The corresponding character list of one syllable,
Wherein this syllable descriptive language model is the language model that uses the phrase data of syllable description to be trained according to a basic vocabulary and,
Wherein this basic vocabulary is made up of a syllable,
Wherein this syllable describe recognition engine with reference to this syllable descriptive language model with these voice of identification.
CN2005100545841A 2005-03-14 2005-03-14 Automatic speech recognizing input method and system for Chinese names Expired - Fee Related CN1835077B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2005100545841A CN1835077B (en) 2005-03-14 2005-03-14 Automatic speech recognizing input method and system for Chinese names

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2005100545841A CN1835077B (en) 2005-03-14 2005-03-14 Automatic speech recognizing input method and system for Chinese names

Publications (2)

Publication Number Publication Date
CN1835077A CN1835077A (en) 2006-09-20
CN1835077B true CN1835077B (en) 2011-05-11

Family

ID=37002792

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2005100545841A Expired - Fee Related CN1835077B (en) 2005-03-14 2005-03-14 Automatic speech recognizing input method and system for Chinese names

Country Status (1)

Country Link
CN (1) CN1835077B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722525A (en) * 2012-05-15 2012-10-10 北京百度网讯科技有限公司 Methods and systems for establishing language model of address book names and searching voice
CN105095180A (en) * 2014-05-14 2015-11-25 中兴通讯股份有限公司 Chinese name broadcasting method and device
CN107945802A (en) * 2017-10-23 2018-04-20 北京云知声信息技术有限公司 Voice recognition result processing method and processing device
CN108962232B (en) * 2018-07-16 2021-01-01 上海小蚁科技有限公司 Voice recognition method and device, storage medium and terminal

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1136356A (en) * 1993-11-29 1996-11-20 迈克尔·T·罗西德斯 Input system for text retrieval
JP2839426B2 (en) * 1992-04-30 1998-12-16 インダストリアル テクノロジー リサーチ インスティチュート Morphological analysis method and text processing system
US6272464B1 (en) * 2000-03-27 2001-08-07 Lucent Technologies Inc. Method and apparatus for assembling a prediction list of name pronunciation variations for use during speech recognition
US6311152B1 (en) * 1999-04-08 2001-10-30 Kent Ridge Digital Labs System for chinese tokenization and named entity recognition
CN1352450A (en) * 2000-11-15 2002-06-05 中国科学院自动化研究所 Voice recognition method for Chinese personal name place name and unit name
CN1464431A (en) * 2002-06-11 2003-12-31 富士施乐株式会社 System for distinguishing names in Asian language writing system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2839426B2 (en) * 1992-04-30 1998-12-16 インダストリアル テクノロジー リサーチ インスティチュート Morphological analysis method and text processing system
CN1136356A (en) * 1993-11-29 1996-11-20 迈克尔·T·罗西德斯 Input system for text retrieval
US6311152B1 (en) * 1999-04-08 2001-10-30 Kent Ridge Digital Labs System for chinese tokenization and named entity recognition
US6272464B1 (en) * 2000-03-27 2001-08-07 Lucent Technologies Inc. Method and apparatus for assembling a prediction list of name pronunciation variations for use during speech recognition
CN1352450A (en) * 2000-11-15 2002-06-05 中国科学院自动化研究所 Voice recognition method for Chinese personal name place name and unit name
CN1464431A (en) * 2002-06-11 2003-12-31 富士施乐株式会社 System for distinguishing names in Asian language writing system

Also Published As

Publication number Publication date
CN1835077A (en) 2006-09-20

Similar Documents

Publication Publication Date Title
KR100901092B1 (en) Combining dtw and hmm in speaker dependent and independent modes for speech recognition
US5638425A (en) Automated directory assistance system using word recognition and phoneme processing method
EP1301922B1 (en) System and method for voice recognition with a plurality of voice recognition engines
US20030130847A1 (en) Method of training a computer system via human voice input
US5732187A (en) Speaker-dependent speech recognition using speaker independent models
US20050261901A1 (en) Training speaker-dependent, phrase-based speech grammars using an unsupervised automated technique
US20030125948A1 (en) System and method for speech recognition by multi-pass recognition using context specific grammars
US8626506B2 (en) Method and system for dynamic nametag scoring
US20030091163A1 (en) Learning of dialogue states and language model of spoken information system
US7940915B2 (en) Apparatus and method for determining part of elicitation from spoken dialogue data
JPH0583918B2 (en)
DE102006006069A1 (en) A distributed speech processing system and method for outputting an intermediate signal thereof
US20020091520A1 (en) Method and apparatus for text input utilizing speech recognition
DE60318385T2 (en) LANGUAGE PROCESSING APPARATUS AND METHOD, RECORDING MEDIUM AND PROGRAM
US20140223310A1 (en) Correction Menu Enrichment with Alternate Choices and Generation of Choice Lists in Multi-Pass Recognition Systems
CN1835077B (en) Automatic speech recognizing input method and system for Chinese names
US20010056345A1 (en) Method and system for speech recognition of the alphabet
EP1213706B1 (en) Method for online adaptation of pronunciation dictionaries
Georgescu et al. Rodigits-a romanian connected-digits speech corpus for automatic speech and speaker recognition
US7272560B2 (en) Methodology for performing a refinement procedure to implement a speech recognition dictionary
Lane et al. Local word discovery for interactive transcription
McTear et al. Integrating flexibility into a structured dialogue model: Some design considerations
CN108682416B (en) Local adaptive speech training method and system
US7885816B2 (en) Efficient presentation of correction options in a speech interface based upon user selection probability
EP1302928A1 (en) Method for speech recognition, particularly of names, and speech recognizer

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110511

Termination date: 20180314

CF01 Termination of patent right due to non-payment of annual fee