WO2017041510A1 - Voice output method and device - Google Patents

Voice output method and device Download PDF

Info

Publication number
WO2017041510A1
WO2017041510A1 PCT/CN2016/082427 CN2016082427W WO2017041510A1 WO 2017041510 A1 WO2017041510 A1 WO 2017041510A1 CN 2016082427 W CN2016082427 W CN 2016082427W WO 2017041510 A1 WO2017041510 A1 WO 2017041510A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice input
user
input content
voice
content
Prior art date
Application number
PCT/CN2016/082427
Other languages
French (fr)
Chinese (zh)
Inventor
王天一
刘升平
Original Assignee
北京云知声信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京云知声信息技术有限公司 filed Critical 北京云知声信息技术有限公司
Priority to CN201680002958.1A priority Critical patent/CN107077845B/en
Publication of WO2017041510A1 publication Critical patent/WO2017041510A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the present invention relates to the field of information processing technologies, and in particular, to a voice output method and apparatus.
  • Voice input is an input method for converting people's spoken content into text through voice recognition.
  • voice recognition With the popularity of smart terminals in people's lives, more and more intelligent terminals gradually have the function of voice services. For example, users can ask questions through voice input.
  • the voice software on smart terminals analyzes the user's voice and also uses voice. The way to answer the user's questions, thus providing help services to the user.
  • this method brings great convenience to the user, and does not require the user to obtain an answer through a cumbersome online query.
  • there is only one answer mode in the current voice service software that is, different users ask the same question. (The main content of the question is the same), the same help information is output.
  • the technical level or patent capability of different users is different. For users, different help information or different answering methods may be required. Therefore, the above methods cannot distinguish different technical requirements to provide voice assistance to users. Not targeted.
  • Embodiments of the present invention provide a voice output method and apparatus.
  • the technical solution is as follows:
  • a voice output method including the following steps:
  • the voice output content matching the recognition degree is acquired and output.
  • the user can select the voice output content that matches the recognition degree of the user according to the recognition degree of the input voice input content, so that the voice output content is more in line with the user's needs, thereby providing the user with A more personalized voice output function, while improving the accuracy of voice output, enabling users to obtain the maximum amount of information from the voice output content, improving the user experience.
  • the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
  • the voice input content of the user When the voice input content of the user is received for the first time, it is determined that the user's awareness of the category to which the voice input content belongs is a preset minimum awareness.
  • the user selects the matched voice output content for output, so that the voice output content is more in line with the user's needs, thereby providing the user with a more personalized voice output function, and simultaneously The accuracy of the voice output is improved, and the user can obtain the maximum amount of information from the voice output content, thereby improving the user experience.
  • the method further includes:
  • the duration of use being a duration between receipt of the voice input content and output of the voice output content.
  • the basis for determining the user's recognition is more abundant, thereby more accurately determining the user's recognition, and further Output more accurate and personalized voice output content for users.
  • the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
  • the basis for determining the user's recognition is more abundant, thereby more accurately determining the user's recognition, and further Output more accurate and personalized voice output content for users.
  • the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
  • history input record information corresponding to the user, where the history input record information includes at least one of historical accumulated use time, historical cumulative input times, and historical input frequency;
  • the user's recognition is determined according to the historical input record information corresponding to the user, so that the terminal can more accurately determine the user's recognition, thereby outputting more accurate and personalized voice output content for the user.
  • the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
  • the user's recognition is determined according to the matching degree of the keyword in the voice input content and the preset keyword, so that the determination of the user's recognition is more accurate and personalized, thereby outputting more accurate and personalized for the user.
  • Voice output content is determined according to the matching degree of the keyword in the voice input content and the preset keyword, so that the determination of the user's recognition is more accurate and personalized, thereby outputting more accurate and personalized for the user.
  • the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
  • the statement structure type including a professional statement structure type or a non-professional statement structure type
  • the user's recognition is determined according to the sentence structure type of the voice input content, so that the determination of the user's recognition is more accurate and personalized, thereby outputting a more accurate and personalized voice output content for the user.
  • the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
  • the user's awareness is determined according to the degree of association between the voice input contents of the same user received twice, so that the user's recognition is more accurate and personalized, thereby outputting more for the user. Accurate, personalized voice output.
  • the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
  • the voice input parameter comprises: voiceprint information of the user, and voice input content of two adjacent inputs of the same user
  • the time interval, the historical input record information corresponding to the user, the matching degree of the keyword in the voice input content with the preset keyword, the statement structure type of the voice input content, and the second input of the same user Between voice input content Union degree
  • the user's awareness of the category to which the voice input content belongs is calculated according to the weight of each of the preset voice input parameters.
  • the user's recognition of the category of the voice input content is calculated according to the different weights of the voice input parameters of the plurality of voice input contents, so that the determination of the user's recognition degree is more accurate and personalized, thereby outputting for the user. More accurate and personalized voice output.
  • the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
  • the voice output content matching the voice input content is output, thereby providing the user with a more accurate and personalized voice output function, so that the user can output from the voice. Get more useful information in the content to improve the user experience.
  • the obtaining, and outputting, from the at least one voice output content corresponding to the voice input content, the voice output content that matches the recognition includes:
  • the voice output content is output.
  • the user selects the matched voice output content for output, so as to select the voice output content that matches the user's recognition for the user to output, so that The voice output content is more in line with the user's needs, and the accuracy of the voice output is improved, so that the user can obtain the maximum amount of information from the voice output content, thereby improving the user experience.
  • the method further includes:
  • the history input record information is updated according to an input time and a usage duration of the voice input content.
  • the user's recognition can be determined according to the accurate history input record, thereby outputting more accurate voice output content for the user.
  • the method further includes:
  • Determining the user's awareness of the category to which the voice input content belongs according to the voice input content including:
  • the user's voiceprint information is used to query the user's awareness of the category to which the voice input content belongs.
  • a voice output device including:
  • a receiving module configured to receive a voice input content input by a user
  • a determining module configured to determine, according to the voice input content, the user's awareness of a category to which the voice input content belongs, the awareness being a degree of knowledge of the user's professional knowledge of the category;
  • an output module configured to acquire and output the voice output content that matches the recognition degree from the at least one voice output content corresponding to the voice input content.
  • the determining module comprises:
  • a first identification submodule configured to identify voiceprint information of the user
  • a first determining sub-module configured to determine, according to the voiceprint information, whether to receive the voice input content of the user for the first time
  • the second determining submodule is configured to determine, when the voice input content of the user is received for the first time, the user's awareness of the category to which the voice input content belongs is a preset minimum awareness.
  • the apparatus further includes:
  • a recording module configured to record an input time and a usage duration of the voice input content, where the usage duration is a duration between receiving the voice input content and outputting the voice output content.
  • the determining module comprises:
  • a second identification submodule configured to identify voiceprint information of the user
  • a second determining sub-module configured to determine, according to the voiceprint information of the user, whether the voice input content received twice adjacently is input by the same user;
  • a first calculation sub-module configured to calculate two adjacent two voice input contents received by the same user when inputting the voice input content of the two adjacent voice input contents, and calculating the adjacent two The time interval between the received voice input contents
  • a third determining submodule configured to determine, according to the time interval, the user's awareness of the category to which the voice input content belongs; wherein the longer the time interval, the lower the awareness.
  • the determining module comprises:
  • a third identification submodule configured to identify voiceprint information of the user
  • a first obtaining sub-module configured to acquire historical input record information corresponding to the user according to the voiceprint information of the user, where the historical input record information includes a historical accumulated use time, a historical cumulative input number, and a historical input frequency. At least one piece of information;
  • a fourth determining submodule configured to determine, according to the historical input record information, the user's awareness of a category to which the voice input content belongs; wherein the longer the historical accumulated usage time, the more the cognitive High; the more the historical cumulative input times, the higher the awareness; the higher the historical input frequency, the higher the recognition.
  • the determining module comprises:
  • a fifth determining submodule configured to determine a matching degree between the keyword in the voice input content and the preset keyword
  • a sixth determining submodule configured to determine, according to a matching degree of the keyword in the voice input content and a preset keyword, the user's recognition of a category to which the voice input content belongs; wherein the voice input The higher the degree of matching between the keyword in the content and the professional keyword in the preset keyword, the higher the recognition; the keyword in the voice input content and the non-professional keyword in the preset keyword The higher the degree of matching, the lower the awareness.
  • the determining module comprises:
  • a seventh determining submodule configured to determine a statement structure type of the voice input content, where the statement structure type includes a professional statement structure type or a non-professional statement structure type;
  • An eighth determining submodule configured to determine, according to a statement structure type of the voice input content, a recognition of a category of the voice input content by the user; wherein, the user voices the type of the professional sentence structure
  • the recognition of the category to which the input content belongs is higher than the recognition of the category of the voice input content of the non-professional sentence structure type.
  • the determining module comprises:
  • a ninth determining sub-module configured to determine, when the voice input content received twice in the adjacent two times is input by the same user, determining the adjacent two times according to keywords in the voice input content received twice adjacent to each other The degree of association between the received voice input content;
  • a tenth determining submodule configured to determine, according to the degree of association between the two received voice input contents, the user's awareness of the category to which the voice input content belongs; wherein the degree of association The higher the recognition, the lower the awareness.
  • the determining module comprises:
  • An eleventh determining submodule configured to determine, according to the voice input content, at least two voice input parameters of the voice input content, where the voice input parameter comprises: voiceprint information of the user, adjacent to the same user a time interval between two input voice input contents, history input record information corresponding to the user, a degree of matching between a keyword in the voice input content and a preset keyword, and a sentence structure of the voice input content The degree of association between the type and the voice input input twice between the same user;
  • a calculation submodule configured to calculate, according to a preset weight of each of the voice input parameters, the user's awareness of the category to which the voice input content belongs.
  • the determining module comprises:
  • the twelfth determining submodule is configured to determine, when the voice input parameter of the voice input content cannot be determined, the recognition of the category of the voice input content by the user as a preset minimum awareness.
  • the output module comprises:
  • a thirteenth determining submodule configured to determine a cognitive level corresponding to the cognition according to a correspondence between the cognition and the cognition level
  • a second obtaining submodule configured to acquire, according to a correspondence between the cognitive level and the voice output content, the voice output content corresponding to the cognitive level
  • An output submodule for outputting the voice output content.
  • the apparatus further includes:
  • an update module configured to update the historical input record information according to an input time and a usage duration of the voice input content.
  • the apparatus further includes:
  • a storage module configured to store, by the user, an awareness of a category to which the voice input content belongs
  • the determining module includes:
  • a fourth identification submodule configured to identify voiceprint information of the user
  • the query sub-module is configured to query, according to the voiceprint information of the user, the user's awareness of the category to which the voice input content belongs.
  • the device can output a voice output content that matches the recognition degree of the user according to the user's recognition of the category of the input voice input content, so that the voice output content is more in line with the user's needs, thereby providing the user with more
  • the personalized voice output function improves the accuracy of the voice output, enabling the user to obtain the maximum amount of information from the voice output content, thereby improving the user experience.
  • a voice output device comprising:
  • a memory for storing the processor executable instructions
  • processor is configured to:
  • the voice output content matching the recognition degree is acquired and output.
  • the above processor is also configured to:
  • the voice input content of the user When the voice input content of the user is received for the first time, it is determined that the user's awareness of the category to which the voice input content belongs is a preset minimum awareness.
  • the above processor is also configured to:
  • the duration of use being a duration between receipt of the voice input content and output of the voice output content.
  • the above processor is also configured to:
  • the above processor is also configured to:
  • history input record information corresponding to the user, where the history input record information includes at least one of historical accumulated use time, historical cumulative input times, and historical input frequency;
  • the above processor is also configured to:
  • the above processor is also configured to:
  • the statement structure type including a professional statement structure type or a non-professional statement structure type
  • the above processor is also configured to:
  • the above processor is also configured to:
  • the voice input parameter comprises: voiceprint information of the user, and voice input content of two adjacent inputs of the same user
  • the time interval, the historical input record information corresponding to the user, the matching degree of the keyword in the voice input content with the preset keyword, the statement structure type of the voice input content, and the second input of the same user The degree of association between voice input content;
  • the above processor is also configured to:
  • the above processor is also configured to:
  • the voice output content is output.
  • the above processor is also configured to:
  • the history input record information is updated according to an input time and a usage duration of the voice input content.
  • the above processor is also configured to:
  • Determining the user's awareness of the category to which the voice input content belongs according to the voice input content including:
  • the user's voiceprint information is used to query the user's awareness of the category to which the voice input content belongs.
  • a non-transitory computer readable recording medium having recorded thereon a computer program, the program comprising instructions for performing the method of the first aspect of the embodiment of the present invention.
  • a computer program comprising: instructions for performing the method of the first aspect of the embodiment of the invention when the program is executed by a computer.
  • FIG. 1 is a flowchart of a voice output method according to an embodiment of the present invention.
  • step S12 in a voice output method according to an embodiment of the present invention
  • FIG. 3 is a flowchart of step S12 in a voice output method according to an embodiment of the present invention.
  • step S12 is a flowchart of step S12 in a voice output method according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of step S12 in a voice output method according to an embodiment of the present invention.
  • FIG. 6 is a flowchart of step S13 in a voice output method according to an embodiment of the present invention.
  • FIG. 7 is a block diagram of a voice output apparatus according to an embodiment of the present invention.
  • FIG. 8 is a block diagram of a determining module in a voice output device according to an embodiment of the present invention.
  • FIG. 9 is a block diagram of a determining module in a voice output device according to an embodiment of the present invention.
  • FIG. 10 is a block diagram of a determining module in a voice output device according to an embodiment of the present invention.
  • FIG. 11 is a block diagram of a determining module in a voice output device according to an embodiment of the present invention.
  • FIG. 12 is a block diagram of an output module in a voice output device according to an embodiment of the present invention.
  • FIG. 13 is a block diagram of a voice output apparatus according to an embodiment of the present invention.
  • FIG. 14 is a block diagram of an apparatus for performing a voice output method according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a voice output method according to an embodiment of the present invention.
  • the method is used in a terminal, and the terminal may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, etc., including the following steps.
  • S11-S13 :
  • step S11 the voice input content input by the user is received.
  • the user can input the voice input content by inputting a voice.
  • Step S12 Determine, according to the voice input content, the user's recognition of the category to which the voice input content belongs; the awareness is the degree of professional knowledge of the user's category of the voice input content.
  • the user's recognition of the category of the voice input content is the user's knowledge of the air conditioning class; the user inputs the voice input content "What is the medicine of aspirin” Then, the user's recognition of the category of the voice input content is the user's knowledge of the professional knowledge of the medicine.
  • the terminal can determine the category to which the voice input content belongs by extracting keywords in the voice input content.
  • Step S13 Acquire and output the voice output content that matches the recognition degree from the at least one voice output content corresponding to the voice input content.
  • the voice output content matching the recognition degree is selected for the user to output, so that the voice output content is more in line with the user.
  • the demand provides users with more personalized voice output functions, and at the same time improves the accuracy of voice output, enabling users to obtain the maximum amount of information from the voice output content, thereby improving the user experience.
  • step S12 the user's awareness of the category to which the voice input content belongs may be determined in various ways. Firstly, according to the voice input content, the voice input parameter of the voice input content is determined, and then the user's recognition of the category of the voice input content is determined according to the voice input parameter.
  • the method for determining the degree of recognition may be different according to different voice input parameters, and the voice input parameter may include the voiceprint information of the user, the time interval between the voice input contents of the two adjacent inputs of the same user, and the user.
  • the embodiment of step S12 will be described below by means of different embodiments.
  • step S12 may be implemented as the following steps S21-S23:
  • step S21 the voiceprint information of the user is identified.
  • step S22 based on the voiceprint information, it is determined whether the voice input content of the user is received for the first time.
  • step S23 when the content of the voice input content of the user is received for the first time, it is determined that the user's recognition of the category of the voice input content belongs to the preset minimum awareness.
  • the voiceprint information corresponding to different users is stored in the terminal.
  • the terminal can query the voiceprint information of the user in the voiceprint information stored in advance, the first time is not received.
  • the voice input content of the user and if the terminal fails to query the voiceprint information of the user in the pre-stored voiceprint information, the terminal indicates that the voice input content of the user is received for the first time.
  • the terminal continues to determine other item voice input parameters according to the voice input content, and performs step S12 according to the other item voice input parameters.
  • a correspondence between the recognition and the voice input content is stored in advance, and the voice input content corresponding to the preset minimum recognition is included.
  • step S12 can be implemented as the following steps S31-S34:
  • step S31 the voiceprint information of the user is identified.
  • Step S32 Determine, according to the voiceprint information of the user, whether the voice input content received twice adjacently is input by the same user.
  • Step S33 when the voice input content received twice in the adjacent two times is input by the same user, calculate the voice input content received twice in the adjacent two according to the input time and the usage duration of the two received voice input contents. The time interval between.
  • Step S34 Determine, according to the time interval, the user's awareness of the category to which the voice input content belongs; wherein the longer the time interval, the lower the recognition.
  • the time interval between the voice input contents received twice adjacently may reflect the previous voice output by the user to the terminal.
  • the response time of the output content in addition, the response time of the user to the last voice output content output by the terminal can also be characterized by the time interval between the last output of the voice output content and the content of the voice input received this time.
  • the voice input content received by the terminal last time is “how to set the air conditioner temperature”, and for the voice input content, the terminal outputs the corresponding voice output content as “first enters the temperature adjustment mode, then changes the temperature”;
  • the received voice input content is “how to enter the temperature adjustment mode”.
  • the voice input content “how to set the air conditioner temperature” may be adopted.
  • the time interval between receiving the voice input content "how to enter the temperature adjustment mode” to characterize the response time of the user to the previous voice output content “first enters the temperature adjustment mode, then changes the temperature", thereby determining the user's input to the voice
  • the time interval between how to enter the temperature adjustment mode is used to characterize the response time of the user to the first voice output content "first enters the temperature adjustment mode and then changes the temperature", thereby determining the user's awareness of the category of the voice input content. The longer the time interval, the longer the user's response to the previous voice output, and the lower the awareness.
  • a preset time interval may be preset, when the time interval between the two received voice input contents is the same user input, and the time interval between the two received voice input contents exceeds the preset time During the interval, the terminal may directly determine that the user's recognition of the category of the voice input content is the preset minimum awareness, and acquire the voice output content that matches the preset minimum recognition for output.
  • step S12 can be implemented as the following steps S41-S43:
  • step S41 the voiceprint information of the user is identified.
  • Step S42 Acquire historical input record information corresponding to the user according to the voiceprint information of the user; the historical input record information includes at least one of historical accumulated use time, historical cumulative input times, and historical input frequency.
  • Step S43 determining the user's recognition of the category of the voice input content according to the historical input record information; wherein, the longer the historical cumulative use time, the higher the recognition degree; the more the historical cumulative input times, the higher the recognition degree; The higher the historical input frequency, the higher the awareness.
  • each time the terminal receives the voice input content input by the user the input time and the usage duration of the voice input content are recorded, and the usage duration is the length of time between receiving the voice input content and outputting the voice output content.
  • the terminal can count the historical input record information corresponding to the user according to the recorded input time and the duration of use, wherein the historical accumulated use time is the sum of the used durations recorded each time.
  • the above method further includes the step of: updating the history input record information according to the input time and the usage duration of the voice input content.
  • the terminal determines the user's recognition of the category of the voice input content according to the historical input record information corresponding to the user, the history input record information is more rich and accurate, so that the user can select a more accurate and personalized voice.
  • the output is output.
  • step S12 may be implemented as the following steps S51-S53:
  • step S51 keywords in the voice input content are extracted.
  • Step S52 determining a degree of matching between the keyword in the voice input content and the preset keyword.
  • Step S53 determining, according to the matching degree of the keyword in the voice input content and the preset keyword, the user's recognition of the category of the voice input content; wherein, the keyword in the voice input content and the professional in the preset keyword The higher the matching degree of the keyword, the higher the recognition degree; the higher the matching degree between the keyword in the voice input content and the non-professional keyword in the preset keyword, the lower the recognition degree.
  • the preset keywords pre-stored in the terminal include two types of professional keywords and non-professional keywords.
  • Degree, and the degree of matching with non-professional keywords For example, professional keywords include “set path", and non-professional keywords include “how to use”. If the voice input content received by the terminal is "set path of ", then the keywords and professions in the voice input content can be determined.
  • the matching degree between the keywords is higher, so the user's awareness of the category of the voice input content is higher; if the voice input content received by the terminal is "how to use", then the voice input content can be determined.
  • the keyword has a higher degree of matching with the non-professional keyword, so the user’s awareness of the category of the voice input content is higher. low.
  • step S12 can be implemented as the following steps A1-A2:
  • step A1 the statement structure type of the voice input content is determined, and the statement structure type includes a professional statement structure type or a non-professional statement structure type.
  • Step A2 determining, according to the sentence structure type of the voice input content, the user's recognition of the category of the voice input content; wherein, the user's recognition of the category of the voice input content of the professional sentence structure type is higher than that of the non-professional statement structure
  • the type of voice input content is recognized by the category.
  • the statement structure type is pre-stored in the terminal, and the statement structure type can be embodied by a regular expression.
  • the regular expressions of the professional sentence structure type are: adjective + noun + verb; non-professional statement structure type regular expression such as: pronoun + verb.
  • the expression of the statement structure type is not limited to regular expressions, but can also be embodied in other ways that can reflect the structure of the statement.
  • the voice input content received by the terminal is “what is the step of booting up”, and the terminal determines that the structure structure of the voice input content is “adjective + noun + verb + pronoun” by analyzing the voice input content, then
  • the statement structure type of the voice input content is determined to be a professional sentence structure type, and the user has a higher awareness of the category to which the voice input content belongs.
  • the voice input content received by the terminal is “how to use this thing”, and the terminal determines the voice input content, and determines that the sentence structure type of the voice input content is “pronoun + verb”, then the voice input content can be determined.
  • the statement structure type is a non-professional statement structure type, and the user has low awareness of the category to which the voice input belongs.
  • step S12 can be implemented as the following steps B1-B2:
  • Step B1 when it is determined that the voice input content received two times in the adjacent two is input by the same user, determining the voice input content of the two adjacent received voices according to the keywords in the voice input content received two times adjacent to each other. The degree of association between them.
  • step B2 the user's recognition of the category of the voice input content is determined according to the degree of association between the two received voice input contents; wherein the higher the degree of association, the lower the degree of recognition.
  • the degree of association between the voice input contents received twice adjacently may reflect the user's content of the previous voice output.
  • the degree of understanding so the higher the degree of association between the two received voice input contents, the lower the understanding of the user's previous voice output content, and the more the user's awareness of the category of the voice input content. Low; the lower the degree of association between the two received voice input contents, the higher the user's understanding of the previous voice output content, and the higher the user's awareness of the category of the voice input content.
  • the voice input content received by the terminal last time is “How to set the air conditioner temperature”, and the voice input content received by the terminal at this time is “How to enter the temperature adjustment mode”, when the terminal determines the voice input received twice adjacently.
  • keywords in the voice input content received twice adjacently can be extracted, such as keywords "air conditioning temperature”, “temperature adjustment mode”, by keyword “air conditioning temperature” and keyword "
  • the degree of association between the temperature adjustment modes determines the degree of association between the two previously received speech input contents, since both "air conditioning temperature” and “temperature adjustment mode” are temperature related keywords, so both The degree of correlation between the two is higher.
  • the voice input content received by the terminal last time is “how to set the air conditioner temperature”, and the voice input content received by the terminal at this time is “what is the step of powering on”, when the terminal determines the voice received twice adjacently.
  • Input is the same
  • the keywords in the adjacent two voice input contents are respectively extracted as “air conditioning temperature” and “power on”, since the two keywords are two unrelated types of keywords,
  • the degree of association is almost zero, which means that the degree of association between the two received voice input contents is very low, and the user has a higher degree of understanding of the previous voice output content, thereby indicating the user's category of the voice input content. High recognition.
  • step S12 may be further implemented as: determining, according to the voice input content, at least two voice input parameters of the voice input content, wherein the voice input parameter comprises: voice tone information of the user, and two adjacent inputs of the same user The time interval between the voice input contents, the history input record information corresponding to the user, the matching degree of the keyword in the voice input content with the preset keyword, the statement structure type of the voice input content, and the input of the same user twice The degree of association between the voice input contents; calculating the user's awareness of the category to which the voice input content belongs according to the weight of each of the preset voice input parameters.
  • the method further includes the step of determining that the user's recognition of the category of the voice input content is a preset minimum awareness when the voice input parameter of the voice input content cannot be determined.
  • the terminal may directly determine that the user's recognition of the category of the voice input content belongs to the preset minimum awareness. Therefore, even if the voice input content of the voice input parameter cannot be determined, the user can obtain the voice output content matching the same, thereby improving the user experience.
  • the above method further comprises the step of storing the user's awareness of the category to which the voice input content belongs.
  • step S12 may be implemented as the following steps: identifying the voiceprint information of the user; and querying the user's awareness of the category of the voice input content according to the voiceprint information of the user. In this embodiment, by querying the user's awareness, it is more convenient and quick to determine the user's recognition of the category of the voice input content, thereby outputting the matched voice output content for the user more accurately and quickly.
  • step S13 can be implemented as the following steps S61-S63:
  • step S61 the cognitive level corresponding to the cognition is determined according to the correspondence between the cognition and the cognition level.
  • Step S62 Acquire a voice output content corresponding to the cognitive level according to the correspondence between the cognitive level and the voice output content.
  • step S63 the voice output content is output.
  • the terminal pre-stores a correspondence between the cognitive level and the cognitive level, and a correspondence between the cognitive level and the voice output content.
  • the cognitive level may be classified into a low cognitive level according to needs. There are three levels of cognition level and high cognition level.
  • the cognition is in the corresponding low cognitive level between “0% and 30%”, and the recognition is recognized in the correspondence between “31% to 70%”. Knowing the level, the recognition is in the corresponding high cognitive level between "71% to 100%”.
  • the voice output content corresponding to the low cognitive level is the detailed version of the voice output content
  • the voice output content corresponding to the middle cognitive level is the standard version of the voice output content
  • the voice output content corresponding to the high cognitive level is the compact version of the voice output content.
  • the terminal For each voice input content, the terminal stores the three voice output contents corresponding to the detailed version, the compact version, and the standard version.
  • the corresponding voice output content includes: detailed version "Click the mode button in the middle of the first row, click twice to enter the temperature adjustment mode, click the left button of the second row '+/-' Change the temperature, click once, the temperature '+/-'1 degree”; Standard version “Click the mode button to enter the temperature adjustment mode, click the button '+/-' to change the temperature”; the simple version "first enter the temperature adjustment mode, then change the temperature ".
  • the cognitive level corresponding to the preset minimum recognition may be a low cognitive level.
  • the terminal may directly output the detailed version.
  • Voice output content it can be seen that, by using the technical solution of the embodiment, when the terminal outputs the voice output content for the user, the terminal can analyze the current demand of the user by determining the user's recognition of the category of the voice input content, and output according to the current demand of the user. The voice output content matched with it enables the user to obtain more and more accurate information from the voice output content.
  • the embodiment of the present invention further provides a voice output device, which is used to perform the above method.
  • FIG. 7 is a block diagram of a voice output device according to an embodiment of the present invention. As shown in Figure 7, the device includes:
  • the receiving module 71 is configured to receive voice input content input by the user
  • the determining module 72 is configured to determine, according to the voice input content, the user's recognition of the category of the voice input content, and the recognition degree is the user's knowledge of the category's professional knowledge;
  • the output module 73 is configured to acquire and output the voice output content that matches the recognition degree from the at least one voice output content corresponding to the voice input content.
  • the determining module 72 includes:
  • a first identification sub-module 721, configured to identify voiceprint information of the user
  • the first determining sub-module 722 is configured to determine, according to the voiceprint information, whether the voice input content of the user is received for the first time;
  • the second determining sub-module 723 is configured to determine, when the voice input content of the user is received for the first time, the user's recognition of the category to which the voice input content belongs is a preset minimum awareness.
  • the above apparatus further includes:
  • the recording module is configured to record an input time and a usage duration of the voice input content, and the usage duration is a duration between receiving the voice input content and outputting the voice output content.
  • the determining module 72 includes:
  • a second identification sub-module 724 configured to identify voiceprint information of the user
  • the second determining sub-module 725 is configured to determine, according to the voiceprint information of the user, whether the voice input content received twice adjacently is input by the same user;
  • the first calculation sub-module 726 is configured to calculate the adjacent two times according to the input time and the usage duration of the two received voice input contents when the voice input content received twice in the adjacent two times is input by the same user.
  • the third determining sub-module 727 is configured to determine, according to the time interval, the user's awareness of the category to which the voice input content belongs; wherein the longer the time interval, the lower the awareness.
  • the determining module 72 includes:
  • a third identification sub-module 728 configured to identify voiceprint information of the user
  • the first obtaining sub-module 729 is configured to acquire historical input record information corresponding to the user according to the voiceprint information of the user, where the historical input record information includes at least one of historical accumulated use time, historical cumulative input times, and historical input frequency;
  • the fourth determining sub-module 7210 is configured to determine, according to the historical input record information, the user's recognition of the category of the voice input content; wherein, the longer the historical accumulated usage time, the higher the recognition degree; the more the historical cumulative input times, The higher the awareness; the higher the historical input frequency, the higher the recognition.
  • the determining module 72 includes:
  • a fifth determining sub-module 7212 configured to determine a matching degree between the keyword in the voice input content and the preset keyword
  • the sixth determining sub-module 7213 is configured to determine, according to the matching degree of the keyword in the voice input content and the preset keyword, the user's recognition of the category of the voice input content; wherein, the keyword and the pre-information in the voice input content.
  • the determining module 72 includes:
  • a seventh determining submodule configured to determine a statement structure type of the voice input content, the statement structure type including a professional statement structure type or a non-professional statement structure type;
  • the eighth determining submodule is configured to determine, according to the sentence structure type of the voice input content, the user's recognition of the category of the voice input content; wherein the user has higher recognition of the category of the voice input content of the professional sentence structure type Awareness of the category of the voice input content of the non-professional statement structure type.
  • the determining module 72 includes:
  • a ninth determining sub-module configured to determine, when the voice input content received twice in the adjacent two times is input by the same user, determining that the two adjacent ones are received according to the keywords in the voice input content received two times adjacent to each other The degree of association between voice input content;
  • a tenth determining sub-module configured to determine, according to the degree of association between the two received voice input contents, the user's recognition of the category of the voice input content; wherein the higher the degree of association, the lower the degree of recognition .
  • the determining module 72 includes:
  • the eleventh determining sub-module is configured to determine at least two voice input parameters of the voice input content according to the voice input content, where the voice input parameter comprises: voice tone information of the user, and voice input content of two adjacent inputs of the same user The time interval between the time, the historical input record information corresponding to the user, the matching degree of the keyword in the voice input content with the preset keyword, the sentence structure type of the voice input content, and the voice input content input twice by the same user Degree of association
  • the calculation sub-module is configured to calculate the user's recognition of the category of the voice input content according to the weight of each preset voice input parameter.
  • the determining module 72 includes:
  • the twelfth determining submodule is configured to determine the user to the voice when the voice input parameter of the voice input content cannot be determined
  • the recognition of the category to which the input content belongs is the preset minimum awareness.
  • the output module 73 includes:
  • a thirteenth determining sub-module 731 configured to determine a cognitive level corresponding to the cognition according to a correspondence between the cognition and the cognition level;
  • the second obtaining sub-module 732 is configured to acquire, according to a correspondence between the cognitive level and the voice output content, the voice output content corresponding to the cognitive level;
  • the output sub-module 733 is configured to output the voice output content.
  • the foregoing apparatus further includes:
  • the updating module 74 is configured to update the historical input record information according to the input time and the usage duration of the voice input content.
  • the storage module 75 is configured to store the user's awareness of the category to which the voice input content belongs.
  • the determining module 72 includes:
  • a fourth identification sub-module configured to identify voiceprint information of the user
  • the query sub-module is configured to query the user's recognition of the category of the voice input content according to the user's voiceprint information.
  • the voice output content matching the recognition degree is selected for the user to output, so that the voice output content is more in line with the user's needs. Therefore, the user is provided with a more personalized voice output function, and the accuracy of the voice output is improved, so that the user can obtain the maximum amount of information from the voice output content, thereby improving the user experience.
  • FIG. 14 is a block diagram of an apparatus for performing a voice output method, according to an exemplary embodiment.
  • device 1600 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
  • device 1600 can include one or more of the following components: processor 1601, memory 1602, and communication component 1603.
  • the processor 1601 typically controls the overall operation of the device 1600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • the processor 1601 can execute instructions to perform all or part of the steps of the above method.
  • Memory 1602 is configured to store various types of data to support operation at device 1600. Examples of such data include instructions for any application or method operating on device 1600, contact data, phone book data, messages, pictures, videos, and the like.
  • the memory 1602 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM Electrically erasable programmable read only memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • Communication component 1603 is configured to facilitate wired or wireless communication between device 1600 and other devices.
  • the device 1600 can access a wireless network based on a communication standard, such as Wi-Fi, 2G or 3G, or a combination thereof.
  • the communication component 1603 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel.
  • communication component 1603 further includes a near field communication (NFC) module to facilitate short range Communication.
  • NFC near field communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 1600 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the voice output method described above.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor, or other electronic component implementation for performing the voice output method described above.
  • non-transitory computer readable storage medium comprising instructions, such as a memory 1602 comprising instructions executable by processor 1601 of apparatus 1600 to perform the voice output method described above.
  • the non-transitory computer readable storage medium can be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
  • the present invention also provides a non-transitory computer readable recording medium having recorded thereon a computer program including instructions for executing the voice output method according to the above-described embodiment of the present invention.
  • the present invention also provides a computer program comprising: instructions for executing a voice output method according to the above-described embodiment of the present invention when the program is executed by a computer.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Abstract

A voice output method and device. The method comprises: receiving a voice input content input by a user (S11); determining a cognition degree of the user on a category, to which the voice input content belongs, according to the voice input content, wherein the cognition degree is a professional knowledge cognition degree of the user on the category (S12); and acquiring and outputting a voice output content matching the cognition degree from at least one voice output content corresponding to the voice input content (S13). By means of the technical solution, a voice output content matching a cognition degree of a user can be selected for the user according to the cognition degree of the user on a category, to which an input voice input content belongs, and can be output, so that the voice output content better conforms to the requirements of the user; a personalized voice output function is provided for the user; the voice output accuracy is improved, so that the user can acquire the maximum information quantity from the voice output content; and the user experience is improved.

Description

一种语音输出方法及装置Voice output method and device
本申请基于申请日为2015年9月8日、申请号为CN201510568430.8、题为“一种语音输出方法及装置”的发明专利申请提出,并要求该发明专利申请的优先权,该发明专利申请的全部内容在此引入本申请作为参考。The present application is based on an invention patent application filed on September 8, 2015, the application number is CN201510568430.8, entitled "A Voice Output Method and Apparatus", and claims the priority of the invention patent application, the invention patent The entire contents of the application are incorporated herein by reference.
技术领域Technical field
本发明涉及信息处理技术领域,尤其涉及一种语音输出方法及装置。The present invention relates to the field of information processing technologies, and in particular, to a voice output method and apparatus.
背景技术Background technique
目前,随着电子科技的发展,语音输入越来越被人们推崇,语音输入是通过语音识别将人说话的内容转换为文本的一种输入方式。随着智能终端在人们生活中的普及,越来越多的智能终端逐渐具有语音服务的功能,例如,用户可以通过语音输入提出问题,智能终端上的语音软件通过分析用户的语音,同样以语音的方式回答用户的问题,从而为用户提供帮助服务。然而,这种方法虽然为用户带来极大的方便,无需用户通过繁琐的上网查询来获取答案,但目前的语音服务软件中仅有一种回答模式,也就是说,不同的用户提问相同的问题(问题的主要内容相同),则输出相同的帮助信息。而不同用户的技术水平或专利能力都是不相同的,对于用户来说,可能需要不同的帮助信息,或者不同的回答方式,因此,上述方法无法区别不同的技术需求来为用户提供语音帮助,不具有针对性。At present, with the development of electronic technology, voice input is more and more praised by people. Voice input is an input method for converting people's spoken content into text through voice recognition. With the popularity of smart terminals in people's lives, more and more intelligent terminals gradually have the function of voice services. For example, users can ask questions through voice input. The voice software on smart terminals analyzes the user's voice and also uses voice. The way to answer the user's questions, thus providing help services to the user. However, this method brings great convenience to the user, and does not require the user to obtain an answer through a cumbersome online query. However, there is only one answer mode in the current voice service software, that is, different users ask the same question. (The main content of the question is the same), the same help information is output. The technical level or patent capability of different users is different. For users, different help information or different answering methods may be required. Therefore, the above methods cannot distinguish different technical requirements to provide voice assistance to users. Not targeted.
发明内容Summary of the invention
本发明实施例提供一种语音输出方法与装置。所述技术方案如下:Embodiments of the present invention provide a voice output method and apparatus. The technical solution is as follows:
第一方面,提供一种语音输出方法,包括以下步骤:In a first aspect, a voice output method is provided, including the following steps:
接收用户输入的语音输入内容;Receiving voice input input by the user;
根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,所述认知度为所述用户对所述类别的专业知识认知程度;Determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs, the degree of recognition being a degree of knowledge of the user's professional knowledge of the category;
从与所述语音输入内容相对应的至少一种语音输出内容中,获取并输出与所述认知度相匹配的语音输出内容。From the at least one voice output content corresponding to the voice input content, the voice output content matching the recognition degree is acquired and output.
本发明实施例的一些有益效果可以包括:Some beneficial effects of embodiments of the present invention may include:
上述技术方案,能够根据用户对输入的语音输入内容所属类别的认知度,为用户选择与其认知度相匹配的语音输出内容进行输出,使得语音输出内容更加符合用户的需求,从而为用户提供更加个性化的语音输出功能,同时提高了语音输出的准确性,使用户能够从语音输出内容中获取到最大的信息量,提高了用户的体验度。According to the above technical solution, the user can select the voice output content that matches the recognition degree of the user according to the recognition degree of the input voice input content, so that the voice output content is more in line with the user's needs, thereby providing the user with A more personalized voice output function, while improving the accuracy of voice output, enabling users to obtain the maximum amount of information from the voice output content, improving the user experience.
在一个实施例中,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:In an embodiment, the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
识别所述用户的声纹信息; Identifying voiceprint information of the user;
根据所述声纹信息,判断是否为首次接收所述用户的语音输入内容;Determining, according to the voiceprint information, whether the voice input content of the user is received for the first time;
当为首次接收所述用户的语音输入内容时,确定所述用户对所述语音输入内容所属类别的认知度为预设最低认知度。When the voice input content of the user is received for the first time, it is determined that the user's awareness of the category to which the voice input content belongs is a preset minimum awareness.
该实施例中,根据是否为首次接收用户的语音输入内容为用户选择相匹配的语音输出内容进行输出,使得语音输出内容更加符合用户的需求,从而为用户提供更加个性化的语音输出功能,同时提高了语音输出的准确性,使用户能够从语音输出内容中获取到最大的信息量,提高了用户的体验度。In this embodiment, according to whether the voice input content of the user is received for the first time, the user selects the matched voice output content for output, so that the voice output content is more in line with the user's needs, thereby providing the user with a more personalized voice output function, and simultaneously The accuracy of the voice output is improved, and the user can obtain the maximum amount of information from the voice output content, thereby improving the user experience.
在一个实施例中,所述方法还包括:In an embodiment, the method further includes:
记录所述语音输入内容的输入时间和使用时长,所述使用时长为接收到所述语音输入内容和输出所述语音输出内容之间的时长。Recording an input time and a duration of use of the voice input content, the duration of use being a duration between receipt of the voice input content and output of the voice output content.
该实施例中,通过记录语音输入内容的输入时间和使用时长,使得后续为用户输出语音输出内容时,确定用户认知度的依据更加丰富,从而更加准确地确定出用户的认知度,进而为用户输出更加准确、个性化的语音输出内容。In this embodiment, by recording the input time and the duration of the voice input content, when the user outputs the voice output content, the basis for determining the user's recognition is more abundant, thereby more accurately determining the user's recognition, and further Output more accurate and personalized voice output content for users.
在一个实施例中,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:In an embodiment, the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
识别所述用户的声纹信息;Identifying voiceprint information of the user;
根据所述用户的声纹信息,判断相邻两次接收到的语音输入内容是否为同一用户所输入;Determining, according to the voiceprint information of the user, whether the voice input content received twice adjacently is input by the same user;
当相邻两次接收到的语音输入内容为同一用户所输入时,根据所述相邻两次接收到的语音输入内容的输入时间和使用时长,计算相邻两次接收到的语音输入内容之间的时间间隔;When the voice input content received twice in the adjacent time is input by the same user, calculating the voice input content of the two adjacent received voices according to the input time and the duration of use of the voice input content received by the two adjacent times Time interval between
根据所述时间间隔,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述时间间隔越长,所述认知度越低。And determining, according to the time interval, the user's awareness of the category to which the voice input content belongs; wherein the longer the time interval, the lower the awareness.
该实施例中,通过计算相邻两次接收到的同一用户的语音输入内容之间的时间间隔,使得确定用户认知度的依据更加丰富,从而更加准确地确定出用户的认知度,进而为用户输出更加准确、个性化的语音输出内容。In this embodiment, by calculating the time interval between the voice input contents of the same user received twice, the basis for determining the user's recognition is more abundant, thereby more accurately determining the user's recognition, and further Output more accurate and personalized voice output content for users.
在一个实施例中,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:In an embodiment, the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
识别所述用户的声纹信息;Identifying voiceprint information of the user;
根据所述用户的声纹信息,获取与所述用户对应的历史输入记录信息,所述历史输入记录信息包括历史累计使用时间、历史累计输入次数和历史输入频率中至少一项信息;Acquiring, according to the voiceprint information of the user, history input record information corresponding to the user, where the history input record information includes at least one of historical accumulated use time, historical cumulative input times, and historical input frequency;
根据所述历史输入记录信息,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述历史累计使用时间越长,所述认知度越高;所述历史累计输入次数越多,所述认知度越高;所述历史输入频率越高,所述认知度越高。Determining, by the history input record information, the user's awareness of the category to which the voice input content belongs; wherein the longer the historical accumulated usage time, the higher the awareness; the historical cumulative input times The more the recognition, the higher the degree of recognition; the higher the historical input frequency, the higher the recognition.
该实施例中,根据用户对应的历史输入记录信息来确定用户的认知度,使得终端能够更加准确地确定出用户的认知度,进而为用户输出更加准确、个性化的语音输出内容。 In this embodiment, the user's recognition is determined according to the historical input record information corresponding to the user, so that the terminal can more accurately determine the user's recognition, thereby outputting more accurate and personalized voice output content for the user.
在一个实施例中,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:In an embodiment, the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
提取所述语音输入内容中的关键词;Extracting keywords in the voice input content;
确定所述语音输入内容中的关键词与预设关键词的匹配度;Determining a degree of matching between a keyword in the voice input content and a preset keyword;
根据所述语音输入内容中的关键词与预设关键词的匹配度,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述语音输入内容中的关键词与预设关键词中的专业关键词的匹配度越高,所述认知度越高;所述语音输入内容中的关键词与预设关键词中的非专业关键词的匹配度越高,所述认知度越低。Determining, by the user, the recognition of the category to which the voice input content belongs according to the matching degree of the keyword in the voice input content and the preset keyword; wherein the keyword and the preset in the voice input content The higher the matching degree of the professional keyword in the keyword, the higher the recognition degree; the higher the matching degree between the keyword in the voice input content and the non-professional keyword in the preset keyword, the recognition The lower the knowledge.
该实施例中,根据语音输入内容中的关键词与预设关键词的匹配度来确定用户的认知度,使得用户认知度的确定更加准确、个性化,从而为用户输出更加准确、个性化的语音输出内容。In this embodiment, the user's recognition is determined according to the matching degree of the keyword in the voice input content and the preset keyword, so that the determination of the user's recognition is more accurate and personalized, thereby outputting more accurate and personalized for the user. Voice output content.
在一个实施例中,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:In an embodiment, the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
确定所述语音输入内容的语句结构类型,所述语句结构类型包括专业语句结构类型或非专业语句结构类型;Determining a statement structure type of the voice input content, the statement structure type including a professional statement structure type or a non-professional statement structure type;
根据所述语音输入内容的语句结构类型,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述用户对所述专业语句结构类型的语音输入内容所属类别的认知度高于对所述非专业语句结构类型的语音输入内容所属类别的认知度。Determining the user's awareness of the category to which the voice input content belongs according to the sentence structure type of the voice input content; wherein the user's recognition of the category of the voice input content of the professional sentence structure type Higher than the recognition of the category of the voice input content of the non-professional statement structure type.
该实施例中,根据语音输入内容的语句结构类型来确定用户的认知度,使得用户认知度的确定更加准确、个性化,从而为用户输出更加准确、个性化的语音输出内容。In this embodiment, the user's recognition is determined according to the sentence structure type of the voice input content, so that the determination of the user's recognition is more accurate and personalized, thereby outputting a more accurate and personalized voice output content for the user.
在一个实施例中,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:In an embodiment, the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
当判定相邻两次接收到的语音输入内容为同一用户所输入时,根据相邻两次接收到的语音输入内容中的关键词,确定所述相邻两次接收到的语音输入内容之间的关联度;When it is determined that the voice input content received two times in the adjacent two is input by the same user, determining, between the two received voice input contents, according to the keywords in the voice input content received two times adjacent to each other Degree of association
根据所述相邻两次接收到的语音输入内容之间的关联度,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述关联度越高,所述认知度越低。Determining the user's awareness of the category to which the voice input content belongs according to the degree of association between the two received voice input contents; wherein the higher the degree of association, the awareness The lower.
该实施例中,根据相邻两次接收到的同一用户的语音输入内容之间的关联度来确定用户的认知度,使得用户认知度的确定更加准确、个性化,从而为用户输出更加准确、个性化的语音输出内容。In this embodiment, the user's awareness is determined according to the degree of association between the voice input contents of the same user received twice, so that the user's recognition is more accurate and personalized, thereby outputting more for the user. Accurate, personalized voice output.
在一个实施例中,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:In an embodiment, the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
根据所述语音输入内容,确定所述语音输入内容的至少两项语音输入参数,所述语音输入参数包括:所述用户的声纹信息、同一用户的相邻两次输入的语音输入内容之间的时间间隔、与所述用户对应的历史输入记录信息、所述语音输入内容中的关键词与预设关键词的匹配度、所述语音输入内容的语句结构类型和同一用户相邻两次输入的语音输入内容之间的关 联度;Determining, according to the voice input content, at least two voice input parameters of the voice input content, where the voice input parameter comprises: voiceprint information of the user, and voice input content of two adjacent inputs of the same user The time interval, the historical input record information corresponding to the user, the matching degree of the keyword in the voice input content with the preset keyword, the statement structure type of the voice input content, and the second input of the same user Between voice input content Union degree
根据预设的每一项语音输入参数的权重,计算所述用户对所述语音输入内容所属类别的认知度。The user's awareness of the category to which the voice input content belongs is calculated according to the weight of each of the preset voice input parameters.
该实施例中,根据多项语音输入内容的语音输入参数的不同权重,来计算用户对语音输入内容所属类别的认知度,使得用户认知度的确定更加准确、个性化,从而为用户输出更加准确、个性化的语音输出内容。In this embodiment, the user's recognition of the category of the voice input content is calculated according to the different weights of the voice input parameters of the plurality of voice input contents, so that the determination of the user's recognition degree is more accurate and personalized, thereby outputting for the user. More accurate and personalized voice output.
在一个实施例中,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:In an embodiment, the determining, according to the voice input content, the user's awareness of a category to which the voice input content belongs includes:
当无法确定所述语音输入内容的语音输入参数时,确定所述用户对所述语音输入内容所属类别的认知度为预设最低认知度。When the voice input parameter of the voice input content cannot be determined, determining that the user's recognition of the category of the voice input content belongs to a preset minimum awareness.
该实施例中,对于无法确定语音输入参数的语音输入内容,输出与该语音输入内容相匹配的语音输出内容,从而为用户提供了更加准确和个性化的语音输出功能,使用户能够从语音输出内容中获取到更多有用的信息,提高用户的体验度。In this embodiment, for the voice input content that cannot determine the voice input parameter, the voice output content matching the voice input content is output, thereby providing the user with a more accurate and personalized voice output function, so that the user can output from the voice. Get more useful information in the content to improve the user experience.
在一个实施例中,所述从与所述语音输入内容相对应的至少一种语音输出内容中,获取并输出与所述认知度相匹配的语音输出内容,包括:In one embodiment, the obtaining, and outputting, from the at least one voice output content corresponding to the voice input content, the voice output content that matches the recognition, includes:
根据认知度和认知等级之间的对应关系,确定所述认知度对应的认知等级;Determining a cognitive level corresponding to the recognition according to a correspondence between the recognition level and the cognitive level;
根据认知等级和语音输出内容之间的对应关系,获取与所述认知等级相对应的语音输出内容;Acquiring the voice output content corresponding to the cognitive level according to the correspondence between the cognitive level and the voice output content;
输出所述语音输出内容。The voice output content is output.
该实施例中,根据认知等级和语音输出内容之间的对应关系来为用户选择匹配的语音输出内容进行输出,从而为用户选择出与用户认知度相匹配的语音输出内容进行输出,使得语音输出内容更加符合用户的需求,提高了语音输出的准确性,使用户能够从语音输出内容中获取到最大的信息量,提高了用户的体验度。In this embodiment, according to the correspondence between the cognitive level and the voice output content, the user selects the matched voice output content for output, so as to select the voice output content that matches the user's recognition for the user to output, so that The voice output content is more in line with the user's needs, and the accuracy of the voice output is improved, so that the user can obtain the maximum amount of information from the voice output content, thereby improving the user experience.
在一个实施例中,所述方法还包括:In an embodiment, the method further includes:
根据所述语音输入内容的输入时间和使用时长,更新所述历史输入记录信息。The history input record information is updated according to an input time and a usage duration of the voice input content.
该实施例中,通过对历史输入记录信息的更新,使得再次为用户输出语音输出内容时,能够依据准确的历史输入记录确定用户的认知度,从而为用户输出更加准确的语音输出内容。In this embodiment, by updating the history input record information, when the voice output content is output again for the user, the user's recognition can be determined according to the accurate history input record, thereby outputting more accurate voice output content for the user.
在一个实施例中,所述方法还包括:In an embodiment, the method further includes:
存储所述用户对所述语音输入内容所属类别的认知度;Storing the user's awareness of the category to which the voice input content belongs;
所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:Determining the user's awareness of the category to which the voice input content belongs according to the voice input content, including:
识别所述用户的声纹信息;Identifying voiceprint information of the user;
根据所述用户的声纹信息查询所述用户对所述语音输入内容所属类别的认知度。The user's voiceprint information is used to query the user's awareness of the category to which the voice input content belongs.
该实施例中,通过查询用户的认知度,能够更加方便快速地确定出用户对语音输入内容 所属类别的认知度,从而更加准确快速地为用户选择相匹配的语音输出内容进行输出。In this embodiment, by querying the user's awareness, it is more convenient and quick to determine the user's input to the voice. The recognition of the category, so that the user can select the matching voice output content for output more accurately and quickly.
第二方面,提供一种语音输出装置,包括:In a second aspect, a voice output device is provided, including:
接收模块,用于接收用户输入的语音输入内容;a receiving module, configured to receive a voice input content input by a user;
确定模块,用于根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,所述认知度为所述用户对所述类别的专业知识认知程度;a determining module, configured to determine, according to the voice input content, the user's awareness of a category to which the voice input content belongs, the awareness being a degree of knowledge of the user's professional knowledge of the category;
输出模块,用于从与所述语音输入内容相对应的至少一种语音输出内容中,获取并输出与所述认知度相匹配的语音输出内容。And an output module, configured to acquire and output the voice output content that matches the recognition degree from the at least one voice output content corresponding to the voice input content.
在一个实施例中,所述确定模块包括:In an embodiment, the determining module comprises:
第一识别子模块,用于识别所述用户的声纹信息;a first identification submodule, configured to identify voiceprint information of the user;
第一判断子模块,用于根据所述声纹信息,判断是否为首次接收所述用户的语音输入内容;a first determining sub-module, configured to determine, according to the voiceprint information, whether to receive the voice input content of the user for the first time;
第二确定子模块,用于当为首次接收所述用户的语音输入内容时,确定所述用户对所述语音输入内容所属类别的认知度为预设最低认知度。The second determining submodule is configured to determine, when the voice input content of the user is received for the first time, the user's awareness of the category to which the voice input content belongs is a preset minimum awareness.
在一个实施例中,所述装置还包括:In one embodiment, the apparatus further includes:
记录模块,用于记录所述语音输入内容的输入时间和使用时长,所述使用时长为接收到所述语音输入内容和输出所述语音输出内容之间的时长。And a recording module, configured to record an input time and a usage duration of the voice input content, where the usage duration is a duration between receiving the voice input content and outputting the voice output content.
在一个实施例中,所述确定模块包括:In an embodiment, the determining module comprises:
第二识别子模块,用于识别所述用户的声纹信息;a second identification submodule, configured to identify voiceprint information of the user;
第二判断子模块,用于根据所述用户的声纹信息,判断相邻两次接收到的语音输入内容是否为同一用户所输入;a second determining sub-module, configured to determine, according to the voiceprint information of the user, whether the voice input content received twice adjacently is input by the same user;
第一计算子模块,用于当相邻两次接收到的语音输入内容为同一用户所输入时,根据所述相邻两次接收到的语音输入内容的输入时间和使用时长,计算相邻两次接收到的语音输入内容之间的时间间隔;a first calculation sub-module, configured to calculate two adjacent two voice input contents received by the same user when inputting the voice input content of the two adjacent voice input contents, and calculating the adjacent two The time interval between the received voice input contents;
第三确定子模块,用于根据所述时间间隔,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述时间间隔越长,所述认知度越低。And a third determining submodule, configured to determine, according to the time interval, the user's awareness of the category to which the voice input content belongs; wherein the longer the time interval, the lower the awareness.
在一个实施例中,所述确定模块包括:In an embodiment, the determining module comprises:
第三识别子模块,用于识别所述用户的声纹信息;a third identification submodule, configured to identify voiceprint information of the user;
第一获取子模块,用于根据所述用户的声纹信息,获取与所述用户对应的历史输入记录信息,所述历史输入记录信息包括历史累计使用时间、历史累计输入次数和历史输入频率中至少一项信息;a first obtaining sub-module, configured to acquire historical input record information corresponding to the user according to the voiceprint information of the user, where the historical input record information includes a historical accumulated use time, a historical cumulative input number, and a historical input frequency. At least one piece of information;
第四确定子模块,用于根据所述历史输入记录信息,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述历史累计使用时间越长,所述认知度越高;所述历史累计输入次数越多,所述认知度越高;所述历史输入频率越高,所述认知度越高。a fourth determining submodule, configured to determine, according to the historical input record information, the user's awareness of a category to which the voice input content belongs; wherein the longer the historical accumulated usage time, the more the cognitive High; the more the historical cumulative input times, the higher the awareness; the higher the historical input frequency, the higher the recognition.
在一个实施例中,所述确定模块包括:In an embodiment, the determining module comprises:
提取子模块,用于提取所述语音输入内容中的关键词; Extracting a sub-module for extracting keywords in the voice input content;
第五确定子模块,用于确定所述语音输入内容中的关键词与预设关键词的匹配度;a fifth determining submodule, configured to determine a matching degree between the keyword in the voice input content and the preset keyword;
第六确定子模块,用于根据所述语音输入内容中的关键词与预设关键词的匹配度,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述语音输入内容中的关键词与预设关键词中的专业关键词的匹配度越高,所述认知度越高;所述语音输入内容中的关键词与预设关键词中的非专业关键词的匹配度越高,所述认知度越低。a sixth determining submodule, configured to determine, according to a matching degree of the keyword in the voice input content and a preset keyword, the user's recognition of a category to which the voice input content belongs; wherein the voice input The higher the degree of matching between the keyword in the content and the professional keyword in the preset keyword, the higher the recognition; the keyword in the voice input content and the non-professional keyword in the preset keyword The higher the degree of matching, the lower the awareness.
在一个实施例中,所述确定模块包括:In an embodiment, the determining module comprises:
第七确定子模块,用于确定所述语音输入内容的语句结构类型,所述语句结构类型包括专业语句结构类型或非专业语句结构类型;a seventh determining submodule, configured to determine a statement structure type of the voice input content, where the statement structure type includes a professional statement structure type or a non-professional statement structure type;
第八确定子模块,用于根据所述语音输入内容的语句结构类型,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述用户对所述专业语句结构类型的语音输入内容所属类别的认知度高于对所述非专业语句结构类型的语音输入内容所属类别的认知度。An eighth determining submodule, configured to determine, according to a statement structure type of the voice input content, a recognition of a category of the voice input content by the user; wherein, the user voices the type of the professional sentence structure The recognition of the category to which the input content belongs is higher than the recognition of the category of the voice input content of the non-professional sentence structure type.
在一个实施例中,所述确定模块包括:In an embodiment, the determining module comprises:
第九确定子模块,用于当判定相邻两次接收到的语音输入内容为同一用户所输入时,根据相邻两次接收到的语音输入内容中的关键词,确定所述相邻两次接收到的语音输入内容之间的关联度;a ninth determining sub-module, configured to determine, when the voice input content received twice in the adjacent two times is input by the same user, determining the adjacent two times according to keywords in the voice input content received twice adjacent to each other The degree of association between the received voice input content;
第十确定子模块,用于根据所述相邻两次接收到的语音输入内容之间的关联度,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述关联度越高,所述认知度越低。a tenth determining submodule, configured to determine, according to the degree of association between the two received voice input contents, the user's awareness of the category to which the voice input content belongs; wherein the degree of association The higher the recognition, the lower the awareness.
在一个实施例中,所述确定模块包括:In an embodiment, the determining module comprises:
第十一确定子模块,用于根据所述语音输入内容,确定所述语音输入内容的至少两项语音输入参数,所述语音输入参数包括:所述用户的声纹信息、同一用户的相邻两次输入的语音输入内容之间的时间间隔、与所述用户对应的历史输入记录信息、所述语音输入内容中的关键词与预设关键词的匹配度、所述语音输入内容的语句结构类型和同一用户相邻两次输入的语音输入内容之间的关联度;An eleventh determining submodule, configured to determine, according to the voice input content, at least two voice input parameters of the voice input content, where the voice input parameter comprises: voiceprint information of the user, adjacent to the same user a time interval between two input voice input contents, history input record information corresponding to the user, a degree of matching between a keyword in the voice input content and a preset keyword, and a sentence structure of the voice input content The degree of association between the type and the voice input input twice between the same user;
计算子模块,用于根据预设的每一项语音输入参数的权重,计算所述用户对所述语音输入内容所属类别的认知度。And a calculation submodule, configured to calculate, according to a preset weight of each of the voice input parameters, the user's awareness of the category to which the voice input content belongs.
在一个实施例中,所述确定模块包括:In an embodiment, the determining module comprises:
第十二确定子模块,用于当无法确定所述语音输入内容的语音输入参数时,确定所述用户对所述语音输入内容所属类别的认知度为预设最低认知度。The twelfth determining submodule is configured to determine, when the voice input parameter of the voice input content cannot be determined, the recognition of the category of the voice input content by the user as a preset minimum awareness.
在一个实施例中,所述输出模块包括:In one embodiment, the output module comprises:
第十三确定子模块,用于根据认知度和认知等级之间的对应关系,确定所述认知度对应的认知等级;a thirteenth determining submodule, configured to determine a cognitive level corresponding to the cognition according to a correspondence between the cognition and the cognition level;
第二获取子模块,用于根据认知等级和语音输出内容之间的对应关系,获取与所述认知等级相对应的语音输出内容;a second obtaining submodule, configured to acquire, according to a correspondence between the cognitive level and the voice output content, the voice output content corresponding to the cognitive level;
输出子模块,用于输出所述语音输出内容。An output submodule for outputting the voice output content.
在一个实施例中,所述装置还包括: In one embodiment, the apparatus further includes:
更新模块,用于根据所述语音输入内容的输入时间和使用时长,更新所述历史输入记录信息。And an update module, configured to update the historical input record information according to an input time and a usage duration of the voice input content.
在一个实施例中,所述装置还包括:In one embodiment, the apparatus further includes:
存储模块,用于存储所述用户对所述语音输入内容所属类别的认知度;a storage module, configured to store, by the user, an awareness of a category to which the voice input content belongs;
所述确定模块包括:The determining module includes:
第四识别子模块,用于识别所述用户的声纹信息;a fourth identification submodule, configured to identify voiceprint information of the user;
查询子模块,用于根据所述用户的声纹信息查询所述用户对所述语音输入内容所属类别的认知度。The query sub-module is configured to query, according to the voiceprint information of the user, the user's awareness of the category to which the voice input content belongs.
本发明实施例的一些有益效果可以包括:Some beneficial effects of embodiments of the present invention may include:
上述装置,能够根据用户对输入的语音输入内容所属类别的认知度,为用户选择与其认知度相匹配的语音输出内容进行输出,使得语音输出内容更加符合用户的需求,从而为用户提供更加个性化的语音输出功能,同时提高了语音输出的准确性,使用户能够从语音输出内容中获取到最大的信息量,提高了用户的体验度。The device can output a voice output content that matches the recognition degree of the user according to the user's recognition of the category of the input voice input content, so that the voice output content is more in line with the user's needs, thereby providing the user with more The personalized voice output function improves the accuracy of the voice output, enabling the user to obtain the maximum amount of information from the voice output content, thereby improving the user experience.
第三方面,提供一种语音输出装置,其特征在于,所述装置包括:In a third aspect, a voice output device is provided, the device comprising:
处理器;processor;
用于存储所述处理器可执行指令的存储器;a memory for storing the processor executable instructions;
其中,所述处理器被配置为:Wherein the processor is configured to:
接收用户输入的语音输入内容;Receiving voice input input by the user;
根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,所述认知度为所述用户对所述类别的专业知识认知程度;Determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs, the degree of recognition being a degree of knowledge of the user's professional knowledge of the category;
从与所述语音输入内容相对应的至少一种语音输出内容中,获取并输出与所述认知度相匹配的语音输出内容。From the at least one voice output content corresponding to the voice input content, the voice output content matching the recognition degree is acquired and output.
上述处理器还被配置为:The above processor is also configured to:
识别所述用户的声纹信息;Identifying voiceprint information of the user;
根据所述声纹信息,判断是否为首次接收所述用户的语音输入内容;Determining, according to the voiceprint information, whether the voice input content of the user is received for the first time;
当为首次接收所述用户的语音输入内容时,确定所述用户对所述语音输入内容所属类别的认知度为预设最低认知度。When the voice input content of the user is received for the first time, it is determined that the user's awareness of the category to which the voice input content belongs is a preset minimum awareness.
上述处理器还被配置为:The above processor is also configured to:
记录所述语音输入内容的输入时间和使用时长,所述使用时长为接收到所述语音输入内容和输出所述语音输出内容之间的时长。Recording an input time and a duration of use of the voice input content, the duration of use being a duration between receipt of the voice input content and output of the voice output content.
上述处理器还被配置为:The above processor is also configured to:
识别所述用户的声纹信息;Identifying voiceprint information of the user;
根据所述用户的声纹信息,判断相邻两次接收到的语音输入内容是否为同一用户所输入;Determining, according to the voiceprint information of the user, whether the voice input content received twice adjacently is input by the same user;
当相邻两次接收到的语音输入内容为同一用户所输入时,根据所述相邻两次接收到的语 音输入内容的输入时间和使用时长,计算相邻两次接收到的语音输入内容之间的时间间隔;When the voice input content received two times adjacently is input by the same user, according to the two received words The input time and duration of the audio input content, and calculate the time interval between the two received voice input contents;
根据所述时间间隔,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述时间间隔越长,所述认知度越低。And determining, according to the time interval, the user's awareness of the category to which the voice input content belongs; wherein the longer the time interval, the lower the awareness.
上述处理器还被配置为:The above processor is also configured to:
识别所述用户的声纹信息;Identifying voiceprint information of the user;
根据所述用户的声纹信息,获取与所述用户对应的历史输入记录信息,所述历史输入记录信息包括历史累计使用时间、历史累计输入次数和历史输入频率中至少一项信息;Acquiring, according to the voiceprint information of the user, history input record information corresponding to the user, where the history input record information includes at least one of historical accumulated use time, historical cumulative input times, and historical input frequency;
根据所述历史输入记录信息,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述历史累计使用时间越长,所述认知度越高;所述历史累计输入次数越多,所述认知度越高;所述历史输入频率越高,所述认知度越高。Determining, by the history input record information, the user's awareness of the category to which the voice input content belongs; wherein the longer the historical accumulated usage time, the higher the awareness; the historical cumulative input times The more the recognition, the higher the degree of recognition; the higher the historical input frequency, the higher the recognition.
上述处理器还被配置为:The above processor is also configured to:
提取所述语音输入内容中的关键词;Extracting keywords in the voice input content;
确定所述语音输入内容中的关键词与预设关键词的匹配度;Determining a degree of matching between a keyword in the voice input content and a preset keyword;
根据所述语音输入内容中的关键词与预设关键词的匹配度,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述语音输入内容中的关键词与预设关键词中的专业关键词的匹配度越高,所述认知度越高;所述语音输入内容中的关键词与预设关键词中的非专业关键词的匹配度越高,所述认知度越低。Determining, by the user, the recognition of the category to which the voice input content belongs according to the matching degree of the keyword in the voice input content and the preset keyword; wherein the keyword and the preset in the voice input content The higher the matching degree of the professional keyword in the keyword, the higher the recognition degree; the higher the matching degree between the keyword in the voice input content and the non-professional keyword in the preset keyword, the recognition The lower the knowledge.
上述处理器还被配置为:The above processor is also configured to:
确定所述语音输入内容的语句结构类型,所述语句结构类型包括专业语句结构类型或非专业语句结构类型;Determining a statement structure type of the voice input content, the statement structure type including a professional statement structure type or a non-professional statement structure type;
根据所述语音输入内容的语句结构类型,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述用户对所述专业语句结构类型的语音输入内容所属类别的认知度高于对所述非专业语句结构类型的语音输入内容所属类别的认知度。Determining the user's awareness of the category to which the voice input content belongs according to the sentence structure type of the voice input content; wherein the user's recognition of the category of the voice input content of the professional sentence structure type Higher than the recognition of the category of the voice input content of the non-professional statement structure type.
上述处理器还被配置为:The above processor is also configured to:
当判定相邻两次接收到的语音输入内容为同一用户所输入时,根据相邻两次接收到的语音输入内容中的关键词,确定所述相邻两次接收到的语音输入内容之间的关联度;When it is determined that the voice input content received two times in the adjacent two is input by the same user, determining, between the two received voice input contents, according to the keywords in the voice input content received two times adjacent to each other Degree of association
根据所述相邻两次接收到的语音输入内容之间的关联度,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述关联度越高,所述认知度越低。Determining the user's awareness of the category to which the voice input content belongs according to the degree of association between the two received voice input contents; wherein the higher the degree of association, the awareness The lower.
上述处理器还被配置为:The above processor is also configured to:
根据所述语音输入内容,确定所述语音输入内容的至少两项语音输入参数,所述语音输入参数包括:所述用户的声纹信息、同一用户的相邻两次输入的语音输入内容之间的时间间隔、与所述用户对应的历史输入记录信息、所述语音输入内容中的关键词与预设关键词的匹配度、所述语音输入内容的语句结构类型和同一用户相邻两次输入的语音输入内容之间的关联度;Determining, according to the voice input content, at least two voice input parameters of the voice input content, where the voice input parameter comprises: voiceprint information of the user, and voice input content of two adjacent inputs of the same user The time interval, the historical input record information corresponding to the user, the matching degree of the keyword in the voice input content with the preset keyword, the statement structure type of the voice input content, and the second input of the same user The degree of association between voice input content;
根据预设的每一项语音输入参数的权重,计算所述用户对所述语音输入内容所属类别的 认知度。Calculating, according to the weight of each of the preset voice input parameters, the category of the voice input content of the user understanding.
上述处理器还被配置为:The above processor is also configured to:
当无法确定所述语音输入内容的语音输入参数时,确定所述用户对所述语音输入内容所属类别的认知度为预设最低认知度。When the voice input parameter of the voice input content cannot be determined, determining that the user's recognition of the category of the voice input content belongs to a preset minimum awareness.
上述处理器还被配置为:The above processor is also configured to:
根据认知度和认知等级之间的对应关系,确定所述认知度对应的认知等级;Determining a cognitive level corresponding to the recognition according to a correspondence between the recognition level and the cognitive level;
根据认知等级和语音输出内容之间的对应关系,获取与所述认知等级相对应的语音输出内容;Acquiring the voice output content corresponding to the cognitive level according to the correspondence between the cognitive level and the voice output content;
输出所述语音输出内容。The voice output content is output.
上述处理器还被配置为:The above processor is also configured to:
根据所述语音输入内容的输入时间和使用时长,更新所述历史输入记录信息。The history input record information is updated according to an input time and a usage duration of the voice input content.
上述处理器还被配置为:The above processor is also configured to:
存储所述用户对所述语音输入内容所属类别的认知度;Storing the user's awareness of the category to which the voice input content belongs;
所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:Determining the user's awareness of the category to which the voice input content belongs according to the voice input content, including:
识别所述用户的声纹信息;Identifying voiceprint information of the user;
根据所述用户的声纹信息查询所述用户对所述语音输入内容所属类别的认知度。The user's voiceprint information is used to query the user's awareness of the category to which the voice input content belongs.
第四方面,提供一种非暂时性计算机可读记录介质,所述介质上记录有计算机程序,所述程序包括用于执行如本发明实施例的第一方面所述的方法的指令。According to a fourth aspect, there is provided a non-transitory computer readable recording medium having recorded thereon a computer program, the program comprising instructions for performing the method of the first aspect of the embodiment of the present invention.
第五方面,提供一种计算机程序,所述程序包括:用于在所述程序被计算机执行时执行如本发明实施例的第一方面所述的方法的指令。In a fifth aspect, a computer program is provided, the program comprising: instructions for performing the method of the first aspect of the embodiment of the invention when the program is executed by a computer.
本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。Other features and advantages of the invention will be set forth in the description which follows, The objectives and other advantages of the invention may be realized and obtained by means of the structure particularly pointed in the appended claims.
下面通过附图和实施例,对本发明的技术方案做进一步的详细描述。The technical solution of the present invention will be further described in detail below through the accompanying drawings and embodiments.
附图说明DRAWINGS
附图用来提供对本发明的进一步理解,并且构成说明书的一部分,与本发明的实施例一起用于解释本发明,并不构成对本发明的限制。在附图中:The drawings are intended to provide a further understanding of the invention, and are intended to be a In the drawing:
图1为本发明实施例中一种语音输出方法的流程图;1 is a flowchart of a voice output method according to an embodiment of the present invention;
图2为本发明实施例中一种语音输出方法中步骤S12的流程图;2 is a flowchart of step S12 in a voice output method according to an embodiment of the present invention;
图3为本发明实施例中一种语音输出方法中步骤S12的流程图;FIG. 3 is a flowchart of step S12 in a voice output method according to an embodiment of the present invention;
图4为本发明实施例中一种语音输出方法中步骤S12的流程图;4 is a flowchart of step S12 in a voice output method according to an embodiment of the present invention;
图5为本发明实施例中一种语音输出方法中步骤S12的流程图;FIG. 5 is a flowchart of step S12 in a voice output method according to an embodiment of the present invention;
图6为本发明实施例中一种语音输出方法中步骤S13的流程图; FIG. 6 is a flowchart of step S13 in a voice output method according to an embodiment of the present invention;
图7为本发明实施例中一种语音输出装置的框图;FIG. 7 is a block diagram of a voice output apparatus according to an embodiment of the present invention; FIG.
图8为本发明实施例中一种语音输出装置中确定模块的框图;8 is a block diagram of a determining module in a voice output device according to an embodiment of the present invention;
图9为本发明实施例中一种语音输出装置中确定模块的框图;9 is a block diagram of a determining module in a voice output device according to an embodiment of the present invention;
图10为本发明实施例中一种语音输出装置中确定模块的框图;10 is a block diagram of a determining module in a voice output device according to an embodiment of the present invention;
图11为本发明实施例中一种语音输出装置中确定模块的框图;11 is a block diagram of a determining module in a voice output device according to an embodiment of the present invention;
图12为本发明实施例中一种语音输出装置中输出模块的框图;12 is a block diagram of an output module in a voice output device according to an embodiment of the present invention;
图13为本发明实施例中一种语音输出装置的框图;FIG. 13 is a block diagram of a voice output apparatus according to an embodiment of the present invention; FIG.
图14为本发明实施例中一种可执行语音输出方法的装置的框图。FIG. 14 is a block diagram of an apparatus for performing a voice output method according to an embodiment of the present invention.
具体实施方式detailed description
以下结合附图对本发明的优选实施例进行说明,应当理解,此处所描述的优选实施例仅用于说明和解释本发明,并不用于限定本发明。The preferred embodiments of the present invention are described with reference to the accompanying drawings, which are intended to illustrate and illustrate the invention.
图1为本发明实施例中一种语音输出方法的流程图。如图1所示,该方法用于终端中,终端可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等,包括以下步骤S11-S13:FIG. 1 is a flowchart of a voice output method according to an embodiment of the present invention. As shown in FIG. 1 , the method is used in a terminal, and the terminal may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, etc., including the following steps. S11-S13:
步骤S11,接收用户输入的语音输入内容。In step S11, the voice input content input by the user is received.
该步骤中,用户可以通过录入声音的方式输入语音输入内容。In this step, the user can input the voice input content by inputting a voice.
步骤S12,根据语音输入内容,确定用户对语音输入内容所属类别的认知度;该认知度为用户对语音输入内容所属类别的专业知识认知程度。Step S12: Determine, according to the voice input content, the user's recognition of the category to which the voice input content belongs; the awareness is the degree of professional knowledge of the user's category of the voice input content.
例如,用户输入语音输入内容“怎样设置空调温度”,那么用户对语音输入内容所属类别的认知度即为用户对空调类的专业知识认知程度;用户输入语音输入内容“阿司匹林是什么药”,那么用户对语音输入内容所属类别的认知度即为用户对医药类的专业知识认知程度。终端可通过提取语音输入内容中的关键词,来确定语音输入内容所属的类别。For example, if the user inputs the voice input content "how to set the air conditioning temperature", then the user's recognition of the category of the voice input content is the user's knowledge of the air conditioning class; the user inputs the voice input content "What is the medicine of aspirin" Then, the user's recognition of the category of the voice input content is the user's knowledge of the professional knowledge of the medicine. The terminal can determine the category to which the voice input content belongs by extracting keywords in the voice input content.
步骤S13,从与语音输入内容相对应的至少一种语音输出内容中,获取并输出与认知度相匹配的语音输出内容。Step S13: Acquire and output the voice output content that matches the recognition degree from the at least one voice output content corresponding to the voice input content.
采用本发明实施例提供的技术方案,能够根据用户对输入的语音输入内容所属类别的认知度,为用户选择与其认知度相匹配的语音输出内容进行输出,使得语音输出内容更加符合用户的需求,从而为用户提供更加个性化的语音输出功能,同时提高了语音输出的准确性,使用户能够从语音输出内容中获取到最大的信息量,提高了用户的体验度。According to the technical solution provided by the embodiment of the present invention, according to the user's recognition of the category of the input voice input content, the voice output content matching the recognition degree is selected for the user to output, so that the voice output content is more in line with the user. The demand provides users with more personalized voice output functions, and at the same time improves the accuracy of voice output, enabling users to obtain the maximum amount of information from the voice output content, thereby improving the user experience.
在步骤S12中,用户对语音输入内容所属类别的认知度可通过多种方式确定。可首先根据语音输入内容,确定出语音输入内容的语音输入参数,再根据语音输入参数确定用户对语音输入内容所属类别的认知度。其中,根据语音输入参数的不同,认知度的确定方式也有所不同,语音输入参数可以包括用户的声纹信息、同一用户的相邻两次输入的语音输入内容之间的时间间隔、与用户对应的历史输入记录信息、语音输入内容中的关键词与预设关键词的匹配度、语音输入内容的语句结构类型和同一用户相邻两次输入的语音输入内容之间的关联 度,等等。以下通过不同的实施例来说明步骤S12的实施方式。In step S12, the user's awareness of the category to which the voice input content belongs may be determined in various ways. Firstly, according to the voice input content, the voice input parameter of the voice input content is determined, and then the user's recognition of the category of the voice input content is determined according to the voice input parameter. The method for determining the degree of recognition may be different according to different voice input parameters, and the voice input parameter may include the voiceprint information of the user, the time interval between the voice input contents of the two adjacent inputs of the same user, and the user. The corresponding history input record information, the matching degree of the keyword in the voice input content with the preset keyword, the statement structure type of the voice input content, and the association between the voice input contents input by the same user twice Degree, and so on. The embodiment of step S12 will be described below by means of different embodiments.
在一个实施例中,如图2所示,步骤S12可以实施为以下步骤S21-S23:In an embodiment, as shown in FIG. 2, step S12 may be implemented as the following steps S21-S23:
步骤S21,识别用户的声纹信息。In step S21, the voiceprint information of the user is identified.
步骤S22,根据声纹信息,判断是否为首次接收用户的语音输入内容。In step S22, based on the voiceprint information, it is determined whether the voice input content of the user is received for the first time.
步骤S23,当为首次接收用户的语音输入内容时,确定用户对语音输入内容所属类别的认知度为预设最低认知度。In step S23, when the content of the voice input content of the user is received for the first time, it is determined that the user's recognition of the category of the voice input content belongs to the preset minimum awareness.
本实施例中,终端中存储有不同用户各自对应的声纹信息,当用户输入语音输入内容时,如果终端能够在预先存储的声纹信息查询到该用户的声纹信息,说明不是首次接收到该用户的语音输入内容,而如果终端未能在预先存储的声纹信息查询到该用户的声纹信息,则说明终端为首次接收该用户的语音输入内容。当不为首次接收用户的语音输入内容时,则终端继续根据语音输入内容确定其他项语音输入参数,并根据其他项语音输入参数执行步骤S12。在终端中,预先存储有认知度和语音输入内容之间的对应关系,其中包含与预设最低认知度相对应的语音输入内容。In this embodiment, the voiceprint information corresponding to different users is stored in the terminal. When the user inputs the voice input content, if the terminal can query the voiceprint information of the user in the voiceprint information stored in advance, the first time is not received. The voice input content of the user, and if the terminal fails to query the voiceprint information of the user in the pre-stored voiceprint information, the terminal indicates that the voice input content of the user is received for the first time. When the content is not input for receiving the voice of the user for the first time, the terminal continues to determine other item voice input parameters according to the voice input content, and performs step S12 according to the other item voice input parameters. In the terminal, a correspondence between the recognition and the voice input content is stored in advance, and the voice input content corresponding to the preset minimum recognition is included.
在一个实施例中,上述方法还包括以下步骤:记录语音输入内容的输入时间和使用时长,该使用时长为接收到语音输入内容和输出语音输出内容之间的时长。因此,如图3所示,步骤S12可以实施为以下步骤S31-S34:In one embodiment, the above method further comprises the step of recording an input time and a duration of use of the voice input content, the duration of use being the length of time between receipt of the voice input content and output of the voice output content. Therefore, as shown in FIG. 3, step S12 can be implemented as the following steps S31-S34:
步骤S31,识别用户的声纹信息。In step S31, the voiceprint information of the user is identified.
步骤S32,根据用户的声纹信息,判断相邻两次接收到的语音输入内容是否为同一用户所输入。Step S32: Determine, according to the voiceprint information of the user, whether the voice input content received twice adjacently is input by the same user.
步骤S33,当相邻两次接收到的语音输入内容为同一用户所输入时,根据相邻两次接收到的语音输入内容的输入时间和使用时长,计算相邻两次接收到的语音输入内容之间的时间间隔。Step S33, when the voice input content received twice in the adjacent two times is input by the same user, calculate the voice input content received twice in the adjacent two according to the input time and the usage duration of the two received voice input contents. The time interval between.
步骤S34,根据时间间隔,确定用户对语音输入内容所属类别的认知度;其中,时间间隔越长,认知度越低。Step S34: Determine, according to the time interval, the user's awareness of the category to which the voice input content belongs; wherein the longer the time interval, the lower the recognition.
本实施例中,当相邻两次接收到的语音输入内容为同一用户所输入时,那么相邻两次接收到的语音输入内容之间的时间间隔可以反映出用户对终端输出的上一个语音输出内容的反应时长,此外,用户对终端输出的上一个语音输出内容的反应时长也可由上一次输出语音输出内容到本次接收到语音输入内容之间的时间间隔来表征。例如,终端上一次接收到的语音输入内容为“怎样设置空调温度”,且对于该语音输入内容,终端输出与之对应的语音输出内容为“先进入温度调节模式,然后改变温度”;终端本次接收到的语音输入内容为“如何进入温度调节模式”,当终端判定相邻两次接收到的语音输入内容为同一用户所输入时,则可采用接收到语音输入内容“怎样设置空调温度”与接收到语音输入内容“如何进入温度调节模式”之间的时间间隔来表征用户对上一个语音输出内容“先进入温度调节模式,然后改变温度”的反应时长,进而确定出用户对语音输入内容所属类别的认知度。或者,也可采用输出语音输出内容“先进入温度调节模式,然后改变温度”和本次接收到语音输入内容“如 何进入温度调节模式”之间的时间间隔来表征用户对上一个语音输出内容“先进入温度调节模式,然后改变温度”的反应时长,进而确定出用户对语音输入内容所属类别的认知度。时间间隔越长,说明用户对上一个语音输出内容的反应时长越长,那么认知度就越低。In this embodiment, when the voice input content received two times adjacently is input by the same user, the time interval between the voice input contents received twice adjacently may reflect the previous voice output by the user to the terminal. The response time of the output content, in addition, the response time of the user to the last voice output content output by the terminal can also be characterized by the time interval between the last output of the voice output content and the content of the voice input received this time. For example, the voice input content received by the terminal last time is “how to set the air conditioner temperature”, and for the voice input content, the terminal outputs the corresponding voice output content as “first enters the temperature adjustment mode, then changes the temperature”; The received voice input content is “how to enter the temperature adjustment mode”. When the terminal determines that the voice input content received twice in the adjacent time is input by the same user, the voice input content “how to set the air conditioner temperature” may be adopted. The time interval between receiving the voice input content "how to enter the temperature adjustment mode" to characterize the response time of the user to the previous voice output content "first enters the temperature adjustment mode, then changes the temperature", thereby determining the user's input to the voice The recognition of the category. Alternatively, you can also use the output voice output content to “enter the temperature adjustment mode first, then change the temperature” and the voice input content received this time. The time interval between how to enter the temperature adjustment mode is used to characterize the response time of the user to the first voice output content "first enters the temperature adjustment mode and then changes the temperature", thereby determining the user's awareness of the category of the voice input content. The longer the time interval, the longer the user's response to the previous voice output, and the lower the awareness.
此外,还可预先设定一个预设时间间隔,当相邻两次接收到的语音输入内容为同一用户所输入、且相邻两次接收到的语音输入内容之间的时间间隔超过预设时间间隔时,终端可直接确定用户对语音输入内容所属类别的认知度为预设最低认知度,并获取与预设最低认知度相匹配的语音输出内容进行输出。In addition, a preset time interval may be preset, when the time interval between the two received voice input contents is the same user input, and the time interval between the two received voice input contents exceeds the preset time During the interval, the terminal may directly determine that the user's recognition of the category of the voice input content is the preset minimum awareness, and acquire the voice output content that matches the preset minimum recognition for output.
在一个实施例中,如图4所示,步骤S12可以实施为以下步骤S41-S43:In one embodiment, as shown in FIG. 4, step S12 can be implemented as the following steps S41-S43:
步骤S41,识别用户的声纹信息。In step S41, the voiceprint information of the user is identified.
步骤S42,根据用户的声纹信息,获取与用户对应的历史输入记录信息;历史输入记录信息包括历史累计使用时间、历史累计输入次数和历史输入频率中至少一项信息。Step S42: Acquire historical input record information corresponding to the user according to the voiceprint information of the user; the historical input record information includes at least one of historical accumulated use time, historical cumulative input times, and historical input frequency.
步骤S43,根据历史输入记录信息,确定用户对语音输入内容所属类别的认知度;其中,历史累计使用时间越长,认知度越高;历史累计输入次数越多,认知度越高;历史输入频率越高,认知度越高。Step S43, determining the user's recognition of the category of the voice input content according to the historical input record information; wherein, the longer the historical cumulative use time, the higher the recognition degree; the more the historical cumulative input times, the higher the recognition degree; The higher the historical input frequency, the higher the awareness.
本实施例中,终端每次接收到用户输入的语音输入内容,就会记录语音输入内容的输入时间和使用时长,该使用时长为接收到语音输入内容和输出语音输出内容之间的时长。终端根据记录的输入时间和使用时长可以统计出与用户对应的历史输入记录信息,其中,历史累计使用时间即为每一次所记录的使用时长的总和。此外,上述方法还包括以下步骤:根据语音输入内容的输入时间和使用时长,更新历史输入记录信息。这样,当终端根据与用户对应的历史输入记录信息确定用户对语音输入内容所属类别的认知度时,所依据的历史输入记录信息更加丰富准确,从而能够为用户选择更加准确和个性化的语音输出内容进行输出。In this embodiment, each time the terminal receives the voice input content input by the user, the input time and the usage duration of the voice input content are recorded, and the usage duration is the length of time between receiving the voice input content and outputting the voice output content. The terminal can count the historical input record information corresponding to the user according to the recorded input time and the duration of use, wherein the historical accumulated use time is the sum of the used durations recorded each time. In addition, the above method further includes the step of: updating the history input record information according to the input time and the usage duration of the voice input content. In this way, when the terminal determines the user's recognition of the category of the voice input content according to the historical input record information corresponding to the user, the history input record information is more rich and accurate, so that the user can select a more accurate and personalized voice. The output is output.
在一个实施例中,如图5所示,步骤S12可以实施为以下步骤S51-S53:In an embodiment, as shown in FIG. 5, step S12 may be implemented as the following steps S51-S53:
步骤S51,提取语音输入内容中的关键词。In step S51, keywords in the voice input content are extracted.
步骤S52,确定语音输入内容中的关键词与预设关键词的匹配度。Step S52, determining a degree of matching between the keyword in the voice input content and the preset keyword.
步骤S53,根据语音输入内容中的关键词与预设关键词的匹配度,确定用户对语音输入内容所属类别的认知度;其中,语音输入内容中的关键词与预设关键词中的专业关键词的匹配度越高,认知度越高;语音输入内容中的关键词与预设关键词中的非专业关键词的匹配度越高,认知度越低。Step S53: determining, according to the matching degree of the keyword in the voice input content and the preset keyword, the user's recognition of the category of the voice input content; wherein, the keyword in the voice input content and the professional in the preset keyword The higher the matching degree of the keyword, the higher the recognition degree; the higher the matching degree between the keyword in the voice input content and the non-professional keyword in the preset keyword, the lower the recognition degree.
本实施例中,终端中预存的预设关键词包括专业关键词和非专业关键词两个类型,在执行步骤S52时,需要分别确定语音输入内容中的关键词和专业关键词之间的匹配度,以及和非专业关键词之间的匹配度。例如,专业关键词包括“设置路径”,非专业关键词包括“怎么用”,如果终端接收到的语音输入内容为“……的设置路径”,那么可判定语音输入内容中的关键词与专业关键词之间的匹配度更高,因此用户对语音输入内容所属类别的认知度也就越高;如果终端接收到的语音输入内容为“……怎么用”,那么可判定语音输入内容中的关键词与非专业关键词之间的匹配度更高,因此用户对语音输入内容所属类别的认知度也就越 低。In this embodiment, the preset keywords pre-stored in the terminal include two types of professional keywords and non-professional keywords. When performing step S52, it is necessary to separately determine the matching between the keywords in the voice input content and the professional keywords. Degree, and the degree of matching with non-professional keywords. For example, professional keywords include "set path", and non-professional keywords include "how to use". If the voice input content received by the terminal is "set path of ...", then the keywords and professions in the voice input content can be determined. The matching degree between the keywords is higher, so the user's awareness of the category of the voice input content is higher; if the voice input content received by the terminal is "how to use", then the voice input content can be determined. The keyword has a higher degree of matching with the non-professional keyword, so the user’s awareness of the category of the voice input content is higher. low.
在一个实施例中,步骤S12可以实施为以下步骤A1-A2:In an embodiment, step S12 can be implemented as the following steps A1-A2:
步骤A1,确定语音输入内容的语句结构类型,语句结构类型包括专业语句结构类型或非专业语句结构类型。In step A1, the statement structure type of the voice input content is determined, and the statement structure type includes a professional statement structure type or a non-professional statement structure type.
步骤A2,根据语音输入内容的语句结构类型,确定用户对语音输入内容所属类别的认知度;其中,用户对专业语句结构类型的语音输入内容所属类别的认知度高于对非专业语句结构类型的语音输入内容所属类别的认知度。Step A2: determining, according to the sentence structure type of the voice input content, the user's recognition of the category of the voice input content; wherein, the user's recognition of the category of the voice input content of the professional sentence structure type is higher than that of the non-professional statement structure The type of voice input content is recognized by the category.
本实施例中,终端中预存有语句结构类型,语句结构类型可通过正则表达式来体现。其中,专业语句结构类型的正则表达式如:形容词+名词+动词;非专业语句结构类型的正则表达式如:代词+动词。需要指出的是,语句结构类型的表现方式不限于正则表达式,还可通过其他能够体现语句结构的方式来体现。举例如下,终端接收到的语音输入内容为“开机的步骤是什么”,终端通过分析该语音输入内容,确定出该语音输入内容的语句结构类型为“形容词+名词+动词+代词”,那么可确定该语音输入内容的语句结构类型为专业语句结构类型,用户对该语音输入内容所属类别的认知度较高。再比如,终端接收到的语音输入内容为“这东西怎么用”,终端通过分析该语音输入内容,确定出该语音输入内容的语句结构类型为“代词+动词”,那么可确定该语音输入内容的语句结构类型为非专业语句结构类型,用户对该语音输入内容所属类别的认知度较低。In this embodiment, the statement structure type is pre-stored in the terminal, and the statement structure type can be embodied by a regular expression. Among them, the regular expressions of the professional sentence structure type are: adjective + noun + verb; non-professional statement structure type regular expression such as: pronoun + verb. It should be pointed out that the expression of the statement structure type is not limited to regular expressions, but can also be embodied in other ways that can reflect the structure of the statement. For example, as follows, the voice input content received by the terminal is “what is the step of booting up”, and the terminal determines that the structure structure of the voice input content is “adjective + noun + verb + pronoun” by analyzing the voice input content, then The statement structure type of the voice input content is determined to be a professional sentence structure type, and the user has a higher awareness of the category to which the voice input content belongs. For another example, the voice input content received by the terminal is “how to use this thing”, and the terminal determines the voice input content, and determines that the sentence structure type of the voice input content is “pronoun + verb”, then the voice input content can be determined. The statement structure type is a non-professional statement structure type, and the user has low awareness of the category to which the voice input belongs.
在一个实施例中,步骤S12可以实施为以下步骤B1-B2:In an embodiment, step S12 can be implemented as the following steps B1-B2:
步骤B1,当判定相邻两次接收到的语音输入内容为同一用户所输入时,根据相邻两次接收到的语音输入内容中的关键词,确定相邻两次接收到的语音输入内容之间的关联度。Step B1, when it is determined that the voice input content received two times in the adjacent two is input by the same user, determining the voice input content of the two adjacent received voices according to the keywords in the voice input content received two times adjacent to each other. The degree of association between them.
步骤B2,根据相邻两次接收到的语音输入内容之间的关联度,确定用户对语音输入内容所属类别的认知度;其中,关联度越高,认知度越低。In step B2, the user's recognition of the category of the voice input content is determined according to the degree of association between the two received voice input contents; wherein the higher the degree of association, the lower the degree of recognition.
本实施例中,当相邻两次接收到的语音输入内容为同一用户所输入时,那么相邻两次接收到的语音输入内容之间的关联度可以反映出用户对上一个语音输出内容的理解程度,因此相邻两次接收到的语音输入内容之间的关联度越高,说明用户对上一个语音输出内容的理解程度越低,用户对语音输入内容所属类别的认知度也就越低;相邻两次接收到的语音输入内容之间的关联度越低,说明用户对上一个语音输出内容的理解程度越高,用户对语音输入内容所属类别的认知度也就越高。例如,终端上一次接收到的语音输入内容为“怎样设置空调温度”,同时终端本次接收到的语音输入内容为“如何进入温度调节模式”,当终端判定相邻两次接收到的语音输入内容为同一用户所输入时,可提取相邻两次接收到的语音输入内容中的关键词,如关键词“空调温度”、“温度调节模式”,通过关键词“空调温度”和关键词“温度调节模式”之间的关联度来确定相邻两次接收到的语音输入内容之间的关联度,由于“空调温度”和“温度调节模式”都是与温度有关的关键词,因此二者之间的关联度较高。再比如,终端上一次接收到的语音输入内容为“怎样设置空调温度”,同时终端本次接收到的语音输入内容为“开机的步骤是什么”,当终端判定相邻两次接收到的语音输入内容为同一 用户所输入时,分别提取相邻两个语音输入内容中的关键词为“空调温度”和“开机”,由于这两个关键词为两个无关的类型的关键词,因此二者之间的关联度几乎为零,也就说明相邻两次接收到的语音输入内容之间的关联度很低,用户对上一个语音输出内容的理解程度较高,进而说明用户对语音输入内容所属类别的认知度较高。In this embodiment, when the voice input content received two times adjacently is input by the same user, the degree of association between the voice input contents received twice adjacently may reflect the user's content of the previous voice output. The degree of understanding, so the higher the degree of association between the two received voice input contents, the lower the understanding of the user's previous voice output content, and the more the user's awareness of the category of the voice input content. Low; the lower the degree of association between the two received voice input contents, the higher the user's understanding of the previous voice output content, and the higher the user's awareness of the category of the voice input content. For example, the voice input content received by the terminal last time is “How to set the air conditioner temperature”, and the voice input content received by the terminal at this time is “How to enter the temperature adjustment mode”, when the terminal determines the voice input received twice adjacently. When the content is input by the same user, keywords in the voice input content received twice adjacently can be extracted, such as keywords "air conditioning temperature", "temperature adjustment mode", by keyword "air conditioning temperature" and keyword " The degree of association between the temperature adjustment modes determines the degree of association between the two previously received speech input contents, since both "air conditioning temperature" and "temperature adjustment mode" are temperature related keywords, so both The degree of correlation between the two is higher. For another example, the voice input content received by the terminal last time is “how to set the air conditioner temperature”, and the voice input content received by the terminal at this time is “what is the step of powering on”, when the terminal determines the voice received twice adjacently. Input is the same When the user inputs, the keywords in the adjacent two voice input contents are respectively extracted as "air conditioning temperature" and "power on", since the two keywords are two unrelated types of keywords, The degree of association is almost zero, which means that the degree of association between the two received voice input contents is very low, and the user has a higher degree of understanding of the previous voice output content, thereby indicating the user's category of the voice input content. High recognition.
在一个实施例中,对于上述实施例中执行步骤S12的多种方式,还可将多项语音输入参数结合起来,并根据预先设置的权重来计算用户对语音输入内容所属类别的认知度。因此,上述步骤S12还可以实施为以下步骤:根据语音输入内容,确定语音输入内容的至少两项语音输入参数,其中,语音输入参数包括:用户的声纹信息、同一用户的相邻两次输入的语音输入内容之间的时间间隔、与用户对应的历史输入记录信息、语音输入内容中的关键词与预设关键词的匹配度、语音输入内容的语句结构类型和同一用户相邻两次输入的语音输入内容之间的关联度;根据预设的每一项语音输入参数的权重,计算用户对语音输入内容所属类别的认知度。In an embodiment, for the multiple manners of performing step S12 in the foregoing embodiment, multiple voice input parameters may also be combined, and the user's recognition of the category of the voice input content is calculated according to the preset weight. Therefore, the foregoing step S12 may be further implemented as: determining, according to the voice input content, at least two voice input parameters of the voice input content, wherein the voice input parameter comprises: voice tone information of the user, and two adjacent inputs of the same user The time interval between the voice input contents, the history input record information corresponding to the user, the matching degree of the keyword in the voice input content with the preset keyword, the statement structure type of the voice input content, and the input of the same user twice The degree of association between the voice input contents; calculating the user's awareness of the category to which the voice input content belongs according to the weight of each of the preset voice input parameters.
在一个实施例中,上述方法还包括以下步骤:当无法确定语音输入内容的语音输入参数时,确定用户对语音输入内容所属类别的认知度为预设最低认知度。本实施例中,当接收到用户输入的语音输入内容时,对于无法确定语音输入内容的语音输入内容,终端可直接确定用户对该语音输入内容所属类别的认知度为预设最低认知度,因此,即使是无法确定语音输入参数的语音输入内容,用户也可获取到与之相匹配的语音输出内容,从而提高用户的体验度。In an embodiment, the method further includes the step of determining that the user's recognition of the category of the voice input content is a preset minimum awareness when the voice input parameter of the voice input content cannot be determined. In this embodiment, when the voice input content input by the user is received, for the voice input content that cannot determine the voice input content, the terminal may directly determine that the user's recognition of the category of the voice input content belongs to the preset minimum awareness. Therefore, even if the voice input content of the voice input parameter cannot be determined, the user can obtain the voice output content matching the same, thereby improving the user experience.
在一个实施例中,上述方法还包括以下步骤:存储用户对语音输入内容所属类别的认知度。此时,步骤S12可以实施为以下步骤:识别用户的声纹信息;根据用户的声纹信息查询用户对语音输入内容所属类别的认知度。本实施例中,通过查询用户的认知度,能够更加方便快速地确定出用户对语音输入内容所属类别的认知度,从而更加准确快速地为用户选择相匹配的语音输出内容进行输出。In one embodiment, the above method further comprises the step of storing the user's awareness of the category to which the voice input content belongs. At this time, step S12 may be implemented as the following steps: identifying the voiceprint information of the user; and querying the user's awareness of the category of the voice input content according to the voiceprint information of the user. In this embodiment, by querying the user's awareness, it is more convenient and quick to determine the user's recognition of the category of the voice input content, thereby outputting the matched voice output content for the user more accurately and quickly.
在一个实施例中,如图6所示,步骤S13可实施为以下步骤S61-S63:In one embodiment, as shown in FIG. 6, step S13 can be implemented as the following steps S61-S63:
步骤S61,根据认知度和认知等级之间的对应关系,确定认知度对应的认知等级。In step S61, the cognitive level corresponding to the cognition is determined according to the correspondence between the cognition and the cognition level.
步骤S62,根据认知等级和语音输出内容之间的对应关系,获取与认知等级相对应的语音输出内容。Step S62: Acquire a voice output content corresponding to the cognitive level according to the correspondence between the cognitive level and the voice output content.
步骤S63,输出语音输出内容。In step S63, the voice output content is output.
本实施例中,终端预存有认知度和认知等级之间的对应关系,以及认知等级和语音输出内容之间的对应关系,例如,认知等级可根据需要划分为低认知等级、中认知等级、高认知等级三个等级,认知度在“0%~30%”之间的对应低认知等级,认知度在“31%~70%”之间的对应中认知等级,认知度在“71%~100%”之间的对应高认知等级。与低认知等级对应的语音输出内容为详细版语音输出内容,与中认知等级对应的语音输出内容为标准版语音输出内容,与高认知等级对应的语音输出内容为简洁版语音输出内容,对于每一个语音输入内容,终端都会存储与之对应的详细版、简洁版、标准版三种语音输出内容。举例来说,对于语音 输入内容“怎样设置空调温度”,与其相对应的语音输出内容包括:详细版“点击第一排中间位置的模式按钮,点击两次进入温度调节模式,点击第二排左边按钮‘+/-’改变温度,点击一次,温度‘+/-’1度”;标准版“点击模式按钮进入温度调节模式,点击按钮‘+/-’改变温度”;简洁版“先进入温度调节模式,然后改变温度”。此外,与预设最低认知度对应的认知等级可以是低认知等级,因此,对于无法确定语音输入参数的语音输入内容、或者首次接收到用户的语音输入内容,终端可直接输出详细版语音输出内容。可见,采用本实施例的技术方案,使终端在为用户输出语音输出内容时,能够通过确定用户对语音输入内容所属类别的认知度来分析出用户的当前需求,并根据用户当前的需求输出与其相匹配的语音输出内容,使得用户能够从语音输出内容中获取到更多更准确的信息。In this embodiment, the terminal pre-stores a correspondence between the cognitive level and the cognitive level, and a correspondence between the cognitive level and the voice output content. For example, the cognitive level may be classified into a low cognitive level according to needs. There are three levels of cognition level and high cognition level. The cognition is in the corresponding low cognitive level between “0% and 30%”, and the recognition is recognized in the correspondence between “31% to 70%”. Knowing the level, the recognition is in the corresponding high cognitive level between "71% to 100%". The voice output content corresponding to the low cognitive level is the detailed version of the voice output content, the voice output content corresponding to the middle cognitive level is the standard version of the voice output content, and the voice output content corresponding to the high cognitive level is the compact version of the voice output content. For each voice input content, the terminal stores the three voice output contents corresponding to the detailed version, the compact version, and the standard version. For example, for speech Input content "How to set the air conditioning temperature", the corresponding voice output content includes: detailed version "Click the mode button in the middle of the first row, click twice to enter the temperature adjustment mode, click the left button of the second row '+/-' Change the temperature, click once, the temperature '+/-'1 degree"; Standard version "Click the mode button to enter the temperature adjustment mode, click the button '+/-' to change the temperature"; the simple version "first enter the temperature adjustment mode, then change the temperature ". In addition, the cognitive level corresponding to the preset minimum recognition may be a low cognitive level. Therefore, for the voice input content that cannot determine the voice input parameter, or the voice input content of the user is received for the first time, the terminal may directly output the detailed version. Voice output content. It can be seen that, by using the technical solution of the embodiment, when the terminal outputs the voice output content for the user, the terminal can analyze the current demand of the user by determining the user's recognition of the category of the voice input content, and output according to the current demand of the user. The voice output content matched with it enables the user to obtain more and more accurate information from the voice output content.
对应于上述一种语音输出方法,本发明实施例还提供一种语音输出装置,该装置用以执行上述方法。Corresponding to the above-mentioned voice output method, the embodiment of the present invention further provides a voice output device, which is used to perform the above method.
图7为本发明实施例中一种语音输出装置的框图。如图7所示,该装置包括:FIG. 7 is a block diagram of a voice output device according to an embodiment of the present invention. As shown in Figure 7, the device includes:
接收模块71,用于接收用户输入的语音输入内容;The receiving module 71 is configured to receive voice input content input by the user;
确定模块72,用于根据语音输入内容,确定用户对语音输入内容所属类别的认知度,认知度为用户对类别的专业知识认知程度;The determining module 72 is configured to determine, according to the voice input content, the user's recognition of the category of the voice input content, and the recognition degree is the user's knowledge of the category's professional knowledge;
输出模块73,用于从与语音输入内容相对应的至少一种语音输出内容中,获取并输出与认知度相匹配的语音输出内容。The output module 73 is configured to acquire and output the voice output content that matches the recognition degree from the at least one voice output content corresponding to the voice input content.
在一个实施例中,如图8所示,确定模块72包括:In one embodiment, as shown in FIG. 8, the determining module 72 includes:
第一识别子模块721,用于识别用户的声纹信息;a first identification sub-module 721, configured to identify voiceprint information of the user;
第一判断子模块722,用于根据声纹信息,判断是否为首次接收用户的语音输入内容;The first determining sub-module 722 is configured to determine, according to the voiceprint information, whether the voice input content of the user is received for the first time;
第二确定子模块723,用于当为首次接收用户的语音输入内容时,确定用户对语音输入内容所属类别的认知度为预设最低认知度。The second determining sub-module 723 is configured to determine, when the voice input content of the user is received for the first time, the user's recognition of the category to which the voice input content belongs is a preset minimum awareness.
在一个实施例中,上述装置还包括:In an embodiment, the above apparatus further includes:
记录模块,用于记录语音输入内容的输入时间和使用时长,使用时长为接收到语音输入内容和输出语音输出内容之间的时长。The recording module is configured to record an input time and a usage duration of the voice input content, and the usage duration is a duration between receiving the voice input content and outputting the voice output content.
在一个实施例中,如图9所示,确定模块72包括:In one embodiment, as shown in FIG. 9, the determining module 72 includes:
第二识别子模块724,用于识别用户的声纹信息;a second identification sub-module 724, configured to identify voiceprint information of the user;
第二判断子模块725,用于根据用户的声纹信息,判断相邻两次接收到的语音输入内容是否为同一用户所输入;The second determining sub-module 725 is configured to determine, according to the voiceprint information of the user, whether the voice input content received twice adjacently is input by the same user;
第一计算子模块726,用于当相邻两次接收到的语音输入内容为同一用户所输入时,根据相邻两次接收到的语音输入内容的输入时间和使用时长,计算相邻两次接收到的语音输入内容之间的时间间隔;The first calculation sub-module 726 is configured to calculate the adjacent two times according to the input time and the usage duration of the two received voice input contents when the voice input content received twice in the adjacent two times is input by the same user. The time interval between received speech input content;
第三确定子模块727,用于根据时间间隔,确定用户对语音输入内容所属类别的认知度;其中,时间间隔越长,认知度越低。The third determining sub-module 727 is configured to determine, according to the time interval, the user's awareness of the category to which the voice input content belongs; wherein the longer the time interval, the lower the awareness.
在一个实施例中,如图10所示,确定模块72包括: In one embodiment, as shown in FIG. 10, the determining module 72 includes:
第三识别子模块728,用于识别用户的声纹信息;a third identification sub-module 728, configured to identify voiceprint information of the user;
第一获取子模块729,用于根据用户的声纹信息,获取与用户对应的历史输入记录信息,历史输入记录信息包括历史累计使用时间、历史累计输入次数和历史输入频率中至少一项信息;The first obtaining sub-module 729 is configured to acquire historical input record information corresponding to the user according to the voiceprint information of the user, where the historical input record information includes at least one of historical accumulated use time, historical cumulative input times, and historical input frequency;
第四确定子模块7210,用于根据历史输入记录信息,确定用户对语音输入内容所属类别的认知度;其中,历史累计使用时间越长,认知度越高;历史累计输入次数越多,认知度越高;历史输入频率越高,认知度越高。The fourth determining sub-module 7210 is configured to determine, according to the historical input record information, the user's recognition of the category of the voice input content; wherein, the longer the historical accumulated usage time, the higher the recognition degree; the more the historical cumulative input times, The higher the awareness; the higher the historical input frequency, the higher the recognition.
在一个实施例中,如图11所示,确定模块72包括:In one embodiment, as shown in FIG. 11, the determining module 72 includes:
提取子模块7211,用于提取语音输入内容中的关键词;Extracting a sub-module 7211 for extracting keywords in the voice input content;
第五确定子模块7212,用于确定语音输入内容中的关键词与预设关键词的匹配度;a fifth determining sub-module 7212, configured to determine a matching degree between the keyword in the voice input content and the preset keyword;
第六确定子模块7213,用于根据语音输入内容中的关键词与预设关键词的匹配度,确定用户对语音输入内容所属类别的认知度;其中,语音输入内容中的关键词与预设关键词中的专业关键词的匹配度越高,认知度越高;语音输入内容中的关键词与预设关键词中的非专业关键词的匹配度越高,认知度越低。The sixth determining sub-module 7213 is configured to determine, according to the matching degree of the keyword in the voice input content and the preset keyword, the user's recognition of the category of the voice input content; wherein, the keyword and the pre-information in the voice input content The higher the matching degree of the professional keywords in the keywords, the higher the recognition degree; the higher the matching degree between the keywords in the voice input content and the non-professional keywords in the preset keywords, the lower the recognition.
在一个实施例中,确定模块72包括:In one embodiment, the determining module 72 includes:
第七确定子模块,用于确定语音输入内容的语句结构类型,语句结构类型包括专业语句结构类型或非专业语句结构类型;a seventh determining submodule, configured to determine a statement structure type of the voice input content, the statement structure type including a professional statement structure type or a non-professional statement structure type;
第八确定子模块,用于根据语音输入内容的语句结构类型,确定用户对语音输入内容所属类别的认知度;其中,用户对专业语句结构类型的语音输入内容所属类别的认知度高于对非专业语句结构类型的语音输入内容所属类别的认知度。The eighth determining submodule is configured to determine, according to the sentence structure type of the voice input content, the user's recognition of the category of the voice input content; wherein the user has higher recognition of the category of the voice input content of the professional sentence structure type Awareness of the category of the voice input content of the non-professional statement structure type.
在一个实施例中,确定模块72包括:In one embodiment, the determining module 72 includes:
第九确定子模块,用于当判定相邻两次接收到的语音输入内容为同一用户所输入时,根据相邻两次接收到的语音输入内容中的关键词,确定相邻两次接收到的语音输入内容之间的关联度;a ninth determining sub-module, configured to determine, when the voice input content received twice in the adjacent two times is input by the same user, determining that the two adjacent ones are received according to the keywords in the voice input content received two times adjacent to each other The degree of association between voice input content;
第十确定子模块,用于根据相邻两次接收到的语音输入内容之间的关联度,确定用户对语音输入内容所属类别的认知度;其中,关联度越高,认知度越低。a tenth determining sub-module, configured to determine, according to the degree of association between the two received voice input contents, the user's recognition of the category of the voice input content; wherein the higher the degree of association, the lower the degree of recognition .
在一个实施例中,确定模块72包括:In one embodiment, the determining module 72 includes:
第十一确定子模块,用于根据语音输入内容,确定语音输入内容的至少两项语音输入参数,语音输入参数包括:用户的声纹信息、同一用户的相邻两次输入的语音输入内容之间的时间间隔、与用户对应的历史输入记录信息、语音输入内容中的关键词与预设关键词的匹配度、语音输入内容的语句结构类型和同一用户相邻两次输入的语音输入内容之间的关联度;The eleventh determining sub-module is configured to determine at least two voice input parameters of the voice input content according to the voice input content, where the voice input parameter comprises: voice tone information of the user, and voice input content of two adjacent inputs of the same user The time interval between the time, the historical input record information corresponding to the user, the matching degree of the keyword in the voice input content with the preset keyword, the sentence structure type of the voice input content, and the voice input content input twice by the same user Degree of association
计算子模块,用于根据预设的每一项语音输入参数的权重,计算用户对语音输入内容所属类别的认知度。The calculation sub-module is configured to calculate the user's recognition of the category of the voice input content according to the weight of each preset voice input parameter.
在一个实施例中,确定模块72包括:In one embodiment, the determining module 72 includes:
第十二确定子模块,用于当无法确定语音输入内容的语音输入参数时,确定用户对语音 输入内容所属类别的认知度为预设最低认知度。The twelfth determining submodule is configured to determine the user to the voice when the voice input parameter of the voice input content cannot be determined The recognition of the category to which the input content belongs is the preset minimum awareness.
在一个实施例中,如图12所示,输出模块73包括:In one embodiment, as shown in FIG. 12, the output module 73 includes:
第十三确定子模块731,用于根据认知度和认知等级之间的对应关系,确定认知度对应的认知等级;a thirteenth determining sub-module 731, configured to determine a cognitive level corresponding to the cognition according to a correspondence between the cognition and the cognition level;
第二获取子模块732,用于根据认知等级和语音输出内容之间的对应关系,获取与认知等级相对应的语音输出内容;The second obtaining sub-module 732 is configured to acquire, according to a correspondence between the cognitive level and the voice output content, the voice output content corresponding to the cognitive level;
输出子模块733,用于输出语音输出内容。The output sub-module 733 is configured to output the voice output content.
在一个实施例中,如图13所示,上述装置还包括:In an embodiment, as shown in FIG. 13, the foregoing apparatus further includes:
更新模块74,用于根据语音输入内容的输入时间和使用时长,更新历史输入记录信息。The updating module 74 is configured to update the historical input record information according to the input time and the usage duration of the voice input content.
存储模块75,用于存储用户对语音输入内容所属类别的认知度。The storage module 75 is configured to store the user's awareness of the category to which the voice input content belongs.
在一个实施例中,确定模块72包括:In one embodiment, the determining module 72 includes:
第四识别子模块,用于识别用户的声纹信息;a fourth identification sub-module, configured to identify voiceprint information of the user;
查询子模块,用于根据用户的声纹信息查询用户对语音输入内容所属类别的认知度。The query sub-module is configured to query the user's recognition of the category of the voice input content according to the user's voiceprint information.
采用本发明实施例提供的装置,能够根据用户对输入的语音输入内容所属类别的认知度,为用户选择与其认知度相匹配的语音输出内容进行输出,使得语音输出内容更加符合用户的需求,从而为用户提供更加个性化的语音输出功能,同时提高了语音输出的准确性,使用户能够从语音输出内容中获取到最大的信息量,提高了用户的体验度。According to the device provided by the embodiment of the present invention, according to the user's recognition of the category of the input voice input content, the voice output content matching the recognition degree is selected for the user to output, so that the voice output content is more in line with the user's needs. Therefore, the user is provided with a more personalized voice output function, and the accuracy of the voice output is improved, so that the user can obtain the maximum amount of information from the voice output content, thereby improving the user experience.
图14是根据一示例性实施例示出的一种可执行语音输出方法的装置的框图。例如,装置1600可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。FIG. 14 is a block diagram of an apparatus for performing a voice output method, according to an exemplary embodiment. For example, device 1600 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
参照图14,装置1600可以包括以下一个或多个组件:处理器1601,存储器1602以及通信组件1603。Referring to Figure 14, device 1600 can include one or more of the following components: processor 1601, memory 1602, and communication component 1603.
处理器1601通常控制装置1600的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理器1601可以执行指令,以完成上述的方法的全部或部分步骤。The processor 1601 typically controls the overall operation of the device 1600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processor 1601 can execute instructions to perform all or part of the steps of the above method.
存储器1602被配置为存储各种类型的数据以支持在装置1600的操作。这些数据的示例包括用于在装置1600上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器1602可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。 Memory 1602 is configured to store various types of data to support operation at device 1600. Examples of such data include instructions for any application or method operating on device 1600, contact data, phone book data, messages, pictures, videos, and the like. The memory 1602 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable. Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
通信组件1603被配置为便于装置1600和其他设备之间有线或无线方式的通信。装置1600可以接入基于通信标准的无线网络,如Wi-Fi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件1603经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,通信组件1603还包括近场通信(NFC)模块,以促进短程 通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。 Communication component 1603 is configured to facilitate wired or wireless communication between device 1600 and other devices. The device 1600 can access a wireless network based on a communication standard, such as Wi-Fi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1603 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, communication component 1603 further includes a near field communication (NFC) module to facilitate short range Communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
在示例性实施例中,装置1600可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述语音输出方法。In an exemplary embodiment, device 1600 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the voice output method described above.
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器1602,上述指令可由装置1600的处理器1601执行以完成上述语音输出方法。例如,非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium comprising instructions, such as a memory 1602 comprising instructions executable by processor 1601 of apparatus 1600 to perform the voice output method described above. For example, the non-transitory computer readable storage medium can be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
本发明还提供一种非暂时性计算机可读记录介质,所述介质上记录有计算机程序,所述程序包括用于执行如本发明上述实施例所述的语音输出方法的指令。The present invention also provides a non-transitory computer readable recording medium having recorded thereon a computer program including instructions for executing the voice output method according to the above-described embodiment of the present invention.
本发明还提供一种计算机程序,所述程序包括:用于在所述程序被计算机执行时执行如本发明上述实施例所述的语音输出方法的指令。The present invention also provides a computer program comprising: instructions for executing a voice output method according to the above-described embodiment of the present invention when the program is executed by a computer.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (system), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device. Means for implementing the functions specified in one or more of the flow or in a block or blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。 It is apparent that those skilled in the art can make various modifications and variations to the invention without departing from the spirit and scope of the invention. Thus, it is intended that the present invention cover the modifications and modifications of the invention

Claims (20)

  1. 一种语音输出方法,其特征在于,包括:A voice output method, comprising:
    接收用户输入的语音输入内容;Receiving voice input input by the user;
    根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,所述认知度为所述用户对所述类别的专业知识认知程度;Determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs, the degree of recognition being a degree of knowledge of the user's professional knowledge of the category;
    从与所述语音输入内容相对应的至少一种语音输出内容中,获取并输出与所述认知度相匹配的语音输出内容。From the at least one voice output content corresponding to the voice input content, the voice output content matching the recognition degree is acquired and output.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:The method according to claim 1, wherein the determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs includes:
    识别所述用户的声纹信息;Identifying voiceprint information of the user;
    根据所述声纹信息,判断是否为首次接收所述用户的语音输入内容;Determining, according to the voiceprint information, whether the voice input content of the user is received for the first time;
    当为首次接收所述用户的语音输入内容时,确定所述用户对所述语音输入内容所属类别的认知度为预设最低认知度。When the voice input content of the user is received for the first time, it is determined that the user's awareness of the category to which the voice input content belongs is a preset minimum awareness.
  3. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1 further comprising:
    记录所述语音输入内容的输入时间和使用时长,所述使用时长为接收到所述语音输入内容和输出所述语音输出内容之间的时长。Recording an input time and a duration of use of the voice input content, the duration of use being a duration between receipt of the voice input content and output of the voice output content.
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:The method according to claim 3, wherein the determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs includes:
    识别所述用户的声纹信息;Identifying voiceprint information of the user;
    根据所述用户的声纹信息,判断相邻两次接收到的语音输入内容是否为同一用户所输入;Determining, according to the voiceprint information of the user, whether the voice input content received twice adjacently is input by the same user;
    当相邻两次接收到的语音输入内容为同一用户所输入时,根据所述相邻两次接收到的语音输入内容的输入时间和使用时长,计算相邻两次接收到的语音输入内容之间的时间间隔;When the voice input content received twice in the adjacent time is input by the same user, calculating the voice input content of the two adjacent received voices according to the input time and the duration of use of the voice input content received by the two adjacent times Time interval between
    根据所述时间间隔,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述时间间隔越长,所述认知度越低。And determining, according to the time interval, the user's awareness of the category to which the voice input content belongs; wherein the longer the time interval, the lower the awareness.
  5. 根据权利要求3所述的方法,其特征在于,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:The method according to claim 3, wherein the determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs includes:
    识别所述用户的声纹信息;Identifying voiceprint information of the user;
    根据所述用户的声纹信息,获取与所述用户对应的历史输入记录信息,所述历史输入记录信息包括历史累计使用时间、历史累计输入次数和历史输入频率中至少一项信息;Acquiring, according to the voiceprint information of the user, history input record information corresponding to the user, where the history input record information includes at least one of historical accumulated use time, historical cumulative input times, and historical input frequency;
    根据所述历史输入记录信息,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述历史累计使用时间越长,所述认知度越高;所述历史累计输入次数越多,所述认知度越高;所述历史输入频率越高,所述认知度越高。 Determining, by the history input record information, the user's awareness of the category to which the voice input content belongs; wherein the longer the historical accumulated usage time, the higher the awareness; the historical cumulative input times The more the recognition, the higher the degree of recognition; the higher the historical input frequency, the higher the recognition.
  6. 根据权利要求1所述的方法,其特征在于,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:The method according to claim 1, wherein the determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs includes:
    提取所述语音输入内容中的关键词;Extracting keywords in the voice input content;
    确定所述语音输入内容中的关键词与预设关键词的匹配度;Determining a degree of matching between a keyword in the voice input content and a preset keyword;
    根据所述语音输入内容中的关键词与预设关键词的匹配度,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述语音输入内容中的关键词与预设关键词中的专业关键词的匹配度越高,所述认知度越高;所述语音输入内容中的关键词与预设关键词中的非专业关键词的匹配度越高,所述认知度越低。Determining, by the user, the recognition of the category to which the voice input content belongs according to the matching degree of the keyword in the voice input content and the preset keyword; wherein the keyword and the preset in the voice input content The higher the matching degree of the professional keyword in the keyword, the higher the recognition degree; the higher the matching degree between the keyword in the voice input content and the non-professional keyword in the preset keyword, the recognition The lower the knowledge.
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:The method according to claim 1, wherein the determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs includes:
    确定所述语音输入内容的语句结构类型,所述语句结构类型包括专业语句结构类型或非专业语句结构类型;Determining a statement structure type of the voice input content, the statement structure type including a professional statement structure type or a non-professional statement structure type;
    根据所述语音输入内容的语句结构类型,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述用户对所述专业语句结构类型的语音输入内容所属类别的认知度高于对所述非专业语句结构类型的语音输入内容所属类别的认知度。Determining the user's awareness of the category to which the voice input content belongs according to the sentence structure type of the voice input content; wherein the user's recognition of the category of the voice input content of the professional sentence structure type Higher than the recognition of the category of the voice input content of the non-professional statement structure type.
  8. 根据权利要求1所述的方法,其特征在于,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:The method according to claim 1, wherein the determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs includes:
    当判定相邻两次接收到的语音输入内容为同一用户所输入时,根据相邻两次接收到的语音输入内容中的关键词,确定所述相邻两次接收到的语音输入内容之间的关联度;When it is determined that the voice input content received two times in the adjacent two is input by the same user, determining, between the two received voice input contents, according to the keywords in the voice input content received two times adjacent to each other Degree of association
    根据所述相邻两次接收到的语音输入内容之间的关联度,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述关联度越高,所述认知度越低。Determining the user's awareness of the category to which the voice input content belongs according to the degree of association between the two received voice input contents; wherein the higher the degree of association, the awareness The lower.
  9. 根据权利要求1所述的方法,其特征在于,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:The method according to claim 1, wherein the determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs includes:
    根据所述语音输入内容,确定所述语音输入内容的至少两项语音输入参数,所述语音输入参数包括:所述用户的声纹信息、同一用户的相邻两次输入的语音输入内容之间的时间间隔、与所述用户对应的历史输入记录信息、所述语音输入内容中的关键词与预设关键词的匹配度、所述语音输入内容的语句结构类型和同一用户相邻两次输入的语音输入内容之间的关联度;Determining, according to the voice input content, at least two voice input parameters of the voice input content, where the voice input parameter comprises: voiceprint information of the user, and voice input content of two adjacent inputs of the same user The time interval, the historical input record information corresponding to the user, the matching degree of the keyword in the voice input content with the preset keyword, the statement structure type of the voice input content, and the second input of the same user The degree of association between voice input content;
    根据预设的每一项语音输入参数的权重,计算所述用户对所述语音输入内容所属类别的认知度。The user's awareness of the category to which the voice input content belongs is calculated according to the weight of each of the preset voice input parameters.
  10. 根据权利要求9所述的方法,其特征在于,所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:The method according to claim 9, wherein the determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs includes:
    当无法确定所述语音输入内容的语音输入参数时,确定所述用户对所述语音输入内容所属类别的认知度为预设最低认知度。When the voice input parameter of the voice input content cannot be determined, determining that the user's recognition of the category of the voice input content belongs to a preset minimum awareness.
  11. 根据权利要求1所述的方法,其特征在于,所述从与所述语音输入内容相对应的 至少一种语音输出内容中,获取并输出与所述认知度相匹配的语音输出内容,包括:The method according to claim 1, wherein said from said voice input content And acquiring, in the at least one voice output content, the voice output content that matches the recognition, including:
    根据认知度和认知等级之间的对应关系,确定所述认知度对应的认知等级;Determining a cognitive level corresponding to the recognition according to a correspondence between the recognition level and the cognitive level;
    根据认知等级和语音输出内容之间的对应关系,获取与所述认知等级相对应的语音输出内容;Acquiring the voice output content corresponding to the cognitive level according to the correspondence between the cognitive level and the voice output content;
    输出所述语音输出内容。The voice output content is output.
  12. 根据权利要求5所述的方法,其特征在于,所述方法还包括:The method of claim 5, wherein the method further comprises:
    根据所述语音输入内容的输入时间和使用时长,更新所述历史输入记录信息。The history input record information is updated according to an input time and a usage duration of the voice input content.
  13. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1 further comprising:
    存储所述用户对所述语音输入内容所属类别的认知度;Storing the user's awareness of the category to which the voice input content belongs;
    所述根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,包括:Determining the user's awareness of the category to which the voice input content belongs according to the voice input content, including:
    识别所述用户的声纹信息;Identifying voiceprint information of the user;
    根据所述用户的声纹信息查询所述用户对所述语音输入内容所属类别的认知度。The user's voiceprint information is used to query the user's awareness of the category to which the voice input content belongs.
  14. 一种语音输出装置,其特征在于,所述装置包括:A voice output device, characterized in that the device comprises:
    处理器;processor;
    用于存储所述处理器可执行指令的存储器;a memory for storing the processor executable instructions;
    其中,所述处理器被配置为执行一种语音输出方法,所述所述方法包括:Wherein the processor is configured to perform a voice output method, the method comprising:
    接收用户输入的语音输入内容;Receiving voice input input by the user;
    根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,所述认知度为所述用户对所述类别的专业知识认知程度;Determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs, the degree of recognition being a degree of knowledge of the user's professional knowledge of the category;
    从与所述语音输入内容相对应的至少一种语音输出内容中,获取并输出与所述认知度相匹配的语音输出内容。From the at least one voice output content corresponding to the voice input content, the voice output content matching the recognition degree is acquired and output.
  15. 根据权利要求14所述的装置,其中,所述处理器还被配置为:The apparatus of claim 14 wherein the processor is further configured to:
    识别所述用户的声纹信息;Identifying voiceprint information of the user;
    根据所述声纹信息,判断是否为首次接收所述用户的语音输入内容;Determining, according to the voiceprint information, whether the voice input content of the user is received for the first time;
    当为首次接收所述用户的语音输入内容时,确定所述用户对所述语音输入内容所属类别的认知度为预设最低认知度。When the voice input content of the user is received for the first time, it is determined that the user's awareness of the category to which the voice input content belongs is a preset minimum awareness.
  16. 根据权利要求14所述的装置,其中,所述处理器还被配置为:The apparatus of claim 14 wherein the processor is further configured to:
    记录所述语音输入内容的输入时间和使用时长,所述使用时长为接收到所述语音输入内容和输出所述语音输出内容之间的时长。Recording an input time and a duration of use of the voice input content, the duration of use being a duration between receipt of the voice input content and output of the voice output content.
  17. 根据权利要求16所述的装置,其中,所述处理器还被配置为:The apparatus of claim 16 wherein said processor is further configured to:
    识别所述用户的声纹信息;Identifying voiceprint information of the user;
    根据所述用户的声纹信息,判断相邻两次接收到的语音输入内容是否为同一用户所输入;Determining, according to the voiceprint information of the user, whether the voice input content received twice adjacently is input by the same user;
    当相邻两次接收到的语音输入内容为同一用户所输入时,根据所述相邻两次接收到的 语音输入内容的输入时间和使用时长,计算相邻两次接收到的语音输入内容之间的时间间隔;When the voice input content received two times adjacently is input by the same user, according to the received two adjacent The input time and the duration of use of the voice input content, and calculate the time interval between the two received voice input contents;
    根据所述时间间隔,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述时间间隔越长,所述认知度越低。And determining, according to the time interval, the user's awareness of the category to which the voice input content belongs; wherein the longer the time interval, the lower the awareness.
  18. 根据权利要求16所述的装置,其中,所述处理器还被配置为:The apparatus of claim 16 wherein said processor is further configured to:
    识别所述用户的声纹信息;Identifying voiceprint information of the user;
    根据所述用户的声纹信息,获取与所述用户对应的历史输入记录信息,所述历史输入记录信息包括历史累计使用时间、历史累计输入次数和历史输入频率中至少一项信息;Acquiring, according to the voiceprint information of the user, history input record information corresponding to the user, where the history input record information includes at least one of historical accumulated use time, historical cumulative input times, and historical input frequency;
    根据所述历史输入记录信息,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述历史累计使用时间越长,所述认知度越高;所述历史累计输入次数越多,所述认知度越高;所述历史输入频率越高,所述认知度越高。Determining, by the history input record information, the user's awareness of the category to which the voice input content belongs; wherein the longer the historical accumulated usage time, the higher the awareness; the historical cumulative input times The more the recognition, the higher the degree of recognition; the higher the historical input frequency, the higher the recognition.
  19. 根据权利要求14所述的装置,其中,所述处理器还被配置为:The apparatus of claim 14 wherein the processor is further configured to:
    提取所述语音输入内容中的关键词;Extracting keywords in the voice input content;
    确定所述语音输入内容中的关键词与预设关键词的匹配度;Determining a degree of matching between a keyword in the voice input content and a preset keyword;
    根据所述语音输入内容中的关键词与预设关键词的匹配度,确定所述用户对所述语音输入内容所属类别的认知度;其中,所述语音输入内容中的关键词与预设关键词中的专业关键词的匹配度越高,所述认知度越高;所述语音输入内容中的关键词与预设关键词中的非专业关键词的匹配度越高,所述认知度越低。Determining, by the user, the recognition of the category to which the voice input content belongs according to the matching degree of the keyword in the voice input content and the preset keyword; wherein the keyword and the preset in the voice input content The higher the matching degree of the professional keyword in the keyword, the higher the recognition degree; the higher the matching degree between the keyword in the voice input content and the non-professional keyword in the preset keyword, the recognition The lower the knowledge.
  20. 一种非暂时性计算机可读记录介质,所述介质上记录有计算机程序,所述程序包括用于执行一种语音输出方法的指令,所述方法包括:接收用户输入的语音输入内容;A non-transitory computer readable recording medium having recorded thereon a computer program, the program comprising instructions for executing a voice output method, the method comprising: receiving a voice input content input by a user;
    根据所述语音输入内容,确定所述用户对所述语音输入内容所属类别的认知度,所述认知度为所述用户对所述类别的专业知识认知程度;Determining, according to the voice input content, the user's recognition of a category to which the voice input content belongs, the degree of recognition being a degree of knowledge of the user's professional knowledge of the category;
    从与所述语音输入内容相对应的至少一种语音输出内容中,获取并输出与所述认知度相匹配的语音输出内容。 From the at least one voice output content corresponding to the voice input content, the voice output content matching the recognition degree is acquired and output.
PCT/CN2016/082427 2015-09-08 2016-05-18 Voice output method and device WO2017041510A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201680002958.1A CN107077845B (en) 2015-09-08 2016-05-18 Voice output method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510568430.8A CN105304082B (en) 2015-09-08 2015-09-08 A kind of speech output method and device
CN201510568430.8 2015-09-08

Publications (1)

Publication Number Publication Date
WO2017041510A1 true WO2017041510A1 (en) 2017-03-16

Family

ID=55201255

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/082427 WO2017041510A1 (en) 2015-09-08 2016-05-18 Voice output method and device

Country Status (2)

Country Link
CN (2) CN105304082B (en)
WO (1) WO2017041510A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105304082B (en) * 2015-09-08 2018-12-28 北京云知声信息技术有限公司 A kind of speech output method and device
CN106251862A (en) * 2016-07-19 2016-12-21 东莞市优陌儿智护电子科技有限公司 The implementation method of complete semantic intelligence intercommunication and system thereof
CN106649698B (en) * 2016-12-19 2020-12-22 宇龙计算机通信科技(深圳)有限公司 Information processing method and information processing device
CN107767869B (en) * 2017-09-26 2021-03-12 百度在线网络技术(北京)有限公司 Method and apparatus for providing voice service
CN107863108B (en) * 2017-11-16 2021-03-23 百度在线网络技术(北京)有限公司 Information output method and device
CN110018843B (en) * 2018-01-09 2022-08-30 北京小度互娱科技有限公司 Method and device for testing application program operation strategy
CN110619870B (en) * 2018-06-04 2022-05-06 佛山市顺德区美的电热电器制造有限公司 Man-machine conversation method and device, household appliance and computer storage medium
CN109035896B (en) * 2018-08-13 2021-11-05 广东小天才科技有限公司 Oral training method and learning equipment
CN109036386B (en) * 2018-09-14 2021-03-16 北京网众共创科技有限公司 Voice processing method and device
CN109766411A (en) * 2019-01-14 2019-05-17 广东小天才科技有限公司 A kind of method and system of the parsing of search problem
CN111782782B (en) * 2020-06-09 2023-04-18 苏宁金融科技(南京)有限公司 Consultation reply method and device for intelligent customer service, computer equipment and storage medium
CN114398514B (en) * 2021-12-24 2022-11-22 北京达佳互联信息技术有限公司 Video display method and device and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005215689A (en) * 2004-02-02 2005-08-11 Fuji Xerox Co Ltd Method and system for recognizing information from information source
US20090216757A1 (en) * 2008-02-27 2009-08-27 Robi Sen System and Method for Performing Frictionless Collaboration for Criteria Search
CN101616221A (en) * 2008-06-25 2009-12-30 富士通株式会社 Guidance information display unit and guidance information display packing
CN103000173A (en) * 2012-12-11 2013-03-27 优视科技有限公司 Voice interaction method and device
CN103578469A (en) * 2012-08-08 2014-02-12 百度在线网络技术(北京)有限公司 Method and device for showing voice recognition result
CN105304082A (en) * 2015-09-08 2016-02-03 北京云知声信息技术有限公司 Voice output method and voice output device

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1553381A (en) * 2003-05-26 2004-12-08 杨宏惠 Multi-language correspondent list style language database and synchronous computer inter-transtation and communication
JP2006208483A (en) * 2005-01-25 2006-08-10 Sony Corp Device, method, and program for assisting survey of interesting matter of listener, and recording medium
CN1838159B (en) * 2006-02-14 2010-08-11 北京未名博思生物智能科技开发有限公司 Cognition logic machine and its information processing method
US7725308B2 (en) * 2006-06-07 2010-05-25 Motorola, Inc. Interactive tool for semi-automatic generation of a natural language grammar from a device descriptor
CN101304457A (en) * 2007-05-10 2008-11-12 许罗迈 Method and apparatus for implementing automatic spoken language training based on voice telephone
US9418661B2 (en) * 2011-05-12 2016-08-16 Johnson Controls Technology Company Vehicle voice recognition systems and methods
KR101307578B1 (en) * 2012-07-18 2013-09-12 티더블유모바일 주식회사 System for supplying a representative phone number information with a search function
CN103680222B (en) * 2012-09-19 2017-10-24 镇江诺尼基智能技术有限公司 Children stories question and answer exchange method
US9269354B2 (en) * 2013-03-11 2016-02-23 Nuance Communications, Inc. Semantic re-ranking of NLU results in conversational dialogue applications
CN103594086B (en) * 2013-10-25 2016-08-17 海菲曼(天津)科技有限公司 Speech processing system, device and method
CN104637007A (en) * 2013-11-07 2015-05-20 大连东方之星信息技术有限公司 Statistical analysis system employing degree-of-cognition system
CN104408099B (en) * 2014-11-18 2019-03-12 百度在线网络技术(北京)有限公司 Searching method and device
CN104574251A (en) * 2015-01-06 2015-04-29 熊国顺 Intelligent public safety information system and application method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005215689A (en) * 2004-02-02 2005-08-11 Fuji Xerox Co Ltd Method and system for recognizing information from information source
US20090216757A1 (en) * 2008-02-27 2009-08-27 Robi Sen System and Method for Performing Frictionless Collaboration for Criteria Search
CN101616221A (en) * 2008-06-25 2009-12-30 富士通株式会社 Guidance information display unit and guidance information display packing
CN103578469A (en) * 2012-08-08 2014-02-12 百度在线网络技术(北京)有限公司 Method and device for showing voice recognition result
CN103000173A (en) * 2012-12-11 2013-03-27 优视科技有限公司 Voice interaction method and device
CN105304082A (en) * 2015-09-08 2016-02-03 北京云知声信息技术有限公司 Voice output method and voice output device

Also Published As

Publication number Publication date
CN105304082B (en) 2018-12-28
CN107077845B (en) 2020-07-17
CN105304082A (en) 2016-02-03
CN107077845A (en) 2017-08-18

Similar Documents

Publication Publication Date Title
WO2017041510A1 (en) Voice output method and device
WO2017143672A1 (en) Information processing method and device based on voice input
TWI511124B (en) Selection method based on speech recognition and mobile terminal device and information system using the same
WO2021164619A1 (en) Group display method and device
WO2019154153A1 (en) Message processing method, unread message display method and computer terminal
US11264021B2 (en) Method for intent-based interactive response and electronic device thereof
CN104951335B (en) The processing method and processing device of application program installation kit
KR20200017249A (en) Apparatus and method for providing feedback for confirming intent of a user in an electronic device
CN104035995B (en) Group's label generating method and device
WO2017092122A1 (en) Similarity determination method, device, and terminal
TW201426359A (en) Characteristics database, method for returning answer, natural language dialog method and system thereof
TW201426358A (en) Method for correcting speech response and natural language dialog system
WO2016082513A1 (en) Method and device for prompting call request
CN104394137B (en) A kind of method and device of prompting voice call
WO2017032084A1 (en) Information output method and apparatus
EP3767488A1 (en) Method and device for processing untagged data, and storage medium
CN105139848B (en) Data transfer device and device
CN111984180B (en) Terminal screen reading method, device, equipment and computer readable storage medium
CN111428032B (en) Content quality evaluation method and device, electronic equipment and storage medium
CN106777016A (en) The method and device of information recommendation is carried out based on instant messaging
CN105187622A (en) Information prompt method and information prompt device
CN105357388B (en) A kind of method and electronic equipment of information recommendation
CN103970831B (en) Recommend the method and apparatus of icon
CN109068005B (en) Method and device for creating timing reminding event
CN104301488B (en) A kind of dialing record generation method, equipment and mobile terminal

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16843448

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16843448

Country of ref document: EP

Kind code of ref document: A1