CN101286161B - Intelligent Chinese request-answering system based on concept - Google Patents

Intelligent Chinese request-answering system based on concept Download PDF

Info

Publication number
CN101286161B
CN101286161B CN2008100478554A CN200810047855A CN101286161B CN 101286161 B CN101286161 B CN 101286161B CN 2008100478554 A CN2008100478554 A CN 2008100478554A CN 200810047855 A CN200810047855 A CN 200810047855A CN 101286161 B CN101286161 B CN 101286161B
Authority
CN
China
Prior art keywords
module
similarity
question
sentence
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2008100478554A
Other languages
Chinese (zh)
Other versions
CN101286161A (en
Inventor
张茂元
邹春燕
杨付全
卢正鼎
赵冰心
余毅
刘明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN2008100478554A priority Critical patent/CN101286161B/en
Publication of CN101286161A publication Critical patent/CN101286161A/en
Application granted granted Critical
Publication of CN101286161B publication Critical patent/CN101286161B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a Chinese question answering system based on concept, which mainly comprises a data server, a question pre-treatment module, a candidate question set extracting module and a question sentence similarity calculation module. The invention aims at providing a question answering system which is based on concept, can carry out synonym expansion of keywords which are processed by question sentences which are input by the user, understand question sentences better, carry out searching and improve the recall ratio of the question answering system. Furthermore, the system has a Chinese sentence similarity calculation method based on concept from three aspects: word form, word order and word length, and improves searching precision ratio. Meanwhile, the system adopts a high-efficiency retrieval technology to realize rapid extraction of candidate question set, calculates question sentence similarity, sorts question set quickly and returns the sorted questions and answers to the user. The question answering system of the invention gives more precise understanding in concept to the question sentences input by the user and searches the accurate answers. Experiments show that the question answering system of the invention achieves high recall ratio and precision ratio.

Description

A kind of intelligent Chinese question answering system based on notion
Technical field
. the invention belongs to information retrieval technique, be specially a kind of dialogue retrieve system based on notion.This question answering system is the improvement to information retrieval system, is a kind of advanced form of information retrieval.It can answer the problem that the user proposes with natural language with accurate, succinct language.
Background technology
21 century, people have formally stepped into the information age, and the demand of network information amount is grown with each passing day.But high capacity, isomerism, distributivity and dynamic that network is intrinsic, and a large amount of inorganized invalid datas among the Web have reduced people to the abundant information efficiency of resource, " information overload " phenomenon occurs.Recent years, along with the fast development of network and infotech, simultaneously people's hope of thinking to obtain quickly information has promoted the development of automatic question answering technology.There are increasing company and scientific research institutions to participate in the automatic question answering Study on Technology.More famous in Microsoft, IBM, MIT, University of Zurich etc.The famous text retrieval meeting TREC of the U.S. set up QA Track in 1999, and the platform of evaluation and test is provided for question answering system.At present, some ripe relatively question answering systems have been developed abroad.Domestic also have some colleges and universities and research institution that automatically request-answering system is studied, the Computer Department of the Chinese Academy of Science, Harbin Institute of Technology, Fudan University, Beijing Institute of Technology, Hong Kong University of Science and Thchnology etc.But generally, the scientific research institution that participates in Chinese automatic question answering technical research is fewer, and does not have the Chinese natural language question answering system of moulding substantially.
Question answering system (Question Answering System) is meant the computer program that can make answer to the question sentence of the use natural language description of computer user input.The natural language processing of question answering system collection, information retrieval, the representation of knowledge are one, become the focus of research in the world just day by day.It can either allow the user put question to natural language, again can for the user return one succinctly, answer accurately, rather than some relevant webpages.Therefore, the search engine of question answering system and traditional dependence keyword matching is compared, and can satisfy user's Search Requirement better, finds out the needed answer of user more accurately, has characteristics such as convenient, fast, efficient.
The man-machine interface of natural language question answering system, accuracy and real-time are three big research and development targets of Chinese natural language question answering system.Wherein, accuracy is the primary goal of natural language question answering system.In order to reach this target, aspect the processing of user's question sentence, need carry out correct participle and part-of-speech tagging, synonym expansion, name entity mark, syntactic analysis, answer type mark or the like to the question sentence of user's input handles, for question answering system based on the frequently asked question storehouse, the similarity calculating that the user imports between question sentence and the problem base question sentence is the core place of system, and the accuracy of its computing method and high efficiency are related to the accuracy and the efficient of total system.
Summary of the invention
The object of the present invention is to provide a kind of intelligent Chinese question answering system based on notion, this system has higher recall ratio and precision ratio.
Intelligent Chinese question answering system based on notion provided by the invention, its structure is characterized in that for comprising data server, load module, display module: it also comprises problem pretreatment module, candidate question set extraction module, question sentence similarity calculation module;
Data server is used to store corpus, index database, XML document and problem base;
Load module is used to receive the problem of user's input, checks the standardization of input question sentence, and the question sentence of correct format is submitted to the problem pretreatment module;
The problem pretreatment module is used to receive the question sentence that load module transmits, and the knowledge base and the rule base that call in the data server carry out pre-service to it, and the result after will handling passes to candidate question set module and question sentence similarity calculation module respectively;
The pre-service that provides from problem pretreatment module rapid extraction candidate question set as a result is provided the candidate question set extraction module, for the question sentence similarity calculation module provides calculating object;
The question sentence similarity calculation module is used for finding the solution the similarity of retrieval question sentence and candidate question set question sentence, the Chinese sentence similarity calculates by the keyword string to the retrieval question sentence and carries out the synonym expansion, utilize spreading result, call the morphology similarity calculating method, call the long similarity calculating method of word order similarity calculating method and speech again, calculate morphology similarity, word order similarity, the long similarity of speech respectively; Then, with three weightings, calculate the final similarity of question sentence;
The morphology similarity calculating method is meant according to formula (I) and calculates morphology similarity Simword:
Simword(S1,S2)=
(I)
2*((λ 1*SameWord(S1,S2)+λ 2*SimWord(S1,S2))/(Len(S1)+Len(S2))
In the formula, S1, S2 are two sentences, and (S1 S2) is the number of contained same words among S1, the S2 to SameWord, (S1 is a contained synon number among S1, the S2 S2) to SimWord, and λ 1, λ 2 represent respectively SameWord (S1, S2) and SimWord (S1, significance level S2);
The word order similarity calculating method is meant according to formula (II) and calculates word order similarity Simord:
Simord ( s 1 , s 2 ) = 1 - ( RevOrd ( s 1 , s 2 ) / ( | λ 1 * OnceSameWord ( s 1 , s 2 ) + λ 2 * OnceSimWord ( s 1 , s 2 ) | - 1 ) ) 1 | λ 1 * OnceSameWord ( s 1 , s 2 ) + λ 2 * OnceSimWord ( s 1 , s 2 ) | = 1 0 | λ 1 * OnceSameWord ( s 1 , s 2 ) + λ 2 * OnceSimWord ( s 1 , s 2 ) | = 0 Formula (II)
In the formula, S1, S2 are two sentences, OnceSameWord (S1, S2) be contained only once same words among S1, the S2, OnceSimWord (S1, S2) be contained only once synon set among S1, the S2, (S1 S2) is OnceSameWord (S1 to Pfirst, S2) and OnceSimWord (S1, S2) vector that the position number of the speech in S1 constitutes, (S1 S2) is Pfirst (S1 to Psecond, S2) component in is pressed the vector that the order ordering of equivalent in S2 generates, (S1 S2) is Psecond (S1, S2) the backward number of each adjacent component to RevOrd;
The long similarity calculating method of speech is meant according to formula (III) computing statement length similarity SimLen:
Simlen(S1,S2)=1-abs(Len(S1)-Len(S2))/Len(S1)+Len(S2)
Formula (III)
Len (S1), Len (S2) represent the length of statement S1 and statement S2 respectively, and abs represents to take absolute value;
Display module will return to the user who submits the retrieval question sentence to corresponding to problem answers in the problem base and relevant information according to the result of question sentence similarity calculation module.
System of the present invention can understand the Chinese question sentence of user's input from concept hierarchy, and the keyword in the question sentence is carried out the synonym expansion, supports the retrieval of the question sentence of natural language description, has improved the recall ratio of question answering system.And system synthesis is considered the morphology of question sentence, and word order, and long three aspects of speech have improved the precision ratio of question sentence retrieval.Secondly, system adopts efficient retrieval technology rapid extraction from problem base to go out candidate question set, similarity between the question sentence that computational problem collection and user import, and based on similarity to the problem set quicksort, sorted problem and answer thereof are returned to the user.By above innovative approach, guaranteed to return apace one succinctly, answer accurately.System of the present invention is a leading indicator with aspects such as precision ratio, recall precision, recall ratios respectively at the requirement of accuracy and real-time, develops, and realizes.Experimental result shows, produces a desired effect.Concrete analysis, the present invention has following advantage:
(1) precision ratio height: this system is according to natural language processing technique, from concept hierarchy the keyword the retrieval question sentence is handled, utilized synonym in sentence, to express the character of identical concept, keyword string to the retrieval question sentence carries out the synonym expansion, calculate the morphology similarity, again in conjunction with word order, the long similarity of speech, COMPREHENSIVE CALCULATING question sentence similarity is calculated, and has realized the pin-point accuracy coupling to former retrieval question sentence and preliminary election problem base problem.Finally, retrieve desirable accurate result apace, reach user's retrieval requirement.
(2) recall precision height: native system has adopted the high-efficiency information retrieval technique.Realize the rapid extraction candidate question set.Has higher execution efficient.Native system utilizes retrieval technique fast, with the retrieval question sentence the keyword string as index terms, the index database that the capacity of setting up is less; The structure of index adopts the inverted list structure, and recall precision is provided greatly.Therefore, retrieval module can extract the preliminary election problem set apace.Improved the efficient of system.
(3) recall ratio height: system can understand the Chinese question sentence of user's input from concept hierarchy, and the keyword in the question sentence is carried out the synonym expansion, has enlarged the semantic information of the retrieval question sentence of user's submission.Support the retrieval of the question sentence of natural language description, make candidate question set more accurate.Improved the recall ratio that selects problem set.And then improved the recall ratio of question answering system.Guarantee that the user obtains correct result.
Description of drawings
Fig. 1 is the system assumption diagram that the present invention is based on the intelligent Chinese question answering system of notion.
Fig. 2 is the modular structure synoptic diagram that the present invention is based on the Chinese question answering system of notion.
Fig. 3 is the process flow diagram of problem pretreatment module.
Fig. 4 is the process flow diagram of retrieval module.
Fig. 5 is the process flow diagram of candidate question set module.
Fig. 6 is the process flow diagram that sentence similarity calculates.
Fig. 7 is the process flow diagram of display module.
Embodiment
The present invention is further detailed explanation below in conjunction with accompanying drawing and example.
As shown in Figure 1, the intelligent Chinese question answering system based on notion provided by the invention comprises data server 100, load module 200, problem pretreatment module 300, candidate question set extraction module 400, question sentence similarity calculation module 500 and display module 600.
Data server 100 is used to store corpus, index database, and XML document and problem base are supported for problem pretreatment module 300 provides knowledge and rule, for candidate question set extraction module 400 provides index and searching object.
Load module 200 is used to receive the problem of user's input, checks the standardization of input question sentence, guarantees the question sentence of correct format is submitted to problem pretreatment module 300.
Question sentence similarity calculation module 500 is utilized the Chinese sentence similarity computational algorithm based on notion of design, find the solution the similarity of question sentence in retrieval question sentence and the candidate question set, the Chinese sentence similarity calculates by the keyword string to the retrieval question sentence and carries out the synonym expansion, utilize spreading result, call the morphology similarity calculating method, call the long similarity calculating method of word order similarity calculating method and speech again, calculate morphology similarity, word order similarity, the long similarity of speech respectively.Then, with three weightings, calculate the final similarity of question sentence.
Problem pretreatment module 300 is used to receive the question sentence that load module 200 transmits, the knowledge base and the rule base that call in the data server 100 carry out pre-service to it, comprise Chinese word segmentation, part-of-speech tagging, operations such as keyword abstraction, and the result after will handling passes to candidate question set module 400 and question sentence similarity calculation module 500 respectively.
The candidate question set extraction module comprises index module, retrieval module and candidate question set module.Be used for rapid extraction candidate question set (with the relevant question sentence collection of retrieval question sentence), for the question sentence similarity calculation module provides calculating object.
Question sentence similarity calculation module 500 is utilized the Chinese sentence similarity computational algorithm based on notion of design, find the solution the similarity of question sentence in retrieval question sentence and the candidate question set, the Chinese sentence similarity calculates by the keyword string to the retrieval question sentence and carries out the synonym expansion, utilize spreading result, call the morphology similarity calculating method, call the long similarity calculating method of word order similarity calculating method and speech again, calculate morphology similarity, word order similarity, the long similarity of speech respectively.Then, with three weightings, calculate the final similarity of question sentence.
Display module 600 according to the result of question sentence similarity calculation module 500, will return to the user who submits the retrieval question sentence to corresponding to problem answers in the problem base and relevant information.
For example data server 100, problem pretreatment module 300, preliminary election problem set module 400 and sentence similarity computing module 500 are described in further detail respectively below.
Shown in Fig. 2 (based on the modular structure synoptic diagram of the Chinese question answering system of notion):
Data server 100 is used to store corpus and comprises knowledge base 110 and rule base 120, and index database 130, XML document 140 and problem base 150.For providing knowledge and rule, problem pretreatment module 300 supports, simultaneously, and for index module 410 provides the index source, for candidate question set module 430 provides searching object.
What deposit in the corpus is to be the basic resource of carrier carrying linguistry with the robot calculator.The linguistic data that truly occurred in the actual use of language obtains through processing (analyze and handle).
Wherein, knowledge base is a notion synonym expansion knowledge base, dictionary, dictionary knowledge base.Rule base has the part-of-speech rule storehouse, the sentence element rule base.
Problem pretreatment module 300 is used to receive the question sentence that load module 200 transmits, call knowledge base 110, rule base 120 carries out pre-service to it, the Chinese word segmentation that comprises question sentence, part-of-speech tagging, operations such as keyword abstraction, and the result after will handling passes to candidate question set module 400 and question sentence similarity calculation module 500 respectively.
As shown in Figure 3, problem pretreatment module 300 is carried out lexical analysis to user's search problem earlier, comprises the Chinese word segmentation module 310 and the part-of-speech tagging module 320 of question sentence.According to the significance level rule of part of speech in sentence (pronoun, adjective is most important to sentence for noun usually, verb) and utilize the vocabulary of stopping using to filter stop words and carry out keyword abstraction module 330.The keyword that extracts is expanded by conceptual expansion knowledge base 110 (generating according to shareware " synonym speech woods ") again.Utilize pretreatment module 300, obtain one group of satisfactory intermediate treatment result;
Problem pretreatment module 300 treatment schemees are: (1), input question sentence; (2), question sentence is carried out format check:, return (1) if incorrect for form; (3), question sentence is handled Chinese word segmentation, part-of-speech tagging; (4), call inactive vocabulary, utilize sentence element significance level rule, carry out the keyword abstraction analyzing and processing; (5) question sentence keyword abstraction; (6), output keyword string.
Chinese word segmentation module 310, the participle of this module adopt maximum reverse matching process.Support as language material by the dictionary knowledge base.Suppose that the contained Chinese character number of long word bar in the dictionary is i, then get preceding i word in the processed text current character string sequence as matching field, search dictionary, if in the dictionary such i words is arranged, then the match is successful, and matching field is cut out as a speech; If can not find a such i words in the dictionary, then it fails to match, and matching field removes the last character, and remaining word mates as new matching field again, so goes on, till the match is successful.
If speech the longest in the dictionary is made up of MaxNum word, sentence length is the number of individual character in the sentence, is made as Len.Array S[N-1] storage length is the sentence of N, i, j, k, position are variable; Wik represents S[i] to S[wik+i] word segmentation unit of composition; Dik is the attribute of the represented word segmentation unit of wik, as its position in dictionary, part of speech etc.; Function m atch (S[i], S[i+j]) judges word string S[i]~S[i+j] whether be the speech in the dictionary.
The flow process of Chinese word segmentation module 310 is as follows: 1) input sentence, call the dictionary knowledge base, the subordinate clause tail coupling that begins to consult the dictionary finishes if mate, and then turns to 3).2) judge word string S[i], S[i+j] whether exceed the sentence tail, whether be the speech in the dictionary, if matching field is cut out as a speech; If can not find a such i words in the dictionary, then it fails to match, and matching field removes the last character, and remaining word mates as new matching field again, returns 1); 3) output word segmentation result.
Part-of-speech tagging module 320 in conjunction with Chinese word segmentation module 310 results, is called the part-of-speech rule storehouse, and the speech of telling is carried out part-of-speech tagging.Determine a most suitable part of speech mark according to the contextual information in the sentence to each speech in the sentence.
Flow process is as follows: 1) get speech string Span from word segmentation result: to each speech in the speech string, look into the part-of-speech rule storehouse, if find, all part of speech marks of this speech are taken out, be registered in array Tags[i] in [j], i represents the sequence number of speech, and j represents the part of speech marking serial numbers, and the occurrence number of this this mark of speech is registered in Freqs[i] in [j] array; If do not find, the open-class items mark is composed to this speech, be registered in Tags[i] in [j], with Freqs[i] value of [j] is changed to 1.2) to each possible part of speech mark of each speech in the speech string, (1) calculates the aggregate-value of this mark; (2) write down best forerunner's mark of this mark.After the part of speech mark of last speech in the speech string is decided, take out best forerunner's mark of each speech in turn, promptly obtain the part-of-speech tagging result.Speech string manipulation class data are reinitialized, prepare the mark of next speech string.Turn back to 1).
Keyword abstraction module 330 is according to the significance level rule of part of speech in sentence (pronoun, adjective is most important to sentence for noun usually, verb) and utilize the vocabulary of stopping using to filter stop words and carry out the extraction of keyword.Make that S is a sentence, w is arbitrary speech among the S, and S ' is a keyword sequence among the S.Flow process is as follows: 1) get a speech w from S, the inactive vocabulary of inquiry turns to 2 if find speech w then), if getting, speech finishes, turn to 4); 2) call the sentence element rule base, judge whether w is noun, pronoun, verb or adjective, if, extract w, read in next speech, turn to 3); 4) form keyword sequence S ' by all keywords that extract among the S, return S '.
The candidate question set extraction module comprises: index module 410, retrieval module 420, candidate question set module 430.Can the rapid extraction candidate question set, be that the question sentence similarity calculation module improves calculating object.
The purpose of candidate's question sentence retrieval is that complicated process such as follow-up similarity calculating is all carried out in this relative small range of candidate question set.Require efficient retrieval.Candidate question set is exactly to concentrate a quick fuzzy correlation that takes out but less relatively subclass from extensive question sentence, and therefore, the function of this part can be achieved by information retrieval technique.Like this, can select to use retrieval technique efficiently on the one hand, make the recall precision height; On the other hand, the function of this module is improved, upgrading is easy, and transplantability is good.
Adopt efficient retrieval, similar problem in the quick positioning question storehouse, for sentence similarity computing module 500 provides the problem base problem set, candidate question set extraction module 400 has very consequence.
The problem base content (XML storage) that index module 410 is used for data server 100 is provided is built index database 130, and the keyword string item among the XML as index terms, is set up index database 130 by index terms and document related information.Along with the renewal of problem base 150, increment is built index, upgrades index database 130.
Retrieval module 420 by problem base 150 derived datas, is stored in the XML document 140, utilizes 130 pairs of XML document 140 of index database to retrieve apace.
As shown in Figure 4, retrieval module 420 treatment schemees are: (1), the input search problem the keyword string, and with it as term; (2), call index database, retrieve; (3), judge whether the keyword string is empty, if sky returns (1), is not empty, enters (4); (4), retrieval, return the problem relevant ID number with the keyword string; (5), ID number of the output problem.
The intermediate treatment result that candidate question set module 430 provides according to problem pretreatment module 300 submits to retrieval module 420 as the term string.Call retrieval module 420, XML document 140 is retrieved, and analyzing XML file 140, the ID that obtains corresponding problem base 150 problems numbers.
As shown in Figure 5, the treatment scheme of this module: (1), input ID number of search problem; (2), corresponding problem in the inquiry problem base; (3), the question sentence of the ID correspondence that judges whether to have problems, if there is no, return (2); (4) the keyword string of output problem concentration problem correspondence.
Sentence similarity computing module 500 calculates the similarity of retrieving question sentence in question sentence and the candidate question set, has directly influenced the result of retrieval.It is a nucleus module of this question answering system.
As shown in Figure 6, this module is mainly utilized the Chinese sentence similarity computing method based on notion of design, find the solution the similarity of question sentence in retrieval question sentence and the candidate question set, the Chinese sentence similarity calculates the keyword string by keyword string synonym expansion module 510 expansion retrieval question sentences, utilize spreading result, call morphology similarity calculation module 530, call word order similarity calculation module 520, the long similarity calculation module 540 of speech again, obtain morphology similarity, word order similarity, the long similarity of speech respectively.Then, call sentence similarity calculating sub module 550, three weightings are tried to achieve the similarity of question sentence.
Treatment scheme is: the keyword string of the problem (being obtained by candidate question set module 400) in (1), input search problem and the preliminary election problem set; (2), call the conceptual expansion knowledge base, retrieval question sentence keyword string is carried out the synonym conceptual expansion, calculate the morphology similarity; (3), calculate the number of same words in the two keyword strings, calculate the long similarity of speech; (4), calculate keyword pairing word order of same keyword in the candidate question set problem of retrieving question sentence, calculating word order similarity; (5), with the similarity result of calculation of (2), (3), (4), carry out the similarity weighting, calculate the question sentence similarity, and output.
Below each module of inside of sentence similarity calculation module 500 is done detailed explanation.
As shown in Figure 2, sentence similarity computing module 500 comprises the long similarity calculation module 540 of synonym expansion module 510, morphology similarity calculation module 530, word order similarity calculation module 520, speech and the sentence similarity calculating sub module 550 of keyword string.
Before specifically introducing the step of function, realization of each module, it is as follows to introduce relevant knowledge earlier:
Related notion is introduced:
(1), the definition 1: the morphology similarity, reflect two modal similarity degrees of sentence, weigh with contained same words or synon number in two sentences.If S1, S2 are two sentences, then the morphology similarity of S1, S2 is:
Simword(S1,S2)=
(1.1)
2*((λ 1*SameWord(S1,S2)+λ 2*SimWord(S1,S2))/(Len(S1)+Len(S2))
SameWord in the formula (S1 S2) is the number of contained same words among S1, the S2, SimWord (S1 is a contained synon number among S1, the S2 S2), and λ 1, λ 2 represent respectively SameWord (S1, S2) and SimWord (S1, significance level S2).The number of times that occurs in S1, S2 when a word is not simultaneously with the few counting of occurrence number; Len (S) is the number of contained speech among the sentence S.Meaning: speech or synon number that two statements are identical are many more, and two statements are similar more;
(2) definition 2: the word order similarity, reflect contained same words or the similarity degree of synonym on the relation of position in two sentences, weigh with contained same words in two sentences or the reverse number of synon adjacent sequential.If S1, S2 is two sentences, OnceSameWord (S1, S2) be S1, contained only once same words among the S2, OnceSimWord (S1, S2) be S1, contained only once synon set among the S2, Pfirst (S1, S2) be OnceSameWord (S1, S2) and OnceSimWord (S1, S2) vector that the position number of the speech in S1 constitutes, Psecond (S1, S2) be Pfirst (S1, S2) component in is pressed the vector that the order ordering of equivalent in S2 generates, RevOrd (S1, S2) be Psecond (S1, S2) the backward number of each adjacent component (with the summation of standard row phase inverse ordinal number), then S1, the word order similarity of S2 is:
Simord ( s 1 , s 2 ) = 1 - ( RevOrd ( s 1 , s 2 ) / ( | λ 1 * OnceSameWord ( s 1 , s 2 ) + λ 2 * OnceSimWord ( s 1 , s 2 ) | - 1 ) ) 1 | λ 1 * OnceSameWord ( s 1 , s 2 ) + λ 2 * OnceSimWord ( s 1 , s 2 ) | = 1 0 | λ 1 * OnceSameWord ( s 1 , s 2 ) + λ 2 * OnceSimWord ( s 1 , s 2 ) | = 0 - - - ( 1.2 )
The advantage of definition word order similarity is like this: when a subordinate sentence or word are whole long distance takes place moves after, still very similar to original statement.Realize fast, algorithm complex is O (m), wherein m=|OnceWord (S1, S2) |;
(3) definition 3: statement length similarity, Len (S1), Len (S2) represents the length of statement S1 and statement S2 respectively, i.e. the number of the speech in two statements.Statement length similarity SimLen (S1 S2) is determined by formula (1.3):
Simlen(S1,S2)=1-abs(Len(S1)-Len(S2))/Len(S1)+Len(S2)
(1.3)
Draw easily: (S1, S2) ∈ [0,1] meaning: the length of two statements is approaching more, and two statements are similar more for SimLen.The example: middle Len (S1)=11, Len (S2)=8, then SimLen (S1, S2) ≈ 0.84;
(4) definition 4: sentence similarity, reflect the similarity degree between two sentences.Be generally the numerical value between one 0~1,0 expression is dissimilar, and 1 expression is similar fully, and two of the big more expressions of numerical value are similar more.Statement X, the final similarity Sim of Y (S1 S2) is determined by formula (1.4):
Sim(S1,S2)=λ 1*Simword(S1,S2)+λ 2*Simorder(S1,S2)
(1.4)
3*Simlen(S1,S2)
Wherein, λ 1, and λ 2, λ 3 constants, and satisfy λ 1+ λ 2+ λ 3=1, obvious Sim (S1, S2) ∈ [0,1].We should be understood that the morphology similarity plays main effect in statement similarity, and statement length similarity and word order similarity play a part less important, so λ 1, and λ 2, should have during λ 3 values λ 1>>λ 2, λ 3.(S1 S2) is S1 to WordSim in the formula, the morphology similarity of S2; (S1 S2) is S1 to OrderSim, S2 word order similarity; (S1 S2) is S1 to OrderSim, the long similarity of sentence of S2.By experiment, get λ 1=0.9, λ 2=0.05, λ 3=0.05.
The function of the long similarity calculation module 540 of the synonym expansion module 510 of keyword string, morphology similarity calculation module 530, word order similarity calculation module 520 and speech, the step of realization:
The synonym expansion module 510 of keyword string mainly is that the keyword string of importing is carried out the synonym expansion.The specific implementation step is as follows: 1) the keyword string keywords1 of input retrieval question sentence; The keyword string keywords2 of input candidate question set question sentence; 2) call the conceptual expansion knowledge base, keywords1 is carried out the synonym conceptual expansion, the result of keywords1 expansion deposits among the character string extendkeywords, finishes the synonym expansion.
Morphology similarity calculation module 530 mainly is a morphology similarity of calculating two sentences, reflects two modal similarity degrees of sentence, weighs with contained same words or synon number in two sentences.The specific implementation step is as follows: 1) passed over the keyword string keywords1 of retrieval question sentence by the synonym expansion module 510 of keyword string, the character string extendkeywords of the keyword string keywords2 of candidate question set question sentence and keywords1 expansion; 2) the keyword number wordsNum1 among the calculating keywords1; Calculate the keyword number wordsNum2 among the keywords2; 3) the number samenum of same keyword among calculating extendkeywords and the keywords2; 4) bring formula into: 2.0*samenum/ (wordsNum1+wordsNum2) calculates morphology similarity simword;
Word order similarity calculation module 520, it mainly is the word order similarity of calculating two sentences, reflect contained same words or the similarity degree of synonym on the relation of position in two sentences, weigh with contained same words in two sentences or the reverse number of synon adjacent sequential.The specific implementation step is as follows: 1) passed over the keyword string keywords1 of retrieval question sentence by the synonym expansion module 510 of keyword string, the keyword string keywords2 of candidate question set question sentence; 2) calculate contained unduplicated same keyword among keywords1 and the keywords2, deposit array oncesimwords in; 3) calculate Pfirst (keywords1, keywords2), vector for the position number formation of the speech among the oncesimwords in keywords1,4) calculate Psecond (keywords1, keywords2), for Pfirst (keywords1, keywords2) component in is pressed the vector that the order ordering of equivalent in keywords2 generates; 5) calculate revord, be Psecond (keywords1, keywords2) the backward number of each adjacent component (with the summation of standard row phase inverse ordinal number); 6) bring formula into: 1-1.0*revord/ (samenum-1) calculates word order similarity simorder;
The long similarity calculation module 540 of speech mainly is the long similarity of speech of calculating two sentences, reflects the similarity degree of the number of contained speech in two sentences.Number with contained speech in two sentences is relatively weighed.The specific implementation step is as follows: 1) transmitted the keyword string keywords1 of retrieval question sentence by the synonym expansion module 510 of keyword string, the keyword string keywords2 of candidate question set question sentence; 2) the keyword number among the calculating keywords1 is made as integer variable wordsNum1; Calculate the keyword number among the keywords2, be made as integer variable wordsNum2; 3) calculate keywords1, the difference distince of keyword number among the keywords2; 4) bring formula into: 1.0-1.0*simorder/ (wordsNum1+wordsNum2) calculates the long similarity simlen of speech;
Sentence similarity calculating sub module 550, according to the significance level of the long similarity of morphology similarity, word order similarity and speech to sentence similarity, the morphology similarity is the most relevant with the semanteme of sentence, and significance level is the highest.Test obtains significance level coefficient preferably by experiment.Try to achieve the similarity of question sentence by the weighting of significance level coefficient to obtaining the long similarity of morphology similarity, word order similarity, speech.The specific implementation step is as follows: 1) transmit morphology similarity, word order similarity, the long similarity of speech by the long similarity calculation module 540 of morphology similarity calculation module 530, word order similarity calculation module 520 and speech respectively; 2) bring formula into: λ 1* simword+ λ 2* simorder+ λ 3* simlen calculates sentence similarity similary; 3) output sentence similarity similary.

Claims (4)

1. intelligent Chinese question answering system based on notion, comprise data server (100), load module (200), display module (600), it is characterized in that: it also comprises problem pretreatment module (300), candidate question set extraction module (400), question sentence similarity calculation module (500);
Data server (100) is used to store corpus, index database, XML document and problem base;
Load module (200) is used to receive the problem of user's input, checks the standardization of input question sentence, and the question sentence of correct format is submitted to problem pretreatment module (300);
Problem pretreatment module (300) is used to receive the question sentence that load module (200) transmits, the knowledge base and the rule base that call in the data server (100) carry out pre-service to it, and the result after will handling passes to candidate question set module (400) and question sentence similarity calculation module (500) respectively;
The pre-service that provides from problem pretreatment module (300) rapid extraction candidate question set as a result is provided candidate question set extraction module (400), for question sentence similarity calculation module (500) provides calculating object;
Question sentence similarity calculation module (500) is used for finding the solution the similarity of retrieval question sentence and candidate question set question sentence, the Chinese sentence similarity calculates by the keyword string to the retrieval question sentence and carries out the synonym expansion, utilize spreading result, call the morphology similarity calculating method, call the long similarity calculating method of word order similarity calculating method and speech again, calculate morphology similarity, word order similarity, the long similarity of speech respectively; Then, with three weightings, calculate the final similarity of question sentence;
The morphology similarity calculating method is meant according to formula (I) and calculates morphology similarity Simword:
Simword(S1,S2)=
2*((λ 1*SameWord(S1,S2)+λ 2*SimWord(S1,S2))/(Len(S1)+Len(S2)) (I)
In the formula, S1, S2 are two sentences, and (S1 S2) is the number of contained same words among S1, the S2 to SameWord, (S1 is a contained synon number among S1, the S2 S2) to SimWord, and λ 1, λ 2 represent respectively SameWord (S1, S2) and SimWord (S1, significance level S2);
The word order similarity calculating method is meant according to formula (II) and calculates word order similarity Simord:
Simord ( S 1 , S 2 ) = 1 - ( RevOrd ( S 1 , S 2 ) / ( | λ 1 * OnceSameWord ( S 1 , S 2 ) + λ 2 * OnceSimWord ( S 1 , S 2 ) | - 1 ) ) 1 | λ 1 * OnceSameWord ( S 1 , S 2 ) + λ 2 * OnceSimWord ( S 1 , S 2 ) | = 1 0 | λ 1 * OnceSameWord ( S 1 , S 2 ) + λ 2 * OnceSimWord ( S 1 , S 2 ) | = 0 Formula (II)
In the formula, S1, S2 are two sentences, OnceSameWord (S1, S2) be contained only once same words among S1, the S2, OnceSimWord (S1, S2) be contained only once synon set among S1, the S2, (S1 S2) is OnceSameWord (S1 to Pfirst, S2) and OnceSimWord (S1, S2) vector that the position number of the speech in S1 constitutes, (S1 S2) is Pfirst (S1 to Psecond, S2) component in is pressed the vector that the order ordering of equivalent in S2 generates, (S1 S2) is Psecond (S1, S2) the backward number of each adjacent component to RevOrd;
The long similarity calculating method of speech is meant according to formula (III) computing statement length similarity SimLen:
Simlen (S1, S2)=1-abs (Len (S1)-Len (S2))/Len (S1)+Len (S2) formula (III)
Len (S1), Len (S2) represent the length of statement S1 and statement S2 respectively, and abs represents to take absolute value;
Display module (600) will return to the user who submits the retrieval question sentence to corresponding to problem answers in the problem base and relevant information according to the result of question sentence similarity calculation module (500).
2. the intelligent Chinese question answering system based on notion according to claim 1 is characterized in that: problem pretreatment module (300) comprises Chinese word segmentation module (310), part-of-speech tagging module (320) and keyword abstraction module (330);
Chinese word segmentation module (310) adopts maximum reverse matching process, supports as language material with the dictionary knowledge base, and the entry in processed text and the dictionary is mated, and obtains Chinese word segmentation;
Part-of-speech tagging module (320) is called the part-of-speech rule storehouse according in conjunction with Chinese word segmentation module (310) result, and the speech of telling is carried out part-of-speech tagging; Determine a most suitable part of speech mark according to the contextual information in the sentence to each speech in the sentence;
Keyword abstraction module (330) is carried out the extraction of keyword according to the significance level rule and the inactive vocabulary filtration of the utilization stop words of part of speech in sentence, obtains the keyword string.
3. the intelligent Chinese question answering system based on notion according to claim 1 is characterized in that: candidate question set extraction module (400) comprises index module (410), retrieval module (420), candidate question set module (430);
Index module (410) is used for the problem base content that data server (100) provides is built index database and renewal;
Retrieval module (420) utilizes index database (130) that XML document is retrieved apace;
The intermediate treatment result that candidate question set module (430) provides according to problem pretreatment module (300) submits to retrieval module (420) as the term string; Call retrieval module (420), XML document is retrieved, and analyzing XML file, the ID that obtains corresponding problem base problem numbers.
4. according to claim 1,2 or 3 described intelligent Chinese question answering systems based on notion, it is characterized in that: sentence similarity computing module (500) comprises the long similarity calculation module (540) of synonym expansion module (510), word order similarity calculation module (520), morphology similarity calculation module (530), speech and the sentence similarity calculating sub module (550) of keyword string;
The synonym expansion module (510) of keyword string is used for the keyword string of input is carried out the synonym expansion, and sends morphology similarity calculation module (530) to;
The keyword string of morphology similarity calculation module (530) after to the expansion that receives carries out the morphology similarity and calculates, and according to contained same words or synon number in two sentences, obtain two modal similarity degrees of sentence, and send sentence similarity calculating sub module (550) to;
Word order similarity calculation module (520) receives the keyword string that problem pretreatment module (300) provides, according to the similarity degree on the relation of position of contained same words or synonym in two sentences, and contained same words or the reverse number of synon adjacent sequential in two sentences, calculate the word order similarity of two sentences, and send sentence similarity calculating sub module (550) to;
The long similarity calculation module of speech (540) receives the keyword string that problem pretreatment module (300) provides, similarity degree according to the number of contained speech in two sentences, and the number of contained speech in two sentences, calculate the long similarity of speech of two sentences, and send sentence similarity calculating sub module (550) to;
Sentence similarity calculating sub module (550) is weighted calculating according to the morphology similarity that obtains, word order similarity, the long similarity of speech, obtains the similarity of question sentence.
CN2008100478554A 2008-05-28 2008-05-28 Intelligent Chinese request-answering system based on concept Expired - Fee Related CN101286161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008100478554A CN101286161B (en) 2008-05-28 2008-05-28 Intelligent Chinese request-answering system based on concept

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008100478554A CN101286161B (en) 2008-05-28 2008-05-28 Intelligent Chinese request-answering system based on concept

Publications (2)

Publication Number Publication Date
CN101286161A CN101286161A (en) 2008-10-15
CN101286161B true CN101286161B (en) 2010-10-06

Family

ID=40058372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008100478554A Expired - Fee Related CN101286161B (en) 2008-05-28 2008-05-28 Intelligent Chinese request-answering system based on concept

Country Status (1)

Country Link
CN (1) CN101286161B (en)

Families Citing this family (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101566998B (en) * 2009-05-26 2011-12-28 华中师范大学 Chinese question-answering system based on neural network
CN102156726B (en) * 2011-04-01 2013-12-25 中国测绘科学研究院 Geographic element querying and extending method based on semantic similarity
CN103425635B (en) * 2012-05-15 2018-02-02 北京百度网讯科技有限公司 Method and apparatus are recommended in a kind of answer
CN102855285A (en) * 2012-08-07 2013-01-02 网讯电通股份有限公司 Keyword management system and method for consultation service system
CN103116577A (en) * 2013-02-04 2013-05-22 刘东民 Method of intelligently processing user natural language order
CN104077330B (en) * 2013-03-30 2019-05-07 百度在线网络技术(北京)有限公司 Method and system of the carry problem to theme
CN104123322A (en) * 2013-04-28 2014-10-29 百度在线网络技术(北京)有限公司 Method and device for obtaining related question corresponding to input question based on synonymy processing
CN103279522A (en) * 2013-05-29 2013-09-04 苏州市米想网络信息技术有限公司 Improvement assisting software
CN104462085B (en) * 2013-09-12 2019-04-12 腾讯科技(深圳)有限公司 Search key error correction method and device
CN104469028B (en) * 2013-09-24 2017-10-24 中国移动通信集团江苏有限公司 A kind of service providing method, conversation server and customer service system
CN103577558B (en) * 2013-10-21 2017-04-26 北京奇虎科技有限公司 Device and method for optimizing search ranking of frequently asked question and answer pairs
CN103577556B (en) * 2013-10-21 2017-01-18 北京奇虎科技有限公司 Device and method for obtaining association degree of question and answer pair
CN103530415A (en) * 2013-10-29 2014-01-22 谭永 Natural language search method and system compatible with keyword search
CN103902652A (en) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 Automatic question-answering system
CN104216968A (en) * 2014-08-25 2014-12-17 华中科技大学 Rearrangement method and system based on document similarity
CN104331523B (en) * 2014-11-27 2017-07-28 韩慧健 A kind of question sentence search method based on conceptual object model
CN104462060B (en) * 2014-12-03 2017-08-01 百度在线网络技术(北京)有限公司 Pass through computer implemented calculating text similarity and search processing method and device
CN104536991B (en) * 2014-12-10 2017-12-08 乐娟 answer extracting method and device
CN104462064B (en) * 2014-12-15 2017-11-03 陈包容 A kind of method and system of information of mobile terminal communication prompt input content
CN104657346A (en) * 2015-01-15 2015-05-27 深圳市前海安测信息技术有限公司 Question matching system and question matching system in intelligent interaction system
CN104679910A (en) * 2015-03-25 2015-06-03 北京智齿博创科技有限公司 Intelligent answering method and system
CN105069070A (en) * 2015-07-30 2015-11-18 武汉博楷管理咨询有限公司 Client online consultation management system and method therefor
CN105117388B (en) * 2015-09-21 2018-06-29 上海智臻智能网络科技股份有限公司 A kind of intelligent robot interactive system
CN108845992B (en) * 2015-10-30 2022-08-26 上海智臻智能网络科技股份有限公司 Computer readable storage medium and question-answer interaction method
CN105740310B (en) * 2015-12-21 2019-08-02 哈尔滨工业大学 A kind of automatic answer method of abstracting and system in question answering system
US10587708B2 (en) 2016-03-28 2020-03-10 Microsoft Technology Licensing, Llc Multi-modal conversational intercom
US11487512B2 (en) 2016-03-29 2022-11-01 Microsoft Technology Licensing, Llc Generating a services application
WO2018000279A1 (en) * 2016-06-29 2018-01-04 深圳狗尾草智能科技有限公司 Diversion-based intention recognition method and system
CN106372055B (en) * 2016-08-23 2019-10-29 北京谛听机器人科技有限公司 A kind of semanteme similar processing method and system in man-machine natural language interaction
CN108628906B (en) * 2017-03-24 2021-01-26 北京京东尚科信息技术有限公司 Short text template mining method and device, electronic equipment and readable storage medium
WO2018170876A1 (en) 2017-03-24 2018-09-27 Microsoft Technology Licensing, Llc A voice-based knowledge sharing application for chatbots
CN106991181B (en) * 2017-04-07 2020-04-21 广州视源电子科技股份有限公司 Method and device for extracting spoken sentences
CN107133299B (en) * 2017-04-26 2019-11-19 消检通(深圳)科技有限公司 Fire-fighting answer method, mobile terminal and readable storage medium storing program for executing based on artificial intelligence
CN107273350A (en) * 2017-05-16 2017-10-20 广东电网有限责任公司江门供电局 A kind of information processing method and its device for realizing intelligent answer
CN107315766A (en) * 2017-05-16 2017-11-03 广东电网有限责任公司江门供电局 A kind of voice response method and its device for gathering intelligence and artificial question and answer
CN107436916B (en) * 2017-06-15 2021-04-27 百度在线网络技术(北京)有限公司 Intelligent answer prompting method and device
CN109213777A (en) * 2017-06-29 2019-01-15 杭州九阳小家电有限公司 A kind of voice-based recipe processing method and system
CN107491425A (en) * 2017-07-26 2017-12-19 合肥美的智能科技有限公司 Determine method, determining device, computer installation and computer-readable recording medium
CN107463699A (en) * 2017-08-15 2017-12-12 济南浪潮高新科技投资发展有限公司 A kind of method for realizing question and answer robot based on seq2seq models
CN107679039B (en) * 2017-10-17 2020-12-29 北京百度网讯科技有限公司 Method and device for determining statement intention
CN108170780A (en) * 2017-12-26 2018-06-15 北京邦邦共赢网络科技有限公司 A kind of the problem of self-service question and answer matching process and device
CN108108449A (en) * 2017-12-27 2018-06-01 哈尔滨福满科技有限责任公司 A kind of implementation method based on multi-source heterogeneous data question answering system and the system towards medical field
CN108345672A (en) * 2018-02-09 2018-07-31 平安科技(深圳)有限公司 Intelligent response method, electronic device and storage medium
CN108763356A (en) * 2018-05-16 2018-11-06 深圳市三宝创新智能有限公司 A kind of intelligent robot chat system and method based on the search of similar sentence
CN108959360A (en) * 2018-05-17 2018-12-07 合肥利元杰信息科技有限公司 A kind of technological development technical support question answering system
CN109344236B (en) * 2018-09-07 2020-09-04 暨南大学 Problem similarity calculation method based on multiple characteristics
CN109460457A (en) * 2018-10-25 2019-03-12 北京奥法科技有限公司 Text sentence similarity calculating method, intelligent government affairs auxiliary answer system and its working method
CN110088748B (en) * 2019-03-19 2023-11-14 京东方科技集团股份有限公司 Question generation method and device, question inquiry system and computer readable storage medium
CN110046244B (en) * 2019-04-24 2021-06-08 中国人民解放军国防科技大学 Answer selection method for question-answering system
CN110245219A (en) * 2019-04-25 2019-09-17 义语智能科技(广州)有限公司 A kind of answering method and equipment based on automatic extension Q & A database
CN110489475B (en) * 2019-08-14 2021-01-26 广东电网有限责任公司 Multi-source heterogeneous data processing method, system and related device
CN111813902B (en) * 2020-05-21 2024-02-23 车智互联(北京)科技有限公司 Intelligent response method, system and computing device
CN111984763B (en) * 2020-08-28 2023-09-19 海信电子科技(武汉)有限公司 Question answering processing method and intelligent device
CN112149428A (en) * 2020-10-12 2020-12-29 珍岛信息技术(上海)股份有限公司 Intelligent writing auxiliary system based on semantic analysis and deep learning
CN112507198B (en) * 2020-12-18 2022-09-23 北京百度网讯科技有限公司 Method, apparatus, device, medium, and program for processing query text
CN112749265B (en) * 2021-01-08 2022-08-19 哈尔滨工业大学 Intelligent question-answering system based on multiple information sources
CN114817512B (en) * 2022-06-28 2023-03-14 清华大学 Question-answer reasoning method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6766320B1 (en) * 2000-08-24 2004-07-20 Microsoft Corporation Search engine with natural language-based robust parsing for user query and relevance feedback learning
CN1821991A (en) * 2005-02-18 2006-08-23 上海赢思软件技术有限公司 Knowledge question-and-answer quick processing system based on artificial intelligence
CN1928864A (en) * 2006-09-22 2007-03-14 浙江大学 FAQ based Chinese natural language ask and answer method
CN101030267A (en) * 2006-02-28 2007-09-05 腾讯科技(深圳)有限公司 Automatic question-answering method and system
CN101097573A (en) * 2006-06-28 2008-01-02 腾讯科技(深圳)有限公司 Automatically request-answering system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6766320B1 (en) * 2000-08-24 2004-07-20 Microsoft Corporation Search engine with natural language-based robust parsing for user query and relevance feedback learning
CN1821991A (en) * 2005-02-18 2006-08-23 上海赢思软件技术有限公司 Knowledge question-and-answer quick processing system based on artificial intelligence
CN101030267A (en) * 2006-02-28 2007-09-05 腾讯科技(深圳)有限公司 Automatic question-answering method and system
CN101097573A (en) * 2006-06-28 2008-01-02 腾讯科技(深圳)有限公司 Automatically request-answering system and method
CN1928864A (en) * 2006-09-22 2007-03-14 浙江大学 FAQ based Chinese natural language ask and answer method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CN 101097573 A,全文.
周法国,杨炳儒.句子相似度计算新方法及在问答系统中的应用.计算机工程与应用2008,44 1.2008,200844(1),165-167.
周法国,杨炳儒.句子相似度计算新方法及在问答系统中的应用.计算机工程与应用2008,44 1.2008,200844(1),165-167. *

Also Published As

Publication number Publication date
CN101286161A (en) 2008-10-15

Similar Documents

Publication Publication Date Title
CN101286161B (en) Intelligent Chinese request-answering system based on concept
CN104331449B (en) Query statement and determination method, device, terminal and the server of webpage similarity
CN104361127B (en) The multilingual quick constructive method of question and answer interface based on domain body and template logic
CN109271505A (en) A kind of question answering system implementation method based on problem answers pair
CN101398814A (en) Method and system for simultaneously abstracting document summarization and key words
CN101539907A (en) Part-of-speech tagging model training device and part-of-speech tagging system and method thereof
CN104484380A (en) Personalized search method and personalized search device
CN109447266A (en) A kind of agricultural science and technology service intelligent sorting method based on big data
CN100511214C (en) Method and system for abstracting batch single document for document set
CN106407113A (en) Bug positioning method based on Stack Overflow and commit libraries
CN113239148B (en) Scientific and technological resource retrieval method based on machine reading understanding
Wang et al. Neural related work summarization with a joint context-driven attention mechanism
CN112036178A (en) Distribution network entity related semantic search method
CN114090861A (en) Education field search engine construction method based on knowledge graph
Li et al. The mixture of textrank and lexrank techniques of single document automatic summarization research in Tibetan
Piryani et al. Sentiment analysis in Nepali: exploring machine learning and lexicon-based approaches
CN113946686A (en) Electric power marketing knowledge map construction method and system
CN113434767A (en) UGC text content mining method, system, device and storage medium
Mohnot et al. Hybrid approach for Part of Speech Tagger for Hindi language
CN111428031A (en) Graph model filtering method fusing shallow semantic information
CN116108175A (en) Language conversion method and system based on semantic analysis and data construction
CN112989811A (en) BilSTM-CRF-based historical book reading auxiliary system and control method thereof
Fu et al. Domain ontology learning for question answering system in network education
CN111062218A (en) Semantic similarity calculation method combining dependency relationship and synonym forest
Li et al. Extracting answers to natural language questions from large-scale corpus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20101006

Termination date: 20140528