CN101169780A - Semantic ontology retrieval system and method - Google Patents
Semantic ontology retrieval system and method Download PDFInfo
- Publication number
- CN101169780A CN101169780A CNA2006101498039A CN200610149803A CN101169780A CN 101169780 A CN101169780 A CN 101169780A CN A2006101498039 A CNA2006101498039 A CN A2006101498039A CN 200610149803 A CN200610149803 A CN 200610149803A CN 101169780 A CN101169780 A CN 101169780A
- Authority
- CN
- China
- Prior art keywords
- semantic
- index
- text
- file
- semantic body
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Abstract
The inventive embodiment discloses an indexing system based on a semantic body. The system comprises a semantic body index database and a semantic body indexing processing unit. The semantic body indexing processing unit is used for acquiring a text hit file list, matching the text hit file list with a semantic body index in the semantic body index database to obtain a file semantic classification list. The indexing system can identify the semantic information of a file to be indexed, and the semantic classification result is presented in the searching result. The invention also discloses an indexing method based on semantic body, which comprises constructing semantic body index for the file with constructed text index, and performing semantic body index matching to the text matching result when user is searching, so that the semantic classification is presented in the final output result based on the conventional text matching result, thereby facilitating user's searching.
Description
Technical field
The present invention relates to information retrieval technique, particularly a kind of searching system and method based on semantic body.
Background technology
Develop rapidly along with retrieval technique, it is ripe that the text based information retrieval technique also is tending towards gradually, formed one and overlapped complete thinking and perfect algorithm, and be widely applied in all kinds of search engines, as Google (Google), AltaVista, Lycos, Yahoo (Yahoo) etc.
Fig. 1 is the structured flowchart of existing a kind of text search engine.As shown in Figure 1, existing text search engine comprises: spider control module 101, unified resource location (URL) database 102, Web Spider 103, URL extraction module 104, web database 105, link information extraction module 106, text index module 107, linked database 108, index data base 109, webpage grading module 110 and querying server 111.
Web Spider 103 grasps webpage from the internet, and webpage is sent into web database 105.URL extraction module 104 extracts URL from the webpage that Web Spider 103 grasps, and URL is sent into url database 102.Spider control module 101 is obtained the URL of webpage from url database 102, and Control Network spider 103 grasps other webpages, repeats above-mentioned steps up to all webpages have been grasped.
System obtains text message from web database 105, and sends into text index module 107, sets up index by text index module 107, sends into index data base 109 again.Link information extraction module 106 obtains link information from web database 105 simultaneously, and sends into linked database 108.Link information in the linked database 108 provides the foundation of webpage grading for webpage grading module 110.
When the user submits query requests to by querying server 111, querying server 111 is searched the webpage relevant with the user inquiring request in index data base 109, webpage grading module 110 combines the evaluation of Search Results being carried out the degree of correlation to the link information in user inquiring request and the linked database 108 simultaneously, and sort according to its degree of correlation by 111 pairs of Search Results of querying server, organize the last page to return to the user.
Though existing text retrieval technology can search the file of the text query information that comprises the user, can't identify the content and the meaning of the file that searches.This is based on the text-string coupling because of existing text retrieval technology, the problem of this retrieval technique is, when different speech can represent that identical meaning or a speech have different meanings in different linguistic context, will limit the precision ratio and the recall ratio of retrieval, the result who causes searching can not satisfy user's demand far away, for example, when user's searching key word is " paradise ", can't judge that the file that meets the user search condition is reflection " paradise recreation " or the content of " paradise music ".And the proposition of semantic net provides opportunity for addressing these problems.
Semantic net is can be by the network that constitutes of webpage of computer controlled automatic and its content of identification by a group, be on basis, existing internet, for webpage expansion computing machine can recognition data, and the document that computing machine uses is specialized in increase, promptly webpage is marked with the ontology language, clear and definite its semanteme, thus make info web not only be understood by the people, also can be by computer controlled automatic and identification.The webpage of semantic tagger is that data are done mark with extend markup language (XML) or hypertext markup language (Html) generally, as the data description model, and in conjunction with semantic body, makes the data that are marked have clear and definite semanteme with resource description framework (RDF).Body is a notion that comes from philosophy, and original meaning is meant that the back is introduced by artificial intelligence field, refers in particular to one of generalities explicit specification about the theory of existence and essence and rule.Body can be with each conception of species in the field and mutual relationship explicitly, express formally, thereby the semantic explicitly of term is expressed, thereby plays an important role aspect semantic query.Here the semantic ontology definition that refers to form the basic terms of subject area notion and the relation between them, and stipulated the rule for extent of combination basic terms and the contextual definition vocabulary between them.
The purpose of semantic retrieval is by the data of obtaining from semantic net, strengthens and improves traditional Search Results.Fig. 2 is the structured flowchart of existing a kind of semantic search system.As shown in Figure 2, existing semantic search system comprises: query interface 201, inquiry pretreatment module 202, semantic ontology inference engine 203, mark ontology library 204, traditional search module 205 and result return interface 206.
The Query Information of inquiry pretreatment module 202 analysis user by the segmenting word technology, is cut into searching keyword with it, and sends to semantic ontology inference engine 203.
Semantic ontology inference engine 203 is according to the Ontological concept vocabulary of definition in the mark ontology library 204 and the relation between notion and the notion, and coupling infers the pairing Ontological concept vocabulary of searching keyword, and it is returned to inquiry pretreatment module 202.
The Ontological concept vocabulary that inquiry pretreatment module 202 is returned semantic ontology inference engine 203 sends to traditional search module 205, and indicates according to semantic search.Here be meant at webpage according to semantic search to be marked under the semantic situation, carry out string matching according to the semantic concept of webpage label, rather than directly the content of webpage self is carried out string matching.
As can be seen, the semantic concept vocabulary of user inquiring keyword with the mark webpage is mated in above-mentioned semantic search system.
In sum, though existing text retrieval technology can search the file that comprises searching keyword, can't identify the semantic information of the file that searches; And existing semantic retrieval technology is no longer done keyword retrieval, and the file that causes searching comprises the too many result who does not conform to user inquiring information, and also not fully up to expectations based on the matching efficiency of user inquiring keyword and semantic concept vocabulary.So the search accuracy of existing retrieval technique is not high.
Summary of the invention
In view of this, the fundamental purpose of the embodiment of the invention is to provide a kind of searching system based on semantic body, to improve the accuracy of search.
Another purpose of the embodiment of the invention is to provide a kind of search method based on semantic body, to improve the accuracy of search.
For achieving the above object, technical scheme of the present invention is achieved in that
The embodiment of the invention discloses a kind of searching system based on semantic body, this system comprises:
Semantic body index data base is used to preserve semantic body index;
Semantic body search processing is used to obtain the tabulation of text hit file, and the semantic body index in tabulation of text hit file and the semantic body index data base is carried out matching treatment, obtains the document semantic sorted table.
The embodiment of the invention also discloses a kind of search method based on semantic body, this method may further comprise the steps:
A, obtain the file of setting up text index, and set up semantic body index for the file that obtains;
B, obtain text hit file tabulation, semantic body index matching treatment is carried out in tabulation to the text hit file, obtains the document semantic sorted table.
Therefore, searching system and method that the embodiment of the invention provides based on semantic body, have the following advantages: set up semantic body index for the file of setting up text index earlier, when user search, the text query information of user's input is carried out the text index matching treatment obtain the tabulation of text hit file, semantic body index matching treatment is carried out in tabulation to the text hit file again, obtain the document semantic sorted table, make the text retrieval result have semantic classification information, improved the accuracy of search.
Description of drawings
Fig. 1 is the structured flowchart of existing text search engine;
Fig. 2 is the structured flowchart of existing semantic search system;
Fig. 3 is the structured flowchart of a kind of searching system based on semantic body of the embodiment of the invention;
Fig. 4 is the process flow diagram that semantic body index is set up in the semantic body index process unit in the embodiment of the invention;
Fig. 5 is embodiment of the invention searching system shown in Figure 3 is carried out search procedure for the user a process flow diagram;
Fig. 6 is two resource description synoptic diagram of embodiment of the invention definition;
Fig. 7 is the result schematic diagram that is inferred by Fig. 6;
Fig. 8 is the graph of a relation of mark ontology library for the semantic body vocabulary among the embodiment is set up in the embodiment of the invention;
Fig. 9 is the graph of a relation after the semantic body vocabulary among Fig. 8 passes through reasoning.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, the present invention is described in further detail below in conjunction with drawings and the specific embodiments.
Fig. 3 is the structured flowchart of a kind of searching system based on semantic body of the embodiment of the invention.As shown in Figure 3, this system comprises: search interface module 301, document semantic classifying rules engine 3 02, search processing module 303, semantic ontology inference engine 3 04, mark ontology library 305, index data base 306, index process module 307, document data bank 308 and network file grasp module 309.Wherein, search processing module 303 comprises: text search processing unit 310, semantic body search processing 311 and ordering processing unit 312; Index data base 306 comprises: text index 315 and semantic body index 316; The index process module comprises: text index processing unit 313 and semantic body index process unit 314.
Network file grasps module 309 main being responsible for and grasps webpage from the internet, and the webpage that grasps is saved in the document data bank 308.It generally is by the webpage capture program that network file grasps module 309, for example " network robot " or " Web Spider " etc., the traversal page space, scan the website in certain Internet protocol (IP) address realm, and the link on the network from a webpage to another webpage, from a website to another website, the collection network file.
Text index processing unit 313 is traditional processing units of setting up text index, by the Study document content, extracts the identification information of keyword and file, sets up text index.Setting up flow process in view of traditional text index is ripe prior art, no longer repeats here.
Semantic body index process unit 314 is responsible for the file of setting up text index and sets up semantic body index.At first analyze the file of having set up text index, judge whether it contains semantic tagger information,, then extract relevant semantic tagger information and file identification information, set up the semantic body index of this document if certain file contains semantic tagger information.
Text search processing unit 310 is responsible for the text query information and the text index 315 of user's input are mated, and inquires the text hit file identification information that meets the user inquiring condition.
Text hit file identification information and semantic body index 316 that semantic body search processing 311 is responsible for text search processing unit 310 is drawn carry out matching treatment, these text hit file identification informations are carried out semantic classification, obtain the document semantic sorted table.
Ontological concept word finder in the document semantic sorted table that mark ontology library 305 and semantic ontology inference engine 3 04 are responsible for semantic body search processing 311 is produced carries out semantic reasoning, the semantic body word finder that is expanded.Wherein mark ontology library 305 and preserved the semantic Ontological concept word finder of definition and the relation between the semantic Ontological concept thereof, semantic ontology inference engine 3 04 has defined inference rule and has carried out the reasoning operation.
Document semantic classifying rules engine 3 02 triggers the semantic classification rule that self defines according to the situation that semantic ontology inference engine 3 04 infers, and the document semantic sorted table is expanded integration.
The text index 315 that index data base 306 is preserved comprises text forward index and text inverted index.Table 1 is a text forward concordance list, and table 2 is text inverted index tables, as shown in Table 1 and Table 2:
Table 1
File identification (DocID) | Keyword |
1 | Paradise, music ... |
2 | Application, software ... |
3 | Use ... |
4 | Paradise, recreation ... |
... | ... |
Table 2
Keyword | File identification sequence (DocID) |
The paradise | 1、4、... |
Use | 2、3、... |
... | ... |
From above two forms as can be seen, text forward index is to be key assignments with the file identification, sets up the mapping relations between file identification and the keyword; And the text inverted index is key assignments with the keyword, sets up the mapping relations between keyword and the file identification.
Equally, the semantic body index 315 of index data base 306 preservations comprises semantic body forward index and semantic body inverted index.Table 3 is semantic body forward concordance lists, the semantic body inverted index table of table 4, shown in table 3 and table 4:
Table 3
File identification (DocID) | Semantic sign |
1 | Pop music |
2 | Classical music |
3 | Novel |
4 | Computer game |
5 | Pop music |
... | ... |
Table 4
Semantic sign | File identification sequence (DocID) |
Pop music | 1、5、... |
Classical music | 2、... |
Novel | 3、... |
Computer game | 4、... |
... | ... |
Semantic body forward index is to be key assignments with the file identification, sets up the mapping relations between file identification and semantic the sign; And semantic body inverted index is designated key assignments with semanteme, sets up the mapping relations between semantic sign and the file identification.
Fig. 4 is the process flow diagram that semantic body index 316 is set up in the semantic body index process unit 314 in the embodiment of the invention.The flow process of setting up of semantic body index is to carry out on the text index processing unit has been set up the basis of text index, and it carries out trigger condition is that text index processing unit 313 has been set up text index to certain file.Referring to Fig. 4, the flow process of setting up of semantic body index may further comprise the steps:
The file of semantic tagger and the difference of not passing through between the file of semantic tagger are that the file of semantic tagger has been set up the Ontological concept map information.For example, a file identification is 9, and network address is
Http:// grids.ucs.indiana.edu/ptliupages/publications/index.htmlThe content of webpage mainly be to have described the relevant item that makes a search and should be noted that, then can be " research (Research) " notion with this webpage label.Existing semantic tagger information is with the note form a bit, and some is with in the XML packet form embedded web page.In this example, provide one with the text marking instrument OntoMat of Stanford University mark, the semantic tagger information of representing with the note form:
<html>
<head>
<!--<rdf:RDF?xmlns:rdf=″
http://www.w3.org/1999/02/22-rdf-syntax-ns#″
xmlns:daml=″http://www.daml.org/2001/03/daml+oil#″
xmlns=″http://annotation.semanticweb.org/iswc/iswc.daml#″
<Research?rdf:about=″http://grids.ucs.indiana.edu/ptliupages/publications/index.html″
</rdf:RDF>
-->
<title>Community?Grids?Publications</title>
This example expression webpage
Http:// grids.ucs.indiana.edu/ptliupages/publications/index.htmlContent mainly be about " Research ".For the webpage with OntoMat instrument mark, its semantic tagger information is placed in the annotation information in the Html head, with<the rdf:RDF beginning, with</rdf:RDF〉ending.Therefore, be with<rdf:RDF beginning when semantic body index process unit 314 detects semantic tagger information, with</rdf:RDF〉ending, judge that then this web page files was marked by semantic marker.
Semantic in the present embodiment body index process unit 314 reads the semantic tagger information that file identification is 9 webpage, promptly reads the annotation information in the Html head.Table 5 is to extract semantic tagger information format table, and is as shown in table 5:
Table 5
File identification (DocID) | Semantic tagger information |
9 | <rdf:RDF xmlns:rdf=″ http://www.w3.org/1999/02/22-rdf-svntax-ns#″ xmlns:daml=http://www.daml.org/2001/03/daml+oil#″ xmlns=″http://annotation.semanticweb.org/iswc/iswc.daml#″ <Research rdf:about=″http://grids.ucs.indiana.edu/ptliupages/publications/index.html″ </rdf:RDF> |
… | … |
Semantic in the present embodiment body index process unit 314 calls relevant RDF document processing application DLL (dynamic link library) (API), from semantic tagger information, extract semantic Ontological concept vocabulary " Research ", set up the semantic body forward index of webpage 9, and convert semantic body inverted index simultaneously to, shown in table 6 and table 7.Table 6 is semantic body forward index of webpage 9, and table 7 is semantic body inverted indexs of webpage 9, shown in table 6 and table 7:
Table 6
File identification (DocID) | Semantic sign |
9 | Research |
Table 7
Semantic sign | File identification (DocID) |
Research | 9 |
Set up before the semantic body index, why to pass through the treatment step of text index processing unit 313 earlier, be because when user search, will inquire the file of the text query information that meets user's input earlier, and then these files are carried out semantic body index matching treatment.The treatment step of text index processing unit 313 has guaranteed that each has set up text index, and the file that semantic information is arranged, corresponding semantic body index information is all arranged in semantic body index 316, thereby avoid because directly read that file carries out semantic body index coupling and the file that produces has semantic body index and do not have the situation of text index from document data bank 308.
Fig. 5 is that embodiment of the invention searching system shown in Figure 3 is the process flow diagram that the user carries out search procedure, as shown in Figure 5, may further comprise the steps:
The detailed process that cutting is handled all has description in the pertinent literature of existing description search engine, no longer repeat here.Present embodiment Chinese version Query Information " paradise " is keyword " paradise " through the pretreated result of cutting.
After text search processing unit 310 receives searching keyword, send the solicited message that reads the text inverted index to index data base 306, index data base 306 is according to the text inverted index in the request returned text index 315.Text search processing unit 310 mates user inquiring keyword " paradise " and text inverted index, obtain a series of web page files sign---text hit file identification lists that comprise this keyword, and the tabulation of text hit file is sent to semantic body search processing 311 handle.
For the sake of simplicity, hypothesis has only been set up index to 20 files in the present embodiment.Table 8 is the text inverted index table that index data base 306 returns to text search processing unit 310, and is as shown in table 8:
Table 8
Keyword | The file identification sequence |
… | … |
Use | 01011100100011011010 |
The paradise | 11011011111110001011 |
… | … |
In the table 8, corresponding keyword of each row and the file identification sequence that this keyword occurred.Wherein, the general act number of index is set up in scale-of-two total bit 20 expressions of file identification sequence, each binary digit is represented a file, the position number of binary digit is identical with the file identification sequence number, be that to represent to identify sequence number be 1 file to first binary digit, it is 2 file that second binary digit represents to identify sequence number, and the like.If certain binary digit is 0, represent that corresponding keyword does not occur in corresponding file, in corresponding file if 1 corresponding keyword of expression occurs.
Text search processing unit 310 matches " paradise " keyword in the table 8 with user inquiring keyword " paradise ", with file identification sequence thereafter, be that text hit file tabulation 11011011111110001011 is taken out, send to semantic body search processing 311.In the text hit file tabulation binary digit be 1 be exactly the file that hits.
In like manner if the text query information of user's input is " paradise application ", through obtaining keyword " paradise " and keyword " application " after the cutting pre-service, therefore need only " paradise " and " application " two keywords that match respectively in the text inverted index, thereafter file identification sequence done with operation obtaining result 01011000100010001010, wherein binary digit is 1 is illustrated in the corresponding file and occurred " paradise " and " application " two keywords simultaneously.
Step 504 after semantic body search processing 311 obtains the tabulation of text hit file, at first judges whether to carry out semantic body inverted index matching treatment.
The foundation that semantic body search processing 311 is judged is the number of text hit file, if the number of hit file is then carried out semantic body inverted index matching treatment, execution in step 505 greater than certain threshold values; Otherwise carry out semantic body forward index matching treatment, execution in step 506.Threshold values can be used as predefined value storage in semantic body search processing 311, also can be that searching system is according to statistical law or the dynamic numerical value of adjusting of other condition.
After semantic body search processing 311 received text hit file tabulation 11011011111110001011, it was 14 that accumulation calculating obtains in this binary sequence 1 number, and promptly text hit file number is 14.Suppose that threshold values is 10,, therefore carry out semantic body inverted index matching treatment because 14 greater than 10.If threshold values is 15, then owing to 14 less than 15, carry out semantic body forward index matching treatment.
At first, semantic body search processing 311 sends the request message that reads semantic body inverted index to index data base 306.Index data base 306 returns semantic body inverted index according to request.Each bar record that semantic body search processing 311 is read in the semantic body inverted index successively, file identification sequence in the record and the tabulation of text hit file are done intersection operation, be about to two binary sequences and carry out step-by-step and operation, cover corresponding file identifier in the semantic body inverted index table with operating result then.At last, filter out the record that occurs simultaneously for empty, then original semantic body inverted index table has just become the document semantic sorted table.Execution in step 507.
Table 9 is the semantic body inverted index table that index data base 306 returns to semantic body search processing 311 in the present embodiment, and is as shown in table 9:
Table 9
Semantic sign | The file identification sequence |
Pop music | 01011010110001100000 |
Computer game | 10100101000100001011 |
Classical music | 00010000001011000001 |
Novel | 10000000000000000110 |
The sports star | 00000000000000010000 |
Suppose in the table 9 that 20 files setting up index only relate to five semantic Ontological concepts, promptly the sign of the semanteme in all files has five kinds.File identification sequence after each semantic sign is represented the situation that this Ontological concept occurs in 20 files.Its method for expressing is with the file identification sequence in the text inverted index, and each binary digit is represented a file, and the position number of binary digit is identical with the sign sequence number of file.If certain binary digit is 0, the expression corresponding file does not mark corresponding Ontological concept, if 1 expression has marked corresponding Ontological concept.For example the file identification sequence of pop music is 01011010110001100000, and the expression file identification is the notion that 2,4,5,7,9,10,14,15 file is marked into pop music, has reflected that the content of these files is relevant with pop music.
Semantic body search unit 311 reads each file identification sequence in the semantic body inverted index shown in the table 9, do step-by-step and operation with text hit file tabulation 11011011111110001011, operating result is deposited in the position of corresponding file identifier in the table 9, and cover original file identification sequence, filter out at last and occur simultaneously for empty, be complete zero semantic identification item both, produced the document semantic sorted table with operating result.Table 10 is the document semantic sorted table that produces, and is as shown in table 10:
Table 10
Semantic sign | The file identification sequence |
Pop music | 01011010110000000000 |
Computer game | 10000001000100001011 |
Classical music | 00010000001010000001 |
Novel | 10000000000000000010 |
Like this, just text hit file tabulation 11011011111110001011 has been classified by semanteme.
At first, semantic body search processing 311 sends the request message that reads semantic body forward index to index data base 306.Table 11 is that index data base 306 returns semantic body forward concordance list according to the request of semantic body search processing 311, and is as shown in table 11:
Table 11
File identification | Semantic sign |
1 | Computer game, novel |
2 | Pop music |
3 | Computer game |
4 | Pop music, classical music |
5 | Pop music |
6 | Computer game |
7 | Pop music |
8 | Computer game |
9 | Pop music |
10 | Pop music |
11 | Classical music |
12 | Computer game |
13 | Classical music |
14 | Pop music, classical music |
15 | Pop music |
16 | The sports star |
17 | Computer game |
18 | Novel |
19 | Computer game, novel |
20 | Computer game, classical music |
Semantic body search processing 311 is converted into concrete file identification with text hit file tabulation 11011011111110001011: 1,2,4,5,7,8,9,10,11,12,13,17,19,20, and be that querying condition mates corresponding record in semantic body forward index with each file identification, obtain a semantic body forward index that only comprises these file identifications.Table 12 is the semantic body forward concordance list that obtains by said process, and is as shown in table 12:
Table 12
File identification | Semantic sign |
1 | Computer game, novel |
2 | Pop music |
4 | Pop music, classical music |
5 | Pop music |
7 | Pop music |
8 | Computer game |
9 | Pop music |
10 | Pop music |
11 | Classical music |
12 | Computer game |
13 | Classical music |
17 | Computer game |
19 | Computer game, novel |
20 | Computer game, classical music |
At last, be key assignments with each the semantic Ontological concept that occurs in the table 12, count the file identification that this key assignments occurs, finish the conversion that forward indexes inverted index, produce the document semantic sorted table.Table 13 is to obtain the document semantic sorted table by said process, and is as shown in table 13:
Table 13
Semantic sign | The file identification sequence |
Pop music | 01011010110000000000 |
Computer game | 10000001000100001011 |
Classical music | 00010000001010000001 |
Novel | 10000000000000000010 |
Execution in step 507 then.
Why being divided into semantic body inverted index matching treatment and semantic body forward index matching treatment, is to consider efficiency.Because in the process of carrying out semantic body inverted index matching treatment, need mate each bar record in the semantic body inverted index successively with text hit file tabulation, and do intersection operation, the process of the semantic body inverted index of this full table scan, its calculated amount expense is very big.Therefore, when the number of text hit file seldom the time, carry out semantic body forward index matching treatment and can reduce calculated amount.But no matter use which kind of matching process, the document semantic sorted table of Chan Shenging all is identical at last, and promptly table 13 is identical with table 10.
After semantic body search processing 311 executes the matching operation of semantic body index, at first the semantic Ontological concept vocabulary in the document semantic sorted table is sent to semantic ontology inference engine 3 04 and carry out semantic reasoning.Semantic ontology inference engine 3 04 produces the RDF document that concerns between the semantic body vocabulary of expression according to the semantic Ontological concept of definition in the body mark storehouse 305 and the inference rule of relation and self definition thereof, returns to semantic body search processing 311.Then, semantic body search processing 311 is mated the trigger condition in the semantic classification rule of definition in this RDF document and the document semantic classifying rules engine 3 02, judge which semantic classification rule needs to trigger, and trigger corresponding rule, produce document semantic sorted table through the reasoning expansion.At last, the semantic document classification table after the expansion is sent to ordering processing unit 312.
In the present embodiment, semantic body search processing 311 is with four in table 10 or the table 13 semantic Ontological concept speech, and pop music, computer game, classical music, novel send to semantic ontology inference engine 3 04 and carry out reasoning.The reasoning principle of semantic ontology inference engine 3 04 is: the representation according to the RDF tlv triple of resource, carry out reasoning according to the inference rule of definition and handle.The form of expression of RDF tlv triple is: (main body, predicate, individuality).For example define two resource descriptions as shown in Figure 6: Shenzhen 601 belongs to Guangdong 602; Guangdong 602 belongs to China 603.Defining an inference rule simultaneously is: (? a belongs to,? b), (? b belongs to,? c) → (? a belongs to,? c).The implication that this inference rule is expressed is: if a belongs to b, and b belongs to c, then can infer a and belong to c.Therefore, can infer result shown in Figure 7 from relation shown in Figure 6: Shenzhen 601 belongs to China 603.
Suppose in the mark ontology library 305 four Ontological concepts of present embodiment to have been set up relation as shown in Figure 8: the parent of pop music 801 is a popular music 802, and the parent of popular music 802 and classical music 803 is music 804; The parent of novel 805 is a literature 806; The parent of computer game 807 is recreation.Then the RDF relation of four Ontological concepts that obtain after the reasoning of process inference rule as shown in Figure 9: the parent of pop music 801 and classical music 803 is music 804; The parent of novel 805 is a literature 806; The parent of computer game 807 is recreation 808.Its RDF tlv triple output format is:
(pop music, parent, music)
(classical music, parent, music)
(novel, parent, literature)
(computer game, parent, recreation)
Defined such semantic classification rule in the document semantic classifying rules engine 3 02: if there is common individuality in a plurality of tlv triple, and predicate is " parent ", then in the document semantic sorted table, increase new document classification, item name is this individual title, the file identification sequence is the union of each main body vocabulary corresponding file identifier in a plurality of tlv triple, i.e. the sequence as a result of step-by-step or operation.Table 14 is above-mentioned semantic classification rule list, and is as shown in table 14:
Table 14
Trigger condition | Executable operations |
Exist a plurality of (? X1, parent,? Y) (? X2, parent,? Y) | The document semantic sorted table increases a record.Is the semanteme of this record designated? Y, is the file identification sequence? X1,? X1 ... the union of corresponding file identifier |
Document semantic sorted table after then expansion is integrated through the semantic reasoning processing and according to the semantic classification rule.Table 15 is the document semantic sorted table after the expansion, and is as shown in Table 15:
Table 15
Semantic sign | The file identification sequence |
Pop music | 01011010110000000000 |
Computer game | 10000001000100001011 |
Classical music | 00010000001010000001 |
Novel | 10000000000000000010 |
Music | 01011010111010000001 |
The above is preferred embodiment of the present invention only, is not to be used for limiting protection scope of the present invention.
Claims (24)
1. the searching system based on semantic body is characterized in that, this system comprises:
Semantic body index data base is used to preserve semantic body index;
Semantic body search processing is used to obtain the tabulation of text hit file, and the semantic body index in tabulation of text hit file and the semantic body index data base is carried out matching treatment, obtains the document semantic sorted table.
2. the system as claimed in claim 1 is characterized in that, this system further comprises semantic body index process unit, is used to obtain the file of having set up text index, and is that the file that obtains is set up semantic body index.
3. system as claimed in claim 2 is characterized in that, this system further comprises:
The text index processing unit is used to file to set up text index;
The text index database is used to preserve text index;
The text search processing unit is used for the text query information of match user and the text index of text index database, obtains the tabulation of text hit file.
4. as claim 1,2 or 3 described systems, it is characterized in that this system further comprises:
Semantic ontology inference engine according to semantic body word finder in the mark ontology library and the relation between the semantic body vocabulary, carries out semantic reasoning to the semantic body word finder in the described document semantic sorted table, the semantic body word finder that is expanded;
The mark ontology library is used to preserve the relation between semantic body word finder and the semantic body vocabulary;
Document semantic classifying rules engine, be used to preserve the semantic classification rule, and the semantic classification rule of the semantic body word finder match triggers correspondence of the expansion that infers according to semantic ontology inference engine, described document semantic sorted table is expanded integration, the document semantic sorted table after being expanded.
5. system as claimed in claim 4 is characterized in that this system further comprises the ordering processing unit, is used for the processing of sorting of file to the document semantic sorted table after the described expansion.
6. system as claimed in claim 5 is characterized in that this system further comprises the search interface module, is used for user's text query information is sent to described text search processing unit; And the ranking results of described ordering processing unit fed back to the user.
7. system as claimed in claim 3 is characterized in that this system further comprises document data bank, is used for storage file, sets up semantic body index and described text index processing unit for described semantic body index process unit and sets up text index and use.
8. system as claimed in claim 7 is characterized in that, this system comprises that further network file grasps module, is used for grasping network file from the internet, and is saved in the described document data bank.
9. system as claimed in claim 3 is characterized in that, described text index comprises text forward index and text inverted index; Described semantic body index comprises semantic body forward index and semantic body inverted index.
10. the system as claimed in claim 1 is characterized in that, described semantic body search processing is tabulated to the text hit file and carried out semantic body forward index matching treatment or carry out semantic body inverted index matching treatment.
11. system as claimed in claim 10, it is characterized in that described semantic body search processing, the number of the text hit file in text hit file tabulation are during greater than threshold values, carry out semantic body inverted index matching treatment, otherwise carry out semantic body forward index matching treatment.
12. system as claimed in claim 11 is characterized in that, described threshold values is predefined fixed numbers or the numerical value that can dynamically adjust.
13. system as claimed in claim 3 is characterized in that, described text search processing unit and semantic body search processing are integrated in the search processing module; Described text index processing unit and semantic body index process unit are integrated in the index process module; Described text index database and semantic body index data base are integrated into an index data base.
14. system as claimed in claim 5 is characterized in that, described text search processing unit, semantic body search processing and ordering processing unit are integrated in the search processing module.
15. the search method based on semantic body is characterized in that, this method may further comprise the steps:
A, obtain the file of setting up text index, and set up semantic body index for the file that obtains;
B, obtain text hit file tabulation, semantic body index matching treatment is carried out in tabulation to the text hit file, obtains the document semantic sorted table.
16. method as claimed in claim 15 is characterized in that,
Before steps A, further comprise, set up the step of text index for the file in the document data bank;
Before step B, further comprise, user's text query information is carried out the text index matching treatment, obtain the step of text hit file tabulation.
17., it is characterized in that this method further may further comprise the steps as claim 15 or 16 described methods:
C, the semantic body word finder in the described document semantic sorted table is carried out semantic reasoning, the semantic body word finder that is expanded;
D, the semantic body word finder of the expansion that goes out are by inference carried out expansion integrated operation, the document semantic sorted table after being expanded to described document semantic sorted table.
18. method as claimed in claim 17 is characterized in that, this method further comprises: the step that the file in the document semantic sorted table after the described expansion is sorted and handles.
19. method as claimed in claim 15 is characterized in that, setting up semantic body index described in the steps A is to set up semantic body forward index and set up semantic body inverted index; Described in the step B semantic body index matching treatment being carried out in text hit file tabulation, is to carry out semantic body inverted index matching treatment or carry out semantic body forward index matching treatment.
20. method as claimed in claim 15 is characterized in that, is further comprising before the step B: in step B, semantic body inverted index matching treatment is carried out in tabulation to the text hit file, or carries out the determining step of semantic body forward index matching treatment.
21. method as claimed in claim 20, it is characterized in that, described determining step is: when the number of the text hit file in the text hit file tabulation is carried out semantic body inverted index matching treatment at step B during greater than threshold values, otherwise carry out semantic body forward index matching treatment at step B.
22. method as claimed in claim 21 is characterized in that, described threshold values is predefined fixed numbers or the numerical value that can dynamically adjust.
23. method as claimed in claim 16 is characterized in that, the described text index of setting up is to set up text forward index and set up the text inverted index; Described text query information to the user is carried out the text index matching treatment, is to carry out text inverted index matching treatment or carry out text forward index matching treatment.
24. method as claimed in claim 16 is characterized in that, further comprises before setting up text index described: the step of setting up document data bank.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2006101498039A CN101169780A (en) | 2006-10-25 | 2006-10-25 | Semantic ontology retrieval system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2006101498039A CN101169780A (en) | 2006-10-25 | 2006-10-25 | Semantic ontology retrieval system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101169780A true CN101169780A (en) | 2008-04-30 |
Family
ID=39390409
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA2006101498039A Pending CN101169780A (en) | 2006-10-25 | 2006-10-25 | Semantic ontology retrieval system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101169780A (en) |
Cited By (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101799835A (en) * | 2010-04-21 | 2010-08-11 | 中国测绘科学研究院 | Ontology-driven geographic information retrieval system and method |
CN101917413A (en) * | 2010-07-29 | 2010-12-15 | 清华大学 | Service assembly system and method based on service quality optimization and semantic information integration |
CN101944099A (en) * | 2010-06-24 | 2011-01-12 | 西北工业大学 | Method for automatically classifying text documents by utilizing body |
CN101566984B (en) * | 2008-07-11 | 2011-02-09 | 博采林电子科技(深圳)有限公司 | Search engine used in personal hand-held equipment and resource search method |
CN102063453A (en) * | 2010-05-31 | 2011-05-18 | 百度在线网络技术(北京)有限公司 | Method and device for searching based on demands of user |
CN102073692A (en) * | 2010-12-16 | 2011-05-25 | 北京农业信息技术研究中心 | Agricultural field ontology library based semantic retrieval system and method |
CN102725759A (en) * | 2010-02-05 | 2012-10-10 | 微软公司 | Semantic table of contents for search results |
CN102750277A (en) * | 2011-04-18 | 2012-10-24 | 腾讯科技(深圳)有限公司 | Method and device for obtaining information |
CN102799677A (en) * | 2012-07-20 | 2012-11-28 | 河海大学 | Water conservation domain information retrieval system and method based on semanteme |
CN102880645A (en) * | 2012-08-24 | 2013-01-16 | 上海云叟网络科技有限公司 | Semantic intelligent search method |
CN103020283A (en) * | 2012-12-27 | 2013-04-03 | 华北电力大学 | Semantic search method based on dynamic reconfiguration of background knowledge |
CN103136360A (en) * | 2013-03-07 | 2013-06-05 | 北京宽连十方数字技术有限公司 | Internet behavior markup engine and behavior markup method corresponding to same |
CN103177123A (en) * | 2013-04-15 | 2013-06-26 | 昆明理工大学 | Method for improving database retrieval information relevancy |
CN103440284A (en) * | 2013-08-14 | 2013-12-11 | 郭克华 | Multimedia storage and search method supporting cross-type semantic search |
CN104462060A (en) * | 2014-12-03 | 2015-03-25 | 百度在线网络技术(北京)有限公司 | Method and device for calculating text similarity and realizing search processing through computer |
CN104615729A (en) * | 2014-10-30 | 2015-05-13 | 南京源成语义软件科技有限公司 | Network searching method based on semantic net technology |
CN104765779A (en) * | 2015-03-20 | 2015-07-08 | 浙江大学 | Patent document inquiry extension method based on YAGO2s |
CN104866598A (en) * | 2015-06-01 | 2015-08-26 | 北京理工大学 | Heterogeneous database integrating method based on configurable templates |
CN105160046A (en) * | 2015-10-30 | 2015-12-16 | 成都博睿德科技有限公司 | Text-based data retrieval method |
WO2016009321A1 (en) * | 2014-07-14 | 2016-01-21 | International Business Machines Corporation | System for searching, recommending, and exploring documents through conceptual associations and inverted table for storing and querying conceptual indices |
CN105335510A (en) * | 2015-10-30 | 2016-02-17 | 成都博睿德科技有限公司 | Text data efficient searching method |
CN102750277B (en) * | 2011-04-18 | 2016-12-14 | 深圳市世纪光速信息技术有限公司 | The method and apparatus of acquisition information |
CN103886099B (en) * | 2014-04-09 | 2017-02-15 | 中国人民大学 | Semantic retrieval system and method of vague concepts |
CN106951191A (en) * | 2017-03-22 | 2017-07-14 | 江苏金易达供应链管理有限公司 | Towards the big data storage method of auto service platform |
CN107004158A (en) * | 2014-11-27 | 2017-08-01 | 爱克发医疗保健公司 | Data repository querying method |
CN107590166A (en) * | 2016-12-20 | 2018-01-16 | 百度在线网络技术(北京)有限公司 | A kind of data creation method and device based on inquiry content |
CN108170739A (en) * | 2017-12-18 | 2018-06-15 | 深圳前海微众银行股份有限公司 | Problem matching process, terminal and computer readable storage medium |
WO2018141140A1 (en) * | 2017-02-06 | 2018-08-09 | 中兴通讯股份有限公司 | Method and device for semantic recognition |
CN109522414A (en) * | 2018-11-26 | 2019-03-26 | 吉林大学 | A kind of document delivery object selection system |
CN110245215A (en) * | 2019-06-05 | 2019-09-17 | 阿里巴巴集团控股有限公司 | A kind of text searching method and device |
US10496683B2 (en) | 2014-07-14 | 2019-12-03 | International Business Machines Corporation | Automatically linking text to concepts in a knowledge base |
US10503761B2 (en) | 2014-07-14 | 2019-12-10 | International Business Machines Corporation | System for searching, recommending, and exploring documents through conceptual associations |
US10572521B2 (en) | 2014-07-14 | 2020-02-25 | International Business Machines Corporation | Automatic new concept definition |
CN111199170A (en) * | 2018-11-16 | 2020-05-26 | 长鑫存储技术有限公司 | Formula file identification method and device, electronic equipment and storage medium |
CN111353055A (en) * | 2020-03-02 | 2020-06-30 | 中国传媒大学 | Intelligent tag extended metadata-based cataloging method and system |
CN112182239A (en) * | 2020-09-22 | 2021-01-05 | 中国建设银行股份有限公司 | Information retrieval method and device |
CN113779032A (en) * | 2021-09-14 | 2021-12-10 | 广州汇通国信科技有限公司 | Search engine index construction method and device based on recurrent neural network |
-
2006
- 2006-10-25 CN CNA2006101498039A patent/CN101169780A/en active Pending
Cited By (59)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101566984B (en) * | 2008-07-11 | 2011-02-09 | 博采林电子科技(深圳)有限公司 | Search engine used in personal hand-held equipment and resource search method |
CN102725759B (en) * | 2010-02-05 | 2015-11-25 | 微软技术许可有限责任公司 | For the semantic directory of Search Results |
CN102725759A (en) * | 2010-02-05 | 2012-10-10 | 微软公司 | Semantic table of contents for search results |
CN101799835B (en) * | 2010-04-21 | 2012-07-04 | 中国测绘科学研究院 | Ontology-driven geographic information retrieval system and method |
CN101799835A (en) * | 2010-04-21 | 2010-08-11 | 中国测绘科学研究院 | Ontology-driven geographic information retrieval system and method |
CN102063453A (en) * | 2010-05-31 | 2011-05-18 | 百度在线网络技术(北京)有限公司 | Method and device for searching based on demands of user |
CN101944099A (en) * | 2010-06-24 | 2011-01-12 | 西北工业大学 | Method for automatically classifying text documents by utilizing body |
CN101917413A (en) * | 2010-07-29 | 2010-12-15 | 清华大学 | Service assembly system and method based on service quality optimization and semantic information integration |
CN101917413B (en) * | 2010-07-29 | 2013-07-17 | 清华大学 | Service assembly system and method based on service quality optimization and semantic information integration |
CN102073692A (en) * | 2010-12-16 | 2011-05-25 | 北京农业信息技术研究中心 | Agricultural field ontology library based semantic retrieval system and method |
CN102073692B (en) * | 2010-12-16 | 2016-04-27 | 北京农业信息技术研究中心 | Based on the semantic retrieval system and method for agriculture field ontology library |
CN102750277A (en) * | 2011-04-18 | 2012-10-24 | 腾讯科技(深圳)有限公司 | Method and device for obtaining information |
CN102750277B (en) * | 2011-04-18 | 2016-12-14 | 深圳市世纪光速信息技术有限公司 | The method and apparatus of acquisition information |
CN102799677B (en) * | 2012-07-20 | 2014-11-12 | 河海大学 | Water conservation domain information retrieval system and method based on semanteme |
CN102799677A (en) * | 2012-07-20 | 2012-11-28 | 河海大学 | Water conservation domain information retrieval system and method based on semanteme |
CN102880645A (en) * | 2012-08-24 | 2013-01-16 | 上海云叟网络科技有限公司 | Semantic intelligent search method |
CN102880645B (en) * | 2012-08-24 | 2015-12-16 | 上海云叟网络科技有限公司 | The intelligent search method of semantization |
CN103020283A (en) * | 2012-12-27 | 2013-04-03 | 华北电力大学 | Semantic search method based on dynamic reconfiguration of background knowledge |
CN103020283B (en) * | 2012-12-27 | 2015-12-09 | 华北电力大学 | A kind of semantic retrieving method of the dynamic restructuring based on background knowledge |
CN103136360B (en) * | 2013-03-07 | 2016-09-07 | 北京宽连十方数字技术有限公司 | A kind of internet behavior markup engine and to should the behavior mask method of engine |
CN103136360A (en) * | 2013-03-07 | 2013-06-05 | 北京宽连十方数字技术有限公司 | Internet behavior markup engine and behavior markup method corresponding to same |
CN103177123B (en) * | 2013-04-15 | 2016-05-11 | 昆明理工大学 | A kind of method that improves database retrieval information correlation |
CN103177123A (en) * | 2013-04-15 | 2013-06-26 | 昆明理工大学 | Method for improving database retrieval information relevancy |
CN103440284B (en) * | 2013-08-14 | 2016-04-20 | 郭克华 | A kind of support across type semantic search multimedia store and searching method |
CN103440284A (en) * | 2013-08-14 | 2013-12-11 | 郭克华 | Multimedia storage and search method supporting cross-type semantic search |
CN103886099B (en) * | 2014-04-09 | 2017-02-15 | 中国人民大学 | Semantic retrieval system and method of vague concepts |
US10572521B2 (en) | 2014-07-14 | 2020-02-25 | International Business Machines Corporation | Automatic new concept definition |
WO2016009321A1 (en) * | 2014-07-14 | 2016-01-21 | International Business Machines Corporation | System for searching, recommending, and exploring documents through conceptual associations and inverted table for storing and querying conceptual indices |
US10496684B2 (en) | 2014-07-14 | 2019-12-03 | International Business Machines Corporation | Automatically linking text to concepts in a knowledge base |
US10503761B2 (en) | 2014-07-14 | 2019-12-10 | International Business Machines Corporation | System for searching, recommending, and exploring documents through conceptual associations |
US10503762B2 (en) | 2014-07-14 | 2019-12-10 | International Business Machines Corporation | System for searching, recommending, and exploring documents through conceptual associations |
US10496683B2 (en) | 2014-07-14 | 2019-12-03 | International Business Machines Corporation | Automatically linking text to concepts in a knowledge base |
US10956461B2 (en) | 2014-07-14 | 2021-03-23 | International Business Machines Corporation | System for searching, recommending, and exploring documents through conceptual associations |
CN104615729A (en) * | 2014-10-30 | 2015-05-13 | 南京源成语义软件科技有限公司 | Network searching method based on semantic net technology |
CN107004158A (en) * | 2014-11-27 | 2017-08-01 | 爱克发医疗保健公司 | Data repository querying method |
CN104462060B (en) * | 2014-12-03 | 2017-08-01 | 百度在线网络技术(北京)有限公司 | Pass through computer implemented calculating text similarity and search processing method and device |
CN104462060A (en) * | 2014-12-03 | 2015-03-25 | 百度在线网络技术(北京)有限公司 | Method and device for calculating text similarity and realizing search processing through computer |
CN104765779A (en) * | 2015-03-20 | 2015-07-08 | 浙江大学 | Patent document inquiry extension method based on YAGO2s |
CN104866598B (en) * | 2015-06-01 | 2018-05-08 | 北京理工大学 | Heterogeneous databases integration method based on configurable template |
CN104866598A (en) * | 2015-06-01 | 2015-08-26 | 北京理工大学 | Heterogeneous database integrating method based on configurable templates |
CN105335510A (en) * | 2015-10-30 | 2016-02-17 | 成都博睿德科技有限公司 | Text data efficient searching method |
CN105160046A (en) * | 2015-10-30 | 2015-12-16 | 成都博睿德科技有限公司 | Text-based data retrieval method |
CN107590166B (en) * | 2016-12-20 | 2019-02-12 | 百度在线网络技术(北京)有限公司 | A kind of data creation method and device based on inquiry content |
US11301515B2 (en) | 2016-12-20 | 2022-04-12 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and apparatus for generating data based on query content |
CN107590166A (en) * | 2016-12-20 | 2018-01-16 | 百度在线网络技术(北京)有限公司 | A kind of data creation method and device based on inquiry content |
WO2018141140A1 (en) * | 2017-02-06 | 2018-08-09 | 中兴通讯股份有限公司 | Method and device for semantic recognition |
CN106951191A (en) * | 2017-03-22 | 2017-07-14 | 江苏金易达供应链管理有限公司 | Towards the big data storage method of auto service platform |
CN108170739A (en) * | 2017-12-18 | 2018-06-15 | 深圳前海微众银行股份有限公司 | Problem matching process, terminal and computer readable storage medium |
CN111199170A (en) * | 2018-11-16 | 2020-05-26 | 长鑫存储技术有限公司 | Formula file identification method and device, electronic equipment and storage medium |
CN111199170B (en) * | 2018-11-16 | 2022-04-01 | 长鑫存储技术有限公司 | Formula file identification method and device, electronic equipment and storage medium |
CN109522414A (en) * | 2018-11-26 | 2019-03-26 | 吉林大学 | A kind of document delivery object selection system |
CN109522414B (en) * | 2018-11-26 | 2021-06-04 | 吉林大学 | Document delivery object selection system |
CN110245215A (en) * | 2019-06-05 | 2019-09-17 | 阿里巴巴集团控股有限公司 | A kind of text searching method and device |
CN110245215B (en) * | 2019-06-05 | 2023-10-20 | 创新先进技术有限公司 | Text retrieval method and device |
CN111353055A (en) * | 2020-03-02 | 2020-06-30 | 中国传媒大学 | Intelligent tag extended metadata-based cataloging method and system |
CN111353055B (en) * | 2020-03-02 | 2024-04-16 | 中国传媒大学 | Cataloging method and system based on intelligent tag extension metadata |
CN112182239A (en) * | 2020-09-22 | 2021-01-05 | 中国建设银行股份有限公司 | Information retrieval method and device |
CN113779032A (en) * | 2021-09-14 | 2021-12-10 | 广州汇通国信科技有限公司 | Search engine index construction method and device based on recurrent neural network |
CN113779032B (en) * | 2021-09-14 | 2024-03-12 | 广州汇通国信科技有限公司 | Search engine index construction method and device based on cyclic neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101169780A (en) | Semantic ontology retrieval system and method | |
US7613602B2 (en) | Structured document processing apparatus, structured document search apparatus, structured document system, method, and program | |
CN105045875B (en) | Personalized search and device | |
WO2008098502A1 (en) | Method and device for creating index as well as method and system for retrieving | |
JP6355840B2 (en) | Stopword identification method and apparatus | |
CN103838833A (en) | Full-text retrieval system based on semantic analysis of relevant words | |
CN103902652A (en) | Automatic question-answering system | |
WO2008097856A2 (en) | Search result delivery engine | |
JP2006004417A (en) | Method and device for recognizing specific type of information file | |
CN101710318A (en) | Knowledge intelligent acquiring system of vegetable supply chains | |
US20120130925A1 (en) | Decomposable ranking for efficient precomputing | |
US9971828B2 (en) | Document tagging and retrieval using per-subject dictionaries including subject-determining-power scores for entries | |
CN101101605A (en) | Method, device and system for searching web page and device for establishing index database | |
CN101339560B (en) | Method and device for searching series data, and search engine system | |
CN103559258A (en) | Webpage ranking method based on cloud computation | |
CN115563313A (en) | Knowledge graph-based document book semantic retrieval system | |
CN105404677A (en) | Tree structure based retrieval method | |
CN109783599A (en) | Knowledge mapping search method and system based on multi storage | |
CN108509449B (en) | Information processing method and server | |
CN105677684A (en) | Method for making semantic annotations on content generated by users based on external data sources | |
CN112597370A (en) | Webpage information autonomous collecting and screening system with specified demand range | |
CN105426490A (en) | Tree structure based indexing method | |
CN102508920B (en) | Information retrieval method based on Boosting sorting algorithm | |
JP6173958B2 (en) | Program, apparatus and method for searching using a plurality of hash tables | |
Thakare et al. | Extraction of template using clustering from heterogeneous web documents |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Open date: 20080430 |