CN103593410A - System for search recommendation by means of replacing conceptual terms - Google Patents

System for search recommendation by means of replacing conceptual terms Download PDF

Info

Publication number
CN103593410A
CN103593410A CN201310501114.XA CN201310501114A CN103593410A CN 103593410 A CN103593410 A CN 103593410A CN 201310501114 A CN201310501114 A CN 201310501114A CN 103593410 A CN103593410 A CN 103593410A
Authority
CN
China
Prior art keywords
concept
conceptual
query
search
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310501114.XA
Other languages
Chinese (zh)
Other versions
CN103593410B (en
Inventor
朱其立
孙伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201310501114.XA priority Critical patent/CN103593410B/en
Publication of CN103593410A publication Critical patent/CN103593410A/en
Application granted granted Critical
Publication of CN103593410B publication Critical patent/CN103593410B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The invention provides a system for research recommendation by means of replacing conceptual terms. The system comprises an offline system and an online system; the offline system is used for parsing and identifying entity keywords contained in each historical record in log of search engines and creating indexes for the historical records according to categories of the entity keywords; the online system is used for receiving and parsing search engine queries submitted by users, identifying conceptual keywords in the search engine queries and searching historical queries according to weights, the historical queries are the most proximate to the given search engine queries and contain entity keywords with meaning of the conceptual keywords, and then the found queries are sorted and are returned to the users for secondary queries. The system has the advantages that the system is simple and direct, and massive data from the search engines are utilized; certain abstract conceptual keywords can be utilized when the users cannot provide accurate search terms; recommended search terms can be directly provided by the system, so that the user experience can be improved.

Description

By replacing conceptual word, search for commending system
Technical field
The present invention relates to natural language processing, searching engine field, particularly, relate to control method and the corresponding control device of XX.
Background technology
Through retrieval, find following coordinate indexing result:
Coordinate indexing result 1:
Application (patent) number: 200580042218.2, title: suggesting search engine keywords
Summary: search engine receives the search inquiry with one or more key words.Analysis, from the document in the result set of this search inquiry, identifies the one or more additional keyword of further cutting apart or separate initial result set.These additional keywords are presented to user, and then user selects whether to comprise or get rid of the document that mates these additional keywords.In this way, the concentrated number of documents of baseline results reduces in mode relatively fast and easily
This patent documentation is based on the result set of the search inquiry of user's input is analyzed, and extract and can be used for the keyword of segmentation result, then the keyword extracting is recommended to user, by user, determine it is to retain the document that these keywords point to, still get rid of the document that these keywords point to.Although this process seems more succinct, but for current large data age, user is difficult to provide accurately initial search inquiry, in this case, the method just cannot guarantee the document that comprises the real needs of user in initial result set, also just cannot guarantee validity.
Technical essential comparison:
1. according to the result set of the search engine inquiry of user's input, carry out keyword recommendation, by user, determined to comprise the result set of (or eliminating) recommended keywords, and in the present invention, utilize the historical record of search engine and the search engine inquiry of user's input to carry out the recommendation of whole inquiry.
2. the result set of the search engine inquiry of inputting for user carries out direct keyword recommendation, and the present invention attempts the more profound inquiry that must input from semantically understanding user, then utilize semanteme to inquire about recommendation.
Coordinate indexing result 2:
Application (patent) number: 201010618555.4, title: the method and apparatus of recommending searched key word
Summary: the application discloses a kind of method and apparatus of recommending searched key word, while recommending searched key word to the user who there is no clear and definite search intention in order to solve in prior art, recommendation effect is not good, causes the problem of search engine server system resource waste.Method comprises: the searched key word that receives input; Sample word in sample word in the searched key word relatively receiving and the non-intention set of words of setting and the intention set of words of setting; When searched key word that comparative result be to receive comprises sample word in non-intention set of words and does not comprise the sample word in intention set of words, take the first predetermined way of recommendation as determining the master mode of recommending searched key word, other ways of recommendation of take except the first predetermined way of recommendation are as determining the strategy of the supplementary mode of searched key word, determine and recommend searched key word, wherein, the first predetermined way of recommendation is the way of recommendation based on knowledge base and/or the way of recommendation of dialogue-based correlativity.
This patent documentation judges that by intention word set and non-intention word set it is also non-intention inquiry that the search inquiry of user's input belongs to intention inquiry, then according to the result of judging, uses different strategies as main recommendation strategy.When search inquiry is judged as intention inquiry, it is master that this patent is used conversation-based recommendation strategy.When search inquiry is judged as non-intention inquiry, it is master that this patent is used the recommendation strategy based on knowledge base.But intention word set and non-intention word set itself are very limited, and need continuous maintenance update, cost is larger; The knowledge base of its use simultaneously is also mainly categorized as master with the ecommerce of Alibaba Co.
Technical essential comparison:
1. adopt compound recommendation strategy, wherein, recommendation for fuzzy query also adopts knowledge base to assist, but the ecommerce that its knowledge base main source is Alibaba Co classification, and knowledge base of the present invention adopts Probase or any one probability hierarchical data base.
Coordinate indexing result 3:
Application (patent) number: 201310165048.3, title: recommend method and the search engine of search candidate word
Summary: the present invention proposes a kind of recommend method and search engine of searching for candidate word, and wherein said method comprises: search engine server receives the input message of user's input, and obtain the prefix information of input message; Using prefix information as index, obtain the weight of a plurality of search candidate word and each search candidate word; Whether judge in a plurality of search candidate word exists at least two search candidate word to belong to same subject; If judgement exists at least two search candidate word to belong to same subject, the weight that retains a search candidate word at least two search candidate word is constant, power is fallen in the weight of other search candidate word at least two search candidate word and process; And sort according to the weight of a plurality of search candidate word, the search candidate word after sequence is provided to user.According to the method for the embodiment of the present invention, improved diversity and the accuracy of search candidate word, can meet user's search need, and algorithm is simple, easy to implement, promote user and experience.
This patent documentation, mainly for the prefix of the search inquiry of user's input, carries out the recommendation of search inquiry, says in essence, is equivalent to a kind of auto-complete function.In practical operation, this auto-complete function has a comparatively serious defect, exactly may be due to certain focus incident that happens suddenly, and cause being raised together about the weight of numerous inquiries of this burst focus, consequently search engine can be recommended the search inquiry that numerous meanings are very close.Although this patent is optimized for this special screne, still do not escape the dependence for accurate keyword.
Technical essential comparison
1. mainly for the polymerization of similar recommended keywords, fall power, and the present invention tends to the fuzzy keyword of expressing the meaning understand and recommend.
2. mainly for prefix, carry out the recommendation of searched key word, and the present invention rewrites recommendation mainly for part fuzzy in searched key word.
Summary of the invention
For defect of the prior art, the object of this invention is to provide and a kind ofly by replacing conceptual word, search for commending system.The technical problem to be solved in the present invention be embodied in following some:
1) introduce hierarchical knowledge base, thereby can carry out to the search inquiry of user's input the understanding of generalities, i.e. identification conceptual (ambiguous) keyword wherein.
2) because search engine has good performance for the inquiry based on keyword, so the present invention is entity (concrete) keyword more specifically by the conceptual keyword replacement recognizing, thereby obtains better Search Results.
3) utilize search engine logs to recommend.In the daily record of search engine, recorded the user search queries of magnanimity, can therefrom filter out high-quality, the accurate search inquiry of result set, then the inquiry of these high-qualitys is recommended to the users that accurate keyword cannot be provided.This method is both direct, and the user that can improve again search engine experiences.
According to provided by the invention, by replacing conceptual word, search for commending system, comprise off-line system and on-line system, wherein:
Off-line system, for resolving the entity keyword comprising in every historical record of identification search engine logs, then according to the classification under these entity keywords, sets up index for these historical records, for on-line system, uses;
On-line system, for receiving and resolve the search engine inquiry of being submitted to by user, identification conceptual keyword wherein, then according to weight, find the historical query with that approach and that the comprise conceptual keyword sense the most entity keyword of given search inquiry, then the inquiry searching is sorted, and return to one of the user recommendation list after sequence, by user, select it to think the inquiry of more pressing close to, carry out secondary inquiry.
Preferably, described off-line module comprises entity abstraction module and concept polymerization module, wherein:
Entity abstraction module, the entity keyword comprising for identifying every historical query, then the entity keyword abstract recognizing is arrived to corresponding conceptual keyword, then give concept polymerization resume module;
Concept polymerization module, for the historical query that comprises same concept is aggregated to together, sets up index; For each historical query, entity abstraction module identifies the entity keyword that wherein comprises and the concept of their correspondences, and concept polymerization module, according to these concepts, is aggregated to the historical query that comprises same concept together; Set up one and take the index that concept is major key, give on-line system and use.
Preferably, described on-line system comprises conceptual analysis module, indexed search module and marking order module, wherein:
Conceptual analysis module, for identifying the conceptual keyword of the search inquiry of user's submission;
Indexed search module, for the conceptual keyword going out according to conceptual analysis Module recognition, the index that traversal is generated by off-line system, finds all historical querys that comprise the entity keyword consistent with the conceptual keyword identifying, using these historical querys as candidate's recommendation query;
Marking order module, for candidate's recommendation query marking of finding to all indexed search modules, and sequence, finally a part for sorted candidate's recommendation list is returned to user and select.
Preferably, described marking is defined as distance, and it comprises three parts: the quality of semantic distance, literal distance and historical query.
Preferably, described semantic distance is for describing user, to inquire about the typicalness of the entity keyword of original conceptual keyword and replacement, and typicalness defines with following formula:
Typicality ( ins tan ce , concept ) = Freq ( ins tan ce , concept ) Freq ( concept ) + 2000
Wherein, Typicality (instance, concept) represent for given concept, entity is for the typical degree of this concept, Freq (instance, concept) represents an entity and a common frequency occurring of concept, and Freq (concept) represents the frequency of given concept in corpus, instance represents an entity, and concept represents a concept;
And convert with following formula:
SemDist ( typ ) = - 2 1 + e typ + 2
Wherein, SemDist (typ) represents semantic distance, and typ represents the value of a typical degree, by Typicality (instance, concept) formula, is calculated, and e is the nature truth of a matter.
Preferably, described literal distance is for describing other parts except conceptual keyword, the similarity between user's inquiry and alternative inquiry.
Preferably, whether the quality of described historical query is really useful for user for describing a historical query, wherein, defines the quality of historical query with following formula:
Quality ( query ) = ClickTime ( query ) Freq ( query )
Wherein, Quality (query) represents the quality of given historical query, ClickTime (query) represents the number of clicks of user in the result set of given historical query, Freq (query) represents the frequency that given historical query is searched, and query represents a given historical query;
And inquiry quality is done as down conversion:
QualityDist(quality)=e -quality
Wherein, QualityDist (quality) represents quality distance, and quality represents a mass value, by Quality (query) formula, is calculated.
Preferably, the last comprehensive mark TotalScore of described marking is defined by following formula:
TotalScore=[SemDist+WordDist]+QualityDist
Wherein, SemDist represents semantic distance, and WordDist represents literal distance, and QualityDist represents quality distance.
Compared with prior art, the present invention has following beneficial effect:
1) use historical query in search engine logs as recommending data source, directly simple and utilized the mass data that is derived from search engine.
2) when user cannot provide accurate search word, can be with some abstract concept keywords, thereby make user can carry out hunting action, and can be not helpless.
3) directly provide the search word of recommendation, the search word that makes user can click recommendation carries out secondary inquiry, simple and convenient, has promoted user's experience.
Accompanying drawing explanation
By reading the detailed description of non-limiting example being done with reference to the following drawings, it is more obvious that other features, objects and advantages of the present invention will become:
Fig. 1 is off-line system schematic diagram.
Fig. 2 is on-line system schematic diagram.
Fig. 3 is index structure schematic diagram.
Embodiment
Below in conjunction with specific embodiment, the present invention is described in detail.Following examples will contribute to those skilled in the art further to understand the present invention, but not limit in any form the present invention.It should be pointed out that to those skilled in the art, without departing from the inventive concept of the premise, can also make some distortion and improvement.These all belong to protection scope of the present invention.
According to described major function of searching for commending system by replacing conceptual word provided by the invention, be to utilize the historical record of storing in search engine logs, according to certain mode, set up index for these historical records, then by the search inquiry of inputting with user, mate, thereby user may be recommended to user by more interested historical record.
Described search commending system comprises two subsystems, is respectively off-line system and on-line system.Off-line system is responsible for resolving the entity keyword (as " Obama ", " Nixon ", " Watergate " etc.) comprising in every historical record in identification search engine logs, then according to the classification under these entity keywords, for these historical records, set up index, for on-line system, use.On-line system is responsible for receiving and resolving the search engine inquiry of being submitted to by user, identification conceptual keyword (as " president ", " scandal " etc.) wherein, then according to weight, find the historical query with that approach and that the comprise conceptual keyword sense the most entity keyword of given search inquiry, then the inquiry searching is sorted, and return to one of the user recommendation list after sequence, by user, select it to think the inquiry of more pressing close to, carry out secondary inquiry.
Off-line module mainly comprises two modules: entity abstraction module and concept polymerization module.As shown in Figure 1.
Entity abstraction module, the entity keyword (as " Obama ", " Nixon ", " Watergate " etc.) that it comprises for identifying every historical query, the conceptual keyword (as " president ", " scandal " etc.) to correspondence by the entity keyword abstract recognizing, then gives concept polymerization resume module again.Here need the auxiliary of a stratification knowledge base, knowledge base can provide the set of an entity keyword, the set of a conceptual keyword and the corresponding relation between one group object-concept.
Concept polymerization module, it,, for the historical query that comprises same concept is aggregated to together, sets up index.For each historical query, entity abstraction module can identify the entity keyword wherein comprising, and the concept of their correspondences, and concept polymerization module is exactly according to these concepts, the historical query that comprises same concept to be aggregated to together.Set up one and take the index that concept is major key, give on-line system and use.
On-line system comprises three modules: conceptual analysis module, indexed search module and marking order module.As shown in Figure 2.
Conceptual analysis module, it is for identifying the conceptual keyword of the search inquiry of user's submission.For example, when user's input " president involved in scandals ", this module can identify two conceptual keywords in inquiry---" president " and " scandal ".And give module below by this information.
Indexed search module, the conceptual keyword that it identifies for basis, the index that traversal is generated by off-line system, find all historical querys that comprise the entity keyword consistent with the conceptual keyword identifying, using these historical querys as candidate's recommendation query, give next module and give a mark and sort.
Marking order module, it is for giving all candidate's recommendation query marking of finding, and sequence, finally a part for sorted candidate's recommendation list is returned to user and selects.In the marking stage, to consider three factors, first semantic factor, the i.e. degree of association of entity keyword in the historical query of conceptual keyword and recommendation in original query; It two is literal similarities between original query and recommendation query; It three is quality of the historical query recommended, and the present invention is mainly by enquiry frequency (number of times being queried) and average clicks (on average inquiring about the clicks of user in result set) at every turn.In list, return to the stage, need dynamically definite threshold value (X) and maximum of returning to inquiry to return to quantity (N), the mark of all recommendation query of returning can not be greater than X, a no more than N of recommendation query simultaneously returning.
Implementation example 1: the foundation of historical query index
For on-line system, need to be the in the situation that of given conceptual keyword, find fast the historical query set with given conceptual dependency, as the alternative collection of recommendation query.Therefore, when setting up historical query index, should using conceptual keyword as index key, organize whole index.As shown in Figure 3, the historical query that all and a certain specific concept is relevant is all concentrated to take in the query set that this concept is key assignments, as the historical query that comprises the president such as " Obama ", " Nixon ", be all concentrated to and take in the query set that " president " be key assignments.Index is the set by the query set of all concepts as above.Especially, in concrete system realizes, can fast access information for asking, all concepts, historical query are all endowed unique identity indications (UID) so that fast access.
Implementation example 2: for the analysis of inquiry
Analysis for inquiry mainly refers to conceptual keyword or the entity keyword in identification inquiry.In the entity abstraction module of off-line system, be mainly that the entity keyword in historical query is identified; And in the conceptual analysis module of on-line system, be mainly that the conceptual keyword in the inquiry of user's input is identified.The data set different (the former is entity keyword set, and the latter is conceptual keyword set) that the two just utilizes when identification is identical in method.Following part will be take online part and be described as example.
The conceptual keyword set using in realization is from hierarchical knowledge base, and level knowledge base completes abstract to the world by a series of tree structures.For example this logic chain: " market " → " emerging market " → " China, India ... "These belong to entity keyword " China " and " India ", and they are carried out to abstract compared with shallow hierarchy, can be conceptualized as " emerging market " (emerging market), more further abstract, can be conceptualized as " market " (market).The information source of this knowledge base can be obtained technology by the network information and capture from network.
Had conceptual keyword set, just can identify the conceptual keyword in user's inquiry, the mode that the present invention adopts is longest match principle,, when " emerging market " is identified, just no longer identifies " market ".This is main because longer concept means more accurate concept conventionally, thereby contains more accurate entity keyword set, then obtains more accurately and recommends.
Implementation example 3: the method for marking sequence
By analyzing, we can know that the fuzzy query that a user submits to can comprise more than one conceptual keyword, by traversal index, can find and comprise all historical querys that recognize the entity that conceptual keyword is corresponding, these historical querys will be by marking sequence as alternative recommendation query.In the present invention, marking is defined as distance (more closely better), and it comprises three parts: the quality of semantic distance, literal distance and historical query.
Semantic distance is for describing user, to inquire about the typicalness of the entity keyword of original conceptual keyword and replacement.For example, " president " replaced with to " Barack Obama " just than being replaced with " Bill " typical case, because in known, " Barack Obama " almost specially refers to incumbent US President, and " Bill " is a very common U.S. name.Concrete, about typicalness, can define with following formula:
Typicality ( ins tan ce , concept ) = Freq ( ins tan ce , concept ) Freq ( concept ) + 2000
Wherein, Typicality (instance, concept) represent for given concept, entity is for the typical degree of this concept, Freq (instance, concept) represents an entity and a common frequency occurring of concept, and Freq (concept) represents the frequency of given concept in corpus, instance represents an entity, and concept represents a concept.
Known according to formula, more typical for a concept when an entity, this numerical value is larger, and for making this numerical value meet the concept of " distance ", so we need to convert with following formula:
SemDist ( typ ) = - 2 1 + e typ + 2
Wherein, SemDist (typ) represents semantic distance, and typ represents the value of a typical degree, by Typicality (instance, concept) formula, is calculated.
Literal distance (WordDist) is for describing other parts except conceptual keyword, the similarity between user's inquiry and alternative inquiry.In the present invention, used " between character string, revising distance " of in Computer Subject, commonly using as literal distance.
Whether the quality of historical query is really useful for user for describing a historical query.Inquiry based on useful, is more likely clicked hypothesis repeatedly by user, defines the quality of historical query in the present invention with following formula:
Quality ( query ) = ClickTime ( query ) Freq ( query )
Wherein, Quality (query) represents the quality of given historical query, ClickTime (query) represents the number of clicks of user in the result set of given historical query, Freq (query) represents the frequency that given historical query is searched, and query represents a given historical query.
Similar with semantic distance, in order to meet the definition of " distance ", need to do as down conversion inquiry quality:
QualityDist(quality)=e -quality
Wherein, QualityDist (quality) represents quality distance, and quality represents a mass value, by Quality (query) formula, is calculated.
Last comprehensive mark TotalScore is defined by following formula:
TotalScore=[SemDist+WordDist]+QualityDist
Wherein, SemDist represents semantic distance, and WordDist represents literal distance, and QualityDiSt represents quality distance.
Be semantic distance and literal distance linearity and round up after, add quality distance.This account form makes the distance calculating in certain granularity, (round) degree of association having guaranteed between original query and recommendation query, has guaranteed that again recommended historical query has enough quality.
Above specific embodiments of the invention are described.It will be appreciated that, the present invention is not limited to above-mentioned specific implementations, and those skilled in the art can make various distortion or modification within the scope of the claims, and this does not affect flesh and blood of the present invention.

Claims (8)

1. by replacing conceptual word, search for a commending system, it is characterized in that, comprise off-line system and on-line system, wherein:
Off-line system, for resolving the entity keyword comprising in every historical record of identification search engine logs, then according to the classification under these entity keywords, sets up index for these historical records, for on-line system, uses;
On-line system, for receiving and resolve the search engine inquiry of being submitted to by user, identification conceptual keyword wherein, then according to weight, find the historical query with that approach and that the comprise conceptual keyword sense the most entity keyword of given search inquiry, then the inquiry searching is sorted, and return to one of the user recommendation list after sequence, by user, select it to think the inquiry of more pressing close to, carry out secondary inquiry.
2. according to claim 1ly by replacing conceptual word, search for commending system, it is characterized in that, described off-line module comprises entity abstraction module and concept polymerization module, wherein:
Entity abstraction module, the entity keyword comprising for identifying every historical query, then the entity keyword abstract recognizing is arrived to corresponding conceptual keyword, then give concept polymerization resume module;
Concept polymerization module, for the historical query that comprises same concept is aggregated to together, sets up index; For each historical query, entity abstraction module identifies the entity keyword that wherein comprises and the concept of their correspondences, and concept polymerization module, according to these concepts, is aggregated to the historical query that comprises same concept together; Set up one and take the index that concept is major key, give on-line system and use.
3. according to claim 1ly by replacing conceptual word, search for commending system, it is characterized in that, described on-line system comprises conceptual analysis module, indexed search module and marking order module, wherein:
Conceptual analysis module, for identifying the conceptual keyword of the search inquiry of user's submission;
Indexed search module, for the conceptual keyword going out according to conceptual analysis Module recognition, the index that traversal is generated by off-line system, finds all historical querys that comprise the entity keyword consistent with the conceptual keyword identifying, using these historical querys as candidate's recommendation query;
Marking order module, for candidate's recommendation query marking of finding to all indexed search modules, and sequence, finally a part for sorted candidate's recommendation list is returned to user and select.
4. according to claim 3ly by replacing conceptual word, search for commending system, it is characterized in that, described marking is defined as distance, and it comprises three parts: the quality of semantic distance, literal distance and historical query.
5. according to claim 4ly by replacing conceptual word, search for commending system, it is characterized in that, described semantic distance is for describing user, to inquire about the typicalness of the entity keyword of original conceptual keyword and replacement, and typicalness defines with following formula:
Typicality ( ins tan ce , concept ) = Freq ( ins tan ce , concept ) Freq ( concept ) + 2000
Wherein, Typicality (instance, concept) represent for given concept, entity is for the typical degree of this concept, Freq (instance, concept) represents an entity and a common frequency occurring of concept, and Freq (concept) represents the frequency of given concept in corpus, instance represents an entity, and concept represents a concept;
And convert with following formula:
SemDist ( typ ) = - 2 1 + e typ + 2
Wherein, SemDist (typ) represents semantic distance, and typ represents the value of a typical degree, by Typicality (instance, concept) formula, is calculated, and e is the nature truth of a matter.
6. according to claim 4ly by replacing conceptual word, search for commending system, it is characterized in that, described literal distance is for describing other parts except conceptual keyword, the similarity between user's inquiry and alternative inquiry.
7. according to claim 4ly by replacing conceptual word, search for commending system, it is characterized in that, whether the quality of described historical query is really useful for user for describing a historical query, wherein, defines the quality of historical query with following formula:
Quality ( query ) = ClickTime ( query ) Freq ( query )
Wherein, Quality (query) represents the quality of given historical query, ClickTime (query) represents the number of clicks of user in the result set of given historical query, Freq (query) represents the frequency that given historical query is searched, and query represents a given historical query;
And inquiry quality is done as down conversion:
QualityDist(quality)=e -quality
Wherein, QualityDist (quality) represents quality distance, and quality represents a mass value, by Quality (query) formula, is calculated.
8. according to claim 4ly by replacing conceptual word, search for commending system, it is characterized in that, the last comprehensive mark TotalScore of described marking is defined by following formula:
TotalScore=[SemDist+WordDist]+QualityDist
Wherein, SemDist represents semantic distance, and WordDist represents literal distance, and QualityDist represents quality distance.
CN201310501114.XA 2013-10-22 2013-10-22 System for search recommendation by means of replacing conceptual terms Expired - Fee Related CN103593410B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310501114.XA CN103593410B (en) 2013-10-22 2013-10-22 System for search recommendation by means of replacing conceptual terms

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310501114.XA CN103593410B (en) 2013-10-22 2013-10-22 System for search recommendation by means of replacing conceptual terms

Publications (2)

Publication Number Publication Date
CN103593410A true CN103593410A (en) 2014-02-19
CN103593410B CN103593410B (en) 2017-04-12

Family

ID=50083551

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310501114.XA Expired - Fee Related CN103593410B (en) 2013-10-22 2013-10-22 System for search recommendation by means of replacing conceptual terms

Country Status (1)

Country Link
CN (1) CN103593410B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104063418A (en) * 2014-03-17 2014-09-24 百度在线网络技术(北京)有限公司 Search recommendation method and device
CN104199810A (en) * 2014-08-29 2014-12-10 科大讯飞股份有限公司 Intelligent service method and system based on natural language interaction
CN104484339A (en) * 2014-11-21 2015-04-01 百度在线网络技术(北京)有限公司 Method and system for recommending relevant entities
CN104503975A (en) * 2014-11-20 2015-04-08 百度在线网络技术(北京)有限公司 Method and device for customizing recommended card
CN104933100A (en) * 2015-05-28 2015-09-23 北京奇艺世纪科技有限公司 Keyword recommendation method and device
CN104991955A (en) * 2015-07-17 2015-10-21 科大讯飞股份有限公司 Method and system for automatically constructing template library
CN105868249A (en) * 2015-12-15 2016-08-17 乐视网信息技术(北京)股份有限公司 Data query control method and device
CN106257452A (en) * 2015-06-19 2016-12-28 联想(新加坡)私人有限公司 Search Results is revised based on contextual feature
CN106649750A (en) * 2016-12-26 2017-05-10 北京奇虎科技有限公司 Search method and device for multi-sense entry
WO2017107457A1 (en) * 2015-12-25 2017-06-29 乐视控股(北京)有限公司 Query recommendation method and apparatus
CN107679186A (en) * 2017-09-30 2018-02-09 北京奇虎科技有限公司 The method and device of entity search is carried out based on entity storehouse
WO2018112696A1 (en) * 2016-12-19 2018-06-28 深圳大学 Content pushing method and content pushing system
CN108460039A (en) * 2017-02-20 2018-08-28 微软技术许可有限责任公司 Recommendation is provided
CN109558479A (en) * 2018-11-29 2019-04-02 北京羽扇智信息科技有限公司 Rule matching method, device, equipment and storage medium
CN110442696A (en) * 2019-08-05 2019-11-12 北京百度网讯科技有限公司 Inquiry processing method and device
CN110807094A (en) * 2018-07-20 2020-02-18 林威伶 Big data analysis, prediction and data visualization system and device for legal document
CN111475725A (en) * 2020-04-01 2020-07-31 百度在线网络技术(北京)有限公司 Method, apparatus, device, and computer-readable storage medium for searching for content
CN111782935A (en) * 2020-05-12 2020-10-16 北京三快在线科技有限公司 Information recommendation method and device, electronic equipment and storage medium
CN112784171A (en) * 2021-01-21 2021-05-11 重庆邮电大学 Movie recommendation method based on context typicality
WO2021136009A1 (en) * 2019-12-31 2021-07-08 阿里巴巴集团控股有限公司 Search information processing method and apparatus, and electronic device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100057716A1 (en) * 2008-08-28 2010-03-04 Stefik Mark J System And Method For Providing A Topic-Directed Search
CN102087699A (en) * 2011-03-03 2011-06-08 北京天地融科技有限公司 Information transmission method and system, bar code display device and reading device
CN102591969A (en) * 2011-12-31 2012-07-18 北京百度网讯科技有限公司 Method for providing search results based on historical behaviors of user and sever therefor
CN103218362A (en) * 2012-01-19 2013-07-24 中兴通讯股份有限公司 Method and system for constructing domain ontology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100057716A1 (en) * 2008-08-28 2010-03-04 Stefik Mark J System And Method For Providing A Topic-Directed Search
CN102087699A (en) * 2011-03-03 2011-06-08 北京天地融科技有限公司 Information transmission method and system, bar code display device and reading device
CN102591969A (en) * 2011-12-31 2012-07-18 北京百度网讯科技有限公司 Method for providing search results based on historical behaviors of user and sever therefor
CN103218362A (en) * 2012-01-19 2013-07-24 中兴通讯股份有限公司 Method and system for constructing domain ontology

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104063418A (en) * 2014-03-17 2014-09-24 百度在线网络技术(北京)有限公司 Search recommendation method and device
CN104199810A (en) * 2014-08-29 2014-12-10 科大讯飞股份有限公司 Intelligent service method and system based on natural language interaction
CN104503975A (en) * 2014-11-20 2015-04-08 百度在线网络技术(北京)有限公司 Method and device for customizing recommended card
CN104484339B (en) * 2014-11-21 2018-01-26 百度在线网络技术(北京)有限公司 A kind of related entities recommend method and system
CN104484339A (en) * 2014-11-21 2015-04-01 百度在线网络技术(北京)有限公司 Method and system for recommending relevant entities
CN104933100A (en) * 2015-05-28 2015-09-23 北京奇艺世纪科技有限公司 Keyword recommendation method and device
CN104933100B (en) * 2015-05-28 2018-05-04 北京奇艺世纪科技有限公司 keyword recommendation method and device
CN106257452A (en) * 2015-06-19 2016-12-28 联想(新加坡)私人有限公司 Search Results is revised based on contextual feature
CN104991955B (en) * 2015-07-17 2018-06-12 安徽科大讯飞医疗信息技术有限公司 Method and system for automatically constructing template library
CN104991955A (en) * 2015-07-17 2015-10-21 科大讯飞股份有限公司 Method and system for automatically constructing template library
WO2017101398A1 (en) * 2015-12-15 2017-06-22 乐视控股(北京)有限公司 Data query control method and device
CN105868249A (en) * 2015-12-15 2016-08-17 乐视网信息技术(北京)股份有限公司 Data query control method and device
WO2017107457A1 (en) * 2015-12-25 2017-06-29 乐视控股(北京)有限公司 Query recommendation method and apparatus
WO2018112696A1 (en) * 2016-12-19 2018-06-28 深圳大学 Content pushing method and content pushing system
CN106649750A (en) * 2016-12-26 2017-05-10 北京奇虎科技有限公司 Search method and device for multi-sense entry
CN106649750B (en) * 2016-12-26 2021-02-05 三六零科技集团有限公司 Searching method and device for multi-meaning term entry
CN108460039A (en) * 2017-02-20 2018-08-28 微软技术许可有限责任公司 Recommendation is provided
CN107679186A (en) * 2017-09-30 2018-02-09 北京奇虎科技有限公司 The method and device of entity search is carried out based on entity storehouse
CN107679186B (en) * 2017-09-30 2021-12-21 北京奇虎科技有限公司 Method and device for searching entity based on entity library
CN110807094A (en) * 2018-07-20 2020-02-18 林威伶 Big data analysis, prediction and data visualization system and device for legal document
CN109558479A (en) * 2018-11-29 2019-04-02 北京羽扇智信息科技有限公司 Rule matching method, device, equipment and storage medium
CN109558479B (en) * 2018-11-29 2022-12-02 出门问问创新科技有限公司 Rule matching method, device, equipment and storage medium
CN110442696A (en) * 2019-08-05 2019-11-12 北京百度网讯科技有限公司 Inquiry processing method and device
CN110442696B (en) * 2019-08-05 2022-07-08 北京百度网讯科技有限公司 Query processing method and device
WO2021136009A1 (en) * 2019-12-31 2021-07-08 阿里巴巴集团控股有限公司 Search information processing method and apparatus, and electronic device
CN111475725A (en) * 2020-04-01 2020-07-31 百度在线网络技术(北京)有限公司 Method, apparatus, device, and computer-readable storage medium for searching for content
CN111475725B (en) * 2020-04-01 2023-11-07 百度在线网络技术(北京)有限公司 Method, apparatus, device and computer readable storage medium for searching content
CN111782935A (en) * 2020-05-12 2020-10-16 北京三快在线科技有限公司 Information recommendation method and device, electronic equipment and storage medium
CN112784171A (en) * 2021-01-21 2021-05-11 重庆邮电大学 Movie recommendation method based on context typicality

Also Published As

Publication number Publication date
CN103593410B (en) 2017-04-12

Similar Documents

Publication Publication Date Title
CN103593410A (en) System for search recommendation by means of replacing conceptual terms
CN108804521B (en) Knowledge graph-based question-answering method and agricultural encyclopedia question-answering system
CN104199965B (en) Semantic information retrieval method
US8527487B2 (en) Method and system for automatic construction of information organization structure for related information browsing
CN103544255A (en) Text semantic relativity based network public opinion information analysis method
CN110543595B (en) In-station searching system and method
CN102184262A (en) Web-based text classification mining system and web-based text classification mining method
CN101097570A (en) Advertisement classification method capable of automatic recognizing classified advertisement type
CN105005564A (en) Data processing method and apparatus based on question-and-answer platform
EP2577521A2 (en) Detection of junk in search result ranking
CN110633365A (en) Word vector-based hierarchical multi-label text classification method and system
CN112257419A (en) Intelligent retrieval method and device for calculating patent document similarity based on word frequency and semantics, electronic equipment and storage medium thereof
CN105528411A (en) Full-text retrieval device and method for interactive electronic technical manual of shipping equipment
CN113312474A (en) Similar case intelligent retrieval system of legal documents based on deep learning
CN102314464B (en) Lyrics searching method and lyrics searching engine
CN115953041A (en) Construction scheme and system of operator policy system
CN114238735B (en) Intelligent internet data acquisition method
CN111259145B (en) Text retrieval classification method, system and storage medium based on information data
CN100593783C (en) Method, system and device for acquiring appraisement of vocabulary semanteme
Zhang et al. An adaptive method for organization name disambiguation with feature reinforcing
Fan et al. Stop Words for Processing Software Engineering Documents: Do they Matter?
Nikolić et al. Modelling the System of Receiving Quick Answers for e-Government Services: Study for the Crime Domain in the Republic of Serbia
CN113051907B (en) Method, system and device for searching duplicate of news content
US11960549B2 (en) Guided source collection for a machine learning model
Hemayati et al. Categorizing search results using wordnet and Wikipedia

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170412

Termination date: 20191022