CN101295319A - Method and device for expanding query, search engine system - Google Patents

Method and device for expanding query, search engine system Download PDF

Info

Publication number
CN101295319A
CN101295319A CNA2008101154707A CN200810115470A CN101295319A CN 101295319 A CN101295319 A CN 101295319A CN A2008101154707 A CNA2008101154707 A CN A2008101154707A CN 200810115470 A CN200810115470 A CN 200810115470A CN 101295319 A CN101295319 A CN 101295319A
Authority
CN
China
Prior art keywords
speech
word
query
classification
existing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2008101154707A
Other languages
Chinese (zh)
Other versions
CN101295319B (en
Inventor
张智敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN2008101154707A priority Critical patent/CN101295319B/en
Publication of CN101295319A publication Critical patent/CN101295319A/en
Application granted granted Critical
Publication of CN101295319B publication Critical patent/CN101295319B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses an expansion query method, a device and a search engine system containing the device, which can solve the problems that the natures of relevant query words provided by the existing search engine are same possibly, so the searched results by using the query words are similar and whether more information in a wider range can be searched or not is uncertain. The method comprises following steps: statistics of co-occurrence words with the query words is carried out; all the co-occurrence words are classified; a characteristic word is selected for each type; the characteristic words of various types are taken as the relevant query words of the query words. Compared with the prior art, the invention provides the multi-type query for a user, the natures of various query words are different, thus being capable of querying the more information in the wider range. The expansion query method guides the user to use better words to carry out the retrieval, thus being capable of obtaining better retrieval effect; the quiddity of guiding the user is to carry out the speculation of the query purpose of the user and further carry out the division, thus obtaining better effect.

Description

A kind of method of expanding query, device and search engine system
Technical field
The present invention relates to the search inquiry field, particularly relate to a kind of expanding query method, install and comprise the search engine system of this device.
Background technology
The development of search engine technique brings very many facilities for the numerous network users, and the user utilizes search engine can obtain it very easily to want the information known.The user imports a query word on search engine, search engine just can return the webpage that comprises this query word according to user's query word.Therefore, for the user who uses search engine, query word is most important, has only and uses appropriate query word, just can find the webpage that needs.
At present, each search engine finds appropriate query word in order to help the user, further improves the search inquiry quality, and the function of " relevant search " all is provided.Promptly in the time of certain speech of user inquiring, the relevant inquiring speech that search engine can point out other users to use.For example the user clicks inquire button in Google input " computer " back, except that listing Search Results, give the query words relevant such as relevant search " Pacific Ocean computer net ", " notebook computer ", " Pacific Ocean computer ", " Legend computer ", " notebook computer quotation " bottom in the page that returns with " computer " at this page.
The existing method that the relevant inquiring speech is provided mainly is the similarity between the comparison query speech, and promptly relatively two query words have how many identical word or speech.There is following problem in relevant inquiring speech based on this technology obtains: the relevant inquiring speech that provides, character all are the same, and be all similar by the result that these query words search; The relevant inquiring speech that provides because character is the same, is uncertain so whether can search the information of more wider scope.
Summary of the invention
Technical matters to be solved by this invention provide a kind of expanding query method, install and comprise the search engine system of this device, to solve the relevant inquiring speech that present search engine provides, character may be the same, cause the result who searches by these query words all similar, and whether can search the uncertain problem of information of more wider scope.
For solving the problems of the technologies described above,, the invention discloses following technical scheme according to specific embodiment provided by the invention:
A kind of method of expanding query comprises:
Statistics and query word are with existing word;
All are classified with existing word;
For each classification is selected the feature speech;
With the feature speech of each class relevant inquiring speech as this query word.
Wherein, describedly refer to and query word occurring words simultaneously in a webpage with existing word.
Preferably, described statistics and query word specifically comprise with existing word: with all query words is that index set up in keyword, index content be with query word with existing word.
Wherein, described index is an inverted index.
Preferably, also comprise: will sort from high to low according to the frequency of occurrences with existing word.
Preferably, described all are classified with existing word specifically comprises: all use a set to represent with existing word each, the content of set be with this speech with existing word and word frequency; Relatively the similarity between the set if similarity meets prerequisite, then will be gathered corresponding same existing word and merge to a class.
Wherein, the described relatively similarity between the set is the number of identical word in relatively gathering.
Preferably, describedly select feature speech for each classification and specifically comprise: select a speech as the feature speech from each classification and corresponding set, the frequency that this speech occurs in this classification is higher than the frequency that occurs in other classifications.
Preferably, described method also comprises: the user input query speech will offer the user to relevant inquiring speech that should query word; Wherein, described relevant inquiring speech comprises a plurality of classification.
Preferably, will offer the user to relevant inquiring speech that should query word and specifically comprise:, described relevant inquiring speech be sorted according to enquiry frequency according to the search daily record; The relevant inquiring speech that enquiry frequency is met prerequisite offers the user.
A kind of device of expanding query comprises:
The data statistics unit, be used to add up with query word with existing word;
The word's kinds unit is used for all are classified with existing word;
Classification name unit is used to each classification to select the feature speech;
The expanding query unit is used for the feature speech of each class relevant inquiring speech as this query word.
Wherein, describedly refer to and query word occurring words simultaneously in a webpage with existing word.
Preferably, described data statistics unit further comprises: set up indexing units, being used for all query words is that index set up in keyword, index content be with query word with existing word.
Wherein, described index is an inverted index.
Preferably, described data statistics unit also comprises: sequencing unit is used for and will sorts from high to low according to the frequency of occurrences with existing word.
Preferably, described word's kinds unit further comprises: set up aggregation units, be used for all using a set to represent with existing word each, the content of set be with this speech with existing word and word frequency; Merge cells, the similarity between being used for relatively gathering if similarity meets prerequisite, then will be gathered corresponding same existing word and merge to a class.
Wherein, the described relatively similarity between the set is the number of identical word in relatively gathering.
Preferably, the feature speech is selected for each classification in the following manner in described classification name unit: select a speech as the feature speech from the set of each classification and correspondence, the frequency that this speech occurs in this classification is higher than the frequency that occurs in other classifications.
Preferably, described device also comprises: applying unit, be used for when the user input query speech, and will offer the user to relevant inquiring speech that should query word; Wherein, described relevant inquiring speech comprises a plurality of classification.
Preferably, described applying unit further comprises: sequencing unit is used for according to the search daily record described relevant inquiring speech being sorted according to enquiry frequency; Classification screening unit, the relevant inquiring speech that is used for enquiry frequency is met prerequisite offers the user.
A kind of search engine system, described search engine system comprise described expanding query device.
According to specific embodiment provided by the invention, the present invention has following technique effect:
The present invention is to classify with existing word with this query word when providing the relevant inquiring speech to the user, then the feature speech of each class is offered the user as the relevant inquiring speech of this query word.Compared with prior art, the invention provides to the user be multiclass inquiry, the different in kind of each query word can inquire the information of more wider scope; And the relevant inquiring that prior art provides, not necessarily a few class query words, very possible character all is the same, because the method for similarity is difficult to judge between the existing comparison query speech.
The invention reside in the guiding user and retrieve, so that can access the better retrieval effect with more excellent speech; Guiding user's essence is that user's search purposes is guessed and then segmented, thereby obtains better effect.In a word, the present invention has guaranteed the diversity of expansion back inquiry by query word is classified.
Description of drawings
Fig. 1 is the process flow diagram of method first embodiment of a kind of expanding query of the present invention;
Fig. 2 is the index synoptic diagram in the embodiment of the invention one;
Fig. 3 is the synoptic diagram that in the embodiment of the invention one cap is carried out in two set;
Fig. 4 is the process flow diagram of method second embodiment of a kind of expanding query of the present invention;
Fig. 5 is the structural drawing of device first embodiment of a kind of expanding query of the present invention;
Fig. 6 is the structural drawing of device second embodiment of a kind of expanding query of the present invention.
Embodiment
For above-mentioned purpose of the present invention, feature and advantage can be become apparent more, the present invention is further detailed explanation below in conjunction with the drawings and specific embodiments.
The invention provides a kind of method of expanding query, expand, can provide more excellent query word, thereby better inquired about effect for the user for user's inquiry.
Embodiment one:
With reference to Fig. 1, be the process flow diagram of method first embodiment of described a kind of expanding query.
S101, statistics and query word are with existing word.
Statistics and query word are meant with all existing words speech of statistics all with which speech occurs simultaneously in a webpage (or one piece of article).In actual applications, a kind of preferred statistical method is: with the query word that was occurred is that index set up in keyword, index content be with query word with existing word.
With reference to Fig. 2, be the index synoptic diagram.This index is a kind of inverted index structure, and each keyword in the index is query word, the index content of corresponding each keyword be with this query word with existing word.These may derive from a plurality of webpages with existing word.For example, for certain query word, with existing word A, B, C, D are arranged, wherein speech A and B and this query word occur in a webpage simultaneously, and speech C and D and this query word occur in another webpage simultaneously.So each index content is that all and query word are with existing word.
Preferably, can also the same existing word in the index be sorted from high to low according to the frequency that occurs, to make things convenient for subsequent treatment.If a word is existing together with this query word in a plurality of webpages, then the frequency of this word appearance is just high, and this speech just comes forward position.For example, for certain query word, A, B, C, D are arranged with existing word, wherein speech A, B, C and this query word occur in webpage X simultaneously, and speech D both in webpage X with look into this inquirys speech with existing, again in webpage Y with this query word with now, the probability of occurrence of speech D just is higher than speech A, B, C like this.
S102 classifies all with existing word.
The speech that occurs simultaneously with a query word that counts from S101 may be very many, do relevant search but can not all take out all same existing speech.So, need do classification to all same existing speech that obtain, be divided into each classification.
The sorting technique that present embodiment preferably adopts is to adopt a kind of maximum method of dividing.Specific as follows:
At first, all use a set to represent with existing word each, the content of set be with this speech with existing word and word frequency; Like this,, can represent with showing speech for each with a string speech and word frequency;
Then, relatively the similarity between the set if similarity meets prerequisite, then will be gathered corresponding same existing word and merge to a class.
Each just can compare similarity with after now speech is represented with the mode of above-mentioned set between per two speech, thereby a class merged in similar same existing speech, so just can much be classified.Detailed process is: friendship is asked in set in twos, and the number of identical word if two intersection of sets collection are very big, thinks that then two speech are similar in promptly relatively gathering, and two set can be merged into a set; If the common factor of two speech is very little, think that then two speech are inhomogeneous.The process that described set merges can be controlled by passing threshold, and the common factor threshold value promptly is set, and when two intersection of sets collection meet described threshold value, just can merge.
With reference to Fig. 3, be the synoptic diagram that cap is carried out in two set.Among the figure, the set of speech 1 correspondence comprises speech 11, speech 12, speech 13 and speech 01, speech 02, and the set of speech 2 correspondences comprises speech 21, speech 22, speech 23 and speech 01, speech 02.These two set all comprise speech 01, speech 02, occur simultaneously so exist; Meet threshold value if occur simultaneously, then speech 1 and speech 2 can be merged to a class.After calculating so in twos, just can obtain a lot of classification.
Illustrate, query word is apple, and the speech that occurs simultaneously with apple has ipod, iphone, mobile phone, mp3, mac.......Four speech were example in the past, and each speech is represented with a set, and is as follows:
Have with existing speech with ipod: apple, player, mp3, song, music, iTunes ...
Have with existing speech with mp3: player, song, music ...
Have with existing speech with iphone: apple, mobile phone, apple ...
Have with existing speech with mobile phone: quotation, number ...
According to above-mentioned sorting technique:, think that then these two set are classes if the speech major part that two set comprise is identical.Therefore, ipod and mp3 are classes, and iphone and mobile phone are classes.
Need to prove, the set of corresponding each speech is by constituting with existing speech and word frequency with this speech in the present embodiment, but binary or the ternary relation that can also extract this speech constitute set, and wherein said binary or ternary relation are meant the front and back speech of this speech and binary or the ternary relation that this speech constitutes.
S103 is for each classification is selected the feature speech.
Obtaining need finding a feature speech to replace whole classification with after the classification that shows speech, figuratively speaking, is to give name of this classification.Present embodiment is preferred, adopts the mode of directly choosing from classification, promptly selects a feature speech from each classification and corresponding set, can guarantee the feature speech that finds like this, all is that inquiry is resultful in search engine.The principle of selecting is as follows:
The first, the frequency height that in this classification, occurs;
The second, the frequency that occurs in other classifications is low.
Still be that apple is an example with the query word, ipod and mp3 are classes, and iphone and mobile phone are classes.Select the most representative word then from each class, select according to word frequency, simultaneously owing in the same existing speech of ipod and iphone, all comprise apple, and these two speech do not comprise mutually, think that then ipod, iphone are two class names of apple.
S104 is with the feature speech of each class relevant inquiring speech as this query word.
Like this, ipod and iphone just can be used as the relevant inquiring speech of apple, in user inquiring apple, ipod and iphone are recommended the user.Certainly, the relevant inquiring speech is not can only be the feature speech of each class, can be other speech in of all categories yet.
Be the preferred embodiments of the present invention explanation below.
Embodiment two:
With reference to Fig. 4, be the process flow diagram of method second embodiment of described a kind of expanding query.Wherein, S401-S404 is identical with the S101-S104 of embodiment one, is not described in detail in this.
S401, statistics and query word are with all existing words;
In search engine system, finish this part thing, need very large data bank.In the Webpage search storehouse, whole data bank is exactly the set of user's all webpages that can retrieve, and does this part thing, is very large for the requirement of computing power.For addressing this problem, present embodiment adopts the mode of Distributed Calculation, and a calculation task is distributed to computing on the group of planes, thereby improves treatment effeciency.
S402 classifies all with existing word;
S403 in each word class, selects the most representative word and names;
S404 is with the relevant inquiring speech of the most representative word of each classification as this query word;
Certainly, the relevant inquiring speech is not can only be the feature speech of each class, can be other speech in of all categories yet;
S405, the user input query speech will offer the user to relevant inquiring speech that should query word; Wherein, described relevant inquiring speech comprises a plurality of classification.
In search engine is used,, just need find out suitable classification and recommend the user if classification is a lot.The method of selecting is to select the frequently high speech of some inquiries according to user's search daily record, and these speech are because the frequency that the user uses is higher, and explanation is other user's interest words.
Still be example with apple, finally may obtain ipod, iphone, mac, notebook, the some classifications of stock......, under the too many prerequisite of classification, can only select and severally represent to the user, select classification to represent according to user's enquiry frequency, such as, apple iphone has a lot of people to look on search engine, think that then apple iphone is the more interested word of user, the preferential selection.
Hence one can see that, the invention provides to the user be multiclass inquiry, the different in kind of each relevant inquiring speech can inquire the information of more wider scope.And the relevant inquiring that prior art provides, not necessarily a few class query words, very possible character all is the same, because the method for similarity is difficult to judge between the existing comparison query speech.
For example, look into apple with Google, associated recommendation is:
Apple iphone The apple mobile phone Apple ipod Apple uk Apple hk
Power apple Apple computer Apple tv The apple notebook Apple mp3
In the Query Result of Google, the character of iphone and mobile phone, ipod and mp3, computer and notebook is basic identical.
And utilize the present invention to inquire about, associated recommendation then is:
Apple?ipod apple?iphone apple?notebook apple?os?x apple?tv
apple?Leopad apple?tiger apple?store apple?quicktime Apple?Developer
Recommendation results all is inhomogeneous inquiry, thereby has expanded query context.
In sum, the present invention can guide the user to retrieve with more excellent speech, so that can access the better retrieval effect; Guiding user's essence is that user's search purposes is guessed and then segmented, thereby obtains better effect.In a word, the present invention has guaranteed the diversity of expansion back inquiry by query word is classified.
At said method, the present invention also provides a kind of device embodiment of expanding query.With reference to Fig. 5, be the structural drawing of device first embodiment of described a kind of expanding query.Described device mainly comprises data statistics unit U51, word's kinds unit U52, classification name unit U53 and expanding query unit U54, wherein:
Data statistics unit U51, be used to add up with query word with existing word;
Word's kinds unit U52 is used for all are classified with existing word;
Classification name unit U53 is used to each classification to select the feature speech;
Expanding query unit U54 is used for the feature speech of each class relevant inquiring speech as this query word.
Preferably, described data statistics unit U51 further comprises: set up indexing units, being used for all query words is that index set up in keyword, index content be with query word with existing word.Wherein, described index is an inverted index.
Preferably, described data statistics unit U51 also comprises: sequencing unit is used for and will sorts from high to low according to the frequency of occurrences with existing word.
Preferably, described word's kinds unit U52 further comprises: set up aggregation units, be used for all using a set to represent with existing word each, the content of set be with this speech with existing word and word frequency; And merge cells, the similarity between being used for relatively gathering if similarity meets prerequisite, then will be gathered corresponding same existing word and merge to a class.
Preferably, described classification name unit U53 selects the feature speech for each classification in the following manner: select a feature speech from the set of each classification and correspondence, the frequency that this speech occurs in this classification is higher than the frequency that occurs in other classifications.
With reference to Fig. 6, be the structural drawing of device second embodiment of described a kind of expanding query.Described device also comprises applying unit U65 except that comprising data statistics unit U61, word's kinds unit U62, classification name unit U63 and expanding query unit U64.
Wherein, the function of data statistics unit U61, word's kinds unit U62, classification name unit U63 and expanding query unit U64 and the relation of the data processing between the unit name unit U53 identical with expanding query unit U54 with data statistics unit U51, word's kinds unit U52, classification in the above-mentioned device shown in Figure 5, are not described in detail in this.
Applying unit U65 in the described device is used for when the user input query speech, will offer the user to relevant inquiring speech that should query word; Wherein, described relevant inquiring speech comprises a plurality of classification.
Preferably, described applying unit U65 further comprises: sequencing unit is used for according to the search daily record described relevant inquiring speech being sorted according to enquiry frequency; Classification screening unit is used for the relevant inquiring speech that enquiry frequency is high and offers the user.
The present invention also provides a kind of search engine system, and described system comprises Fig. 5 or the described device of Fig. 6.After this search engine system has carried out classification by Fig. 4 or the described device of Fig. 5 to query word, when the user utilizes a certain query word to search for, can expand user's search, the multiclass inquiry is provided, these relevant inquiring speech have heterogeneity, therefore say so to a certain extent and have expanded the range of inquiry.
The part that does not describe in detail in Fig. 5, the device shown in Figure 6 can be considered for length referring to the relevant portion of Fig. 1-method shown in Figure 4, is not described in detail in this.
More than to the method for a kind of expanding query provided by the present invention, install and comprise the search engine system of this device, be described in detail, used specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, part in specific embodiments and applications all can change.In sum, this description should not be construed as limitation of the present invention.

Claims (21)

1, a kind of method of expanding query is characterized in that, comprising:
Statistics and query word are with existing word;
All are classified with existing word;
For each classification is selected the feature speech;
With the feature speech of each class relevant inquiring speech as this query word.
2, method according to claim 1 is characterized in that: describedly refer to and query word occurring words simultaneously in a webpage with existing word.
3, method according to claim 1 is characterized in that, described statistics and query word specifically comprise with existing word:
With all query words is that index set up in keyword, index content be with query word with existing word.
4, method according to claim 3 is characterized in that: described index is an inverted index.
5, method according to claim 3 is characterized in that, also comprises: will sort from high to low according to the frequency of occurrences with existing word.
6, method according to claim 1 is characterized in that, described all are classified with existing word specifically comprises:
All use a set to represent with existing word each, the content of set be with this speech with existing word and word frequency;
Relatively the similarity between the set if similarity meets prerequisite, then will be gathered corresponding same existing word and merge to a class.
7, method according to claim 6 is characterized in that: the described relatively similarity between the set is the number of identical word in relatively gathering.
8, method according to claim 1 is characterized in that, describedly selects feature speech for each classification and specifically comprises:
Select a speech as the feature speech from the set of each classification and correspondence, the frequency that this speech occurs in this classification is higher than the frequency that occurs in other classifications.
9, method according to claim 1 is characterized in that, also comprises:
To offer the user to relevant inquiring speech that should query word; Wherein, described relevant inquiring speech comprises a plurality of classification.
10, method according to claim 9 is characterized in that, will offer the user to relevant inquiring speech that should query word and specifically comprise:
According to the search daily record, described relevant inquiring speech is sorted according to enquiry frequency;
The relevant inquiring speech that enquiry frequency is met prerequisite offers the user.
11, a kind of device of expanding query is characterized in that, comprising:
The data statistics unit, be used to add up with query word with existing word;
The word's kinds unit is used for all are classified with existing word;
Classification name unit is used to each classification to select the feature speech;
The expanding query unit is used for the feature speech of each class relevant inquiring speech as this query word.
12, device according to claim 11 is characterized in that: describedly refer to and query word occurring words simultaneously in a webpage with existing word.
13, device according to claim 11 is characterized in that, described data statistics unit further comprises:
Set up indexing units, being used for all query words is that index set up in keyword, index content be with query word with existing word.
14, device according to claim 13 is characterized in that: described index is an inverted index.
15, device according to claim 13 is characterized in that, described data statistics unit also comprises:
Sequencing unit is used for and will sorts from high to low according to the frequency of occurrences with existing word.
16, device according to claim 11 is characterized in that, described word's kinds unit further comprises:
Set up aggregation units, be used for all using a set to represent with existing word each, the content of set be with this speech with existing word and word frequency;
Merge cells, the similarity between being used for relatively gathering if similarity meets prerequisite, then will be gathered corresponding same existing word and merge to a class.
17, device according to claim 16 is characterized in that: the described relatively similarity between the set is the number of identical word in relatively gathering.
18, device according to claim 11 is characterized in that, the feature speech is selected for each classification in the following manner in described classification name unit:
Select a speech as the feature speech from the set of each classification and correspondence, the frequency that this speech occurs in this classification is higher than the frequency that occurs in other classifications.
19, device according to claim 11 is characterized in that, described device also comprises:
Applying unit is used for when the user input query speech, will offer the user to relevant inquiring speech that should query word; Wherein, described relevant inquiring speech comprises a plurality of classification.
20, device according to claim 19 is characterized in that, described applying unit further comprises:
Sequencing unit is used for according to the search daily record described relevant inquiring speech being sorted according to enquiry frequency;
Classification screening unit, the relevant inquiring speech that is used for enquiry frequency is met prerequisite offers the user.
21, a kind of search engine system is characterized in that: described search engine system comprises the described expanding query device of aforesaid right requirement 11 to 20 any claims.
CN2008101154707A 2008-06-24 2008-06-24 Method and device for expanding query, search engine system Active CN101295319B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008101154707A CN101295319B (en) 2008-06-24 2008-06-24 Method and device for expanding query, search engine system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008101154707A CN101295319B (en) 2008-06-24 2008-06-24 Method and device for expanding query, search engine system

Publications (2)

Publication Number Publication Date
CN101295319A true CN101295319A (en) 2008-10-29
CN101295319B CN101295319B (en) 2010-06-02

Family

ID=40065603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008101154707A Active CN101295319B (en) 2008-06-24 2008-06-24 Method and device for expanding query, search engine system

Country Status (1)

Country Link
CN (1) CN101295319B (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011079414A1 (en) * 2009-12-30 2011-07-07 Google Inc. Custom search query suggestion tools
CN102375885A (en) * 2011-10-21 2012-03-14 北京百度网讯科技有限公司 Method and device for providing search suggestions corresponding to query sequence
CN102722526A (en) * 2012-05-16 2012-10-10 成都信息工程学院 Part-of-speech classification statistics-based duplicate webpage and approximate webpage identification method
CN102799689A (en) * 2012-08-09 2012-11-28 昆山宏凌电子有限公司 Search software
CN102831185A (en) * 2012-08-01 2012-12-19 北京百度网讯科技有限公司 Entry recommending method and device
CN102955821A (en) * 2011-08-30 2013-03-06 北京百度网讯科技有限公司 Method and device for carrying out expansion processing on query sequence
CN103092858A (en) * 2011-11-01 2013-05-08 阿里巴巴集团控股有限公司 Search method and search device
CN103150409A (en) * 2013-04-08 2013-06-12 深圳市宜搜科技发展有限公司 Method and system for recommending user search word
CN103258025A (en) * 2013-05-08 2013-08-21 百度在线网络技术(北京)有限公司 Method for generating co-occurrence key words and method and system for providing associated search terms
CN103401918A (en) * 2013-07-30 2013-11-20 东北石油大学 Electronic map-based business information releasing system
WO2014063595A1 (en) * 2012-10-23 2014-05-01 International Business Machines Corporation Incorporating related searches by other users in a social network in a search request
CN103853771A (en) * 2012-12-03 2014-06-11 百度在线网络技术(北京)有限公司 Search result pushing method and search result pushing system
US8874604B2 (en) 2009-08-31 2014-10-28 International Business Machines Corporation Method and system for searching an electronic map
CN104598630A (en) * 2015-02-05 2015-05-06 北京航空航天大学 Event indexing and retrieval method and device
CN106033445A (en) * 2015-03-16 2016-10-19 北京国双科技有限公司 Method and device for obtaining article association degree data
CN103092858B (en) * 2011-11-01 2016-12-14 阿里巴巴集团控股有限公司 A kind of searching method and device thereof
CN103744956B (en) * 2014-01-06 2017-01-04 同济大学 A kind of diversified expanding method of key word
CN103853831B (en) * 2014-03-10 2017-02-01 中国电子科技集团公司第二十八研究所 Personalized searching realization method based on user interest
CN107168943A (en) * 2017-04-07 2017-09-15 平安科技(深圳)有限公司 The method and apparatus of topic early warning
CN107169045A (en) * 2017-04-19 2017-09-15 中国人民解放军国防科学技术大学 A kind of query word method for automatically completing and device based on temporal signatures
CN107203543A (en) * 2016-03-18 2017-09-26 温浩 The information retrieval method that a kind of user search word association is recommended
US9799001B2 (en) 2012-01-24 2017-10-24 International Business Machines Corporation Business-to-business social network
CN108170664A (en) * 2017-11-29 2018-06-15 有米科技股份有限公司 Keyword expanding method and device based on emphasis keyword
CN108304444A (en) * 2017-11-30 2018-07-20 腾讯科技(深圳)有限公司 Information query method and device
CN108304417A (en) * 2017-01-13 2018-07-20 北京京东尚科信息技术有限公司 Information processing method and information processing unit
CN112925967A (en) * 2021-02-07 2021-06-08 北京鼎诚世通科技有限公司 Method, device and equipment for generating expanded query words and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5675819A (en) * 1994-06-16 1997-10-07 Xerox Corporation Document information retrieval using global word co-occurrence patterns
CN100437585C (en) * 2006-09-04 2008-11-26 北京航空航天大学 Method for carrying out retrieval hint based on inverted list

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8874604B2 (en) 2009-08-31 2014-10-28 International Business Machines Corporation Method and system for searching an electronic map
WO2011079414A1 (en) * 2009-12-30 2011-07-07 Google Inc. Custom search query suggestion tools
CN102955821A (en) * 2011-08-30 2013-03-06 北京百度网讯科技有限公司 Method and device for carrying out expansion processing on query sequence
CN102375885A (en) * 2011-10-21 2012-03-14 北京百度网讯科技有限公司 Method and device for providing search suggestions corresponding to query sequence
CN103092858A (en) * 2011-11-01 2013-05-08 阿里巴巴集团控股有限公司 Search method and search device
CN103092858B (en) * 2011-11-01 2016-12-14 阿里巴巴集团控股有限公司 A kind of searching method and device thereof
US9799001B2 (en) 2012-01-24 2017-10-24 International Business Machines Corporation Business-to-business social network
CN102722526A (en) * 2012-05-16 2012-10-10 成都信息工程学院 Part-of-speech classification statistics-based duplicate webpage and approximate webpage identification method
CN102722526B (en) * 2012-05-16 2014-04-30 成都信息工程学院 Part-of-speech classification statistics-based duplicate webpage and approximate webpage identification method
CN102831185A (en) * 2012-08-01 2012-12-19 北京百度网讯科技有限公司 Entry recommending method and device
CN102799689A (en) * 2012-08-09 2012-11-28 昆山宏凌电子有限公司 Search software
WO2014063595A1 (en) * 2012-10-23 2014-05-01 International Business Machines Corporation Incorporating related searches by other users in a social network in a search request
CN103853771B (en) * 2012-12-03 2018-12-14 百度在线网络技术(北京)有限公司 A kind of method for pushing and system of search result
CN103853771A (en) * 2012-12-03 2014-06-11 百度在线网络技术(北京)有限公司 Search result pushing method and search result pushing system
CN103150409B (en) * 2013-04-08 2017-04-12 深圳市宜搜科技发展有限公司 Method and system for recommending user search word
CN103150409A (en) * 2013-04-08 2013-06-12 深圳市宜搜科技发展有限公司 Method and system for recommending user search word
CN103258025B (en) * 2013-05-08 2016-08-31 百度在线网络技术(北京)有限公司 Generate the method for co-occurrence keyword, the method that association search word is provided and system
CN103258025A (en) * 2013-05-08 2013-08-21 百度在线网络技术(北京)有限公司 Method for generating co-occurrence key words and method and system for providing associated search terms
CN103401918A (en) * 2013-07-30 2013-11-20 东北石油大学 Electronic map-based business information releasing system
CN103401918B (en) * 2013-07-30 2016-04-06 东北石油大学 A kind of business information delivery system based on electronic chart
CN103744956B (en) * 2014-01-06 2017-01-04 同济大学 A kind of diversified expanding method of key word
CN103853831B (en) * 2014-03-10 2017-02-01 中国电子科技集团公司第二十八研究所 Personalized searching realization method based on user interest
CN104598630A (en) * 2015-02-05 2015-05-06 北京航空航天大学 Event indexing and retrieval method and device
CN106033445B (en) * 2015-03-16 2019-10-25 北京国双科技有限公司 The method and apparatus for obtaining article degree of association data
CN106033445A (en) * 2015-03-16 2016-10-19 北京国双科技有限公司 Method and device for obtaining article association degree data
CN107203543A (en) * 2016-03-18 2017-09-26 温浩 The information retrieval method that a kind of user search word association is recommended
CN108304417B (en) * 2017-01-13 2021-09-17 北京京东尚科信息技术有限公司 Information processing method and information processing apparatus
CN108304417A (en) * 2017-01-13 2018-07-20 北京京东尚科信息技术有限公司 Information processing method and information processing unit
US11205046B2 (en) 2017-04-07 2021-12-21 Ping An Technology (Shenzhen) Co., Ltd. Topic monitoring for early warning with extended keyword similarity
CN107168943A (en) * 2017-04-07 2017-09-15 平安科技(深圳)有限公司 The method and apparatus of topic early warning
CN107169045A (en) * 2017-04-19 2017-09-15 中国人民解放军国防科学技术大学 A kind of query word method for automatically completing and device based on temporal signatures
CN108170664A (en) * 2017-11-29 2018-06-15 有米科技股份有限公司 Keyword expanding method and device based on emphasis keyword
CN108304444A (en) * 2017-11-30 2018-07-20 腾讯科技(深圳)有限公司 Information query method and device
CN108304444B (en) * 2017-11-30 2021-12-14 腾讯科技(深圳)有限公司 Information query method and device
CN112925967A (en) * 2021-02-07 2021-06-08 北京鼎诚世通科技有限公司 Method, device and equipment for generating expanded query words and storage medium

Also Published As

Publication number Publication date
CN101295319B (en) 2010-06-02

Similar Documents

Publication Publication Date Title
CN101295319B (en) Method and device for expanding query, search engine system
US8380697B2 (en) Search and retrieval methods and systems of short messages utilizing messaging context and keyword frequency
US8554854B2 (en) Systems and methods for identifying terms relevant to web pages using social network messages
CN102567408B (en) Method and device for recommending search keyword
CN101079064B (en) Web page sequencing method and device
CN109885773B (en) Personalized article recommendation method, system, medium and equipment
US20110264651A1 (en) Large scale entity-specific resource classification
CN104166651A (en) Data searching method and device based on integration of data objects in same classes
CN101673306B (en) Website information query method and system thereof
CN101261629A (en) Specific information searching method based on automatic classification technology
US9405803B2 (en) Ranking signals in mixed corpora environments
CN103294692A (en) Information recommendation method and system
Adamu et al. A survey on big data indexing strategies
CN103778206A (en) Method for providing network service resources
CN106294358A (en) The search method of a kind of information and system
CN103810210B (en) Search result display methods and device
CN108509449B (en) Information processing method and server
CN112784040B (en) Vertical industry text classification method based on corpus
CN112883143A (en) Elasticissearch-based digital exhibition searching method and system
Park et al. Topic word selection for blogs by topic richness using web search result clustering
US9646099B2 (en) Generating resources for support of online services
CN104102738A (en) Entity library expansion method and device
CN111694929B (en) Data map-based searching method, intelligent terminal and readable storage medium
Indra et al. A clustering technique using single pass clustering algorithm for search engine
Pranckaitis et al. Clustering of Lithuanian news articles

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant