CN104615621A

CN104615621A - Method and system for processing correlations in searches

Info

Publication number: CN104615621A
Application number: CN201410294419.2A
Authority: CN
Inventors: 贺海军; 李雅凡
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2014-06-25
Filing date: 2014-06-25
Publication date: 2015-05-13
Anticipated expiration: 2034-06-25
Also published as: CN104615621B

Abstract

The invention provides a system for processing correlations in searches. A method includes the steps of obtaining an inquire string, and conducting searching according to the inquire string to obtain a plurality of searching results; according to a plurality of pre-defined features, gradually conducting feature extraction on the obtained multiple searching results so as to obtain feature marking values corresponding to the features in the searching results; conducting regression problem processing in the searching results according to the feature marking values corresponding to the features so as to obtain correlation scores of the searching results relative to the inquire string; determining the searching results related to the inquire string according to the correlation scores, and displaying the searching results. By means of the search correlation treatment system and method, the accuracy of correlation processing of the searching results can be improved.

Description

Correlation treatment method in search and system

Technical field

The present invention relates to Computer Applied Technology, particularly relate to the Correlation treatment method in a kind of search and system.

Background technology

Along with the development of search technique, user uses various search engine to complete the search of various query string more and more, to obtain corresponding Search Results.In a search engine, according to query string obtain and the Search Results that is shown in searched page normally magnanimity, therefore, need to carry out correlativity process to Search Results, for user provides the Search Results comparatively relevant to query string.

Such as, but traditional is mostly realize based on attribute single in Search Results to searching for the correlativity process carried out, and, Search Results is relative to the text coverage rate etc. of query string.This will make the inaccurate limitation of correlativity process that there is Search Results in real application.

Summary of the invention

Based on this, be necessary that pin provides the Correlation treatment method in a kind of search that can improve the accuracy of the correlativity process of Search Results.

In addition, there is a need to provide the correlativity disposal system in a kind of search that can improve the accuracy of the correlativity process of Search Results.

A Correlation treatment method in search, comprises the steps:

Obtain query string, and carry out search according to described query string and obtain some Search Results;

According to predefined multiple feature, one by one feature extraction is carried out, to obtain the feature tag value in described Search Results corresponding to each feature to described some the Search Results obtained;

Feature tag value in each Search Results corresponding to feature carries out regression problem process and obtains the relevance score of described Search Results relative to described query string;

Determine Search Results maximally related with described query string according to described relevance score, and show described Search Results.

A correlativity disposal system in search, comprising:

Query string search module, for obtaining query string, and carries out search according to described query string and obtains some Search Results;

Feature extraction module, for carrying out feature extraction one by one, to obtain the feature tag value in described Search Results corresponding to each feature according to predefined multiple feature to described some the Search Results obtained;

Processing module, carries out regression problem process for the feature tag value in each Search Results corresponding to feature and obtains the relevance score of described Search Results relative to described query string;

Correlation determining module, for determining Search Results maximally related with described query string according to described relevance score, and shows described Search Results.

Correlation treatment method in above-mentioned search and system, acquisition query string is carried out search for obtain some Search Results accordingly, feature extraction will be carried out one by one according to predefined multiple feature to some obtained Search Results, to obtain the feature tag value in each Search Results corresponding to each feature, feature tag value in each Search Results corresponding to feature carries out regression problem process and obtains the relevance score of Search Results relative to query string, the maximally related Search Results with query string is determined according to relevance score, and show with the maximally related Search Results of query string, owing to being depend on predefined multiple feature with the maximally related Search Results of query string, and obtain as regression problem process, therefore, to greatly improve the accuracy of the correlativity process of Search Results.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the Correlation treatment method in an embodiment in search;

Fig. 2 obtains the method flow diagram of Search Results relative to the relevance score of query string for the feature tag value in Fig. 1 in each Search Results corresponding to feature carries out regression problem process;

Fig. 3 is in advance according to the method flow diagram of the multiple feature construction regression models in the most correlation results data of given precise search query string set and correspondence in an embodiment;

Fig. 4 obtains relevancy labels's value of Search Results and Search Results characteristic of correspondence vector in an embodiment, according to the method flow diagram of relevancy labels's value and proper vector optimized regression model;

Fig. 5 is the structural representation of the correlativity disposal system in an embodiment in search;

Fig. 6 is the structural representation of processing module in Fig. 5;

Fig. 7 is the structural representation of the correlativity disposal system in another embodiment in search;

Fig. 8 is the structural representation of model construction module in Fig. 7;

Fig. 9 is the structural representation optimizing module in an embodiment;

A kind of server architecture schematic diagram that Figure 10 provides for the embodiment of the present invention.

Embodiment

In order to make object of the present invention, technical scheme and advantage clearly understand, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.

As shown in Figure 1, in one embodiment, the Correlation treatment method in a kind of search, comprises the steps:

Step 110, obtains query string, and carries out search according to query string and obtain some Search Results.

In the present embodiment, obtain the query string that inputted by searched page of user, obtain some Search Results relevant to this query string to be undertaken searching for by search engine according to query string.

Such as, the search that user carries out can be map search, therefore, searching for obtained Search Results according to query string to map will be point of interest (Point of Interest, be called for short POI) data, each interest point data will contain the much informations such as title, classification, longitude, latitude and importance degree (POIRank).

Step 130, carries out feature extraction according to predefined multiple feature one by one to some the Search Results obtained, to obtain the feature tag value in Search Results corresponding to each feature.

In the present embodiment, pre-define multiple feature, with the multiple attribute by comprising in predefined each Search Results of multiple feature representation.All feature extraction will be carried out according to predefined multiple feature to each Search Results, so that Search Results is reached feature tag value according to predefined multiple mark sheet, i.e. corresponding multiple feature in each Search Results, each feature all has the feature tag value corresponded.

Wherein, its predefined multiple feature of institute of different search procedures is also by different, and the feature tag value corresponding to feature will be used for the degree of correlation weighed between Search Results and query string.

Such as, in map search predefined multiple feature can comprise the position of current results, the text PTS of current results, the importance degree of current results, the confidence level of current results, the technorati authority of current results, the title text score of current results, the polymerization another name text score of current results, the title coverage rate of current results, the polymerization another name coverage rate of current results, with the difference of the text PTS of first bar result, with the difference of the title text score of first bar result, with the difference of the importance degree of first bar result, with the difference of the text PTS of a upper result, with the difference of the title text score of a upper result, with the difference of the importance degree of a upper result, with the difference of the text PTS of next result, with the difference of the title text score of next result, with the difference of the importance degree of next result, the difference of the text PTS of current results and the average text PTS of Top N result, the title text score of current results and the difference of the difference of Top N result average title text score and the importance degree of current results and the average importance degree of Top N result, wherein, TopN result average title text score refers to text PTS mean value corresponding in the highest N number of Search Results of text PTS, Top N result average title text score refers to title text score averages corresponding in the highest N number of Search Results of title text score, the average importance degree of Top N result refers to importance degree mean value corresponding to the highest N number of Search Results of importance degree, and N can carry out value as required flexibly.

Concrete, in map search predefined multiple feature and the feature representation corresponding to each feature (i.e. the acquisition of feature tag value) as shown in the table:

Wherein, result as above is the interest point data carrying out map search and obtain.

A kind of interest point data in map may occur in multiple data source, and the title, address, phone etc. in different pieces of information source may have minute differences, therefore, the interest point data coming from different pieces of information source is done polymerization process, an interest point data is aggregated into the interest point data of different pieces of information being originated, choose the title of the title corresponding to a data source as this interest point data, the title of other Data Source is then called as the polymerization of this interest point data.

In addition, query string will form multiple field after cutting word process, the Search Results existed with the form of the text field also will be cut into multiple field, the ratio that multiple fields in Search Results occur in multiple fields of query string is text coverage rate, defines title text coverage rate, polymerization another name text coverage rate etc. accordingly.

Step 150, the feature tag value in each Search Results corresponding to feature carries out regression problem process and obtains the relevance score of Search Results relative to query string.

In the present embodiment, the method for logistic regression (Logisitic Regression) is adopted to carry out computing to the multiple feature tag value corresponding to feature multiple in each Search Results, to obtain the relevance score of this Search Results relative to query string.

Wherein, the relevance score obtained is higher, then illustrate that corresponding Search Results and query string are more for relevant.

Step 170, determines the maximally related Search Results with query string according to relevance score, and display of search results.

In the present embodiment, after some the Search Results that search obtains all obtain the relevance score corresponding to it, if the highest predetermined number of a relevance score Search Results can be obtained according to the numerical values recited of relevance score, this Search Results is the maximally related Search Results with query string, and will show in searched page.

Further, to be undertaken searching in the searched page obtained by input inquiry string user, only be shown in searched page by what determine with the maximally related Search Results of query string, other Search Results will directly not shown, such as, other Search Results is folded up, clicks similar buttons such as " checking whole result " as user and just can all represent.

Such as, when inputting " Peking University " this query string, only will represent the interest point data that name is called " Peking University ", other more interest point datas will be folded, and select can see time " checking whole result " until user.

As shown in Figure 2, in one embodiment, above-mentioned steps 150 comprises:

Step 151, in each Search Results, each the feature tag value morphogenesis characters vector corresponding to multiple feature.

Step 153, take proper vector as input, obtains the relevance score of Search Results relative to query string according to the regression model built in advance.

In the present embodiment, obtain the regression model built in advance, namely w parameter sets, i.e. β ₀, β ₁, β ₂..., β _n, and corresponding relevance score computing formula, and then proper vector and w parameter sets are inputted following relevance score computing formula can calculate the relevance score of corresponding Search Results relative to query string, that is:

p (y_{i} = + 1 | x_{i}, w) = σ (y_{i} w^{T} x_{i}) = \frac{1}{1 + \exp (- w^{T} x_{i})}

Wherein, y _i{-1 ,+1}, illustrate corresponding Search Results is positive sample (+1) or negative sample (-1) to ∈, x _i∈ R ⁿbe a n-dimensional vector, represent the value of i-th Search Results in this n feature.

In one embodiment, before above-mentioned steps 153, method as above further comprises following steps:

In advance according to the multiple feature construction regression models in the most correlation results data of given precise search query string set and correspondence.

In the present embodiment, given precise search query string set will comprise multiple accurate query string, and this accurate query string will be used for realizing precise search.Such as, in map search, the accurate query string comprised in the set of precise search query string can be " Beijing Peking University ", will be specified in the interest point data in Beijing search " Peking University " by this accurate query string.

Wherein, given precise search query string set can acquire by search daily record, also some modes by other acquire, and the most correlation results data of correspondence also can be obtained by search log acquisition, such as, in search daily record, have recorded query string and maximally related result data, in addition, also obtaining by carrying out search to given precise search query string set, not limiting one by one at this.

Machine learning is carried out according to the multiple features in the most correlation results data of given precise search query string set and correspondence, to build regression model, the machine learning method being applicable to this includes but not limited to that decision tree, support vector machine, artificial neural network and gradient increase progressively the methods such as decision tree.

Realized the structure of regression model by the set of large-scale precise search query string and multiple feature, will greatly improve the accuracy of most relevant search result in regression model identification search procedure.

As shown in Figure 3, in one embodiment, above-mentionedly to comprise according to the step of the multiple feature construction regression models in the most correlation results data of given precise search query string set and correspondence in advance:

Step 301, obtains the most correlation results data that query string in given precise search query string set and the set of precise search query string is corresponding.

Step 303, carries out feature extraction to most correlation results data, to obtain most correlation results data characteristic of correspondence vector.

In the present embodiment, according to predefined multiple feature, feature extraction is carried out to most correlation results data, so that most correlation results data is expressed as predefined multiple feature, obtain characteristic of correspondence vector.

Concrete, because given precise search query string set contains multiple accurate query string, therefore, the most correlation results data corresponding to this precise search query string set will contain some Search Results corresponding to query string accurate with each.

It can thus be appreciated that will extract each Search Results characteristic of correspondence mark value respectively according to predefined multiple feature, and then each Search Results all can be expressed as the proper vector of a N*1, wherein, N is predefined feature quantity; Whole most correlation results data just can be expressed as the proper vector of a M*N dimension, and M is the quantity of Search Results in most correlation results data.

Step 305, carries out recurrence learning to build regression model according to most correlation results data characteristic of correspondence vector.

In the present embodiment, the proper vector corresponding to most correlation results data carries out machine learning, to build the regression model for identifying most relevant search result.

Concrete, by given M training sample (x ₁, y ₁), (x ₂, y ₂), (x ₃, y ₃) ..., (x _m, y _m), wherein, x _i∈ R ⁿn-dimensional vector, for representing i-th sample, the value of i-th Search Results in a predefined n feature namely in most correlation results data, y _i{-1 ,+1} to illustrate this sample be positive sample (+1) or negative sample (-1) to ∈.Regression model passes through logical function by the proper vector x of i-th sample _ithe probability being positive sample with this sample connects, that is:

p (y_{i} = + 1 | x_{i}, w) = σ (y_{i} w^{T} x_{i}) = \frac{1}{1 + \exp (- w^{T} x_{i})}

Wherein, w ^tx _i=β ₀+ β ₁x _i1+ β ₂x _i2+ ...+β _nx _in, the form of w parameter is β ₀, β ₁, β ₂..., β _n, parameter w does different weightings to calculate w to the n of an x dimension ^tx _i, then by the logical function to 0 of S type to 1, be the probability of positive sample.

The target of carrying out machine learning needs to look for namely suitable w, and make the relevance score P of original sample all larger, the relevance score of negative sample is all smaller simultaneously.

In another embodiment, after the above-mentioned step in advance according to the multiple feature construction regression models in the most correlation results data of given precise search query string set and correspondence, method as above also comprises:

Obtain relevancy labels's value and the Search Results characteristic of correspondence vector of Search Results, according to relevancy labels's value and proper vector optimized regression model.

In the present embodiment, also constantly can carry out the optimization of regression model according to relevancy labels's value of Search Results and Search Results characteristic of correspondence vector, to obtain more suitable regression model.

Wherein, relevancy labels's value of Search Results carries out marking obtaining according to the rule preset, and this rule preset is by relevant to predefined multiple feature.Wherein, the correlativity mark value of Search Results includes 0 and 1 two numerical value, and that is, the correlativity mark value corresponding to maximally related Search Results is 1, and the correlativity mark value corresponding to all the other Search Results is 0.

The relevance score of Search Results will be obtained by presently used regression model according to Search Results characteristic of correspondence vector, and then the error compared between relevance score and relevancy labels's value adjusts regression model, to optimize presently used regression model, and then improve constantly the accuracy of correlativity process in search.

As shown in Figure 4, in one embodiment, relevancy labels's value of above-mentioned acquisition Search Results and Search Results characteristic of correspondence vector, the step according to relevancy labels's value and proper vector optimized regression model comprises:

Step 401, obtains relevancy labels's value and the Search Results characteristic of correspondence vector of Search Results.

Step 403, is obtained the relevance score of Search Results by Search Results characteristic of correspondence vector sum regression model.

In the present embodiment, by Search Results characteristic of correspondence vector input formula

p (y_{i} = + 1 | x_{i}, w) = σ (y_{i} w^{T} x_{i}) = \frac{1}{1 + \exp (- w^{T} x_{i})},

To calculate the relevance score of Search Results, wherein, the w parameter sets adopted obtains for machine learning.

Step 405, according to relevancy labels's value and the relevance score optimized regression model of Search Results.

In the present embodiment, relevancy labels's value of Search Results and relevance score are compared to the error obtained between the two, and then find the deficiency of regression model according to this error, with optimized regression model, obtain better forecast model.

As shown in Figure 5, in one embodiment, the correlativity disposal system in a kind of search, comprises query string search module 510, feature extraction module 530, processing module 550 and correlation determining module 570.

Query string search module 510, for obtaining query string, and carries out search according to query string and obtains some Search Results.

In the present embodiment, query string search module 510 obtains the query string that user is inputted by searched page, obtains some Search Results relevant to this query string to be undertaken searching for by search engine according to query string.

Such as, the search that user carries out can be map search, therefore, it will be interest point data that query string search module 510 searches for obtained Search Results according to query string to map, and each interest point data will contain the much informations such as title, classification, longitude, latitude and importance degree.

Feature extraction module 530, for carrying out feature extraction according to predefined multiple feature one by one to some the Search Results obtained, to obtain the feature tag value in Search Results corresponding to each feature.

In the present embodiment, pre-define multiple feature, with the multiple attribute by comprising in predefined each Search Results of multiple feature representation.Feature extraction module 530 all will carry out feature extraction according to predefined multiple feature to each Search Results, so that Search Results is reached feature tag value according to predefined multiple mark sheet, i.e. corresponding multiple feature in each Search Results, each feature all has the feature tag value corresponded.

Processing module 550, carries out regression problem process for the feature tag value in each Search Results corresponding to feature and obtains the relevance score of Search Results relative to query string.

In the present embodiment, processing module 550 adopts the method for logistic regression to carry out computing to the multiple feature tag value corresponding to feature multiple in each Search Results, to obtain the relevance score of this Search Results relative to query string.

Correlation determining module 570, for determining the maximally related Search Results with query string according to relevance score, and shows this Search Results.

In the present embodiment, after some the Search Results that search obtains all obtain the relevance score corresponding to it, if correlation determining module 570 can obtain the highest predetermined number of a relevance score Search Results according to the numerical values recited of relevance score, this Search Results is the maximally related Search Results with query string, and will show in searched page.

As shown in Figure 6, in one embodiment, above-mentioned processing module 550 comprises vectorial forming unit 551 and mode input unit 553.

Vector forming unit 551, for each the feature tag value morphogenesis characters vector in each Search Results corresponding to multiple feature.

Mode input unit 553, for taking proper vector as input, obtains the relevance score of Search Results relative to query string according to the regression model built in advance.

In the present embodiment, mode input unit 553 obtains the regression model built in advance, namely w parameter sets, i.e. β ₀, β ₁, β ₂..., β _n, and corresponding relevance score computing formula, and then proper vector and w parameter sets are inputted following relevance score computing formula can calculate the relevance score of corresponding Search Results relative to query string, that is:

p (y_{i} = + 1 | x_{i}, w) = σ (y_{i} w^{T} x_{i}) = \frac{1}{1 + \exp (- w^{T} x_{i})}

As shown in Figure 7, in one embodiment, system as above further comprises model construction module 710.

Model construction module 710 is in advance according to the multiple feature construction regression models in the most correlation results data of given precise search query string set and correspondence.

Model construction module 710 carries out machine learning according to the multiple features in the most correlation results data of given precise search query string set and correspondence, to build regression model, the machine learning method being applicable to this includes but not limited to that decision tree, support vector machine, artificial neural network and gradient increase progressively the methods such as decision tree.

Model construction module 710 realizes the structure of regression model by the set of large-scale precise search query string and multiple feature, will greatly improve the accuracy of most relevant search result in regression model identification search procedure.

As shown in Figure 8, in one embodiment, above-mentioned model construction module 710 includes acquiring unit 711, characteristic processing unit 713 and unit 715.

Acquiring unit 711, for obtaining most correlation results data corresponding to query string in given precise search query string set and the set of precise search query string.

Characteristic processing unit 713, for carrying out feature extraction to most correlation results data, to obtain most correlation results data characteristic of correspondence vector.

In the present embodiment, characteristic processing unit 713 carries out feature extraction according to predefined multiple feature to most correlation results data, so that most correlation results data is expressed as predefined multiple feature, obtains characteristic of correspondence vector.

It can thus be appreciated that characteristic processing unit 713 will extract each Search Results characteristic of correspondence mark value respectively according to predefined multiple feature, and then each Search Results all can be expressed as the proper vector of a N*1, wherein, N will be predefined feature quantity; Whole most correlation results data just can be expressed as the proper vector of a M*N dimension, and M is the quantity of Search Results in most correlation results data.

Unit 715, for carrying out recurrence learning to build regression model according to most correlation results data characteristic of correspondence vector.

In the present embodiment, the proper vector of unit 715 corresponding to most correlation results data carries out machine learning, to build the regression model for identifying most relevant search result.

p (y_{i} = + 1 | x_{i}, w) = σ (y_{i} w^{T} x_{i}) = \frac{1}{1 + \exp (- w^{T} x_{i})}

The target that unit 715 carries out machine learning needs to look for namely suitable w, and make the relevance score P of original sample all larger, the relevance score of negative sample is all smaller simultaneously.

In another embodiment, system as above also comprises optimization module.This optimization module, for obtaining relevancy labels's value and the Search Results characteristic of correspondence vector of Search Results, is worth and proper vector optimized regression model according to relevancy labels.

In the present embodiment, optimizing module also can constantly according to relevancy labels's value of Search Results and the vectorial optimization carrying out regression model of Search Results characteristic of correspondence, to obtain more suitable regression model.

Optimize module and will be obtained the relevance score of Search Results by presently used regression model according to Search Results characteristic of correspondence vector, and then the error compared between relevance score and relevancy labels's value adjusts regression model, to optimize presently used regression model, and then improve constantly the accuracy of correlativity process in search.

As shown in Figure 9, in one embodiment, above-mentioned optimization module comprises numerical value acquiring unit 901, degree of correlation arithmetic element 903 and model optimization unit 905.

Numerical value acquiring unit 901, for obtaining feature tag value and the Search Results characteristic of correspondence vector of Search Results.

Degree of correlation arithmetic element 903, for being obtained the relevance score of Search Results by Search Results characteristic of correspondence vector sum regression model.

In the present embodiment, degree of correlation arithmetic element 903 is by Search Results characteristic of correspondence vector input formula

p (y_{i} = + 1 | x_{i}, w) = σ (y_{i} w^{T} x_{i}) = \frac{1}{1 + \exp (- w^{T} x_{i})},

Model optimization unit 905, is worth and relevance score optimized regression model for the relevancy labels according to Search Results.

In the present embodiment, relevancy labels's value of model optimization unit 905 pairs of Search Results and relevance score compare the error obtained between the two, and then find the deficiency of regression model according to this error, with optimized regression model, obtain better forecast model.

Figure 10 is a kind of server architecture schematic diagram that the embodiment of the present invention provides.This server 1000 can produce larger difference because of configuration or performance difference, one or more central processing units (central processing units can be comprised, CPU) 1022 (such as, one or more processors) and storer 1032, one or more store the storage medium 51030 (such as one or more mass memory units) of application program 1042 or data 1044.Wherein, storer 1032 and storage medium 1030 can be of short duration storages or store lastingly.The program being stored in storage medium 1030 can comprise one or more modules (illustrating not shown), such as, query string search module 510 in Fig. 5, feature extraction module 530, processing module 550 and correlation determining module 570 etc., each module can comprise a series of command operatings in server.Further, central processing unit 1022 can be set to communicate with storage medium 1030, and server 1000 performs a series of command operatings in storage medium 1030.Server 1000 can also comprise one or more power supplys 1026, one or more wired or wireless network interfaces 550, one or more IO interface 1058, and/or, one or more operating systems 1041, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc.Above-mentioned Fig. 1 can based on the server architecture shown in this Figure 10 to the step performed by server described in embodiment illustrated in fig. 4.

One of ordinary skill in the art will appreciate that all or part of step realizing above-described embodiment can have been come by hardware, the hardware that also can carry out instruction relevant by program completes, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium mentioned can be ROM (read-only memory), disk or CD etc.

One of ordinary skill in the art will appreciate that all or part of flow process realized in above-described embodiment method, that the hardware that can carry out instruction relevant by computer program has come, described program can be stored in a computer read/write memory medium, as in the embodiment of the present invention, this program can be stored in the storage medium of computer system, and performed by least one processor in this computer system, to realize the flow process of the embodiment comprised as above-mentioned each side method.Wherein, described storage medium can be magnetic disc, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc.

The above embodiment only have expressed several embodiment of the present invention, and it describes comparatively concrete and detailed, but therefore can not be interpreted as the restriction to the scope of the claims of the present invention.It should be pointed out that for the person of ordinary skill of the art, without departing from the inventive concept of the premise, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.

Claims

1. the Correlation treatment method in search, comprises the steps:

2. method according to claim 1, is characterized in that, described feature tag value in each Search Results corresponding to feature carries out regression problem process and obtains described Search Results and comprise relative to the step of the relevance score of described query string:

In each Search Results, each the feature tag value morphogenesis characters vector corresponding to multiple feature;

With described proper vector for input, obtain the relevance score of described Search Results relative to described query string according to the regression model built in advance.

3. method according to claim 2, is characterized in that, described with described proper vector for input, before obtaining the step of described Search Results relative to the relevance score of described query string by the regression model built in advance, described method also comprises:

4. method according to claim 3, is characterized in that, describedly comprises according to the step of the multiple feature construction regression models in given precise search query string set and corresponding result data in advance:

Obtain the most correlation results data that query string in given precise search query string set and the set of described precise search query string is corresponding;

Feature extraction is carried out to described most correlation results data, with correlation results data characteristic of correspondence vector most described in obtaining;

Recurrence learning is carried out to build regression model according to described most correlation results data characteristic of correspondence vector.

5. method according to claim 3, is characterized in that, after the described step in advance according to the multiple feature construction regression models in given precise search query string set and corresponding result data, described method also comprises:

Obtain relevancy labels's value and the described Search Results characteristic of correspondence vector of Search Results, optimize described regression model according to described relevancy labels's value and proper vector.

6. method according to claim 5, is characterized in that, relevancy labels's value of described acquisition Search Results and described Search Results characteristic of correspondence vector, and the step optimizing described regression model according to described relevancy labels's value and proper vector comprises:

Obtain feature tag value and the described Search Results characteristic of correspondence vector of Search Results;

The relevance score of described Search Results is obtained by described Search Results characteristic of correspondence vector sum regression model;

Described regression model is optimized according to relevancy labels's value of described Search Results and relevance score.

7. the correlativity disposal system in search, is characterized in that, comprising:

8. system according to claim 7, is characterized in that, described processing module comprises:

Vector forming unit, for each the feature tag value morphogenesis characters vector in each Search Results corresponding to multiple feature;

Mode input unit, for described proper vector for input, obtain the relevance score of described Search Results relative to described query string according to the regression model built in advance.

9. system according to claim 8, is characterized in that, described system also comprises:

Model construction module, in advance according to the multiple feature construction regression models in the most correlation results data of given precise search query string set and correspondence.

10. system according to claim 9, is characterized in that, described model construction module comprises:

Acquiring unit, for obtaining most correlation results data corresponding to query string in given precise search query string set and the set of described precise search query string;

Characteristic processing unit, for carrying out feature extraction to described most correlation results data, with correlation results data characteristic of correspondence vector most described in obtaining;

Unit, for carrying out recurrence learning to build regression model according to described most correlation results data characteristic of correspondence vector.

11. systems according to claim 9, is characterized in that, described system also comprises:

Optimizing module, for obtaining relevancy labels's value and the described Search Results characteristic of correspondence vector of Search Results, optimizing described regression model according to described relevancy labels's value and proper vector.

12. systems according to claim 11, is characterized in that, described optimization module comprises:

Numerical value acquiring unit, for obtaining feature tag value and the described Search Results characteristic of correspondence vector of Search Results;

Degree of correlation arithmetic element, for being obtained the relevance score of described Search Results by described Search Results characteristic of correspondence vector sum regression model;

Model optimization unit, for optimizing described regression model according to relevancy labels's value of described Search Results and relevance score.