CN102622417A - Method and device for ordering information records - Google Patents

Method and device for ordering information records Download PDF

Info

Publication number
CN102622417A
CN102622417A CN2012100389932A CN201210038993A CN102622417A CN 102622417 A CN102622417 A CN 102622417A CN 2012100389932 A CN2012100389932 A CN 2012100389932A CN 201210038993 A CN201210038993 A CN 201210038993A CN 102622417 A CN102622417 A CN 102622417A
Authority
CN
China
Prior art keywords
information
classification
query string
environmental information
under
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012100389932A
Other languages
Chinese (zh)
Other versions
CN102622417B (en
Inventor
江会星
苏雪峰
佟子健
张超旭
王潇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Beijing Sogou Information Service Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Beijing Sogou Information Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd, Beijing Sogou Information Service Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201210038993.2A priority Critical patent/CN102622417B/en
Publication of CN102622417A publication Critical patent/CN102622417A/en
Application granted granted Critical
Publication of CN102622417B publication Critical patent/CN102622417B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method and a device for ordering information records. The method specifically includes: acquiring environmental information corresponding to a query string; acquiring information records about intention categories according to the query string; and ordering the intention categories according to distribution of the intention categories under the environmental information corresponding to the query string, and reordering the information records according to the result of ordering the intention categories, wherein the distribution of the intention categories under the environmental information corresponding to the query string is obtained by statistically analyzing a user log which records the environment information. By the method and the device, the intention categories can be ordered according to the environment information, and the intention categories that the current user is interested in are ordered in the front. In addition, according to the factor of individuality of the user, the latter information records are more likely true information required by the user.

Description

Information is write down the method and apparatus that sorts
Technical field
The application relates to technical field of data processing, particularly relates to a kind of method and apparatus, a kind of information search server and information search client that the information record is sorted.
Background technology
At present, in network data, carry out information search, become one of topmost application in internet.For example; When carrying out information search; The query string that search engine is imported according to the user inquires the information record of page format in database, perhaps, browser is according to user's current browsing webpage structure query string; And in database, inquire the information record of page format according to the query string of structure, or the like.
In order to meet consumers' demand preferably; Search engine or browser do not represent the information record that inquires immediately; But be foundation with the correlativity between information record and the query string; According to correlativity order from high to low information record is sorted, and the record of the information after will sort represents, this will and query string between correlativity be referred to as the operation that the basic weights of foundation sort as the operation of sort by.
Information record after the basic weights of foundation sort can reflect the correlativity between information record and the query string; Help the user to a certain extent and from the information record, search fast, still, can only embody the correlativity of information record and query string because the basic weights of foundation sort; Do not consider other factors; And the information recorded content in the real network data is various, and the ordering of only carrying out according to basic weights is too simple, is affected by other factors; The information record that comes the front might not be that the user needs; The information record that comes the back possibly be that the user needs on the contrary, and therefore, the sort method of existing information record can not reflect user's real information demand; The user requires a great deal of time and could from the corresponding information record of query string, find most interested information in such cases, and also can take too much system resource.
In a word, need the urgent technical matters that solves of those skilled in the art to be exactly: how the information record of the real information demand that approaches the user more can be provided, thereby make things convenient for the user therefrom to find most interested information apace.
Summary of the invention
The application's technical matters to be solved provides a kind of method and apparatus that the information record is sorted, and can realize effective searching order to environmental information, makes that the information record after the ordering approaches user's real information demand more.
Accordingly, the application also provides a kind of information search server and information search client, and the information record of the real information demand that approaches the user more can be provided, thereby makes things convenient for the user therefrom to find most interested information apace.
In order to address the above problem, the application discloses a kind of method that the information record is sorted, and said method comprises:
Gather the corresponding environmental information of query string;
Obtain the information record that respectively is intended to classification according to said query string;
According to the distribution that respectively is intended to classification under the corresponding environmental information of said query string, the intention classification is sorted, and the order of the ranking results adjustment information record of foundation intention classification; Wherein, the distribution that respectively is intended to classification under the environmental information that said query string is corresponding obtains by carrying out analytic statistics according to the user journal that records environmental information.
On the other hand, disclosed herein as well is a kind of device that the information record is sorted, said device comprises:
Acquisition module is used to gather the corresponding environmental information of said query string;
Information record acquisition module is used for obtaining the information record that respectively is intended to classification according to said query string; And
Order module between class is used for the distribution according to intention classification under the corresponding environmental information of said query string, the intention classification sorted, and the order of the ranking results adjustment information record of foundation intention classification; Wherein, the distribution of intention classification obtains by carrying out analytic statistics according to the user journal that records environmental information under the environmental information that said query string is corresponding.
On the other hand, disclosed herein as well is a kind of information search server, comprising:
Receiver module is used to receive query string and the corresponding environmental information of said query string from the information search client;
The information search module is used for searching in network data according to query string, respectively is intended to the information record of classification;
Order module between type is used for the distribution according to intention classification under the environmental information of said query string correspondence, the intention classification is sorted, and adjust the order of each information record according to the ranking results of intention classification; Wherein, the distribution of intention classification obtains by carrying out analytic statistics according to the user journal that records environmental information under the environmental information that said query string is corresponding; And
Return module, be used for the information record of order module output between said class is returned.
On the other hand, disclosed herein as well is a kind of information search client, comprising:
The inquiry receiver module is used to receive the query string that the user imports;
The environment acquisition module is used to gather the corresponding environmental information of said query string;
Sending module is used for said query string and the corresponding environmental information of said query string are sent to information search server; And
Represent module, the information record that is used for said information search server is returned represents.
Preferably, said information search client also comprises:
The inquiry log logging modle is used for user totem information, said query string and corresponding web page operation history and environmental information are recorded to inquiry log, the historical webpage record for being operated by the user in the corresponding information record of query string of said web page operation.
Compared with prior art, the application has the following advantages:
At first, the application sorts to the intention classification according to the distribution of intention classification under the corresponding environmental information of query string, and the order of the ranking results adjustment information record of foundation intention classification; Because the user is to existing different information requirements under the varying environment information; And the intention classification is directly corresponding with information requirement; It can reflect user's different classes of information requirement; So above-mentioned ordering can come the front with the intention classification that more can reflect information requirement under the corresponding environmental information (hereinafter to be referred as current environment information) of said query string, therefore, the application can make the information record after the ordering be satisfied with user's real information demand.
Secondly, the application it is also conceivable that to the ordering of information record the active user is directed against the factor of the interest that respectively is intended to classification; Because each user has different interest to difference intention classification; And sort according to the user journal that records environmental information and user totem information being carried out active user that analytic statistics obtains respectively is intended to classification under the environmental information of said query string correspondence distribution, can the more interested intention classification of active user be come the front; Under the identical situation of query string; Prior art can provide identical information record to the whole network user; And not will consider the problem of user's individual demand, the application can make the information record after the ordering approach to reflect the real information demand of the personalization of user interest degree more.
Moreover; Sort when respectively being intended to the order of information record of classification at information record with adjustment to each intention classification; The application can also sort to the inner information record of each intention classification according to current environment information; Can reflect that more the webpage of information requirement under the current environment information comes the front in the information record with each intention classification, make that the information record after the ordering approaches user's real information demand more.
The application's technical scheme can be applied to application such as search engine service, browser service, the information record of the real information demand that approaches the user more can be provided, thereby make things convenient for the user to view most interested information apace.
Description of drawings
Fig. 1 is a kind of process flow diagram that information is write down the method embodiment that sorts of the application;
Fig. 2 is the process flow diagram of a kind of information search method embodiment based on search engine of the application;
Fig. 3 is the process flow diagram of a kind of information recommendation method embodiment based on browser of the application;
Fig. 4 is the exemplary plot that represents the zone among the said embodiment of the application Fig. 3 more;
Fig. 5 is a kind of structural drawing that information is write down the device embodiment that sorts of the application;
Fig. 6 is the structural drawing of a kind of information search server embodiment of the application;
Fig. 7 is the structural drawing of a kind of information search client implementation example of the application.
Embodiment
For above-mentioned purpose, the feature and advantage that make the application can be more obviously understandable, the application is done further detailed explanation below in conjunction with accompanying drawing and embodiment.
The application embodiment sorts to the information record to environmental information, owing to can embody user's under the varying environment information different information requirements, can make the information record approach user's real information demand more.
Among the application embodiment, environmental information mainly is meant the residing surrounding enviroment information of user, specifically can comprise time environmental information, location circumstances information, temperature environment information, hardware environment information or the like.
Under different environment information, user's information requirement is different often: with the time environmental information is example, and be new one day beginning morning, so the user has demand to news information in the morning the time; Work is auxilliary for Your Majesty's net during working, so webpage, pictorial information are existed demand when being on duty; Be the moment of loosening amusement evening, in the time of at night music, video information existed demand, or the like;
With the geographical environment is example, and Internet bar, family are the place of loosening amusement, thus the user in the Internet bar, family, usually information such as video, recreation, music are existed demand; Office is the unsuitable excessively amusement in office space, so information such as news, picture are enough concerning the user; Airport, station, hotels and other places are mobile strong place, pay close attention to information such as tourism, weather usually.Even the information requirement of self that the user is clear and definite to video; But; Consider the unsuitable excessively amusement in office space; Internet bar, family are fit to the characteristics of amusement, and what can think that user under the working environment wants to see is the film clips of video, are the complete videos of high definition and the user wants to see under the Internet bar, home environment.
To sum up, those skilled in the art can be according to the actual requirements, adopts in the above-mentioned environmental information one or more, and, segment to one or more environmental informations that adopt.For example, cut apart, the time environmental information is subdivided into daytime and night, perhaps morning, working and evening etc. through the time environmental information is carried out environmental information; For example, through the position environmental information is classified, be Internet bar, attack, family, airport, station, hotel etc. with the location circumstances information subdivision.The application does not limit concrete segmentation mode.
For the record of the information in user's the various information requirement calcaneus rete network data is carried out association, it is that information writes down interpolation intention class label that the application can adopt based on the thought of classifying, and makes difference be intended to the different information requirements of classification correspondence; Like this, to the ordering that the information record that obtains carries out, just convert the ordering that is intended to classification based on environmental information based on environmental information into.
About be intended to the ordering of classification based on environmental information, the application's probability of use opinion and mathematical statistic method are calculated the regularity that respectively is intended to the distribution of classification under the corresponding environmental information of said query string.Particularly, under off-line case, user journal is carried out analytic statistics, obtain respectively being intended under the corresponding environmental information of said query string the distribution of classification; When line ordering,, each information record that is intended to classification is sorted according to the distribution that respectively is intended to classification under the corresponding environmental information of said query string.
In view of the probability symbol that uses among the application embodiment, understand for convenient, make an explanation at this title, implication and acquisition methods by table 1 pair each probability symbol.
Table 1
Figure BDA0000136820820000061
With reference to Fig. 1, show the process flow diagram of a kind of method embodiment that information record is sorted of the application, specifically can comprise:
Step 101, the corresponding environmental information of collection query string;
Among the application embodiment,, each information record that is intended to classification is sorted according to the distribution that respectively is intended to classification under the corresponding environmental information of said query string; Because the user exists different information requirements under varying environment information; And the intention classification links directly with information requirement; It can reflect the information requirement of user's difference intention classification; So above-mentioned ordering more can come the front with the information record of the intention classification that meets information requirement under the corresponding environmental information (hereinafter to be referred as current environment information) of said query string; Therefore, the application can make the information record after the ordering approach user's real information demand more, thereby is user-friendly to.
Environmental information mainly is meant the residing surrounding enviroment information of user, even same user, its residing surrounding enviroment information also probably changes, and the time environmental information is exactly a typical example.For this reason, no matter the application is the query string imported of user or according to the query string of user's input or user's current browsing webpage structure, the environmental information that query string is corresponding all has real-time when gathering environmental information; So the application gathers the corresponding environmental information of said query string.
Query string to user's input; Its reception or structure deadline promptly are the time corresponding environmental informations; According to its IP (agreement that interconnects between the network; Internet Protocol) positional information that obtains of address promptly is corresponding location circumstances information, the temperature information of time environmental information and location circumstances information correspondence be temperature environment information, or the like.The application does not limit the method for the corresponding environmental information of concrete said query string.
Step 102, the said query string of foundation obtain the information record that respectively is intended to classification;
In a kind of preferred embodiment of the application, obtain the information recorded steps that respectively is intended to classification according to said query string, specifically can comprise:
At first in network data, search for and obtain corresponding information record, according to the classification that respectively is intended to that presets said information record is classified then, respectively be intended to the information record of classification according to said query string; The said classification that respectively is intended to presets according to the label that the whole network user beats for the corresponding webpage of information record;
And/or, in having the network data that respectively is intended to class label, search for respectively according to said query string, respectively be intended to the information record of classification.Be about to said query string and in the whole network, respectively be intended to search in the classification corresponding search engine, obtain the Search Results that respectively is intended to class label that has that each search engine returns, thereby form the information record that respectively is intended to classification.Because the classification outwardness of each search engine in the whole network; Such as mp3.baidu.com is the search engine of music categories; News.sogou.com is the search engine of news category; Video.baidu.com is that video class is else searched plain engine or the like, can from these search engines, directly obtain the information record of corresponding intention classification, so the application's intention classification is the attribute of the pairing outwardness of network data.
Among the application embodiment, said intention classification is mainly used in each information record and distinguishes different information requirements, and in a kind of preferred embodiment of the application, it specifically can comprise video, picture, information, resource, comment or rate of exchange classification etc.In the reality, those skilled in the art can also be based on actual needs, respectively is intended to classification with what the information record was divided into other, and to distinguish different information requirements, the application does not limit the sorting technique of concrete information record.
Respectively be intended to the distribution of classification under step 103, the environmental information, the intention classification is sorted according to said query string correspondence, and the order of the ranking results adjustment information record of foundation intention classification; Wherein, the distribution that respectively is intended to classification under the environmental information that said query string is corresponding obtains by carrying out analytic statistics according to the user journal that records environmental information.
In reality; Can select to adopt the user journals such as inquiry log of browser log or search engine to carry out statistical study according to practical application request; For example; Search engine generally can be provided with inquiry log, and browser client generally can be provided with browser log, and the application has increased environmental information on the basis of existing inquiry log or browser log.
In a kind of preferred embodiment of the application, said user journal comprises browser log and/or inquiry log.Said browser log records user totem information, browsing page is historical and corresponding environmental information; Said inquiry log records user totem information, query string and corresponding web page operation history and environmental information, the historical webpage record for being operated by the user in the corresponding information record of query string of said web page operation.
P (I c| T) can be used for representing the corresponding environmental information T of said query string intention classification I down cDistribution, according to theory of probability and mathematical statistics method, can adopt following formula that it is derived:
P ( I c | T ) = Σ d p ( I c d | T )
= Σ d p ( I c dT ) p ( T ) = Σ d p ( I c | dT ) p ( dT ) p ( T ) = Σ d p ( I c | dT ) p ( d | T ) p ( T ) p ( T ) = Σ d p ( d | T ) p ( I c | dT ) - - - ( 1 )
Wherein,
Figure BDA0000136820820000083
For to joint probability distribution P (I c| T) carry out the operation of marginalisation.
In a kind of preferred embodiment of the application, can carry out analytic statistics according to the user journal that records environmental information through following steps, obtain the distribution of intention classification under the corresponding environmental information of said query string:
Substep A1, under the corresponding environmental information of said query string, the webpage in the whole network is carried out analytic statistics, obtain the webpage distribution p (d|T) under the corresponding environmental information of said query string according to user journal;
When using user journal to add up, can under the corresponding environmental information of said query string, carry out the operation of statistics p (d), wherein, the available following formula of operation that uses inquiry log to carry out analytic statistics p (d) is represented:
p ( d ) = Σ x p ( d | x ) p ( x ) - - - ( 2 )
Wherein, x record in inquiry log.
The operation example that the use browser log is carried out analytic statistics p (d) is following: add up the number of times that certain webpage d occurs in browser log; In some cases, the number of times that the number of times that can use certain webpage d in browser log, to occur is occurring in browser log divided by all webpages.
Substep A2, under the corresponding environmental information of said query string, according to user journal each intention classification is carried out analytic statistics, obtain the intention category distribution p (I of particular webpage under the corresponding environmental information of said query string to certain particular webpage c| dT);
In concrete the realization, can at first add up p (I c):
1, with the browser log is example, such as there being five pillars to represent five intention classification I now cIf a webpage belongs on some (a plurality of) intention classification, just on the pillar of correspondence, increases by 1; Thereby obtain the numerical value on each pillar, also promptly respectively be intended to classification I cProbability distribution;
2, use inquiry log to carry out analytic statistics p (I c) the available following formula of operation represent:
p ( I c ) = Σ x p ( I c | x ) p ( x ) - - - ( 3 )
Under the corresponding environmental information T of said query string, add up p (I to certain particular webpage c), so just obtained p (I c| dT).
Substep A3, be statistical sample with the webpage; Product to the intention category distribution of particular webpage under the environmental information that webpage distributes and said query string is corresponding under the corresponding environmental information of said query string is sued for peace, and obtains respectively being intended under the corresponding environmental information of said query string the distribution of classification:
Figure BDA0000136820820000093
Figure BDA0000136820820000094
Under the identical situation of query string, prior art can represent consistent information record to the whole network user, and not will consider user's individual demand.
To the problems referred to above, in a kind of preferred embodiment of the application, on the basis of considering current environment information, can also sort to the information record that each is intended to the correspondence of classification to the interest of intention classification according to the active user; Correspondingly, said method can also comprise:
Discern the corresponding active user's of said query string user totem information;
Under the corresponding environmental information of said query string, respectively be intended to the distribution of classification according to the active user; The intention classification is sorted; And the order of each information record of ranking results adjustment of foundation intention classification; Wherein, active user's distribution of under the corresponding environmental information of said query string, respectively being intended to classification is carried out analytic statistics according to the user journal that records environmental information and user totem information and is obtained.
Exclude the factor of environmental information; Different user has different interest to difference intention classification; For example, user A grows tender of variety show, the variety show that all can watch the form of video to obtain wanting with search engine and/or browser every day; And user B grows tender of star's picture, the star's picture that habitually obtains wanting with the search and/or the form of browsing video.
This preferred embodiment is studied the regularity of user for the interest of intention classification with theory of probability and mathematical statistic method; Here; Respectively be intended to the regularity of the distribution of classification under the environmental information of comprehensive said query string correspondence; Finally, this preferred embodiment statistics is the user respectively is intended to classification under the corresponding environmental information of said query string distribution.
Because different user has different interest to difference intention classification; And carry out analytic statistics according to the user journal that records environmental information and user totem information; The active user who obtains respectively is intended to the distribution of classification and sorts under the corresponding environmental information of said query string; Can the more interested said intention classification of active user be come the front, therefore, the application can make information record approach to reflect the real information demand of the personalization of user interest degree more.
P (I c| T, u) can be used for representing that the active user respectively is intended to the distribution of classification under the corresponding environmental information of said query string, its available following formula obtains through the weighted mean statistics:
P(I c|T,u)∝λP(T|I c)P(I c)+(1-λ)P(T|I c,u)P(I c|u) (4)
Wherein, u representative of consumer sign (userid), because all can record user identifications in every user journal, like this, just can obtain the all-access record of each u, and then, add up p (Ic) to the Visitor Logs of u and can obtain P (I c| u), P (I c| u) can reflect refer to user u respectively be intended to category distribution; λ is a random factor.
To specific intended classification I cThe statistical operation of carrying out p (T) can obtain p (T|I c), p (T) can use following formula to calculate:
p ( T ) = Σ d p ( dT ) - - - ( 5 )
Wherein, p (dT)=p (T|d) p (d) (6)
Wherein, the available webpage d of p (T|d) ratio calculation that drops on the sum that numerical value and webpage d under the environmental information occur obtains; To specific user u and specific intended classification I cThe statistical operation of carrying out p (T) can obtain P (T|I c, u); Random factor λ is used to represent that all users in the distribution of intention classification under the corresponding environmental information of said query string and the distribution of active user's intention classification under the corresponding environmental information of said query string, can confirm the numerical value of λ according to the actual requirements.
For example, can carry out manual work mark through the log information to user in the T, marked content is the intention classification, and adjustment λ makes λ value corresponding when obtaining best intention and describing accuracy rate, and wherein, T can be the many days same T time periods in the user journal.
Particularly, manual work has marked the model answer of intention classification ranking results, adjustment λ={ 0.1; 0.2 .., 0.9}; Utilize formula (4) the right to calculate the result under the different λ; The intention classification ranking results of contrast standard answer and formula clearing can count the accuracy rate that formula calculates under specific λ, and corresponding λ value was exactly the final λ value of confirming when accuracy rate was the highest.Wherein, Can utilize NDCG (normalization accumulation discount taken; Normalized Discounted Cumulative Gain), NDCG is a kind of tolerance to search engine or relative program validity, and the computing formula of k bar result's correlativity score was before it calculated:
NDCG ( k ) = G max , i - 1 ( k ) Σ j : π i ( j ) ≤ k 2 y i , j - 1 log 2 ( 1 + π i ( j ) )
I is expressed as the i time search; J is expressed as j bar result; y I, jBe expressed as j bar result's correlativity mark score, 5 grades of systems; π i(j) be expressed as the position of this result in ordering.
And for example, can also directly set the numerical value of λ, 0.6,0.8 or the like, the application does not limit the concrete numerical value of λ.
In a kind of preferred embodiment of the application, can discern this user's identity through following steps:
When this user registers when login, with this user's ID user totem information as this user; When this user so that logging status is not browsed, discern this user's user totem information according to this user's cookie (being used to store the small text file of user's private information).In practical application; For a website that needs ID registration login; The selection of user's unique identifier can be deferred to following order: when the user registers login, be as the criterion with ID, when user cookie with the user when logging status is not browsed is as the criterion.
Wherein, the User Recognition based on cookie is existing a kind of typical user identification method.When the method through self-defined Apache journal format or JavaScript obtains user cookie, the means of a very effective User Recognition have been found in fact.Cookie can think to bind with certain access client computer under the prerequisite that is not eliminated, thus based on the accuracy of the User Recognition of cookie than higher.For example, like the user who registered in Taobao, just have the cookies information stores in the c of user's computer dish the inside; When this user visited Taobao once more, the system of Taobao can go to the path of appointment to reach cookies information, if got; Even then this user does not login; Also can get login name, if get less than, then can a newly-built cookies information to user's computer the inside.Present most of user does not remove the cookies information of oneself.So can utilize should technology, obtains user's identify label.
In a kind of preferred embodiment of the application, can carry out the distribution that analytic statistics obtains active user's intention classification under the corresponding environmental information of said query string according to the user journal that records environmental information through following steps:
Substep B1, user journal is carried out analytic statistics; Respectively be intended to the distribution of classification and the distribution of corresponding said each environmental information under specific intention classification, and then statistics obtains the distribution of all users intention classification under the corresponding environmental information of said query string:
P ( I c | T ) = P ( TI c ) P ( T ) = P ( T | I c ) P ( I c ) P ( T ) ∝ P ( T | I c ) P ( I c ) ; ∝ representes implication of equal value;
Substep B2, analytic statistics is carried out in active user's daily record, is obtained active user's the distribution that respectively is intended to classification and active user's the distribution of corresponding said each environmental information under specific intention classification: P ( I c | T , u ) = P ( TI c u ) P ( T ) = P ( T | I c , u ) P ( I c ) P ( T ) ∝ P ( T | I c , u ) P ( I c | u ) , And then statistics obtains the preliminary distribution of active user's intention classification under the corresponding environmental information of said query string:
Substep B3, said all users respectively are intended to classification the distribution that respectively is intended to classification under the corresponding environmental information of said query string and active user under the corresponding environmental information of said query string preliminary distribution are carried out linear weighted function and are handled, and obtain the active user respectively is intended to classification under the corresponding environmental information of current said query string distribution: P (I c| T, u) ∝ λ P (T|I c) P (I c)+(1-λ) P (T|I c, u) P (I c| u).
Under the situation of no active user's daily record, promptly the user browses for the first time, λ=1, and the active user respectively is intended to classification under the corresponding environmental information of said query string distribution is the distribution that respectively be intended to classification of all users under current environment information.
Above according to the distribution of intention classification or the active user is intended to classification under current environment information distribution under the current environment information; Information record to each intention classification sorts; To adjust the order of the information record that respectively is intended to classification, in a kind of preferred embodiment of the application, can also sort to the inner information record of each intention classification according to current environment information; Correspondingly, said method can also comprise:
Webpage according to specific intended classification under the corresponding environmental information of said query string distributes, and the information record inner to each intention classification sorts; Wherein, the webpage of specific intended classification is distributed as and carries out analytic statistics according to the user journal that records environmental information and obtain under the environmental information that said query string is corresponding.
This preferred embodiment is also being considered environmental information to the inner information record of each intention classification when sorting; Can reflect that more the webpage of information requirement under the current environment information comes the front in the information record with each intention classification, can make the information record approach user's real information demand more.
For example, the information of video intention classification records a plurality of, comprises various film clips video resources and various HD video resource, at this moment, if do not consider current environment information, simply the HD video resource is come the front, and the user is cut a sorry figure; If because be under the office space, the unsuitable excessively amusement of user, this preferred embodiment has been considered current environment information, so can make the information record approach user's real information demand more.
P (d|I c, T) can be used for representative ring environment information T intention classification I down cWebpage distribute, according to theory of probability and mathematical statistics method, can adopt following formula that it is derived:
p(d|I c,T)=p(I c,d,T)/p(TI c)=p(I c,d,T)/(p(I c|T)·p(T)) (7)
Wherein, p (I c, d is the joint distribution of corresponding environmental information, specific intended classification and webpage of said query string T), available following formula obtains:
p(I c,d,T)=p(I c|d,T)·p(T|d)·p(d) (8)
Wherein, p (I c| d is that particular webpage d is being intended to I under the corresponding environmental information T of said query string T) cOn distribution, p (T|d) is webpage d in the corresponding environmental information T distribution down of said query string, p (d) distributes for webpage, all can directly from browser log, add up acquisition;
P (TI c) be the corresponding environmental information of said query string and the joint distribution of specific intended classification, available following formula is represented:
p(TI c)=p(T|I c)p(I c)=p(I c|T)p(T) (9)
In a kind of preferred embodiment of the application, can carry out analytic statistics to the user journal that records environmental information through following steps, the webpage that obtains specific intended classification under the corresponding environmental information of said query string distributes:
Substep C1, user journal is carried out analytic statistics, obtain particular webpage under the environmental information that each webpage distributes, said query string is corresponding in the whole network respectively be intended to category distribution, and each webpage under the corresponding environmental information of said query string distribute;
Under the corresponding environmental information T of said query string,, so just obtained p (I to certain particular webpage statistics p (Ic) c| d, T);
P (T|d) is the distribution of webpage d under the corresponding environmental information T of said query string, and the statistical operation of carrying out p (T) to webpage d can obtain, and wherein, p (T) can use formula (5) to calculate.
Substep C2, according to particular webpage under the environmental information that each webpage in the whole network distributes, said query string is corresponding respectively be intended to category distribution, and each webpage under the corresponding environmental information of said query string distribute, construct the joint distribution of each webpage in said query string corresponding environmental information, specific intended classification and the whole network;
The ratio of the environmental information that the joint distribution of each webpage in environmental information, specific intended classification and the whole network of substep C3, the said query string correspondence of foundation is corresponding with said query string and the joint distribution of specific intended classification, the webpage that statistics obtains specific intended classification under the corresponding environmental information of said query string distributes.
In practical application, the environmental information that said query string is corresponding and the joint distribution of specific intended classification can be used the product p (TI of distribution of environmental information of distribution and the said query string correspondence of specific intended classification under the corresponding environmental information of said query string c)=p (I c| T) p (T) calculates, perhaps, and the product p (TI of the distribution of the environmental information that said query string is corresponding under the available specific intended classification and the distribution of specific intended classification c)=p (T|I c) p (I c) calculate.The front has been introduced p (I c), p (T), p (I c| statistical method T), to specific intended classification I cThe statistical operation of carrying out p (T) can obtain p (T|I c).
For making those skilled in the art understand the method that the application sorts to information record better, below introduce its application in reality through example.
Example 1, the information search service in the search engine carried out the ordering of information record;
With reference to Fig. 2, show the process flow diagram of a kind of information search method embodiment based on search engine of the application, specifically can comprise:
Step 201, information search client receive the query string of user's input;
Step 202, information search client are gathered the corresponding environmental information of said query string;
Step 203, information search client are sent to information search server with said query string and the corresponding environmental information of said query string;
Step 204, information search server receive from the query string of information search client and the corresponding environmental information of said query string;
Step 205, information search server are searched in network data according to query string, respectively are intended to the information record of classification;
Be intended to the distribution of classification under step 206, the environmental information of information search server, the intention classification sorted, and adjust the order that each information writes down according to the ranking results of intention classification according to said query string correspondence; Wherein, the distribution of intention classification obtains by carrying out analytic statistics according to the user journal that records environmental information under the environmental information that said query string is corresponding;
Search Results after step 207, information search server will sort returns to the information search client;
Step 208, information search client represent the Search Results that said information search server returns.
In the existing information search service, the environmental information that has no basis was adjusted representing of Search Results; And this example is through adding up the inquiry log under the varying environment information; And according to the distribution of adding up intention classification under the current environment information that obtains; Search Results to each intention classification sorts, and realization represents based on the Search Results of the personalization of environmental information; The Search Results of the real information demand that approaches the user more can be provided, thereby make things convenient for the user therefrom to find most interested information apace.
Below explain with an instantiation:
Give an example for ease, environmental information is just divided (working time T with the time 1, non-working time T 2); After client receives query string " A Chinese Ghost Story " x, with x and T 1Send server end to.Server obtains to have the webpage collection that is intended to class label according to the x searching database
Figure BDA0000136820820000151
Utilize current environment information T then 1Be intended to the distribution P (I of classification down c| T 1) right
Figure BDA0000136820820000152
According to the ordering of intention classification, for example, the Search Results of ordering back " A Chinese Ghost Story " is at T 1The intention classification of environmental information be in proper order " information, video display, picture, the recreation ... ".
In a word; Do not consider environmental information and unified Search Results is provided to existing information search service; The application makes that collecting the result more possesses specific aim; Personalization capability is stronger, can make things convenient for the user therefrom to find most interested information apace, reduces the system resource that the user takies in search procedure.
As a kind of preferred embodiment; Said step 205 specifically can comprise: at first search in network data according to query string and obtain corresponding information record; According to each intention classification said information record is classified then, respectively be intended to the information record of classification; And/or, in having the network data that respectively is intended to class label, search for respectively according to said query string, respectively be intended to the information record of classification.
As a kind of preferred embodiment, said information search method can also comprise:
The webpage of specific intended classification distributes under the environmental information of step D, the said query string correspondence of foundation, the inner information of each intention classification is write down sort; Wherein, the webpage of specific intended classification is distributed as and carries out analytic statistics according to the user journal that records environmental information and obtain under the environmental information that said query string is corresponding.
Said step D can carry out before or after step 206, in step D and the step 206 sort order after one export ranking results to step 207.Specific to last example, can be under each intention classification according to P (d|I c, T 1) ordering, like T 1" A Chinese Ghost Story film review " page comes before " download of A Chinese Ghost Story video " page under environmental information " video display " the intention classification.
As a kind of preferred embodiment, said information search method can also comprise:
The active user's of step e 1, the said query string correspondence of identification user totem information;
Step e 2, the intention classification is sorted, and according to the order of the ranking results adjustment information record of intention classification.
Step e 2 alternative steps 206, the result of step e 2 exports step 207 to like this.
As a kind of preferred embodiment; Said information search method can also comprise the relevance ranking step: according to the correlativity between Search Results and the said query string, the information record of said information search module being exported according to correlativity order from high to low carries out relevance ranking.Wherein, relevance ranking can be carried out before or after step 206, and the net result of relevance ranking step and step 206 exports step 208. to
Need to prove that said information service client can also and be recorded to inquiry log by accessed web page and environmental information accordingly with ID, said query string, said is by user accessing web page in the information record by accessed web page.
The ordering of information record in example 2, the information recommendation service, the information record shows as recommendation results.
With reference to Fig. 3, show the process flow diagram of a kind of information recommendation method embodiment based on browser of the application, specifically can comprise:
Step 301, according to user's input or user's current browsing webpage structure query string;
Step 302, the input of gathering the user or user's the corresponding environmental information of current browsing webpage is as the corresponding environmental information of said query string;
Step 303, the said query string of foundation are searched in network data, respectively are intended to the recommendation results of classification;
Be intended to the distribution of classification under step 304, the environmental information, the intention classification is sorted, and adjust the order of recommendation results according to the ranking results of intention classification according to said query string correspondence; Wherein, the distribution of intention classification obtains by carrying out analytic statistics according to the user journal that records environmental information under the environmental information that said query string is corresponding;
Step 305, the recommendation results that respectively is intended to classification after sorting is represented.
Existing information recommendation service is not considered environmental information when recommendation results is sorted, and the application adjusts recommendation results according to current environment information and shows, can realize the personalized recommendation of browsing.
Corresponding example:
Example is during 1. mornings, before the recommendation of the news category row; During working, before web, other row of recommendation of picture category; During evening, before the recommendation of video, the music categories row.
Example 2. is in the Internet bar, before classifications such as video, recreation, music are recommended row; In office, before classifications such as news, picture are recommended row; On the airport, station, hotels and other places, before classification information such as tourism, weather are recommended row, or the like.
Example 3. same video input demands, working environment, film clips sort preceding; The Internet bar, environment such as family, high definition, complete video resource sort preceding, or the like.
Introduce whole flow process with an instance below:
The user utilizes web page title when browsing the webpage relevant with " Wang Xiaochuan ", url and text message structure query string " Wang Xiaochuan "; Then, the Search Results that respectively is intended under the classification is returned in retrieval " Wang Xiaochuan " from classification such as " information, picture, video display " intention; Then, according to P (I c| T 1) to each intention classification ordering.
As a kind of preferred embodiment, said information recommendation method based on browser can also comprise:
The webpage of specific intended classification distributes under the environmental information of step F, the said query string correspondence of foundation, and the inner browsing information of each intention classification is sorted; Wherein, the webpage of specific intended classification is distributed as and carries out analytic statistics according to the user journal that records environmental information and obtain under the environmental information that said query string is corresponding.
Said step F can be carried out before or after step 304, in step F and the step 304 sort order after one export ranking results to step 305.Specific to last example, can be under each intention classification according to P (d|I c, T 1) ordering, like T 1" the search dog browser leads Wang Xiaochuan to lead to success " page comes before " search dog CEO king coulee sermon internet " page under environmental information " information " the intention classification.
As a kind of preferred embodiment, said information recommendation method based on browser can also comprise:
The active user's of step G1, the said query string correspondence of identification user totem information;
Step G2, according to the distribution of active user's intention classification under the corresponding environmental information of said query string; The intention classification is sorted; And the order of the ranking results adjustment recommendation results of foundation intention classification; Wherein, active user's being distributed as of intention classification under the corresponding environmental information of said query string carried out analytic statistics according to the user journal that records environmental information and user totem information and obtained.
Step G2 alternative steps 304, the result of step G2 exports step 305 to like this.
In a word, this preferred embodiment can be realized personalized information recommendation service according to environmental information, user totem information, has recommended more precisely, more personalized recommendation results.
In a kind of preferred embodiment of the application, said information recommendation method based on browser can also be with ID, said current browsing webpage and corresponding environment recommendation results to browser log; And/or, ID, said query string and corresponding web page operation history and environmental information are recorded to inquiry log, clicked the webpage record that accessing operation is crossed by the user in the recommendation results of said web page operation historical query string correspondence.
In the application's another kind of preferred embodiment; Said step 305 can be specially; The recommendation results of in the zone order module between said class being exported that respectively is intended to classification that respectively represents presetting represents; Wherein, in each represents the zone, represent in the recommendation results of an intention classification and come several of front.With reference to Fig. 4, show a kind of examples that represent the zone of the application more, wherein, " information ", " picture ", " video display " intention classification come the front three of recommendation results, and are presented at accordingly respectively and represent in the zone.
In another preferred embodiment of the application, can use the ordering learning method, according to user's input or user's current browsing webpage structure query string, specifically can comprise:
Step H1, from said current browsing webpage, extract candidate's phrase;
Here, can adopt Chinese cutting, named entity recognition, part of speech, tf/idfTF-IDF steps such as (word frequency/reverse file frequency, term frequency/inverse document frequency) to extract candidate's phrase.
Step H2, from said candidate's phrase, pick out candidate word as query string.
The ordering learning method roughly can be divided into three big type: based on the ordering study that returns, based on the ordering study of classification and the ordering study that returns based on order.Wherein, The ordering learning algorithm that returns based on order is the focus of current ordering study research; Specifically can enter oneself for the examination ordering perceptron algorithm (PRank), improved ordering perceptron algorithm (Large Marge PRank) and support vector order regression algorithm (SupportVector Ordinal Regression) for representative based on data point (Point-wise) learning algorithm that sorts, be the ordering learning algorithm based on ordered pair (Pair-wise) of representative with rank support vector machine algorithm (Rank SVM), RankBoost algorithm and RankNet algorithm.The application can adopt above-mentioned arbitrary ordering learning method, from said candidate's phrase, picks out the intention phrase subclass that can represent current page.
Corresponding with the aforementioned method that the information record is sorted, the application also provides a kind of device that the information record is sorted, and with reference to Fig. 5, said device specifically can comprise:
Acquisition module 501 is used to gather the corresponding environmental information of said query string;
Information record acquisition module 502 is used for obtaining the information record that respectively is intended to classification according to said query string; And
Order module 503 between class is used for the distribution according to intention classification under the corresponding environmental information of said query string, the intention classification sorted, and the order of the ranking results adjustment information record of foundation intention classification; Wherein, the distribution of intention classification obtains by carrying out analytic statistics according to the user journal that records environmental information under the environmental information that said query string is corresponding.
In the application's preferred embodiment, preferably, said user journal comprises browser log and/or inquiry log; Said browser log records user totem information, browsing page is historical and corresponding environmental information; Said inquiry log records user totem information, query string and corresponding web page operation history and environmental information, the historical webpage record for being operated by the user in the corresponding information record of query string of said web page operation.
In the application's preferred embodiment, preferably, said environmental information specifically can comprise time environmental information, location circumstances information, temperature environment information or hardware environment information.In the application's preferred embodiment, preferably, said intention classification specifically can comprise video, picture, information, resource, comment or rate of exchange classification.
In a kind of preferred embodiment of the application, said device can also comprise:
First statistical module is used for carrying out the distribution that analytic statistics obtains intention classification under the corresponding environmental information of said query string according to the user journal that records environmental information, comprising:
The first statistics submodule is used under the corresponding environmental information of said query string according to user journal the webpage of the whole network being carried out analytic statistics, and each webpage that obtains under the corresponding environmental information of said query string distributes;
The second statistics submodule is used under the corresponding environmental information of said query string, according to user journal each intention classification being carried out analytic statistics to certain particular webpage, obtains the intention category distribution of particular webpage under the corresponding environmental information of said query string; And
The summation submodule; Be used for the webpage is variable; Product to the intention category distribution of particular webpage under the environmental information that webpage distributes and query string is corresponding under the corresponding environmental information of said query string is sued for peace, and obtains being intended to the distribution of classification under the corresponding environmental information of said query string.
In the application's another kind of preferred embodiment, said device can also comprise:
Identification module is used to discern the corresponding active user's of said query string user totem information;
Order module between the interest-degree class; Be used under the corresponding environmental information of said query string, respectively being intended to the distribution of classification according to the active user; The intention classification is sorted; And according to the order of each information record of ranking results adjustment of intention classification, wherein, the user respectively is intended to being distributed as according to the user journal that records environmental information and user totem information is carried out analytic statistics of classification and obtains under the corresponding environmental information of said query string.
In another preferred embodiment of the application, said device can also comprise:
Second statistical module is used for carrying out the distribution that analytic statistics obtains user's intention classification under the corresponding environmental information of said query string according to the user journal that records environmental information, specifically can comprise:
The 3rd statistics submodule; Be used for user journal is carried out analytic statistics; Obtain being intended to the distribution of classification and the distribution of corresponding said each environmental information under specific intention classification, and then statistics obtains the distribution of all users intention classification under the corresponding environmental information of said query string;
The 4th statistics submodule; Be used for analytic statistics is carried out in active user's daily record; Obtain active user's the distribution that respectively is intended to classification and active user's the distribution of corresponding said each environmental information under specific intention classification, and then statistics obtains the preliminary distribution of active user's intention classification under the corresponding environmental information of said query string; And
The linear weighted function processing sub; Be used for said all users are carried out the linear weighted function processing in the distribution of intention classification under the corresponding environmental information of said query string and the preliminary distribution of user's intention classification under the corresponding environmental information of said query string, obtain the distribution of active user's intention classification under the corresponding environmental information of current said query string.
In a kind of preferred embodiment of the application, said device can also comprise:
Classification internal sort module is used for distributing according to the webpage of specific intended classification under the corresponding environmental information of said query string, and the information record inner to each intention classification sorts; Wherein, the webpage of specific intended classification is distributed as and carries out analytic statistics according to the user journal that records environmental information and obtain under the environmental information that said query string is corresponding.
In the application's another kind of preferred embodiment, said device can also comprise:
The 3rd statistical module is used for carrying out the webpage distribution that analytic statistics obtains specific intended classification under the corresponding environmental information of said query string according to the user journal that records environmental information, comprising:
The 5th statistics submodule; Be used for user journal is carried out analytic statistics, obtain particular webpage under the environmental information that each webpage distributes, said query string is corresponding in the whole network respectively be intended to category distribution, and each webpage under the corresponding environmental information of said query string distribute;
The 6th statistics submodule; Be used for distributing, construct the Joint Distribution of each webpage in said query string corresponding environmental information, specific intended classification and the whole network based on each webpage that respectively is intended to category distribution, reaches under the corresponding environmental information of said query string of particular webpage under the environmental information that each webpage distributes, said query string is corresponding of the whole network; And
The 7th statistics submodule; Be used for the ratio according to the joint distribution of the joint distribution of each webpage of said query string corresponding environmental information, specific intended classification and the whole network environmental information corresponding with said query string and specific intended classification, statistics obtains each webpage distribution of specific intended classification under the environmental information of said query string correspondence.
In the application embodiment; Preferably, said information record acquisition module can specifically be used for obtaining corresponding information record according to said query string in the network data search; And according to each intention classification said information record is classified, respectively be intended to the information record of classification; And/or, in having the network data that respectively is intended to class label, search for respectively according to said query string, respectively be intended to the information record of classification.
In a kind of preferred embodiment of the application, said device can also comprise: represent module, be used for the information record that respectively is intended to classification of order module output between said class is represented.
In the application embodiment, preferably, the said module that represents can specifically be used for writing down and representing in the information of in the zone order module between said type being exported that respectively is intended to classification that respectively represents that presets.
In the application embodiment, preferably, said query string derives from user's input or user's current browsing webpage.
For the device embodiment that the information record is sorted; Because it is similar basically with the method embodiment that the information record is sorted; So that describes is fairly simple, relevant part gets final product referring to the part explanation of the method embodiment that the information record is sorted.
With reference to Fig. 6, show the structural drawing of a kind of information search server embodiment of the application, specifically can comprise:
Receiver module 601 is used to receive query string and the corresponding environmental information of said query string from the information search client;
Information search module 602 is used for searching in network data according to query string, respectively is intended to the information record of classification;
Order module 603 between type is used for the distribution according to intention classification under the environmental information of said query string correspondence, the intention classification is sorted, and adjust the order of each information record according to the ranking results of intention classification; Wherein, the distribution of intention classification obtains by carrying out analytic statistics according to the user journal that records environmental information under the environmental information that said query string is corresponding; And
Return module 604, be used for the information record of order module output between said class is returned to the information search client.
In a kind of preferred embodiment of the application; Said information search module 602; Can specifically, the user be used for obtaining corresponding information record in the network data search when using search engine according to query string; And according to each intention classification said information record is classified, respectively be intended to the information record of classification; And/or, when the user uses browser to carry out information browse, having in the network data that respectively is intended to class label respectively and searching for according to the corresponding said query string of the current browsing page, respectively be intended to the information record of classification.
In a kind of preferred embodiment of the application, said information search server can also comprise:
The first relevance ranking module; Be used for according to the correlativity between information record and the said query string; Information record to said information search module output carries out first relevance ranking, and exports the record of the information behind first relevance ranking to said sort module; Perhaps
The second relevance ranking module; Be used for according to the correlativity between information record and the said query string; Information record to order module output between said class carries out second relevance ranking, and exports the record of the information behind second relevance ranking to the said module of returning.
In a kind of preferred embodiment of the application, said information search server can also comprise:
Identification module is used to discern the corresponding active user's of said query string user totem information;
Order module between the interest class; Be used under the corresponding environmental information of said query string, respectively being intended to the distribution of classification according to the active user; Thereby the information record to each intention classification sorts; Wherein, active user's distribution of under the corresponding environmental information of said query string, respectively being intended to classification obtains according to the user journal that records environmental information and user totem information is carried out analytic statistics;
The said module of returning; Also be used for the information record of order module output between said interest class is returned to the information search client, also can return by the information of being exported that finishes of order module overall treatment between order module between said type and interest class and write down to the information search client.
In a kind of preferred embodiment of the application, said information search server can also comprise:
Classification internal sort module was used for before or after order module between said type, and according to the webpage distribution of specific intended classification under the corresponding environmental information of said query string, the information record inner to each intention classification sorts; Wherein, the webpage of specific intended classification is distributed as and the user journal that records environmental information is carried out analytic statistics obtains under the environmental information that said query string is corresponding;
The said module of returning also is used for the information record of said classification internal sort module output is returned to the information search client; The information that also can return the output of order module between said type writes down to the information search client; Or order module and the classification internal sort module synthesis information of being exported that disposes writes down to the information search client between said type, or through between said class between order module, interest class order module and the classification internal sort module synthesis information of being exported that disposes write down to the information search client.
With reference to Fig. 7, show the structural drawing of a kind of information search client implementation example of the application, specifically can comprise:
Receiver module 701 is used to receive the query string that the user imports;
Environment acquisition module 702 is used to gather the corresponding environmental information of said query string;
Sending module 703 is used for said query string and the corresponding environmental information of said query string are sent to information search server; And
Represent module 704, the information record that is used for said information search server is returned represents.
In a kind of preferred embodiment of the application, said information search client can also comprise:
The inquiry log logging modle is used for user totem information, said query string and corresponding web page operation history and environmental information are recorded to inquiry log, the historical webpage record for being operated by the user in the corresponding information record of query string of said web page operation.
For the embodiment of information search server and client; Because it is similar basically with the method embodiment that the information record is sorted; So that describes is fairly simple, relevant part gets final product referring to the part explanation of the method embodiment that the information record is sorted.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed all is and the difference of other embodiment that identical similar part is mutually referring to getting final product between each embodiment.
More than to the application provided a kind of method and apparatus, a kind of information search server and information search client that the information record is sorted; Be described in detail; Used specific case herein the application's principle and embodiment are set forth, the explanation of above embodiment just is used to help to understand the application's method and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to the application's thought, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as the restriction to the application.

Claims (23)

1. method that information record is sorted is characterized in that said method comprises:
Gather the corresponding environmental information of query string;
Obtain the information record that respectively is intended to classification according to said query string;
According to the distribution that respectively is intended to classification under the corresponding environmental information of said query string, the intention classification is sorted, and the order of the ranking results adjustment information record of foundation intention classification; Wherein, the distribution that respectively is intended to classification under the environmental information that said query string is corresponding obtains by carrying out analytic statistics according to the user journal that records environmental information.
2. the method for claim 1 is characterized in that, obtains respectively being intended under the corresponding environmental information of said query string the distribution of classification through following steps:
Under the corresponding environmental information of said query string, according to user journal the webpage in the whole network is carried out analytic statistics, each webpage that obtains under the said environmental information distributes;
Under the corresponding environmental information of said query string, according to user journal each intention classification is carried out analytic statistics, obtain the intention category distribution of particular webpage under the said environmental information to certain particular webpage;
With each webpage is statistical sample, and the intention category distribution of particular webpage under each webpage distribution under the said environmental information and the said environmental information is added up, and obtains respectively being intended under the corresponding environmental information of said query string the distribution of classification.
3. the method for claim 1 is characterized in that, also comprises:
Discern the corresponding active user's of said query string user totem information;
Under the corresponding environmental information of said query string, respectively be intended to the distribution of classification according to the active user, the intention classification is sorted, and according to the order of each information record of ranking results adjustment of intention classification; Wherein, the said active user distribution that under the corresponding environmental information of said query string, respectively is intended to classification obtains according to the user journal that records environmental information and user totem information is carried out analytic statistics.
4. method as claimed in claim 3 is characterized in that, obtains the active user respectively is intended to classification under the corresponding environmental information of said query string distribution through following steps:
User journal is carried out analytic statistics, respectively be intended to the distribution of classification and the distribution of corresponding said each environmental information under specific intention classification, and then statistics obtains all users respectively are intended to classification under said environmental information distribution;
Analytic statistics is carried out in active user's daily record; Obtain active user's the distribution that respectively is intended to classification and the distribution of active user's corresponding said each environmental information under specific intention classification, and then statistics obtains the active user respectively is intended to classification under said environmental information preliminary distribution;
Said all users are carried out weighted in the distribution that respectively is intended to classification under the said environmental information and said active user respectively are intended to classification under said environmental information preliminary distribution, obtain said active user respectively is intended to classification under the corresponding environmental information of said query string distribution.
5. like each described method in the claim 1 to 4, it is characterized in that, also comprise:
Each webpage according to specific intended classification under the corresponding environmental information of said query string distributes, and each information record that is intended to classification is sorted; Wherein, each webpage of specific intended classification is distributed as and carries out analytic statistics according to the user journal that records environmental information and obtain under the environmental information that said query string is corresponding.
6. method as claimed in claim 5 is characterized in that, each webpage that obtains specific intended classification under the corresponding environmental information of said query string through following steps distributes:
User journal is carried out analytic statistics, obtain particular webpage under the environmental information that each webpage distributes, said query string is corresponding in the whole network respectively be intended to category distribution, and each webpage under the corresponding environmental information of said query string distribute;
Each webpage that respectively is intended to category distribution, reaches under the corresponding environmental information of said query string according to particular webpage under the environmental information that each webpage distributes, said query string is corresponding in the whole network distributes, and constructs the joint distribution of each webpage in said query string corresponding environmental information, specific intended classification and the whole network;
The ratio of the environmental information that the joint distribution of each webpage in environmental information, specific intended classification and the whole network of the said query string correspondence of foundation is corresponding with said query string and the joint distribution of specific intended classification, each webpage that statistics obtains specific intended classification under the corresponding environmental information of said query string distributes.
7. like each described method in the claim 1 to 4, it is characterized in that, when the said query string of said foundation obtains the information record that respectively is intended to classification:
The said query string of foundation is searched in network data and is obtained corresponding information record, and according to each intention classification said information record is classified, and respectively is intended to the information record of classification;
And/or, in having the network data that respectively is intended to class label, search for respectively according to said query string, respectively be intended to the information record of classification.
8. like each described method in the claim 1 to 4, it is characterized in that said user journal comprises browser log and/or inquiry log; Said browser log records user totem information, browsing page is historical and corresponding environmental information; Said inquiry log records user totem information, query string and corresponding web page operation history and environmental information, the historical webpage record for being operated by the user in the corresponding information record of query string of said web page operation.
9. like each described method in the claim 1 to 4, it is characterized in that, also comprise:
The information record that respectively is intended to classification to after the ordering represents.
10. method as claimed in claim 9 is characterized in that, also comprises: in respectively representing in the zone of presetting each recommendation results that is intended to classification is represented.
11., it is characterized in that said query string derives from the webpage of user's input or user's current browsing like each described method in the claim 1 to 4.
12. the device that the information record is sorted is characterized in that said device comprises:
Acquisition module is used to gather the corresponding environmental information of said query string;
Information record acquisition module is used for obtaining the information record that respectively is intended to classification according to said query string; And
Order module between class is used for the distribution according to intention classification under the corresponding environmental information of said query string, the intention classification sorted, and the order of the ranking results adjustment information record of foundation intention classification; Wherein, the distribution of intention classification obtains by carrying out analytic statistics according to the user journal that records environmental information under the environmental information that said query string is corresponding.
13. device as claimed in claim 12 is characterized in that, also comprises:
First statistical module is used to obtain the distribution of intention classification under the corresponding environmental information of said query string, comprising:
The first statistics submodule is used under the corresponding environmental information of said query string according to user journal the webpage of the whole network being carried out analytic statistics, and each webpage that obtains under the said environmental information distributes;
The second statistics submodule is used under the corresponding environmental information of said query string, according to user journal each intention classification being carried out analytic statistics to certain particular webpage, obtains the intention category distribution of particular webpage under the said environmental information; And
The summation submodule, being used for the webpage is variable, and the intention category distribution of particular webpage under distribution of the webpage under the said environmental information and the said environmental information is added up, and obtains being intended to the distribution of classification under the corresponding environmental information of said query string.
14. device as claimed in claim 12 is characterized in that, also comprises:
Identification module is used to discern the corresponding active user's of said query string user totem information;
Order module between the interest-degree class; Be used under the corresponding environmental information of said query string, respectively being intended to the distribution of classification according to the active user; The intention classification is sorted; And according to the order of each information record of ranking results adjustment of intention classification, wherein, said active user respectively is intended to being distributed as according to the user journal that records environmental information and user totem information is carried out analytic statistics of classification and obtains under the corresponding environmental information of said query string.
15. device as claimed in claim 14 is characterized in that, also comprises:
Second statistical module is used to obtain the user respectively is intended to classification under the corresponding environmental information of said query string distribution, comprising:
The 3rd statistics submodule is used for user journal is carried out analytic statistics, obtain being intended to the distribution of classification and the distribution of each corresponding environmental information under specific intention classification, and then statistics obtains all users are intended to classification under said environmental information distribution;
The 4th statistics submodule; Be used for analytic statistics is carried out in active user's daily record; Obtain the distribution of distribution and active user corresponding said each environmental information under specific intention classification of active user's intention classification, and then statistics obtains the active user is intended to classification under said environmental information preliminary distribution; And
The linear weighted function processing sub; Be used for said all users are carried out weighted in the distribution of intention classification and said active user is intended to classification under said environmental information preliminary distribution under the said environmental information, obtain the distribution of said active user intention classification under the corresponding environmental information of said query string.
16. like each described device in the claim 12 to 15, it is characterized in that, also comprise:
Classification internal sort module is used for distributing according to the webpage of specific intended classification under the corresponding environmental information of said query string, and the information record inner to each intention classification sorts; Wherein, the webpage of specific intended classification is distributed as and carries out analytic statistics according to the user journal that records environmental information and obtain under the environmental information that said query string is corresponding.
17. an information search server is characterized in that, comprising:
Receiver module is used to receive query string and the corresponding environmental information of said query string from the information search client;
The information search module is used for searching in network data according to query string, respectively is intended to the information record of classification;
Order module between type is used for the distribution according to intention classification under the environmental information of said query string correspondence, the intention classification is sorted, and adjust the order of each information record according to the ranking results of intention classification; Wherein, the distribution of intention classification obtains by carrying out analytic statistics according to the user journal that records environmental information under the environmental information that said query string is corresponding; And
Return module, be used for the information record of order module output between said class is returned.
18. information search server as claimed in claim 17; It is characterized in that; Said information search module; Specifically be used for obtaining corresponding information record in the network data search, and said information record classified, respectively be intended to the information record of classification according to each intention classification according to query string; And/or, in having the network data that respectively is intended to class label, search for respectively according to said query string, respectively be intended to the information record of classification.
19. information search server as claimed in claim 17 is characterized in that, also comprises:
The first relevance ranking module; Be used for carrying out first relevance ranking, and export the record of the information behind first relevance ranking to said sort module according to the information record that the correlativity between information record and the said query string is exported said information search module; Perhaps
The second relevance ranking module; Be used for carrying out second relevance ranking, and export the record of the information behind second relevance ranking to the said module of returning according to the information record that the correlativity between information record and the said query string is exported order module between said class.
20. information search server as claimed in claim 17 is characterized in that, also comprises:
Identification module is used to discern the corresponding active user's of said query string user totem information;
Order module between the interest class; Be used under the corresponding environmental information of said query string, respectively being intended to the distribution of classification according to the active user; The intention classification is sorted; And the order of the ranking results adjustment information record of foundation intention classification, wherein, said active user respectively is intended to classification under the corresponding environmental information of said query string distribution obtains according to the user journal that records environmental information and user totem information is carried out analytic statistics;
The said module of returning also is used for the information record of order module output between said interest class is returned to the information search client.
21. like each described information search server in the claim 17 to 20, it is characterized in that, also comprise:
Classification internal sort module is used for distributing according to the webpage of specific intended classification under the corresponding environmental information of said query string, and the information record inner to each intention classification sorts; Wherein, the webpage of specific intended classification is distributed as and the user journal that records environmental information is carried out analytic statistics obtains under the environmental information that said query string is corresponding;
The said module of returning also is used for the information record of said classification internal sort module output is returned to the information search client.
22. an information search client is characterized in that, comprising:
The inquiry receiver module is used to receive the query string that the user imports;
The environment acquisition module is used to gather the corresponding environmental information of said query string;
Sending module is used for said query string and the corresponding environmental information of said query string are sent to information search server; And
Represent module, the information record that is used for said information search server is returned represents.
23. information search client as claimed in claim 22 is characterized in that, also comprises:
The inquiry log logging modle is used for user totem information, said query string and corresponding web page operation history and environmental information are recorded to inquiry log, the historical webpage record for being operated by the user in the corresponding information record of query string of said web page operation.
CN201210038993.2A 2012-02-20 2012-02-20 The method and apparatus that information record is ranked up Active CN102622417B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210038993.2A CN102622417B (en) 2012-02-20 2012-02-20 The method and apparatus that information record is ranked up

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210038993.2A CN102622417B (en) 2012-02-20 2012-02-20 The method and apparatus that information record is ranked up

Publications (2)

Publication Number Publication Date
CN102622417A true CN102622417A (en) 2012-08-01
CN102622417B CN102622417B (en) 2016-08-31

Family

ID=46562336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210038993.2A Active CN102622417B (en) 2012-02-20 2012-02-20 The method and apparatus that information record is ranked up

Country Status (1)

Country Link
CN (1) CN102622417B (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810210A (en) * 2012-11-14 2014-05-21 腾讯科技(深圳)有限公司 Search result display method and device
CN103838754A (en) * 2012-11-23 2014-06-04 腾讯科技(深圳)有限公司 Information searching device and method
WO2014094481A1 (en) * 2012-12-21 2014-06-26 腾讯科技(深圳)有限公司 Method and device for pushing information
CN104112235A (en) * 2013-04-22 2014-10-22 中广核工程有限公司 Method and system for nuclear power project experience feedback information searching
CN104657397A (en) * 2013-11-25 2015-05-27 腾讯科技(深圳)有限公司 Information processing method and terminal
CN104699725A (en) * 2013-12-10 2015-06-10 阿里巴巴集团控股有限公司 Data searching processing method and system
CN104715011A (en) * 2014-12-31 2015-06-17 上海孩子国科教设备有限公司 Method and system for conducting data retrieval
CN105302903A (en) * 2015-10-27 2016-02-03 广州神马移动信息科技有限公司 Search method, apparatus and system and search result sequence adjustment basis determination method
WO2016107276A1 (en) * 2014-12-29 2016-07-07 北京奇虎科技有限公司 Search method and device
CN105893427A (en) * 2015-12-07 2016-08-24 乐视网信息技术(北京)股份有限公司 Resource searching method and server
CN106663082A (en) * 2014-05-19 2017-05-10 迈克尔哈里森特雷特奥尔巴克信托公司 Dynamic computer systems and uses thereof
CN106874413A (en) * 2017-01-22 2017-06-20 斑马信息科技有限公司 Search system and its method for processing search results
CN107515857A (en) * 2017-08-31 2017-12-26 科大讯飞股份有限公司 Semantic understanding method and system based on customization technical ability
CN107832432A (en) * 2017-11-15 2018-03-23 北京百度网讯科技有限公司 A kind of search result ordering method, device, server and storage medium
CN108763579A (en) * 2018-06-08 2018-11-06 Oppo(重庆)智能科技有限公司 Search for content recommendation method, device, terminal device and storage medium
CN103593353B (en) * 2012-08-15 2018-11-13 阿里巴巴集团控股有限公司 Information search method, displaying information sorting weighted value determine method and its device
CN108897785A (en) * 2018-06-08 2018-11-27 Oppo(重庆)智能科技有限公司 Search for content recommendation method, device, terminal device and storage medium
CN110162535A (en) * 2019-03-26 2019-08-23 腾讯科技(深圳)有限公司 For executing personalized searching method, device, equipment and storage medium
CN110990598A (en) * 2019-11-18 2020-04-10 北京声智科技有限公司 Resource retrieval method and device, electronic equipment and computer-readable storage medium
US10666735B2 (en) 2014-05-19 2020-05-26 Auerbach Michael Harrison Tretter Dynamic computer systems and uses thereof
CN113254513A (en) * 2021-07-05 2021-08-13 北京达佳互联信息技术有限公司 Sequencing model generation method, sequencing device and electronic equipment
CN113792225A (en) * 2021-08-25 2021-12-14 北京库睿科技有限公司 Multi-data type hierarchical sequencing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1050830A2 (en) * 1999-05-05 2000-11-08 Xerox Corporation System and method for collaborative ranking of search results employing user and group profiles
US20030036848A1 (en) * 2001-08-16 2003-02-20 Sheha Michael A. Point of interest spatial rating search method and system
CN1758248A (en) * 2004-10-05 2006-04-12 微软公司 Systems, methods, and interfaces for providing personalized search and information access
CN101019118A (en) * 2004-07-13 2007-08-15 谷歌股份有限公司 Personalization of placed content ordering in search results

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1050830A2 (en) * 1999-05-05 2000-11-08 Xerox Corporation System and method for collaborative ranking of search results employing user and group profiles
US20030036848A1 (en) * 2001-08-16 2003-02-20 Sheha Michael A. Point of interest spatial rating search method and system
CN101019118A (en) * 2004-07-13 2007-08-15 谷歌股份有限公司 Personalization of placed content ordering in search results
CN1758248A (en) * 2004-10-05 2006-04-12 微软公司 Systems, methods, and interfaces for providing personalized search and information access

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593353B (en) * 2012-08-15 2018-11-13 阿里巴巴集团控股有限公司 Information search method, displaying information sorting weighted value determine method and its device
CN103810210A (en) * 2012-11-14 2014-05-21 腾讯科技(深圳)有限公司 Search result display method and device
CN103810210B (en) * 2012-11-14 2018-10-19 腾讯科技(深圳)有限公司 Search result display methods and device
CN103838754A (en) * 2012-11-23 2014-06-04 腾讯科技(深圳)有限公司 Information searching device and method
CN103838754B (en) * 2012-11-23 2017-12-22 腾讯科技(深圳)有限公司 Information retrieval device and method
WO2014094481A1 (en) * 2012-12-21 2014-06-26 腾讯科技(深圳)有限公司 Method and device for pushing information
US9589026B2 (en) 2012-12-21 2017-03-07 Tencent Technology (Shenzhen) Company Limited Method and device for pushing information
CN104112235A (en) * 2013-04-22 2014-10-22 中广核工程有限公司 Method and system for nuclear power project experience feedback information searching
CN104657397B (en) * 2013-11-25 2020-03-03 腾讯科技(深圳)有限公司 Information processing method and terminal
CN104657397A (en) * 2013-11-25 2015-05-27 腾讯科技(深圳)有限公司 Information processing method and terminal
CN104699725A (en) * 2013-12-10 2015-06-10 阿里巴巴集团控股有限公司 Data searching processing method and system
CN104699725B (en) * 2013-12-10 2018-10-09 阿里巴巴集团控股有限公司 data search processing method and system
US10666735B2 (en) 2014-05-19 2020-05-26 Auerbach Michael Harrison Tretter Dynamic computer systems and uses thereof
CN106663082B (en) * 2014-05-19 2020-08-07 M·H·T·奥尔巴赫 Dynamic computer system and use thereof
CN106663082A (en) * 2014-05-19 2017-05-10 迈克尔哈里森特雷特奥尔巴克信托公司 Dynamic computer systems and uses thereof
WO2016107276A1 (en) * 2014-12-29 2016-07-07 北京奇虎科技有限公司 Search method and device
CN104715011A (en) * 2014-12-31 2015-06-17 上海孩子国科教设备有限公司 Method and system for conducting data retrieval
CN105302903A (en) * 2015-10-27 2016-02-03 广州神马移动信息科技有限公司 Search method, apparatus and system and search result sequence adjustment basis determination method
WO2017071578A1 (en) * 2015-10-27 2017-05-04 广州神马移动信息科技有限公司 Searching method, apparatus and system, and method for determining search result order adjustment basis
CN105893427A (en) * 2015-12-07 2016-08-24 乐视网信息技术(北京)股份有限公司 Resource searching method and server
WO2017096896A1 (en) * 2015-12-07 2017-06-15 乐视控股(北京)有限公司 Resource search method and server
CN106874413A (en) * 2017-01-22 2017-06-20 斑马信息科技有限公司 Search system and its method for processing search results
CN107515857B (en) * 2017-08-31 2020-08-18 科大讯飞股份有限公司 Semantic understanding method and system based on customization technology
CN107515857A (en) * 2017-08-31 2017-12-26 科大讯飞股份有限公司 Semantic understanding method and system based on customization technical ability
CN107832432A (en) * 2017-11-15 2018-03-23 北京百度网讯科技有限公司 A kind of search result ordering method, device, server and storage medium
CN108897785A (en) * 2018-06-08 2018-11-27 Oppo(重庆)智能科技有限公司 Search for content recommendation method, device, terminal device and storage medium
CN108763579A (en) * 2018-06-08 2018-11-06 Oppo(重庆)智能科技有限公司 Search for content recommendation method, device, terminal device and storage medium
CN110162535A (en) * 2019-03-26 2019-08-23 腾讯科技(深圳)有限公司 For executing personalized searching method, device, equipment and storage medium
CN110162535B (en) * 2019-03-26 2023-11-07 腾讯科技(深圳)有限公司 Search method, apparatus, device and storage medium for performing personalization
CN110990598A (en) * 2019-11-18 2020-04-10 北京声智科技有限公司 Resource retrieval method and device, electronic equipment and computer-readable storage medium
CN110990598B (en) * 2019-11-18 2020-11-27 北京声智科技有限公司 Resource retrieval method and device, electronic equipment and computer-readable storage medium
CN113254513A (en) * 2021-07-05 2021-08-13 北京达佳互联信息技术有限公司 Sequencing model generation method, sequencing device and electronic equipment
CN113792225A (en) * 2021-08-25 2021-12-14 北京库睿科技有限公司 Multi-data type hierarchical sequencing method and device
CN113792225B (en) * 2021-08-25 2023-08-18 北京库睿科技有限公司 Multi-data type hierarchical ordering method and device

Also Published As

Publication number Publication date
CN102622417B (en) 2016-08-31

Similar Documents

Publication Publication Date Title
CN102622417A (en) Method and device for ordering information records
US8346782B2 (en) Method and system of information matching in electronic commerce website
CN103886487B (en) Based on personalized recommendation method and the system of distributed B2B platform
CN107862553A (en) Advertisement real-time recommendation method, device, terminal device and storage medium
CN102054003B (en) Methods and systems for recommending network information and creating network resource index
CN107451861B (en) Method for identifying user internet access characteristics under big data
CN102037464A (en) Search results with most clicked next objects
CN101329674A (en) System and method for providing personalized searching
KR20100094021A (en) Customized and intellectual symbol, icon internet information searching system utilizing a mobile communication terminal and ip-based information terminal
CN102799662A (en) Method, device and system for recommending website
CN102053983A (en) Method, system and device for querying vertical search
CN102708174A (en) Method and device for displaying rich media information in browser
US20150220641A1 (en) Search engine optimization at scale
CN103839169A (en) Personalized commodity recommendation method based on frequency matrix and text similarity
US9158851B2 (en) Location aware commenting widget for creation and consumption of relevant comments
CN103034680A (en) Data interaction method and device for terminal device
Dias et al. Automating the extraction of static content and dynamic behaviour from e-commerce websites
Gupta et al. A review on search engine optimization: Basics
CN104598604A (en) Browsing method of website navigation applied in various browsers
JP4504878B2 (en) Document processing device
CN105590234A (en) Method and system for recommending commodities to target users
Bhujbal et al. News aggregation using web scraping news portals
CN110347923B (en) Traceable fast fission type user portrait construction method
US20150248491A1 (en) Data processing device and data processing method
US20210004790A1 (en) Systems, Methods and Devices for Providing Automated Adaptive Web-Based News Feeds

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant