CN102622417B - The method and apparatus that information record is ranked up - Google Patents

The method and apparatus that information record is ranked up Download PDF

Info

Publication number
CN102622417B
CN102622417B CN201210038993.2A CN201210038993A CN102622417B CN 102622417 B CN102622417 B CN 102622417B CN 201210038993 A CN201210038993 A CN 201210038993A CN 102622417 B CN102622417 B CN 102622417B
Authority
CN
China
Prior art keywords
information
classification
query string
intended
environmental information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210038993.2A
Other languages
Chinese (zh)
Other versions
CN102622417A (en
Inventor
江会星
苏雪峰
佟子健
张超旭
王潇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Beijing Sogou Information Service Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Beijing Sogou Information Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd, Beijing Sogou Information Service Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201210038993.2A priority Critical patent/CN102622417B/en
Publication of CN102622417A publication Critical patent/CN102622417A/en
Application granted granted Critical
Publication of CN102622417B publication Critical patent/CN102622417B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

This application provides a kind of method and apparatus being ranked up information record, wherein method specifically includes: gather the environmental information that query string is corresponding;The information record of each intention classification is obtained according to described query string;According to distribution being intended to classification each under the environmental information that described query string is corresponding, it is ranked up being intended to classification, and according to the order of the ranking results adjustment information record being intended to classification;Wherein, under the environmental information that described query string is corresponding, each distribution being intended to classification obtains by having the user journal of environmental information to be analyzed statistics according to record.The application can be ranked up being intended to classification according to environmental information, intention classification interested for active user is come before, and combine the personalized factor of user so that the information record after sequence is closer to the real information demand of user.

Description

The method and apparatus that information record is ranked up
Technical field
The application relates to technical field of data processing, particularly relates to a kind of be ranked up information record Method and apparatus, a kind of information search server and information search client.
Background technology
At present, network data carries out information search, have become as one of topmost application in the Internet. Such as, when carrying out information search, the query string that search engine inputs according to user is inquired about in data base The information record of page-out form, or, browser constructs query string according to the webpage that currently browses of user, And the query string of foundation structure inquires the information record of page format in data base, etc..
In order to preferably meet the information that user's request, search engine or browser will inquire the most immediately Record represents, but with the dependency between information record and query string as foundation, according to dependency Information record is ranked up by order from high to low, and is represented by the information record after sequence, this Plant and the dependency between query string is referred to as carried out according to basis weights as the operation of sort by The operation of sequence.
Information record after being ranked up according to basis weights can reflect information record and query string it Between dependency, the most beneficially user quickly makes a look up from information record, but, Due to the dependency being ranked up to embody information record with query string according to basis weights, do not examine Consider other factors, and the information recorded content in real network data is various, carry out only in accordance with basis weights Sequence excessively simple, be affected by other factors, coming information record not necessarily user above needs Want, come information record below and be probably what user needed on the contrary, therefore, existing information record Sort method can not reflect the real information demand of user, and user needs to spend substantial amounts of in such cases Time could find most interested information from the information record that query string is corresponding, and also can take Many system resource.
In a word, the technical problem that those skilled in the art urgently solve is needed exactly: how can carry For being closer to the information record of the real information demand of user, thus user is facilitated the most therefrom to look for To most interested information.
Summary of the invention
Technical problems to be solved in this application be to provide a kind of method that information record is ranked up and Device, it is possible to realize effective searching order for environmental information so that the information record after sequence is more Real information demand close to user.
Accordingly, present invention also provides a kind of information search server and information search client, it is possible to The information record of the real information demand being closer to user is provided, thus facilitates user the most therefrom Find most interested information.
In order to solve the problems referred to above, this application discloses a kind of method that information record is ranked up, institute The method of stating includes:
Gather the environmental information that query string is corresponding;
The information record of each intention classification is obtained according to described query string;
According to distribution being intended to classification each under the environmental information that described query string is corresponding, carry out being intended to classification Sequence, and according to the order of the ranking results adjustment information record being intended to classification;Wherein, described query string Under corresponding environmental information, each distribution being intended to classification is had the user journal of environmental information to enter by according to record Row analytic statistics obtains.
On the other hand, disclosed herein as well is a kind of device that information record is ranked up, described device Including:
Acquisition module, for gathering the environmental information that described query string is corresponding;
Information record acquisition module, for obtaining the information record of each intention classification according to described query string; And
Order module between class, is intended to dividing of classification under the environmental information corresponding according to described query string Cloth, is ranked up being intended to classification, and according to the order of the ranking results adjustment information record being intended to classification; Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had environmental information by according to record User journal be analyzed statistics and obtain.
On the other hand, disclosed herein as well is a kind of information search server, including:
Receiver module is corresponding from query string and the described query string of information search client for receiving Environmental information;
Information search module, for scanning in network data according to query string, obtains each intention class Other information record;
Order module between class, is intended to dividing of classification under the environmental information corresponding according to described query string Cloth, is ranked up being intended to classification, and adjusts the suitable of each information record according to the ranking results being intended to classification Sequence;Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had environment by according to record The user journal of information is analyzed statistics and obtains;And
Return module, for being returned by the information record of order module output between described class.
On the other hand, disclosed herein as well is a kind of information search client, including:
Inquire-receive module, for receiving the query string of user's input;
Environment acquisition module, for gathering the environmental information that described query string is corresponding;
Sending module, for sending environmental information corresponding to described query string and described query string to information Search server;And
Represent module, represent for the information record that described information search server is returned.
Preferably, described information search client also includes:
Inquiry log logging modle, for by user totem information, described query string and corresponding webpage behaviour Making history and environmental information record to inquiry log, described web page operation history is the information that query string is corresponding The webpage record being user-operably in record.
Compared with prior art, the application has the advantage that
First, it is intended to the distribution of classification under the environmental information that the application foundation query string is corresponding, to being intended to class It is not ranked up, and according to the order of the ranking results adjustment information record being intended to classification;Due to user couple There is different information requirements under varying environment information, and it is the most corresponding with information requirement to be intended to classification , it can reflect the different classes of information requirement of user, therefore above-mentioned sequence can be described by more reflecting The intention classification row of information requirement under the environmental information (hereinafter referred to as current context information) that query string is corresponding Above, therefore, the application enables to the information record after sorting and is satisfied with the real information need of user Ask.
Secondly, the application it is also conceivable to active user for each intention classification to the sequence of information record The factor of interest;Different interest is had owing to each user is intended to classification to difference, and according to record The user journal having environmental information and user totem information is analyzed adding up the active user obtained in institute State each distribution being intended to classification under the environmental information that query string is corresponding to be ranked up, it is possible to by active user more Before intention classification interested comes;In the case of query string is identical, prior art can be used to the whole network Family provides identical information record, and without considering the problem of the individual demand of user, the application can The information record after sequence is made to be closer to reflect the personalized true letter of user interest degree Breath demand.
Furthermore, the information record of each intention classification is being ranked up adjusting the information note of each intention classification During the order recorded, the information within each intention classification can also be remembered by the application according to current context information Record is ranked up, and more will can reflect that under current context information, information needs in the information record of each intention classification Before the webpage asked comes so that the information record after sequence is closer to the real information of user to be needed Ask.
The technical scheme of the application can apply to the application such as search engine service, browser service, it is possible to The information record of the real information demand being closer to user is provided, thus facilitates user to check rapidly To most interested information.
Accompanying drawing explanation
Fig. 1 is a kind of flow chart to the embodiment of the method that information record is ranked up of the application;
Fig. 2 is the flow chart of a kind of information search method embodiment based on search engine of the application;
Fig. 3 is the flow chart of a kind of information recommendation method embodiment based on browser of the application;
Fig. 4 is the exemplary plot representing region in embodiment described in the application Fig. 3 more;
Fig. 5 is the structure chart of a kind of device embodiment being ranked up information record of the application;
Fig. 6 is the structure chart of the application a kind of information search server embodiment;
Fig. 7 is the structure chart of the application a kind of information search client embodiment.
Detailed description of the invention
Understandable, below in conjunction with the accompanying drawings for enabling the above-mentioned purpose of the application, feature and advantage to become apparent from With detailed description of the invention, the application is described in further detail.
Information record is ranked up by the embodiment of the present application for environmental information, owing to can embody different rings The different information requirements of user under environment information, therefore the information record of enabling to is closer to the true of user Information requirement.
In the embodiment of the present application, environmental information is primarily referred to as the surrounding enviroment information residing for user, specifically may be used To include time environmental information, location circumstances information, temperature environment information, hardware environment information etc..
Under different environmental informations, the information requirement of user is often different: with time environmental information As a example by, be the beginning of new a day in the morning, therefore user in the morning time news information is had demand;During working Work is auxiliary for Your Majesty's net, therefore when being on duty, webpage, pictorial information is also existed demand;Evening is for loosening In the moment of amusement, time at night, music, video information are also existed demand, etc.;
As a example by geographical environment, Internet bar, family are the place loosening amusement, thus user in Internet bar, family, Generally the information such as video, game, music are also existed demand;Office is should not excessively to give pleasure in office space Happy, therefore the information such as news, picture is enough for user;Airport, station, hotels and other places are flowing The place that property is strong, is generally concerned with the information such as tourism, weather.Even if user specify that self for video Information requirement, but, it is contemplated that office space should not excessively be entertained, Internet bar, family be suitable for amusement Feature, it is believed that under working environment, user wants that see is the film clips of video, and under Internet bar, home environment User wants that see is the video that high definition is complete.
To sum up, those skilled in the art can use the one in above-mentioned environmental information according to the actual requirements Or multiple, and, it is finely divided for one or more environmental informations used.Such as, by pair time Between environmental information carry out environmental information segmentation, time environmental information is subdivided into daytime and night, or early Morning, working and evening etc.;Such as, by position environmental information is classified, by location circumstances information It is subdivided into Internet bar, attack, family, airport, station, hotel etc..The application is to concrete segmentation mode It is not any limitation as.
In order to the various information requirements of user are associated with the information record in network data, The application can use thought based on classification to be that information record adds intention class label so that different intentions The corresponding different information requirement of classification;So, the row according to environmental information, the information record obtained carried out Sequence, is converted to carry out being intended to the sequence of classification according to environmental information.
About carrying out being intended to the sequence of classification according to environmental information, the application uses theory of probability and mathematical statistics Method calculate under the environmental information that described query string is corresponding the regularity of each distribution being intended to classification.Tool For body, in off-line case, user journal is analyzed statistics, obtains described query string correspondence ring Each distribution being intended to classification under environment information;When line ordering, according to the environmental information that described query string is corresponding Under the distribution of each intention classification, the information record of each intention classification is ranked up.
In view of the probability symbols used in the embodiment of the present application, for convenience of understanding, at this by table 1 to respectively The title of probability symbols, implication and acquisition methods explain.
Table 1
With reference to Fig. 1, it is shown that a kind of flow process to the embodiment of the method that information record is ranked up of the application Figure, specifically may include that
Step 101, the environmental information that collection query string is corresponding;
In the embodiment of the present application, it is intended to dividing of classification according to each under the environmental information that described query string is corresponding Cloth, is ranked up the information record of each intention classification;Owing to user also exists under varying environment information Different information requirements, and be intended to classification and link directly with information requirement, it can reflect user's The different information requirements being intended to classification, therefore the environment letter that above-mentioned sequence more can be corresponding by meeting described query string Before under breath (hereinafter referred to as current context information), the information record of the intention classification of information requirement comes, Therefore, the application enables to the information record after sorting and is closer to the real information demand of user, Thus be user-friendly to.
Environmental information is primarily referred to as the surrounding enviroment information residing for user, even if same user, it is residing Surrounding enviroment information is likely to be change, and time environmental information is exactly a typical example.For This, the application is when gathering environmental information, and either the query string of user's input is also based on user's input Or the query string currently browsing webpage structure of user, the environmental information that query string is corresponding is respectively provided with in real time Property;Therefore the application gathers the environmental information that described query string is corresponding.
For the query string of user's input, it receives or the construction complete time is i.e. corresponding time ring Environment information, the position obtained according to its IP (agreement of interconnection, Internet Protocol between network) address Confidence breath is i.e. corresponding location circumstances information, the temperature that time environmental information is corresponding with location circumstances information Information be temperature environment information, etc..The environmental information that the application is corresponding to concrete described query string Method be not any limitation as.
Step 102, according to described query string obtain each intention classification information record;
In a preferred embodiment of the present application, obtain the information of each intention classification according to described query string The step of record, specifically may include that
It is first depending on described query string to search in network data and obtain corresponding information record, then foundation Described information record is classified by preset each classification that is intended to, and obtains the information record of each intention classification; The label that described each intention classification is beaten by the webpage that information record is corresponding according to the whole network user carries out preset;
And/or, search in the network data with each intention class label respectively according to described query string Rope, obtains the information record of each intention classification.Will described query string each intention classification correspondence in the whole network Search engine in scan for, obtain each search engine return the search with each intention class label As a result, thus form the information record of each intention classification.Owing to the classification of search engine each in the whole network is objective Existing, such as mp3.baidu.com is the search engine of music categories, and news.sogou.com is news The search engine of classification, video.baidu.com is that video class is other searches element engine etc., can search from these Index directly obtains correspondence in holding up and is intended to the information record of classification, so the intention classification of the application is network The attribute of the objective reality corresponding to data.
In the embodiment of the present application, described intention classification is mainly used in distinguishing different letters in each information record Breath demand, in a preferred embodiment of the present application, its specifically can include video, picture, information, Resource, comment or rate of exchange classification etc..In reality, those skilled in the art can also according to actual needs, What information record was divided into other is respectively intended to classification, and with the different information requirement of difference, the application is to specifically The sorting technique of information record be not any limitation as.
Step 103, according to distribution being intended to classification each under environmental information corresponding to described query string, to meaning Figure classification is ranked up, and according to the order of the ranking results adjustment information record being intended to classification;Wherein, Under the environmental information that described query string is corresponding, each distribution being intended to classification is had environmental information by according to record User journal is analyzed statistics and obtains.
In practice, can select to use browser log or search engine according to practical application request The user journals such as inquiry log carry out statistical analysis, and such as, search engine typically can arrange inquiry log, And browser client typically can arrange browser log, the application is at existing inquiry log or browser Environmental information is added on the basis of daily record.
In a preferred embodiment of the present application, described user journal includes browser log and/or looks into Ask daily record.Described browser log record has user totem information, browses web-page histories and corresponding environment Information;Described inquiry log record have user totem information, query string and corresponding web page operation history and Environmental information, described web page operation history is the net being user-operably in the information record that query string is corresponding Page record.
P(Ic| T) can be used for representing intention classification I under environmental information T that described query string is correspondingcDistribution, According to theory of probability and mathematical statistics method, can use following formula that it is derived:
P ( I c | T ) = Σ d p ( I c d | T ) = Σ d p ( I c d T ) p ( T ) = Σ d p ( I c | d T ) p ( d T ) p ( T ) = Σ d p ( I c | d T ) p ( d | T ) p ( T ) p ( T ) = Σ d p ( d | T ) p ( I c | d T ) - - - ( 1 )
Wherein,For to joint probability distribution P (Ic| T) carry out the operation of marginalisation.
In a preferred embodiment of the present application, can there is environmental information according to record as follows User journal be analyzed statistics, obtain under the environmental information that described query string is corresponding be intended to classification point Cloth:
Sub-step A1, under the environmental information that described query string is corresponding according to user journal in the whole network Webpage is analyzed statistics, obtains webpage distribution p under the environmental information that described query string is corresponding (d | T);
When using user journal to add up, can perform under the environmental information that described query string is corresponding The operation of statistics p (d), wherein, the operation using inquiry log to be analyzed adding up p (d) can use Following formula represents:
p ( d ) = Σ x p ( d | x ) p ( x ) - - - ( 2 )
Wherein, x record in inquiry log.
The operation example that use browser log is analyzed adding up p (d) is as follows: add up certain webpage d The number of times occurred in browser log, in some cases, can be with certain webpage d in browser log The number of times of middle appearance divided by all webpages in browser log occur number of times.
Sub-step A2, under the environmental information that described query string is corresponding for certain particular webpage according to user Daily record is analyzed statistics to each intention classification, obtains ad hoc networks under the environmental information that described query string is corresponding The intention categorical distribution p (I of pagec|dT);
In implementing, can first add up p (Ic):
1, as a example by browser log, such as represent five now with five pillars and be intended to classification Ic, as Really a webpage belongs in some (multiple) intention classification, just increases by 1 on corresponding pillar;From And obtain the numerical value on each pillar, namely each intention classification IcProbability distribution;
2, inquiry log is used to be analyzed adding up p (Ic) operation can be represented by the formula:
p ( I c ) = Σ x p ( I c | x ) p ( x ) - - - ( 3 )
P (I is added up for certain particular webpage under environmental information T that described query string is correspondingc), then must Arrive p (Ic|dT)。
Sub-step A3, with webpage as statistical sample, net environmental information under corresponding to described query string Under the environmental information that page distribution is corresponding with described query string, the product of the intention categorical distribution of particular webpage enters Row summation, obtains under the environmental information that described query string is corresponding each distribution being intended to classification:
In the case of query string is identical, prior art can represent consistent information record to the whole network user, Without considering the individual demand of user.
For the problems referred to above, in a preferred embodiment of the present application, considering current context information On the basis of, it is also possible to according to active user for the interest corresponding letter to each intention classification being intended to classification Breath record is ranked up;Correspondingly, described method can also include:
Identify the user totem information of the active user that described query string is corresponding;
According to active user's each distribution being intended to classification under the environmental information that described query string is corresponding, to meaning Figure classification is ranked up, and adjusts the order of each information record according to the ranking results being intended to classification, wherein, Active user's each distribution being intended to classification under the environmental information that described query string is corresponding has ring according to record The user journal of environment information and user totem information is analyzed statistics and obtains.
Excluding the factor of environmental information, different user is intended to classification to difference different interest, such as, User A grows tender of variety show, and every day all can watch the form of video with search engine and/or browser Obtain the variety show wanted, and user B grows tender of star's picture, habitually to search for and/or clear The form of video of looking at obtains the star's picture wanted.
This preferred embodiment method of theory of probability and mathematical statistics studies user for being intended to classification The regularity of interest, here, each distribution being intended to classification under the environmental information that comprehensive described query string is corresponding Regularity, finally, this preferred embodiment statistics is user in environmental information corresponding to described query string Under the distribution of each intention classification.
There is different interest owing to different user is intended to classification to difference, and have environmental information according to record Being analyzed statistics with the user journal of user totem information, the active user obtained is at described query string pair Under the environmental information answered, each distribution being intended to classification is ranked up, it is possible to by institute interested for active user Before stating intention classification coming, therefore, the application enables to information record and is closer to reflect user The personalized real information demand of interest-degree.
P(Ic| T, u) can be used for representing that active user is respectively intended under the environmental information that described query string is corresponding The distribution of classification, its available following formula is weight averaged statistics and obtains:
P(Ic|T,u)∝λP(T|Ic)P(Ic)+(1-λ)P(T|Ic,u)P(Ic|u) (4)
Wherein, u represents ID (userid), owing to all can record user's mark in every user journal Know, so, just can obtain all access records of each u, and then, the access record for u is added up P (Ic) i.e. can get P (Ic| u), P (Ic| u) can reflect that refer to user u is respectively intended to categorical distribution;λ For random factor.
For specific intended classification IcPerform p (T) statistical operation i.e. can get p (T | Ic), p (T) Can calculate by equation below:
p ( T ) = Σ d p ( d T ) - - - ( 5 )
Wherein, p (dT)=p (T | d) p (d) (6)
Wherein, the sum that the numerical value that the available webpage d of p (T | d) falls under environmental information occurs with webpage d Ratio calculation obtains;For specific user u and specific intended classification IcPerform the statistical operation of p (T) I.e. can get P (T | Ic,u);Random factor λ is for representing that all users are at environment corresponding to described query string The distribution being intended to classification under information is intended to class with active user under the environmental information that described query string is corresponding Other distribution, can determine the numerical value of λ according to the actual requirements.
For example, it is possible to by manually marking the log information of user in T, marked content is for being intended to Classification, adjusts λ so as to get best is intended to describe λ value corresponding during accuracy rate, and wherein, T is permissible It it is the same T time section in many days user journals.
Specifically, manually it is labelled with the model answer being intended to classification ranking results, adjusts λ=0.1,0.2 ..., 0.9}, utilize the result that formula (4) the right is calculated under different λ, contrast standard Answer and the intention classification ranking results of formula clearing, can count formula under specific λ calculated Accuracy rate, λ value corresponding when accuracy rate is the highest is exactly the λ value finally determined.Wherein it is possible to utilize NDCG (normalization accumulation discount taken, Normalized Discounted Cumulative Gain), NDCG is a kind of to search engine or the tolerance of relative program effectiveness, its calculate before the phase of k bar result The computing formula of closing property score is:
N D C G ( k ) = G max , i - 1 ( k ) Σ j : π i ( j ) ≤ k 2 y i , j - 1 log 2 ( 1 + π i ( j ) )
I is expressed as i & lt search;J is expressed as j-th strip result;yi,jIt is expressed as the relevant of j-th strip result Property mark score, 5 grades of systems;πiJ () is expressed as this result position in the ranking.
And for example, it is also possible to the directly numerical value of setting λ, 0.6,0.8 etc., the application concrete number to λ Value is not any limitation as.
In a preferred embodiment of the present application, the identity of this user can be identified as follows:
When this user registers login, using the ID of this user as the user totem information of this user;When When this user is to be not logged in state and browse, (it is used for storing private information according to the cookie of this user Small text file) identify the user totem information of this user.In actual applications, for needs For the website that ID registration logs in, the selection of user's unique identifier can defer to following sequence: Be as the criterion with ID when user registers login, when user when the state of being not logged in browses with user's Cookie is as the criterion.
Wherein, user based on cookie identifies is the typical user identification method of existing one.When passing through When the method for self-defined Apache journal format or JavaScript obtains user cookie, the most Through have found the means that a very effective user identifies.Cookie is permissible on the premise of not being eliminated It is considered with the binding of certain access client computer, so the accuracy that user based on cookie identifies The highest.Such as, such as the user registered in Taobao, cookies information will be had to be stored in user's Inside the c dish of computer, when this user accesses Taobao again, the system of Taobao can go to the path specified to go Take cookies information, if got, even if then this user without logging into, also can get login name, as Fruit take less than, then can a newly-built cookies information to inside the computer of user.Current most of user The most do not remove the cookies information of oneself.So, it is possible to use this technology, obtain the identity mark of user Know.
In a preferred embodiment of the present application, can there is environmental information according to record as follows User journal be analyzed adding up and obtain active user and anticipate under the environmental information that described query string is corresponding The distribution of figure classification:
Sub-step B1, user journal is analyzed statistics, obtain the distribution of each intention classification and spy The distribution of described each environmental information corresponding under fixed intention classification, and then statistics obtains all users in institute State the distribution being intended to classification under the environmental information that query string is corresponding:
∝ represents implication of equal value;
Sub-step B2, active user's daily record is analyzed statistics, obtain active user is respectively intended to class The distribution of described each environmental information corresponding under specific intention classification of other distribution and active user:And then statistics obtains active user Under the environmental information that described query string is corresponding, it is intended to the preliminary of classification be distributed:
Sub-step B3, described all users are respectively intended to class under the environmental information that described query string is corresponding Other distribution and active user's each be intended to classification preliminary point under the environmental information that described query string is corresponding Cloth carries out linear weighted function process, obtains active user each under the environmental information that presently described query string is corresponding The distribution of intention classification: P (Ic|T,u)∝λP(T|Ic)P(Ic)+(1-λ)P(T|Ic,u)P(Ic|u)。
In the case of without the daily record of active user, i.e. user browses for the first time, and λ=1, active user exists Under the environmental information that described query string is corresponding, each distribution being intended to classification is all users at current environment Each distribution being intended to classification under information.
Believe at current environment above in accordance with the distribution or active user being intended to classification under current context information The lower distribution being intended to classification of breath, is ranked up the information record of each intention classification, to adjust each intention class The order of other information record, in a preferred embodiment of the present application, it is also possible to according to current environment Information record within each intention classification is ranked up by information, and correspondingly, described method can also be wrapped Include:
According to the webpage distribution of specific intended classification under the environmental information that described query string is corresponding, to each meaning Information record within figure classification is ranked up;Wherein, under the environmental information that described query string is corresponding specific The webpage of intention classification is distributed as having the user journal of environmental information to be analyzed statistics according to record and obtains.
This preferred embodiment have also contemplated that when being ranked up the information record within each intention classification Environmental information, more will can reflect information requirement under current context information in the information record of each intention classification Webpage come before, it is possible to the information record of making is closer to the real information demand of user.
Such as, video is intended to the information record of classification to be had multiple, including various film clips video resources and various HD video resource, now, discounting for current context information, arranges HD video resource simply Above, user may be made to cut a sorry figure;Because if being under office space, user should not excessively entertain, This preferred embodiment considers current context information, therefore the information record of enabling to is closer to user's Real information demand.
P(d|Ic, T) and can be used for representing intention classification I under environmental information TcWebpage distribution, according to theory of probability And mathematical statistics method, can use following formula that it is derived:
p(d|Ic, T) and=p (Ic,d,T)/p(TIc)=p (Ic,d,T)/(p(Ic|T)·p(T)) (7)
Wherein, p (Ic, d, T) and it is environmental information, specific intended classification and the webpage that described query string is corresponding Joint Distribution, available following formula obtains:
p(Ic, d, T) and=p (Ic|d,T)·p(T|d)·p(d) (8)
Wherein, p (Ic| d, T) it is that particular webpage d is being intended under environmental information T that described query string is corresponding IcOn distribution, p (T | d) is webpage d distribution under environmental information T that described query string is corresponding, p (d) It is distributed for webpage, all directly can add up from browser log and obtain;
p(TIc) it is the environmental information and the Joint Distribution of specific intended classification that described query string is corresponding, available Following formula represents:
p(TIc)=p (T | Ic)p(Ic)=p (Ic|T)p(T) (9)
In a preferred embodiment of the present application, as follows record can there be is environmental information User journal is analyzed statistics, obtains specific intended classification under the environmental information that described query string is corresponding Webpage is distributed:
Sub-step C1, user journal is analyzed statistics, obtain in the whole network each webpage distribution, institute That states particular webpage under the environmental information that query string is corresponding is respectively intended to categorical distribution and at described query string pair Each webpage distribution under the environmental information answered;
For certain particular webpage statistics p (Ic) under environmental information T that described query string is corresponding, then just Obtain p (Ic|d,T);
P (T | d) it is webpage d distribution under environmental information T that described query string is corresponding, for webpage d The statistical operation performing p (T) is the most available, and wherein, p (T) can use formula (5) to calculate.
Sub-step C2, it is distributed according to each webpage in the whole network, under environmental information corresponding to described query string Particular webpage be respectively intended to categorical distribution and each webpage under the environmental information that described query string is corresponding divides Cloth, constructs the connection of each webpage in environmental information corresponding to described query string, specific intended classification and the whole network Close distribution;
Sub-step C3, according in corresponding environmental information, specific intended classification and the whole network of described query string The Joint Distribution of each webpage environmental information corresponding with described query string and the associating of specific intended classification The ratio of distribution, statistics obtains the webpage of specific intended classification under the environmental information that described query string is corresponding and divides Cloth.
In actual applications, described query string is corresponding environmental information and the Joint Distribution of specific intended classification Can be with the distribution of specific intended classification under environmental information corresponding to described query string and described query string pair Product p (the TI of the distribution of the environmental information answeredc)=p (Ic| T) p (T) calculating, or, available specific intended class The distribution of the environmental information that the most lower described query string is corresponding and the product of the distribution of specific intended classification p(TIc)=p (T | Ic)p(Ic) calculate.The most describe p (Ic),p(T),p(Ic| T) statistical method, pin To specific intended classification IcPerform p (T) statistical operation i.e. can get p (T | Ic)。
For making those skilled in the art be more fully understood that the method that information record is ranked up by the application, with Under introduce its application in practice by example.
Example 1, the information search service in search engine is carried out the sequence of information record;
With reference to Fig. 2, it is shown that the stream of a kind of information search method embodiment based on search engine of the application Cheng Tu, specifically may include that
Step 201, information search client receive the query string of user's input;
Step 202, information search client gather the environmental information that described query string is corresponding;
Step 203, information search client are by environmental information corresponding to described query string and described query string Send to information search server;
Step 204, information search server receive from information search client query string and described in look into Ask the environmental information that string is corresponding;
Step 205, information search server scan in network data according to query string, obtain each It is intended to the information record of classification;
Step 206, information search server are according to being intended to classification under environmental information corresponding to described query string Distribution, to be intended to classification be ranked up, and according to be intended to classification ranking results adjust each information record Order;Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had by according to record The user journal of environmental information is analyzed statistics and obtains;
Search Results after sequence is returned to information search client by step 207, information search server;
The Search Results that described information search server is returned by step 208, information search client is carried out Represent.
In existing information search service, Search Results is represented by the environmental information that has no basis to be adjusted Whole;And this example is by adding up the inquiry log under varying environment information, and obtain according to statistics Current context information under be intended to the distribution of classification, the Search Results of each intention classification is ranked up, real Now personalized Search Results based on environmental information represents;Can provide and be closer to the true of user The Search Results of information requirement, thus facilitate user the most therefrom to find most interested information.
Below with an instantiation explanation:
Illustrating for convenience, environmental information just carries out dividing (working time T with the time1, the non-working time T2);After client receives query string " A Chinese Ghost Story " x, by x and T1Send server end to.Clothes Business device is according to x searching database, it is thus achieved that with the webpage collection of intentional class labelThen utilize current Environmental information T1Distribution P (the I of lower intention classificationc|T1) rightAccording to being intended to classification sequence, such as, sequence The Search Results of " A Chinese Ghost Story " is at T afterwards1Environmental information intention classification order be " information, video display, picture, Game ... ".
In a word, do not consider that environmental information provides unified search to tie for existing information search service Really, the application makes collection result more for specific aim, and personalization capability is higher, it is possible to facilitate user fast Therefrom find most interested information fastly, reduce the system resource that user takies in search procedure.
As a kind of preferred embodiment, described step 205 specifically may include that being first depending on query string exists In network data, search obtains corresponding information record, then according to each intention classification to described information record Classify, obtain the information record of each intention classification;And/or, according to described query string respectively with Each network data being intended to class label scans for, obtains the information record of each intention classification.
As a kind of preferred embodiment, described information search method can also include:
Step D, according to the webpage distribution of specific intended classification under environmental information corresponding to described query string, Information record within each intention classification is ranked up;Wherein, the environment letter that described query string is corresponding The webpage of the lower specific intended classification of breath is distributed as having the user journal of environmental information to be analyzed according to record Statistics obtains.
Described step D can perform before or after step 206, sorts in step D and step 206 Ranking results is exported to step 207 by the posterior one of order.Specific to upper example, under each intention classification Can according to P (d | Ic,T1) sequence, such as T1Environmental information " video display " is intended to " A Chinese Ghost Story film review " page under classification Before face comes " download of A Chinese Ghost Story video " page.
As a kind of preferred embodiment, described information search method can also include:
Step E1, identify the user totem information of the active user that described query string is corresponding;
Step E2, to be intended to classification be ranked up, and according to be intended to classification ranking results adjustment information The order of record.
Step E2 alternative steps 206, the result of such step E2 exports to step 207.
As a kind of preferred embodiment, described information search method can also include relevance ranking step: According to the dependency between Search Results and described query string, according to dependency order from high to low to institute The information record stating the output of information search module carries out relevance ranking.Wherein, relevance ranking can be in step Performing before or after rapid 206, the final result of relevance ranking step and step 206 exports to step 208。
It should be noted that described information service client end can also by ID, described query string and Corresponding accessed webpage and environmental information record are to inquiry log, and described accessed webpage is information record In webpage accessed by the user.
The sequence of information record in the service of example 2, information recommendation, information record shows as recommendation results.
With reference to Fig. 3, it is shown that the flow process of a kind of information recommendation method embodiment based on browser of the application Figure, specifically may include that
Step 301, the input of foundation user or user currently browse webpage structure query string;
Step 302, the input gathering user or user currently browse the environmental information that webpage is corresponding, make For the environmental information that described query string is corresponding;
Step 303, scan in network data according to described query string, obtain each intention classification Recommendation results;
Step 304, according to being intended to the distribution of classification under environmental information corresponding to described query string, to intention Classification is ranked up, and adjusts the order of recommendation results according to the ranking results being intended to classification;Wherein, institute State the distribution being intended to classification under the environmental information that query string is corresponding by the user having environmental information according to record Daily record is analyzed statistics and obtains;
Step 305, recommendation results being intended to classification each after sequence is represented.
Existing information recommendation service, when being ranked up recommendation results, does not consider environmental information, and The application adjusts recommendation results according to current context information and is shown, it is possible to realize personalized browsing Recommend.
Corresponding example:
During 1. morning of example, before the recommendation row of news category;During working, web, picture category are other recommends row Before;During evening, before video, the recommendation of music categories are arranged.
Example 2. is in Internet bar, before the classification such as video, game, music recommends row;In office, news, Before the classifications such as picture recommend row;On airport, station, hotels and other places, the classification information such as tourism, weather Before recommending row, etc..
Example 3. same video input demand, working environment, film clips sort front;Internet bar, the environment such as family, High definition, complete video resource sort front, etc..
Whole flow process is introduced below with an example:
User, when browsing the webpage relevant to " Wang little Chuan ", utilizes web page title, url and text message structure Query string " Wang little Chuan ";Then, it is intended in classification retrieve " Wang little Chuan " from " information, picture, video display " etc., Return the Search Results under each intention classification;Then, according to P (Ic|T1) each intention classification is sorted.
As a kind of preferred embodiment, described information recommendation method based on browser can also include:
Step F, according to the webpage distribution of specific intended classification under environmental information corresponding to described query string, Each intention classification inside is browsed information be ranked up;Wherein, the environment letter that described query string is corresponding The webpage of the lower specific intended classification of breath is distributed as having the user journal of environmental information to be analyzed according to record Statistics obtains.
Described step F can perform before or after step 304, sequence time in step F and step 304 Ranking results is exported to step 305 by the posterior one of sequence.Specific to upper example, can under each intention classification Foundation P (d | Ic,T1) sequence, such as T1Under environmental information " information " intention classification, " search dog browser leads king little Lead to success in river " before the page comes " search dog CEO king's coulee sermon the Internet " page.
As a kind of preferred embodiment, described information recommendation method based on browser can also include:
Step G1, identify the user totem information of the active user that described query string is corresponding;
Step G2, foundation active user are intended to dividing of classification under the environmental information that described query string is corresponding Cloth, is ranked up being intended to classification, and adjusts the order of recommendation results according to the ranking results being intended to classification, Wherein, what active user was intended to classification under the environmental information that described query string is corresponding is distributed as according to record The user journal having environmental information and user totem information is analyzed statistics and obtains.
Step G2 alternative steps 304, the result of such step G2 exports to step 305.
In a word, this preferred embodiment can realize personalized letter according to environmental information, user totem information Breath recommendation service, it is recommended that more precisely, more personalized recommendation results.
In the application one preferred embodiment, described information recommendation method based on browser can also be by ID, described webpage and the corresponding environment recommendation results of currently browsing are to browser log;And/or, By ID, described query string and corresponding web page operation history and environmental information record to inquiring about day Will, is clicked on by user in the recommendation results that described web page operation historical query string is corresponding and accesses the net operated Page record.
In another preferred embodiment of the present application, described step 305 can be specifically, preset The each recommendation results being intended to classification in respectively representing region exported order module between described class is opened up Existing, wherein, if each represent region in represent in a recommendation results being intended to classification and come above Dry.With reference to Fig. 4, it is shown that a kind of examples representing region of the application, wherein, " information ", " figure more Sheet ", " video display " be intended to classification and come the front three of recommendation results, and be respectively displayed on and represent district accordingly In territory.
In the still another preferable embodiment of the application, can use sequence learning method, foundation user's is defeated Enter or the webpage that currently browses of user constructs query string, specifically may include that
Step H1, currently browse webpage extraction candidate's phrase from described;
Here, Chinese word segmentation, name Entity recognition, part of speech, tf/idfTF-IDF (word frequency/inverse can be used To document-frequency, term frequency/inverse document frequency) etc. step extract candidate Phrase.
Step H2, from described candidate's phrase, pick out candidate word as query string.
Sequence learning method substantially can be divided into three big classes: based on the sequence study returned, based on classification Sequence study and sequence based on ordinal regression study.Wherein, sequence based on ordinal regression study is calculated Method is the focus of Learning Studies of currently sorting, and specifically can enter oneself for the examination sequence perceptron algorithm (PRank), change The sequence perceptron algorithm (Large Marge PRank) entered and support vector ordinal regression algorithm (Support Vector Ordinal Regression) be representative based on data point (Point-wise) Ranking Algorithm, With rank support vector machine algorithm (Rank SVM), RankBoost algorithm and RankNet algorithm as representative Ranking Algorithm based on ordered pair (Pair-wise).The application can use any of the above-described sequence Learning method, picks out the intention phrase subset that can represent current page from described candidate's phrase.
Corresponding to the aforementioned method that information record is ranked up, present invention also provides a kind of to information note The device that record is ranked up, with reference to Fig. 5, described device specifically may include that
Acquisition module 501, for gathering the environmental information that described query string is corresponding;
Information record acquisition module 502, for obtaining the information note of each intention classification according to described query string Record;And
Order module 503 between class, are intended to classification under the environmental information corresponding according to described query string Distribution, is ranked up being intended to classification, and according to being intended to ranking results adjustment information record suitable of classification Sequence;Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had environment by according to record The user journal of information is analyzed statistics and obtains.
In the application preferred embodiment, it is preferred that described user journal include browser log and/ Or inquiry log;Described browser log record has user totem information, browses web-page histories and corresponding Environmental information;Described inquiry log record has user totem information, query string and corresponding web page operation to go through History and environmental information, described web page operation history is to be user-operably in the information record that query string is corresponding Webpage record.
In the application preferred embodiment, it is preferred that described environmental information specifically can include time ring Environment information, location circumstances information, temperature environment information or hardware environment information.It is preferable to carry out in the application In example, it is preferred that described intention classification specifically can include video, picture, information, resource, comment Opinion or rate of exchange classification.
In a preferred embodiment of the present application, described device can also include:
First statistical module, obtains for having the user journal of environmental information to be analyzed statistics according to record The distribution of classification it is intended under the environmental information that described query string is corresponding, including:
First statistics submodule, for foundation user journal pair under the environmental information that described query string is corresponding Webpage in the whole network is analyzed statistics, and each webpage obtained under the environmental information that described query string is corresponding divides Cloth;
Second statistics submodule, is used under the environmental information that described query string is corresponding for certain particular webpage According to user journal, each intention classification is analyzed statistics, obtains the environmental information that described query string is corresponding The intention categorical distribution of lower particular webpage;And
Summation submodule, is used for webpage as variable, net environmental information under corresponding to described query string Under the environmental information that page distribution is corresponding with query string, the product of the intention categorical distribution of particular webpage is asked With, obtain being intended to classification distribution under the environmental information that described query string is corresponding.
In another preferred embodiment of the present application, described device can also include:
Identification module, is used for the user totem information of the active user identifying that described query string is corresponding;
Order module between interest-degree class, for according to active user in environmental information corresponding to described query string Under the distribution of each intention classification, be ranked up being intended to classification, and adjust according to the ranking results being intended to classification The order of whole each information record, wherein, user is respectively intended to class under the environmental information that described query string is corresponding Other being distributed as has the user journal of environmental information and user totem information to be analyzed system according to record Meter obtains.
In the still another preferable embodiment of the application, described device can also include:
Second statistical module, obtains for having the user journal of environmental information to be analyzed statistics according to record User is intended to the distribution of classification under the environmental information that described query string is corresponding, specifically may include that
3rd statistics submodule, for user journal is analyzed statistics, obtains being intended to the distribution of classification With in the specific distribution being intended to described each environmental information corresponding under classification, and then statistics to obtain institute useful Family is intended to the distribution of classification under the environmental information that described query string is corresponding;
4th statistics submodule, for active user's daily record is analyzed statistics, obtains active user's Each distribution being intended to classification and described each environment letter corresponding under specific intention classification of active user The distribution of breath, and then statistics obtains active user and is intended to classification under the environmental information that described query string is corresponding Preliminary distribution;And
Linear weighted function processes submodule, for believing described all users at the environment that described query string is corresponding The lower distribution being intended to classification of breath is intended at the beginning of classification under the environmental information that described query string is corresponding with user Step distribution carries out linear weighted function process, obtains active user in environmental information corresponding to presently described query string The distribution of lower intention classification.
In a preferred embodiment of the present application, described device can also include:
Classification internal sort module, specific intended classification under the environmental information corresponding according to described query string Webpage distribution, the information record within each intention classification is ranked up;Wherein, described query string Under corresponding environmental information, the webpage of specific intended classification is distributed as the user having environmental information according to record Daily record is analyzed statistics and obtains.
In another preferred embodiment of the present application, described device can also include:
3rd statistical module, obtains for having the user journal of environmental information to be analyzed statistics according to record The webpage distribution of specific intended classification under the environmental information that described query string is corresponding, including:
5th statistics submodule, for user journal is analyzed statistics, obtains each webpage in the whole network Under distribution, environmental information corresponding to described query string, particular webpage is respectively intended to categorical distribution and described Each webpage distribution under the environmental information that query string is corresponding;
6th statistics submodule, for the ring being distributed according to each webpage in the whole network, described query string is corresponding Under environment information, particular webpage is respectively intended to categorical distribution and under the environmental information that described query string is corresponding Each webpage is distributed, and construct in environmental information corresponding to described query string, specific intended classification and the whole network is each The Joint Distribution of webpage;And
7th statistics submodule, for the environmental information corresponding according to described query string, specific intended classification The environmental information corresponding with described query string with the Joint Distribution of each webpage in the whole network and specific intended class The ratio of other Joint Distribution, statistics obtains specific intended classification under the environmental information that described query string is corresponding Each webpage distribution.
In the embodiment of the present application, it is preferred that described information record acquisition module, can be specifically for depending on Search in network data according to described query string and obtain corresponding information record, and according to each intention classification pair Described information record is classified, and obtains the information record of each intention classification;And/or, according to described inquiry String scans for respectively in the network data with each intention class label, obtains the letter of each intention classification Breath record.
In a preferred embodiment of the present application, described device can also include: represents module, is used for Each information record being intended to classification of order module output between described class is represented.
In the embodiment of the present application, it is preferred that described in represent module, can be specifically for preset each The each information record being intended to classification in representing region exported order module between described class represents.
In the embodiment of the present application, it is preferred that described query string derive from user input or user work as Before browse webpage.
For for device embodiment that information record is ranked up, owing to it is carried out with to information record The embodiment of the method basic simlarity of sequence, so describe is fairly simple, relevant part sees to be remembered information The part of the embodiment of the method that record is ranked up illustrates.
With reference to Fig. 6, it is shown that the structure chart of the application a kind of information search server embodiment, specifically may be used To include:
Receiver module 601, for receiving the query string from information search client and described query string pair The environmental information answered;
Information search module 602, for scanning in network data according to query string, obtains each meaning The information record of figure classification;
Order module 603 between class, are intended to classification under the environmental information corresponding according to described query string Distribution, is ranked up being intended to classification, and adjusts each information record according to the ranking results being intended to classification Sequentially;Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had ring by according to record The user journal of environment information is analyzed statistics and obtains;And
Return module 604, search for the information record of order module output between described class is returned to information Rope client.
In a preferred embodiment of the present application, described information search module 602, can be specifically user It is used for when using search engine searching in network data according to query string obtaining corresponding information record, and According to each intention classification, described information record is classified, obtain the information record of each intention classification;With / or, when user uses browser to carry out information browse according to the described query string that current browse webpage is corresponding Scan in the network data with each intention class label respectively, obtain the information of each intention classification Record.
In a preferred embodiment of the present application, described information search server can also include:
First relevance ranking module, the dependency between foundation information record and described query string, The information record of described information search module output is carried out the first relevance ranking, and by the first dependency Information record output after sequence is to described sort module;Or
Second relevance ranking module, the dependency between foundation information record and described query string, The information record of order module output between described class is carried out the second relevance ranking, and by the second dependency Information record output after sequence is to described return module.
In a preferred embodiment of the present application, described information search server can also include:
Identification module, is used for the user totem information of the active user identifying that described query string is corresponding;
Order module between interest class, for foundation active user under the environmental information that described query string is corresponding Each distribution being intended to classification, thus the information record of each intention classification is ranked up, wherein, currently use Family is each under the environmental information that described query string is corresponding is intended to the distribution of classification according to having environment to believe record The user journal of breath and user totem information is analyzed statistics and obtains;
Described return module, is additionally operable to return to the information record of order module output between described interest class Information search client, it is also possible to return by between described class between order module and interest class order module comprehensive The information record being disposed exported is to information search client.
In a preferred embodiment of the present application, described information search server can also include:
Classification internal sort module, is used between described class before or after order module, according to described inquiry The webpage distribution of specific intended classification under the environmental information that string is corresponding, to the information within each intention classification Record is ranked up;Wherein, under the environmental information that described query string is corresponding, the webpage of specific intended classification divides Cloth is to have the user journal of environmental information to be analyzed statistics record to obtain;
Described return module, is additionally operable to the information record by described classification internal sort module exports and returns to letter Breath search client;The information record of the output of order module between described class can also be returned to information search Client, or through between described class, order module and classification internal sort module synthesis are disposed and are exported Information record is to information search client, or order module between order module, interest class through between described class The information record being disposed exported with classification internal sort module synthesis is to information search client.
With reference to Fig. 7, it is shown that the structure chart of the application a kind of information search client embodiment, specifically may be used To include:
Receiver module 701, for receiving the query string of user's input;
Environment acquisition module 702, for gathering the environmental information that described query string is corresponding;
Sending module 703, for sending environmental information corresponding to described query string and described query string extremely Information search server;And
Represent module 704, represent for the information record that described information search server is returned.
In a preferred embodiment of the present application, described information search client can also include:
Inquiry log logging modle, for by user totem information, described query string and corresponding webpage behaviour Making history and environmental information record to inquiry log, described web page operation history is the information that query string is corresponding The webpage record being user-operably in record.
For the embodiment of information search server and client, owing to it is carried out with to information record The embodiment of the method basic simlarity of sequence, so describe is fairly simple, relevant part sees to be remembered information The part of the embodiment of the method that record is ranked up illustrates.
Each embodiment in this specification all uses the mode gone forward one by one to describe, and each embodiment stresses Be all the difference with other embodiments, between each embodiment, identical similar part sees mutually ?.
Above to provided herein a kind of information record is ranked up method and apparatus, Yi Zhongxin Breath search server and information search client, is described in detail, specific case used herein Principle and embodiment to the application are set forth, and the explanation of above example is only intended to help reason Solve the present processes and core concept thereof;Simultaneously for one of ordinary skill in the art, according to this The thought of application, the most all will change, in sum, and this Description should not be construed as the restriction to the application.

Claims (20)

1. the method that information record is ranked up, it is characterised in that described method includes:
The environmental information that Real-time Collection query string is corresponding;Wherein, described environmental information includes: described inquiry The corresponding surrounding enviroment information residing for user of string;
The information record of each intention classification is obtained according to described query string;Wherein, described intention classification is used for Different information requirements is distinguished in each information record;
According to distribution being intended to classification each under the environmental information that described query string is corresponding, carry out being intended to classification Sequence, and according to the order of the ranking results adjustment information record being intended to classification;Wherein, described query string Under corresponding environmental information, each distribution being intended to classification is had the user journal of environmental information to enter by according to record Row analytic statistics obtains;
Wherein, described method also includes:
According to each webpage distribution of specific intended classification under the environmental information that described query string is corresponding, to each The information record being intended to classification is ranked up;Wherein, specific meaning under the environmental information that described query string is corresponding Each webpage of figure classification is distributed as having the user journal of environmental information to be analyzed statistics according to record and obtains.
2. the method for claim 1, it is characterised in that look into described in obtaining as follows Each distribution being intended to classification under the environmental information that inquiry string is corresponding:
Webpage in the whole network is carried out point according to user journal under the environmental information that described query string is corresponding Analysis statistics, obtains each webpage distribution under described environmental information;
For certain particular webpage foundation user journal to each meaning under the environmental information that described query string is corresponding Figure classification is analyzed statistics, obtains the intention categorical distribution of particular webpage under described environmental information;
With each webpage as statistical sample, each webpage under described environmental information is distributed and described environmental information The intention categorical distribution of lower particular webpage is added up, and obtains under the environmental information that described query string is corresponding each It is intended to the distribution of classification.
3. the method for claim 1, it is characterised in that also include:
Identify the user totem information of the active user that described query string is corresponding;
According to active user's each distribution being intended to classification under the environmental information that described query string is corresponding, to meaning Figure classification is ranked up, and adjusts the order of each information record according to the ranking results being intended to classification;Wherein, Described active user is each under the environmental information that described query string is corresponding is intended to the distribution of classification according to note Record has the user journal of environmental information and user totem information to be analyzed statistics to obtain.
4. method as claimed in claim 3, it is characterised in that the most currently used Family is each distribution being intended to classification under the environmental information that described query string is corresponding:
User journal is analyzed statistics, obtains the distribution of each intention classification and be specifically intended to classification The distribution of described each environmental information of lower correspondence, so statistics obtain all users under described environmental information Each distribution being intended to classification;
Active user's daily record is analyzed statistics, obtains each distribution being intended to classification of active user and work as Front user is in the specific distribution being intended to described each environmental information corresponding under classification, and then statistics is worked as Front user is each preliminary distribution being intended to classification under described environmental information;
To described all users each distribution being intended to classification and described active user under described environmental information Under described environmental information, each preliminary distribution being intended to classification is weighted processing, and obtains described active user Each distribution being intended to classification under the environmental information that described query string is corresponding.
5. the method for claim 1, it is characterised in that look into described in obtaining as follows Ask each webpage of specific intended classification under the environmental information that string is corresponding to be distributed:
User journal is analyzed statistics, obtains each webpage distribution in the whole network, described query string correspondence Environmental information under particular webpage be respectively intended to categorical distribution and in environmental information corresponding to described query string Under each webpage distribution;
Be distributed according to each webpage in the whole network, particular webpage is each under environmental information corresponding to described query string Being intended to categorical distribution and the distribution of each webpage under the environmental information that described query string is corresponding, structure is described The Joint Distribution of each webpage in environmental information, specific intended classification and the whole network that query string is corresponding;
Connection according to each webpage in environmental information, specific intended classification and the whole network that described query string is corresponding Close the ratio of the Joint Distribution being distributed the environmental information corresponding with described query string and specific intended classification, system Meter obtains each webpage distribution of specific intended classification under the environmental information that described query string is corresponding.
6. the method as according to any one of Claims 1-4, it is characterised in that described according to institute When stating the information record that query string obtains each intention classification:
Search in network data according to described query string and obtain corresponding information record, and according to each intention Described information record is classified by classification, obtains the information record of each intention classification;
And/or, search in the network data with each intention class label respectively according to described query string Rope, obtains the information record of each intention classification.
7. the method as according to any one of Claims 1-4, it is characterised in that described user's day Will includes browser log and/or inquiry log;Described browser log record has user totem information, clear Look at web-page histories and corresponding environmental information;Described inquiry log record has user totem information, query string And corresponding web page operation history and environmental information, described web page operation history is the information that query string is corresponding The webpage record being user-operably in record.
8. the method as according to any one of Claims 1-4, it is characterised in that also include:
Each information record being intended to classification after sequence is represented.
9. method as claimed in claim 8, it is characterised in that also include: respectively represent preset In region, the recommendation results to each intention classification represents.
10. the method as according to any one of Claims 1-4, it is characterised in that described query string Derive from the webpage that user inputs or user currently browses.
11. 1 kinds of devices that information record is ranked up, it is characterised in that described device includes:
Acquisition module, for the environmental information that Real-time Collection query string is corresponding;Wherein, described environmental information Including: the surrounding enviroment information residing for described query string correspondence user;
Information record acquisition module, for obtaining the information record of each intention classification according to described query string; Wherein, described intention classification is for distinguishing different information requirements in each information record;And
Order module between class, is intended to dividing of classification under the environmental information corresponding according to described query string Cloth, is ranked up being intended to classification, and according to the order of the ranking results adjustment information record being intended to classification; Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had environmental information by according to record User journal be analyzed statistics and obtain;
Wherein, described device also includes:
Classification internal sort module, specific intended classification under the environmental information corresponding according to described query string Webpage distribution, the information record within each intention classification is ranked up;Wherein, described query string Under corresponding environmental information, the webpage of specific intended classification is distributed as the user having environmental information according to record Daily record is analyzed statistics and obtains.
12. devices as claimed in claim 11, it is characterised in that also include:
First statistical module, is used for obtaining being intended to dividing of classification under the environmental information that described query string is corresponding Cloth, including:
First statistics submodule, for foundation user journal pair under the environmental information that described query string is corresponding Webpage in the whole network is analyzed statistics, obtains each webpage distribution under described environmental information;
Second statistics submodule, is used under the environmental information that described query string is corresponding for certain particular webpage According to user journal, each intention classification is analyzed statistics, obtains particular webpage under described environmental information It is intended to categorical distribution;And
Summation submodule, for webpage as variable, is distributed and described the webpage under described environmental information Under environmental information, the intention categorical distribution of particular webpage is added up, and obtains being intended to classification at described query string The corresponding distribution under environmental information.
13. devices as claimed in claim 11, it is characterised in that also include:
Identification module, is used for the user totem information of the active user identifying that described query string is corresponding;
Order module between interest-degree class, for according to active user in environmental information corresponding to described query string Under the distribution of each intention classification, be ranked up being intended to classification, and adjust according to the ranking results being intended to classification The order of whole each information record, wherein, described active user is under the environmental information that described query string is corresponding Record is had the user journal of environmental information and user totem information to enter by each foundation that is distributed as being intended to classification Row analytic statistics obtains.
14. devices as claimed in claim 13, it is characterised in that also include:
Second statistical module, is respectively intended to class for obtaining user under the environmental information that described query string is corresponding Other distribution, including:
3rd statistics submodule, for user journal is analyzed statistics, obtains being intended to the distribution of classification It is intended to the distribution of each environmental information of correspondence under classification with specific, and then statistics obtains all users and exists The distribution of classification it is intended under described environmental information;
4th statistics submodule, for active user's daily record is analyzed statistics, obtains active user's The distribution and the active user that are intended to classification are intended to described each environmental information of correspondence under classification specific It is distributed, and then statistics obtains active user and is intended to the preliminary distribution of classification under described environmental information;And
Linear weighted function processes submodule, for described all users are intended under described environmental information classification Distribution and described active user under described environmental information, is intended to the preliminary distribution of classification is weighted locating Reason, obtains described active user and is intended to the distribution of classification under the environmental information that described query string is corresponding.
15. an information search server, it is characterised in that including:
Receiver module is corresponding from query string and the described query string of information search client for receiving Environmental information;Wherein, described environmental information includes: described information search client Real-time Collection, institute State the surrounding enviroment information residing for query string correspondence user;
Information search module, for scanning in network data according to query string, obtains each intention class Other information record;Wherein, described intention classification needs for distinguishing different information in each information record Ask;
Order module between class, is intended to dividing of classification under the environmental information corresponding according to described query string Cloth, is ranked up being intended to classification, and adjusts the suitable of each information record according to the ranking results being intended to classification Sequence;Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had environment by according to record The user journal of information is analyzed statistics and obtains;And
Return module, for being returned by the information record of order module output between described class;
Wherein, described information search server also includes:
Classification internal sort module, specific intended classification under the environmental information corresponding according to described query string Webpage distribution, the information record within each intention classification is ranked up;Wherein, described query string Under corresponding environmental information, the webpage of specific intended classification is distributed as the user's day having environmental information to record Will is analyzed statistics and obtains;
Described return module, is additionally operable to the information record by described classification internal sort module exports and returns to letter Breath search client.
16. information search servers as claimed in claim 15, it is characterised in that described information is searched Rope module, obtains corresponding information record specifically for searching in network data according to query string, and depends on According to each intention classification, described information record is classified, obtain the information record of each intention classification;With/ Or, scan in the network data with each intention class label respectively according to described query string, Information record to each intention classification.
17. information search servers as claimed in claim 15, it is characterised in that also include:
First relevance ranking module, for according to the dependency pair between information record and described query string The information record of described information search module output carries out the first relevance ranking, and by the first dependency row Order module between the information record output extremely described class after sequence;Or
Second relevance ranking module, for according to the dependency pair between information record and described query string Between described class, the information record of order module output carries out the second relevance ranking, and by the second dependency row Information record output after sequence is to described return module.
18. information search servers as claimed in claim 15, it is characterised in that also include:
Identification module, is used for the user totem information of the active user identifying that described query string is corresponding;
Order module between interest class, for foundation active user under the environmental information that described query string is corresponding Each distribution being intended to classification, is ranked up being intended to classification, and adjusts according to the ranking results being intended to classification The order of information record, wherein, described active user respectively anticipates under the environmental information that described query string is corresponding The distribution of figure classification has the user journal of environmental information and user totem information to be analyzed according to record Statistics obtains;
Described return module, is additionally operable to return to the information record of order module output between described interest class Information search client.
19. 1 kinds of information search clients, it is characterised in that including:
Inquire-receive module, for receiving the query string of user's input;
Environment acquisition module, for the environmental information that query string described in Real-time Collection is corresponding;Wherein, described Environmental information includes: the surrounding enviroment information residing for described query string correspondence user;
Sending module, for sending environmental information corresponding to described query string and described query string to information Search server;And
Represent module, represent for the information record that described information search server is returned;Wherein, Described information is recorded as according to distribution being intended to classification each under environmental information corresponding to described query string, to meaning Figure classification is ranked up, according to the order of the ranking results adjustment information record being intended to classification, and, depend on According to the webpage distribution of specific intended classification under the environmental information that described query string is corresponding, to each intention classification Internal information record is ranked up obtaining, and described intention classification is for distinguishing difference in each information record Information requirement.
20. information search clients as claimed in claim 19, it is characterised in that also include:
Inquiry log logging modle, for by user totem information, described query string and corresponding webpage behaviour Making history and environmental information record to inquiry log, described web page operation history is the information that query string is corresponding The webpage record being user-operably in record.
CN201210038993.2A 2012-02-20 2012-02-20 The method and apparatus that information record is ranked up Active CN102622417B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210038993.2A CN102622417B (en) 2012-02-20 2012-02-20 The method and apparatus that information record is ranked up

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210038993.2A CN102622417B (en) 2012-02-20 2012-02-20 The method and apparatus that information record is ranked up

Publications (2)

Publication Number Publication Date
CN102622417A CN102622417A (en) 2012-08-01
CN102622417B true CN102622417B (en) 2016-08-31

Family

ID=46562336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210038993.2A Active CN102622417B (en) 2012-02-20 2012-02-20 The method and apparatus that information record is ranked up

Country Status (1)

Country Link
CN (1) CN102622417B (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593353B (en) * 2012-08-15 2018-11-13 阿里巴巴集团控股有限公司 Information search method, displaying information sorting weighted value determine method and its device
CN103810210B (en) * 2012-11-14 2018-10-19 腾讯科技(深圳)有限公司 Search result display methods and device
CN103838754B (en) * 2012-11-23 2017-12-22 腾讯科技(深圳)有限公司 Information retrieval device and method
CN103885979B (en) 2012-12-21 2018-06-05 深圳市世纪光速信息技术有限公司 The method and apparatus of pushed information
CN104112235B (en) * 2013-04-22 2018-05-29 中广核工程有限公司 The method and system of nuclear power projects Experience Feedback information search
CN104657397B (en) * 2013-11-25 2020-03-03 腾讯科技(深圳)有限公司 Information processing method and terminal
CN104699725B (en) * 2013-12-10 2018-10-09 阿里巴巴集团控股有限公司 data search processing method and system
US10666735B2 (en) 2014-05-19 2020-05-26 Auerbach Michael Harrison Tretter Dynamic computer systems and uses thereof
US9742853B2 (en) * 2014-05-19 2017-08-22 The Michael Harrison Tretter Auerbach Trust Dynamic computer systems and uses thereof
CN104572960B (en) * 2014-12-29 2018-07-06 北京奇虎科技有限公司 A kind of method and device of search
CN104715011A (en) * 2014-12-31 2015-06-17 上海孩子国科教设备有限公司 Method and system for conducting data retrieval
CN105302903B (en) * 2015-10-27 2018-12-14 广州神马移动信息科技有限公司 Searching method, device, system and search result sequencing foundation determination method
CN105893427A (en) * 2015-12-07 2016-08-24 乐视网信息技术(北京)股份有限公司 Resource searching method and server
CN106874413A (en) * 2017-01-22 2017-06-20 斑马信息科技有限公司 Search system and its method for processing search results
CN107515857B (en) * 2017-08-31 2020-08-18 科大讯飞股份有限公司 Semantic understanding method and system based on customization technology
CN107832432A (en) * 2017-11-15 2018-03-23 北京百度网讯科技有限公司 A kind of search result ordering method, device, server and storage medium
CN108897785A (en) * 2018-06-08 2018-11-27 Oppo(重庆)智能科技有限公司 Search for content recommendation method, device, terminal device and storage medium
CN108763579B (en) * 2018-06-08 2020-12-22 Oppo(重庆)智能科技有限公司 Search content recommendation method and device, terminal device and storage medium
CN110162535B (en) * 2019-03-26 2023-11-07 腾讯科技(深圳)有限公司 Search method, apparatus, device and storage medium for performing personalization
CN110990598B (en) * 2019-11-18 2020-11-27 北京声智科技有限公司 Resource retrieval method and device, electronic equipment and computer-readable storage medium
CN113254513B (en) * 2021-07-05 2021-09-28 北京达佳互联信息技术有限公司 Sequencing model generation method, sequencing device and electronic equipment
CN113792225B (en) * 2021-08-25 2023-08-18 北京库睿科技有限公司 Multi-data type hierarchical ordering method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1050830A2 (en) * 1999-05-05 2000-11-08 Xerox Corporation System and method for collaborative ranking of search results employing user and group profiles
CN1758248A (en) * 2004-10-05 2006-04-12 微软公司 Systems, methods, and interfaces for providing personalized search and information access

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7082365B2 (en) * 2001-08-16 2006-07-25 Networks In Motion, Inc. Point of interest spatial rating search method and system
US7693827B2 (en) * 2003-09-30 2010-04-06 Google Inc. Personalization of placed content ordering in search results

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1050830A2 (en) * 1999-05-05 2000-11-08 Xerox Corporation System and method for collaborative ranking of search results employing user and group profiles
CN1758248A (en) * 2004-10-05 2006-04-12 微软公司 Systems, methods, and interfaces for providing personalized search and information access

Also Published As

Publication number Publication date
CN102622417A (en) 2012-08-01

Similar Documents

Publication Publication Date Title
CN102622417B (en) The method and apparatus that information record is ranked up
Ortiz‐Cordova et al. Classifying web search queries to identify high revenue generating customers
JP5941075B2 (en) SEARCH SYSTEM, METHOD, AND COMPUTER-READABLE MEDIUM WITH INTEGRATED USER JUDGMENT INCLUDING A AUTHORITY NETWORK
CN101551806B (en) Personalized website navigation method and system
CN107862553A (en) Advertisement real-time recommendation method, device, terminal device and storage medium
JP4418135B2 (en) Group forming system, group forming method, and group forming apparatus
US9996630B2 (en) System and/or method for linking network content
US20060064411A1 (en) Search engine using user intent
US20200294071A1 (en) Determining user intents related to websites based on site search user behavior
CN102037464A (en) Search results with most clicked next objects
CN103646092A (en) SE (search engine) ordering method based on user participation
CN101283353A (en) Systems for and methods of finding relevant documents by analyzing tags
US20120041936A1 (en) Search engine optimization at scale
CN104598604A (en) Browsing method of website navigation applied in various browsers
KR101559719B1 (en) Auto-learning system and method for derive effective marketing
CN105159898B (en) A kind of method and apparatus of search
Chen et al. The best answers? think twice: online detection of commercial campaigns in the CQA forums
CN102930009B (en) Individual website navigation system
Raju et al. A novel approaches in web mining techniques in case of web personalization
KR20010108877A (en) Method For Evaluating A Web Site
Ohmukai et al. Personal knowledge publishing suite with Weblog
Chen et al. The best answers? Think twice: identifying commercial campagins in the CQA forums
CN110321487A (en) A kind of accurate label recommendations system and its workflow
Maheswari et al. Algorithm for Tracing Visitors' On-Line Behaviors for Effective Web Usage Mining
Gudla et al. Enhanced service recommender and ranking system using browsing patterns of users

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant