CN102622417B - The method and apparatus that information record is ranked up - Google Patents
The method and apparatus that information record is ranked up Download PDFInfo
- Publication number
- CN102622417B CN102622417B CN201210038993.2A CN201210038993A CN102622417B CN 102622417 B CN102622417 B CN 102622417B CN 201210038993 A CN201210038993 A CN 201210038993A CN 102622417 B CN102622417 B CN 102622417B
- Authority
- CN
- China
- Prior art keywords
- information
- classification
- query string
- intended
- environmental information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
This application provides a kind of method and apparatus being ranked up information record, wherein method specifically includes: gather the environmental information that query string is corresponding;The information record of each intention classification is obtained according to described query string;According to distribution being intended to classification each under the environmental information that described query string is corresponding, it is ranked up being intended to classification, and according to the order of the ranking results adjustment information record being intended to classification;Wherein, under the environmental information that described query string is corresponding, each distribution being intended to classification obtains by having the user journal of environmental information to be analyzed statistics according to record.The application can be ranked up being intended to classification according to environmental information, intention classification interested for active user is come before, and combine the personalized factor of user so that the information record after sequence is closer to the real information demand of user.
Description
Technical field
The application relates to technical field of data processing, particularly relates to a kind of be ranked up information record
Method and apparatus, a kind of information search server and information search client.
Background technology
At present, network data carries out information search, have become as one of topmost application in the Internet.
Such as, when carrying out information search, the query string that search engine inputs according to user is inquired about in data base
The information record of page-out form, or, browser constructs query string according to the webpage that currently browses of user,
And the query string of foundation structure inquires the information record of page format in data base, etc..
In order to preferably meet the information that user's request, search engine or browser will inquire the most immediately
Record represents, but with the dependency between information record and query string as foundation, according to dependency
Information record is ranked up by order from high to low, and is represented by the information record after sequence, this
Plant and the dependency between query string is referred to as carried out according to basis weights as the operation of sort by
The operation of sequence.
Information record after being ranked up according to basis weights can reflect information record and query string it
Between dependency, the most beneficially user quickly makes a look up from information record, but,
Due to the dependency being ranked up to embody information record with query string according to basis weights, do not examine
Consider other factors, and the information recorded content in real network data is various, carry out only in accordance with basis weights
Sequence excessively simple, be affected by other factors, coming information record not necessarily user above needs
Want, come information record below and be probably what user needed on the contrary, therefore, existing information record
Sort method can not reflect the real information demand of user, and user needs to spend substantial amounts of in such cases
Time could find most interested information from the information record that query string is corresponding, and also can take
Many system resource.
In a word, the technical problem that those skilled in the art urgently solve is needed exactly: how can carry
For being closer to the information record of the real information demand of user, thus user is facilitated the most therefrom to look for
To most interested information.
Summary of the invention
Technical problems to be solved in this application be to provide a kind of method that information record is ranked up and
Device, it is possible to realize effective searching order for environmental information so that the information record after sequence is more
Real information demand close to user.
Accordingly, present invention also provides a kind of information search server and information search client, it is possible to
The information record of the real information demand being closer to user is provided, thus facilitates user the most therefrom
Find most interested information.
In order to solve the problems referred to above, this application discloses a kind of method that information record is ranked up, institute
The method of stating includes:
Gather the environmental information that query string is corresponding;
The information record of each intention classification is obtained according to described query string;
According to distribution being intended to classification each under the environmental information that described query string is corresponding, carry out being intended to classification
Sequence, and according to the order of the ranking results adjustment information record being intended to classification;Wherein, described query string
Under corresponding environmental information, each distribution being intended to classification is had the user journal of environmental information to enter by according to record
Row analytic statistics obtains.
On the other hand, disclosed herein as well is a kind of device that information record is ranked up, described device
Including:
Acquisition module, for gathering the environmental information that described query string is corresponding;
Information record acquisition module, for obtaining the information record of each intention classification according to described query string;
And
Order module between class, is intended to dividing of classification under the environmental information corresponding according to described query string
Cloth, is ranked up being intended to classification, and according to the order of the ranking results adjustment information record being intended to classification;
Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had environmental information by according to record
User journal be analyzed statistics and obtain.
On the other hand, disclosed herein as well is a kind of information search server, including:
Receiver module is corresponding from query string and the described query string of information search client for receiving
Environmental information;
Information search module, for scanning in network data according to query string, obtains each intention class
Other information record;
Order module between class, is intended to dividing of classification under the environmental information corresponding according to described query string
Cloth, is ranked up being intended to classification, and adjusts the suitable of each information record according to the ranking results being intended to classification
Sequence;Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had environment by according to record
The user journal of information is analyzed statistics and obtains;And
Return module, for being returned by the information record of order module output between described class.
On the other hand, disclosed herein as well is a kind of information search client, including:
Inquire-receive module, for receiving the query string of user's input;
Environment acquisition module, for gathering the environmental information that described query string is corresponding;
Sending module, for sending environmental information corresponding to described query string and described query string to information
Search server;And
Represent module, represent for the information record that described information search server is returned.
Preferably, described information search client also includes:
Inquiry log logging modle, for by user totem information, described query string and corresponding webpage behaviour
Making history and environmental information record to inquiry log, described web page operation history is the information that query string is corresponding
The webpage record being user-operably in record.
Compared with prior art, the application has the advantage that
First, it is intended to the distribution of classification under the environmental information that the application foundation query string is corresponding, to being intended to class
It is not ranked up, and according to the order of the ranking results adjustment information record being intended to classification;Due to user couple
There is different information requirements under varying environment information, and it is the most corresponding with information requirement to be intended to classification
, it can reflect the different classes of information requirement of user, therefore above-mentioned sequence can be described by more reflecting
The intention classification row of information requirement under the environmental information (hereinafter referred to as current context information) that query string is corresponding
Above, therefore, the application enables to the information record after sorting and is satisfied with the real information need of user
Ask.
Secondly, the application it is also conceivable to active user for each intention classification to the sequence of information record
The factor of interest;Different interest is had owing to each user is intended to classification to difference, and according to record
The user journal having environmental information and user totem information is analyzed adding up the active user obtained in institute
State each distribution being intended to classification under the environmental information that query string is corresponding to be ranked up, it is possible to by active user more
Before intention classification interested comes;In the case of query string is identical, prior art can be used to the whole network
Family provides identical information record, and without considering the problem of the individual demand of user, the application can
The information record after sequence is made to be closer to reflect the personalized true letter of user interest degree
Breath demand.
Furthermore, the information record of each intention classification is being ranked up adjusting the information note of each intention classification
During the order recorded, the information within each intention classification can also be remembered by the application according to current context information
Record is ranked up, and more will can reflect that under current context information, information needs in the information record of each intention classification
Before the webpage asked comes so that the information record after sequence is closer to the real information of user to be needed
Ask.
The technical scheme of the application can apply to the application such as search engine service, browser service, it is possible to
The information record of the real information demand being closer to user is provided, thus facilitates user to check rapidly
To most interested information.
Accompanying drawing explanation
Fig. 1 is a kind of flow chart to the embodiment of the method that information record is ranked up of the application;
Fig. 2 is the flow chart of a kind of information search method embodiment based on search engine of the application;
Fig. 3 is the flow chart of a kind of information recommendation method embodiment based on browser of the application;
Fig. 4 is the exemplary plot representing region in embodiment described in the application Fig. 3 more;
Fig. 5 is the structure chart of a kind of device embodiment being ranked up information record of the application;
Fig. 6 is the structure chart of the application a kind of information search server embodiment;
Fig. 7 is the structure chart of the application a kind of information search client embodiment.
Detailed description of the invention
Understandable, below in conjunction with the accompanying drawings for enabling the above-mentioned purpose of the application, feature and advantage to become apparent from
With detailed description of the invention, the application is described in further detail.
Information record is ranked up by the embodiment of the present application for environmental information, owing to can embody different rings
The different information requirements of user under environment information, therefore the information record of enabling to is closer to the true of user
Information requirement.
In the embodiment of the present application, environmental information is primarily referred to as the surrounding enviroment information residing for user, specifically may be used
To include time environmental information, location circumstances information, temperature environment information, hardware environment information etc..
Under different environmental informations, the information requirement of user is often different: with time environmental information
As a example by, be the beginning of new a day in the morning, therefore user in the morning time news information is had demand;During working
Work is auxiliary for Your Majesty's net, therefore when being on duty, webpage, pictorial information is also existed demand;Evening is for loosening
In the moment of amusement, time at night, music, video information are also existed demand, etc.;
As a example by geographical environment, Internet bar, family are the place loosening amusement, thus user in Internet bar, family,
Generally the information such as video, game, music are also existed demand;Office is should not excessively to give pleasure in office space
Happy, therefore the information such as news, picture is enough for user;Airport, station, hotels and other places are flowing
The place that property is strong, is generally concerned with the information such as tourism, weather.Even if user specify that self for video
Information requirement, but, it is contemplated that office space should not excessively be entertained, Internet bar, family be suitable for amusement
Feature, it is believed that under working environment, user wants that see is the film clips of video, and under Internet bar, home environment
User wants that see is the video that high definition is complete.
To sum up, those skilled in the art can use the one in above-mentioned environmental information according to the actual requirements
Or multiple, and, it is finely divided for one or more environmental informations used.Such as, by pair time
Between environmental information carry out environmental information segmentation, time environmental information is subdivided into daytime and night, or early
Morning, working and evening etc.;Such as, by position environmental information is classified, by location circumstances information
It is subdivided into Internet bar, attack, family, airport, station, hotel etc..The application is to concrete segmentation mode
It is not any limitation as.
In order to the various information requirements of user are associated with the information record in network data,
The application can use thought based on classification to be that information record adds intention class label so that different intentions
The corresponding different information requirement of classification;So, the row according to environmental information, the information record obtained carried out
Sequence, is converted to carry out being intended to the sequence of classification according to environmental information.
About carrying out being intended to the sequence of classification according to environmental information, the application uses theory of probability and mathematical statistics
Method calculate under the environmental information that described query string is corresponding the regularity of each distribution being intended to classification.Tool
For body, in off-line case, user journal is analyzed statistics, obtains described query string correspondence ring
Each distribution being intended to classification under environment information;When line ordering, according to the environmental information that described query string is corresponding
Under the distribution of each intention classification, the information record of each intention classification is ranked up.
In view of the probability symbols used in the embodiment of the present application, for convenience of understanding, at this by table 1 to respectively
The title of probability symbols, implication and acquisition methods explain.
Table 1
With reference to Fig. 1, it is shown that a kind of flow process to the embodiment of the method that information record is ranked up of the application
Figure, specifically may include that
Step 101, the environmental information that collection query string is corresponding;
In the embodiment of the present application, it is intended to dividing of classification according to each under the environmental information that described query string is corresponding
Cloth, is ranked up the information record of each intention classification;Owing to user also exists under varying environment information
Different information requirements, and be intended to classification and link directly with information requirement, it can reflect user's
The different information requirements being intended to classification, therefore the environment letter that above-mentioned sequence more can be corresponding by meeting described query string
Before under breath (hereinafter referred to as current context information), the information record of the intention classification of information requirement comes,
Therefore, the application enables to the information record after sorting and is closer to the real information demand of user,
Thus be user-friendly to.
Environmental information is primarily referred to as the surrounding enviroment information residing for user, even if same user, it is residing
Surrounding enviroment information is likely to be change, and time environmental information is exactly a typical example.For
This, the application is when gathering environmental information, and either the query string of user's input is also based on user's input
Or the query string currently browsing webpage structure of user, the environmental information that query string is corresponding is respectively provided with in real time
Property;Therefore the application gathers the environmental information that described query string is corresponding.
For the query string of user's input, it receives or the construction complete time is i.e. corresponding time ring
Environment information, the position obtained according to its IP (agreement of interconnection, Internet Protocol between network) address
Confidence breath is i.e. corresponding location circumstances information, the temperature that time environmental information is corresponding with location circumstances information
Information be temperature environment information, etc..The environmental information that the application is corresponding to concrete described query string
Method be not any limitation as.
Step 102, according to described query string obtain each intention classification information record;
In a preferred embodiment of the present application, obtain the information of each intention classification according to described query string
The step of record, specifically may include that
It is first depending on described query string to search in network data and obtain corresponding information record, then foundation
Described information record is classified by preset each classification that is intended to, and obtains the information record of each intention classification;
The label that described each intention classification is beaten by the webpage that information record is corresponding according to the whole network user carries out preset;
And/or, search in the network data with each intention class label respectively according to described query string
Rope, obtains the information record of each intention classification.Will described query string each intention classification correspondence in the whole network
Search engine in scan for, obtain each search engine return the search with each intention class label
As a result, thus form the information record of each intention classification.Owing to the classification of search engine each in the whole network is objective
Existing, such as mp3.baidu.com is the search engine of music categories, and news.sogou.com is news
The search engine of classification, video.baidu.com is that video class is other searches element engine etc., can search from these
Index directly obtains correspondence in holding up and is intended to the information record of classification, so the intention classification of the application is network
The attribute of the objective reality corresponding to data.
In the embodiment of the present application, described intention classification is mainly used in distinguishing different letters in each information record
Breath demand, in a preferred embodiment of the present application, its specifically can include video, picture, information,
Resource, comment or rate of exchange classification etc..In reality, those skilled in the art can also according to actual needs,
What information record was divided into other is respectively intended to classification, and with the different information requirement of difference, the application is to specifically
The sorting technique of information record be not any limitation as.
Step 103, according to distribution being intended to classification each under environmental information corresponding to described query string, to meaning
Figure classification is ranked up, and according to the order of the ranking results adjustment information record being intended to classification;Wherein,
Under the environmental information that described query string is corresponding, each distribution being intended to classification is had environmental information by according to record
User journal is analyzed statistics and obtains.
In practice, can select to use browser log or search engine according to practical application request
The user journals such as inquiry log carry out statistical analysis, and such as, search engine typically can arrange inquiry log,
And browser client typically can arrange browser log, the application is at existing inquiry log or browser
Environmental information is added on the basis of daily record.
In a preferred embodiment of the present application, described user journal includes browser log and/or looks into
Ask daily record.Described browser log record has user totem information, browses web-page histories and corresponding environment
Information;Described inquiry log record have user totem information, query string and corresponding web page operation history and
Environmental information, described web page operation history is the net being user-operably in the information record that query string is corresponding
Page record.
P(Ic| T) can be used for representing intention classification I under environmental information T that described query string is correspondingcDistribution,
According to theory of probability and mathematical statistics method, can use following formula that it is derived:
Wherein,For to joint probability distribution P (Ic| T) carry out the operation of marginalisation.
In a preferred embodiment of the present application, can there is environmental information according to record as follows
User journal be analyzed statistics, obtain under the environmental information that described query string is corresponding be intended to classification point
Cloth:
Sub-step A1, under the environmental information that described query string is corresponding according to user journal in the whole network
Webpage is analyzed statistics, obtains webpage distribution p under the environmental information that described query string is corresponding (d | T);
When using user journal to add up, can perform under the environmental information that described query string is corresponding
The operation of statistics p (d), wherein, the operation using inquiry log to be analyzed adding up p (d) can use
Following formula represents:
Wherein, x record in inquiry log.
The operation example that use browser log is analyzed adding up p (d) is as follows: add up certain webpage d
The number of times occurred in browser log, in some cases, can be with certain webpage d in browser log
The number of times of middle appearance divided by all webpages in browser log occur number of times.
Sub-step A2, under the environmental information that described query string is corresponding for certain particular webpage according to user
Daily record is analyzed statistics to each intention classification, obtains ad hoc networks under the environmental information that described query string is corresponding
The intention categorical distribution p (I of pagec|dT);
In implementing, can first add up p (Ic):
1, as a example by browser log, such as represent five now with five pillars and be intended to classification Ic, as
Really a webpage belongs in some (multiple) intention classification, just increases by 1 on corresponding pillar;From
And obtain the numerical value on each pillar, namely each intention classification IcProbability distribution;
2, inquiry log is used to be analyzed adding up p (Ic) operation can be represented by the formula:
P (I is added up for certain particular webpage under environmental information T that described query string is correspondingc), then must
Arrive p (Ic|dT)。
Sub-step A3, with webpage as statistical sample, net environmental information under corresponding to described query string
Under the environmental information that page distribution is corresponding with described query string, the product of the intention categorical distribution of particular webpage enters
Row summation, obtains under the environmental information that described query string is corresponding each distribution being intended to classification:
In the case of query string is identical, prior art can represent consistent information record to the whole network user,
Without considering the individual demand of user.
For the problems referred to above, in a preferred embodiment of the present application, considering current context information
On the basis of, it is also possible to according to active user for the interest corresponding letter to each intention classification being intended to classification
Breath record is ranked up;Correspondingly, described method can also include:
Identify the user totem information of the active user that described query string is corresponding;
According to active user's each distribution being intended to classification under the environmental information that described query string is corresponding, to meaning
Figure classification is ranked up, and adjusts the order of each information record according to the ranking results being intended to classification, wherein,
Active user's each distribution being intended to classification under the environmental information that described query string is corresponding has ring according to record
The user journal of environment information and user totem information is analyzed statistics and obtains.
Excluding the factor of environmental information, different user is intended to classification to difference different interest, such as,
User A grows tender of variety show, and every day all can watch the form of video with search engine and/or browser
Obtain the variety show wanted, and user B grows tender of star's picture, habitually to search for and/or clear
The form of video of looking at obtains the star's picture wanted.
This preferred embodiment method of theory of probability and mathematical statistics studies user for being intended to classification
The regularity of interest, here, each distribution being intended to classification under the environmental information that comprehensive described query string is corresponding
Regularity, finally, this preferred embodiment statistics is user in environmental information corresponding to described query string
Under the distribution of each intention classification.
There is different interest owing to different user is intended to classification to difference, and have environmental information according to record
Being analyzed statistics with the user journal of user totem information, the active user obtained is at described query string pair
Under the environmental information answered, each distribution being intended to classification is ranked up, it is possible to by institute interested for active user
Before stating intention classification coming, therefore, the application enables to information record and is closer to reflect user
The personalized real information demand of interest-degree.
P(Ic| T, u) can be used for representing that active user is respectively intended under the environmental information that described query string is corresponding
The distribution of classification, its available following formula is weight averaged statistics and obtains:
P(Ic|T,u)∝λP(T|Ic)P(Ic)+(1-λ)P(T|Ic,u)P(Ic|u) (4)
Wherein, u represents ID (userid), owing to all can record user's mark in every user journal
Know, so, just can obtain all access records of each u, and then, the access record for u is added up
P (Ic) i.e. can get P (Ic| u), P (Ic| u) can reflect that refer to user u is respectively intended to categorical distribution;λ
For random factor.
For specific intended classification IcPerform p (T) statistical operation i.e. can get p (T | Ic), p (T)
Can calculate by equation below:
Wherein, p (dT)=p (T | d) p (d) (6)
Wherein, the sum that the numerical value that the available webpage d of p (T | d) falls under environmental information occurs with webpage d
Ratio calculation obtains;For specific user u and specific intended classification IcPerform the statistical operation of p (T)
I.e. can get P (T | Ic,u);Random factor λ is for representing that all users are at environment corresponding to described query string
The distribution being intended to classification under information is intended to class with active user under the environmental information that described query string is corresponding
Other distribution, can determine the numerical value of λ according to the actual requirements.
For example, it is possible to by manually marking the log information of user in T, marked content is for being intended to
Classification, adjusts λ so as to get best is intended to describe λ value corresponding during accuracy rate, and wherein, T is permissible
It it is the same T time section in many days user journals.
Specifically, manually it is labelled with the model answer being intended to classification ranking results, adjusts
λ=0.1,0.2 ..., 0.9}, utilize the result that formula (4) the right is calculated under different λ, contrast standard
Answer and the intention classification ranking results of formula clearing, can count formula under specific λ calculated
Accuracy rate, λ value corresponding when accuracy rate is the highest is exactly the λ value finally determined.Wherein it is possible to utilize
NDCG (normalization accumulation discount taken, Normalized Discounted Cumulative Gain),
NDCG is a kind of to search engine or the tolerance of relative program effectiveness, its calculate before the phase of k bar result
The computing formula of closing property score is:
I is expressed as i & lt search;J is expressed as j-th strip result;yi,jIt is expressed as the relevant of j-th strip result
Property mark score, 5 grades of systems;πiJ () is expressed as this result position in the ranking.
And for example, it is also possible to the directly numerical value of setting λ, 0.6,0.8 etc., the application concrete number to λ
Value is not any limitation as.
In a preferred embodiment of the present application, the identity of this user can be identified as follows:
When this user registers login, using the ID of this user as the user totem information of this user;When
When this user is to be not logged in state and browse, (it is used for storing private information according to the cookie of this user
Small text file) identify the user totem information of this user.In actual applications, for needs
For the website that ID registration logs in, the selection of user's unique identifier can defer to following sequence:
Be as the criterion with ID when user registers login, when user when the state of being not logged in browses with user's
Cookie is as the criterion.
Wherein, user based on cookie identifies is the typical user identification method of existing one.When passing through
When the method for self-defined Apache journal format or JavaScript obtains user cookie, the most
Through have found the means that a very effective user identifies.Cookie is permissible on the premise of not being eliminated
It is considered with the binding of certain access client computer, so the accuracy that user based on cookie identifies
The highest.Such as, such as the user registered in Taobao, cookies information will be had to be stored in user's
Inside the c dish of computer, when this user accesses Taobao again, the system of Taobao can go to the path specified to go
Take cookies information, if got, even if then this user without logging into, also can get login name, as
Fruit take less than, then can a newly-built cookies information to inside the computer of user.Current most of user
The most do not remove the cookies information of oneself.So, it is possible to use this technology, obtain the identity mark of user
Know.
In a preferred embodiment of the present application, can there is environmental information according to record as follows
User journal be analyzed adding up and obtain active user and anticipate under the environmental information that described query string is corresponding
The distribution of figure classification:
Sub-step B1, user journal is analyzed statistics, obtain the distribution of each intention classification and spy
The distribution of described each environmental information corresponding under fixed intention classification, and then statistics obtains all users in institute
State the distribution being intended to classification under the environmental information that query string is corresponding:
∝ represents implication of equal value;
Sub-step B2, active user's daily record is analyzed statistics, obtain active user is respectively intended to class
The distribution of described each environmental information corresponding under specific intention classification of other distribution and active user:And then statistics obtains active user
Under the environmental information that described query string is corresponding, it is intended to the preliminary of classification be distributed:
Sub-step B3, described all users are respectively intended to class under the environmental information that described query string is corresponding
Other distribution and active user's each be intended to classification preliminary point under the environmental information that described query string is corresponding
Cloth carries out linear weighted function process, obtains active user each under the environmental information that presently described query string is corresponding
The distribution of intention classification: P (Ic|T,u)∝λP(T|Ic)P(Ic)+(1-λ)P(T|Ic,u)P(Ic|u)。
In the case of without the daily record of active user, i.e. user browses for the first time, and λ=1, active user exists
Under the environmental information that described query string is corresponding, each distribution being intended to classification is all users at current environment
Each distribution being intended to classification under information.
Believe at current environment above in accordance with the distribution or active user being intended to classification under current context information
The lower distribution being intended to classification of breath, is ranked up the information record of each intention classification, to adjust each intention class
The order of other information record, in a preferred embodiment of the present application, it is also possible to according to current environment
Information record within each intention classification is ranked up by information, and correspondingly, described method can also be wrapped
Include:
According to the webpage distribution of specific intended classification under the environmental information that described query string is corresponding, to each meaning
Information record within figure classification is ranked up;Wherein, under the environmental information that described query string is corresponding specific
The webpage of intention classification is distributed as having the user journal of environmental information to be analyzed statistics according to record and obtains.
This preferred embodiment have also contemplated that when being ranked up the information record within each intention classification
Environmental information, more will can reflect information requirement under current context information in the information record of each intention classification
Webpage come before, it is possible to the information record of making is closer to the real information demand of user.
Such as, video is intended to the information record of classification to be had multiple, including various film clips video resources and various
HD video resource, now, discounting for current context information, arranges HD video resource simply
Above, user may be made to cut a sorry figure;Because if being under office space, user should not excessively entertain,
This preferred embodiment considers current context information, therefore the information record of enabling to is closer to user's
Real information demand.
P(d|Ic, T) and can be used for representing intention classification I under environmental information TcWebpage distribution, according to theory of probability
And mathematical statistics method, can use following formula that it is derived:
p(d|Ic, T) and=p (Ic,d,T)/p(TIc)=p (Ic,d,T)/(p(Ic|T)·p(T)) (7)
Wherein, p (Ic, d, T) and it is environmental information, specific intended classification and the webpage that described query string is corresponding
Joint Distribution, available following formula obtains:
p(Ic, d, T) and=p (Ic|d,T)·p(T|d)·p(d) (8)
Wherein, p (Ic| d, T) it is that particular webpage d is being intended under environmental information T that described query string is corresponding
IcOn distribution, p (T | d) is webpage d distribution under environmental information T that described query string is corresponding, p (d)
It is distributed for webpage, all directly can add up from browser log and obtain;
p(TIc) it is the environmental information and the Joint Distribution of specific intended classification that described query string is corresponding, available
Following formula represents:
p(TIc)=p (T | Ic)p(Ic)=p (Ic|T)p(T) (9)
In a preferred embodiment of the present application, as follows record can there be is environmental information
User journal is analyzed statistics, obtains specific intended classification under the environmental information that described query string is corresponding
Webpage is distributed:
Sub-step C1, user journal is analyzed statistics, obtain in the whole network each webpage distribution, institute
That states particular webpage under the environmental information that query string is corresponding is respectively intended to categorical distribution and at described query string pair
Each webpage distribution under the environmental information answered;
For certain particular webpage statistics p (Ic) under environmental information T that described query string is corresponding, then just
Obtain p (Ic|d,T);
P (T | d) it is webpage d distribution under environmental information T that described query string is corresponding, for webpage d
The statistical operation performing p (T) is the most available, and wherein, p (T) can use formula (5) to calculate.
Sub-step C2, it is distributed according to each webpage in the whole network, under environmental information corresponding to described query string
Particular webpage be respectively intended to categorical distribution and each webpage under the environmental information that described query string is corresponding divides
Cloth, constructs the connection of each webpage in environmental information corresponding to described query string, specific intended classification and the whole network
Close distribution;
Sub-step C3, according in corresponding environmental information, specific intended classification and the whole network of described query string
The Joint Distribution of each webpage environmental information corresponding with described query string and the associating of specific intended classification
The ratio of distribution, statistics obtains the webpage of specific intended classification under the environmental information that described query string is corresponding and divides
Cloth.
In actual applications, described query string is corresponding environmental information and the Joint Distribution of specific intended classification
Can be with the distribution of specific intended classification under environmental information corresponding to described query string and described query string pair
Product p (the TI of the distribution of the environmental information answeredc)=p (Ic| T) p (T) calculating, or, available specific intended class
The distribution of the environmental information that the most lower described query string is corresponding and the product of the distribution of specific intended classification
p(TIc)=p (T | Ic)p(Ic) calculate.The most describe p (Ic),p(T),p(Ic| T) statistical method, pin
To specific intended classification IcPerform p (T) statistical operation i.e. can get p (T | Ic)。
For making those skilled in the art be more fully understood that the method that information record is ranked up by the application, with
Under introduce its application in practice by example.
Example 1, the information search service in search engine is carried out the sequence of information record;
With reference to Fig. 2, it is shown that the stream of a kind of information search method embodiment based on search engine of the application
Cheng Tu, specifically may include that
Step 201, information search client receive the query string of user's input;
Step 202, information search client gather the environmental information that described query string is corresponding;
Step 203, information search client are by environmental information corresponding to described query string and described query string
Send to information search server;
Step 204, information search server receive from information search client query string and described in look into
Ask the environmental information that string is corresponding;
Step 205, information search server scan in network data according to query string, obtain each
It is intended to the information record of classification;
Step 206, information search server are according to being intended to classification under environmental information corresponding to described query string
Distribution, to be intended to classification be ranked up, and according to be intended to classification ranking results adjust each information record
Order;Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had by according to record
The user journal of environmental information is analyzed statistics and obtains;
Search Results after sequence is returned to information search client by step 207, information search server;
The Search Results that described information search server is returned by step 208, information search client is carried out
Represent.
In existing information search service, Search Results is represented by the environmental information that has no basis to be adjusted
Whole;And this example is by adding up the inquiry log under varying environment information, and obtain according to statistics
Current context information under be intended to the distribution of classification, the Search Results of each intention classification is ranked up, real
Now personalized Search Results based on environmental information represents;Can provide and be closer to the true of user
The Search Results of information requirement, thus facilitate user the most therefrom to find most interested information.
Below with an instantiation explanation:
Illustrating for convenience, environmental information just carries out dividing (working time T with the time1, the non-working time
T2);After client receives query string " A Chinese Ghost Story " x, by x and T1Send server end to.Clothes
Business device is according to x searching database, it is thus achieved that with the webpage collection of intentional class labelThen utilize current
Environmental information T1Distribution P (the I of lower intention classificationc|T1) rightAccording to being intended to classification sequence, such as, sequence
The Search Results of " A Chinese Ghost Story " is at T afterwards1Environmental information intention classification order be " information, video display, picture,
Game ... ".
In a word, do not consider that environmental information provides unified search to tie for existing information search service
Really, the application makes collection result more for specific aim, and personalization capability is higher, it is possible to facilitate user fast
Therefrom find most interested information fastly, reduce the system resource that user takies in search procedure.
As a kind of preferred embodiment, described step 205 specifically may include that being first depending on query string exists
In network data, search obtains corresponding information record, then according to each intention classification to described information record
Classify, obtain the information record of each intention classification;And/or, according to described query string respectively with
Each network data being intended to class label scans for, obtains the information record of each intention classification.
As a kind of preferred embodiment, described information search method can also include:
Step D, according to the webpage distribution of specific intended classification under environmental information corresponding to described query string,
Information record within each intention classification is ranked up;Wherein, the environment letter that described query string is corresponding
The webpage of the lower specific intended classification of breath is distributed as having the user journal of environmental information to be analyzed according to record
Statistics obtains.
Described step D can perform before or after step 206, sorts in step D and step 206
Ranking results is exported to step 207 by the posterior one of order.Specific to upper example, under each intention classification
Can according to P (d | Ic,T1) sequence, such as T1Environmental information " video display " is intended to " A Chinese Ghost Story film review " page under classification
Before face comes " download of A Chinese Ghost Story video " page.
As a kind of preferred embodiment, described information search method can also include:
Step E1, identify the user totem information of the active user that described query string is corresponding;
Step E2, to be intended to classification be ranked up, and according to be intended to classification ranking results adjustment information
The order of record.
Step E2 alternative steps 206, the result of such step E2 exports to step 207.
As a kind of preferred embodiment, described information search method can also include relevance ranking step:
According to the dependency between Search Results and described query string, according to dependency order from high to low to institute
The information record stating the output of information search module carries out relevance ranking.Wherein, relevance ranking can be in step
Performing before or after rapid 206, the final result of relevance ranking step and step 206 exports to step
208。
It should be noted that described information service client end can also by ID, described query string and
Corresponding accessed webpage and environmental information record are to inquiry log, and described accessed webpage is information record
In webpage accessed by the user.
The sequence of information record in the service of example 2, information recommendation, information record shows as recommendation results.
With reference to Fig. 3, it is shown that the flow process of a kind of information recommendation method embodiment based on browser of the application
Figure, specifically may include that
Step 301, the input of foundation user or user currently browse webpage structure query string;
Step 302, the input gathering user or user currently browse the environmental information that webpage is corresponding, make
For the environmental information that described query string is corresponding;
Step 303, scan in network data according to described query string, obtain each intention classification
Recommendation results;
Step 304, according to being intended to the distribution of classification under environmental information corresponding to described query string, to intention
Classification is ranked up, and adjusts the order of recommendation results according to the ranking results being intended to classification;Wherein, institute
State the distribution being intended to classification under the environmental information that query string is corresponding by the user having environmental information according to record
Daily record is analyzed statistics and obtains;
Step 305, recommendation results being intended to classification each after sequence is represented.
Existing information recommendation service, when being ranked up recommendation results, does not consider environmental information, and
The application adjusts recommendation results according to current context information and is shown, it is possible to realize personalized browsing
Recommend.
Corresponding example:
During 1. morning of example, before the recommendation row of news category;During working, web, picture category are other recommends row
Before;During evening, before video, the recommendation of music categories are arranged.
Example 2. is in Internet bar, before the classification such as video, game, music recommends row;In office, news,
Before the classifications such as picture recommend row;On airport, station, hotels and other places, the classification information such as tourism, weather
Before recommending row, etc..
Example 3. same video input demand, working environment, film clips sort front;Internet bar, the environment such as family,
High definition, complete video resource sort front, etc..
Whole flow process is introduced below with an example:
User, when browsing the webpage relevant to " Wang little Chuan ", utilizes web page title, url and text message structure
Query string " Wang little Chuan ";Then, it is intended in classification retrieve " Wang little Chuan " from " information, picture, video display " etc.,
Return the Search Results under each intention classification;Then, according to P (Ic|T1) each intention classification is sorted.
As a kind of preferred embodiment, described information recommendation method based on browser can also include:
Step F, according to the webpage distribution of specific intended classification under environmental information corresponding to described query string,
Each intention classification inside is browsed information be ranked up;Wherein, the environment letter that described query string is corresponding
The webpage of the lower specific intended classification of breath is distributed as having the user journal of environmental information to be analyzed according to record
Statistics obtains.
Described step F can perform before or after step 304, sequence time in step F and step 304
Ranking results is exported to step 305 by the posterior one of sequence.Specific to upper example, can under each intention classification
Foundation P (d | Ic,T1) sequence, such as T1Under environmental information " information " intention classification, " search dog browser leads king little
Lead to success in river " before the page comes " search dog CEO king's coulee sermon the Internet " page.
As a kind of preferred embodiment, described information recommendation method based on browser can also include:
Step G1, identify the user totem information of the active user that described query string is corresponding;
Step G2, foundation active user are intended to dividing of classification under the environmental information that described query string is corresponding
Cloth, is ranked up being intended to classification, and adjusts the order of recommendation results according to the ranking results being intended to classification,
Wherein, what active user was intended to classification under the environmental information that described query string is corresponding is distributed as according to record
The user journal having environmental information and user totem information is analyzed statistics and obtains.
Step G2 alternative steps 304, the result of such step G2 exports to step 305.
In a word, this preferred embodiment can realize personalized letter according to environmental information, user totem information
Breath recommendation service, it is recommended that more precisely, more personalized recommendation results.
In the application one preferred embodiment, described information recommendation method based on browser can also be by
ID, described webpage and the corresponding environment recommendation results of currently browsing are to browser log;And/or,
By ID, described query string and corresponding web page operation history and environmental information record to inquiring about day
Will, is clicked on by user in the recommendation results that described web page operation historical query string is corresponding and accesses the net operated
Page record.
In another preferred embodiment of the present application, described step 305 can be specifically, preset
The each recommendation results being intended to classification in respectively representing region exported order module between described class is opened up
Existing, wherein, if each represent region in represent in a recommendation results being intended to classification and come above
Dry.With reference to Fig. 4, it is shown that a kind of examples representing region of the application, wherein, " information ", " figure more
Sheet ", " video display " be intended to classification and come the front three of recommendation results, and be respectively displayed on and represent district accordingly
In territory.
In the still another preferable embodiment of the application, can use sequence learning method, foundation user's is defeated
Enter or the webpage that currently browses of user constructs query string, specifically may include that
Step H1, currently browse webpage extraction candidate's phrase from described;
Here, Chinese word segmentation, name Entity recognition, part of speech, tf/idfTF-IDF (word frequency/inverse can be used
To document-frequency, term frequency/inverse document frequency) etc. step extract candidate
Phrase.
Step H2, from described candidate's phrase, pick out candidate word as query string.
Sequence learning method substantially can be divided into three big classes: based on the sequence study returned, based on classification
Sequence study and sequence based on ordinal regression study.Wherein, sequence based on ordinal regression study is calculated
Method is the focus of Learning Studies of currently sorting, and specifically can enter oneself for the examination sequence perceptron algorithm (PRank), change
The sequence perceptron algorithm (Large Marge PRank) entered and support vector ordinal regression algorithm (Support
Vector Ordinal Regression) be representative based on data point (Point-wise) Ranking Algorithm,
With rank support vector machine algorithm (Rank SVM), RankBoost algorithm and RankNet algorithm as representative
Ranking Algorithm based on ordered pair (Pair-wise).The application can use any of the above-described sequence
Learning method, picks out the intention phrase subset that can represent current page from described candidate's phrase.
Corresponding to the aforementioned method that information record is ranked up, present invention also provides a kind of to information note
The device that record is ranked up, with reference to Fig. 5, described device specifically may include that
Acquisition module 501, for gathering the environmental information that described query string is corresponding;
Information record acquisition module 502, for obtaining the information note of each intention classification according to described query string
Record;And
Order module 503 between class, are intended to classification under the environmental information corresponding according to described query string
Distribution, is ranked up being intended to classification, and according to being intended to ranking results adjustment information record suitable of classification
Sequence;Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had environment by according to record
The user journal of information is analyzed statistics and obtains.
In the application preferred embodiment, it is preferred that described user journal include browser log and/
Or inquiry log;Described browser log record has user totem information, browses web-page histories and corresponding
Environmental information;Described inquiry log record has user totem information, query string and corresponding web page operation to go through
History and environmental information, described web page operation history is to be user-operably in the information record that query string is corresponding
Webpage record.
In the application preferred embodiment, it is preferred that described environmental information specifically can include time ring
Environment information, location circumstances information, temperature environment information or hardware environment information.It is preferable to carry out in the application
In example, it is preferred that described intention classification specifically can include video, picture, information, resource, comment
Opinion or rate of exchange classification.
In a preferred embodiment of the present application, described device can also include:
First statistical module, obtains for having the user journal of environmental information to be analyzed statistics according to record
The distribution of classification it is intended under the environmental information that described query string is corresponding, including:
First statistics submodule, for foundation user journal pair under the environmental information that described query string is corresponding
Webpage in the whole network is analyzed statistics, and each webpage obtained under the environmental information that described query string is corresponding divides
Cloth;
Second statistics submodule, is used under the environmental information that described query string is corresponding for certain particular webpage
According to user journal, each intention classification is analyzed statistics, obtains the environmental information that described query string is corresponding
The intention categorical distribution of lower particular webpage;And
Summation submodule, is used for webpage as variable, net environmental information under corresponding to described query string
Under the environmental information that page distribution is corresponding with query string, the product of the intention categorical distribution of particular webpage is asked
With, obtain being intended to classification distribution under the environmental information that described query string is corresponding.
In another preferred embodiment of the present application, described device can also include:
Identification module, is used for the user totem information of the active user identifying that described query string is corresponding;
Order module between interest-degree class, for according to active user in environmental information corresponding to described query string
Under the distribution of each intention classification, be ranked up being intended to classification, and adjust according to the ranking results being intended to classification
The order of whole each information record, wherein, user is respectively intended to class under the environmental information that described query string is corresponding
Other being distributed as has the user journal of environmental information and user totem information to be analyzed system according to record
Meter obtains.
In the still another preferable embodiment of the application, described device can also include:
Second statistical module, obtains for having the user journal of environmental information to be analyzed statistics according to record
User is intended to the distribution of classification under the environmental information that described query string is corresponding, specifically may include that
3rd statistics submodule, for user journal is analyzed statistics, obtains being intended to the distribution of classification
With in the specific distribution being intended to described each environmental information corresponding under classification, and then statistics to obtain institute useful
Family is intended to the distribution of classification under the environmental information that described query string is corresponding;
4th statistics submodule, for active user's daily record is analyzed statistics, obtains active user's
Each distribution being intended to classification and described each environment letter corresponding under specific intention classification of active user
The distribution of breath, and then statistics obtains active user and is intended to classification under the environmental information that described query string is corresponding
Preliminary distribution;And
Linear weighted function processes submodule, for believing described all users at the environment that described query string is corresponding
The lower distribution being intended to classification of breath is intended at the beginning of classification under the environmental information that described query string is corresponding with user
Step distribution carries out linear weighted function process, obtains active user in environmental information corresponding to presently described query string
The distribution of lower intention classification.
In a preferred embodiment of the present application, described device can also include:
Classification internal sort module, specific intended classification under the environmental information corresponding according to described query string
Webpage distribution, the information record within each intention classification is ranked up;Wherein, described query string
Under corresponding environmental information, the webpage of specific intended classification is distributed as the user having environmental information according to record
Daily record is analyzed statistics and obtains.
In another preferred embodiment of the present application, described device can also include:
3rd statistical module, obtains for having the user journal of environmental information to be analyzed statistics according to record
The webpage distribution of specific intended classification under the environmental information that described query string is corresponding, including:
5th statistics submodule, for user journal is analyzed statistics, obtains each webpage in the whole network
Under distribution, environmental information corresponding to described query string, particular webpage is respectively intended to categorical distribution and described
Each webpage distribution under the environmental information that query string is corresponding;
6th statistics submodule, for the ring being distributed according to each webpage in the whole network, described query string is corresponding
Under environment information, particular webpage is respectively intended to categorical distribution and under the environmental information that described query string is corresponding
Each webpage is distributed, and construct in environmental information corresponding to described query string, specific intended classification and the whole network is each
The Joint Distribution of webpage;And
7th statistics submodule, for the environmental information corresponding according to described query string, specific intended classification
The environmental information corresponding with described query string with the Joint Distribution of each webpage in the whole network and specific intended class
The ratio of other Joint Distribution, statistics obtains specific intended classification under the environmental information that described query string is corresponding
Each webpage distribution.
In the embodiment of the present application, it is preferred that described information record acquisition module, can be specifically for depending on
Search in network data according to described query string and obtain corresponding information record, and according to each intention classification pair
Described information record is classified, and obtains the information record of each intention classification;And/or, according to described inquiry
String scans for respectively in the network data with each intention class label, obtains the letter of each intention classification
Breath record.
In a preferred embodiment of the present application, described device can also include: represents module, is used for
Each information record being intended to classification of order module output between described class is represented.
In the embodiment of the present application, it is preferred that described in represent module, can be specifically for preset each
The each information record being intended to classification in representing region exported order module between described class represents.
In the embodiment of the present application, it is preferred that described query string derive from user input or user work as
Before browse webpage.
For for device embodiment that information record is ranked up, owing to it is carried out with to information record
The embodiment of the method basic simlarity of sequence, so describe is fairly simple, relevant part sees to be remembered information
The part of the embodiment of the method that record is ranked up illustrates.
With reference to Fig. 6, it is shown that the structure chart of the application a kind of information search server embodiment, specifically may be used
To include:
Receiver module 601, for receiving the query string from information search client and described query string pair
The environmental information answered;
Information search module 602, for scanning in network data according to query string, obtains each meaning
The information record of figure classification;
Order module 603 between class, are intended to classification under the environmental information corresponding according to described query string
Distribution, is ranked up being intended to classification, and adjusts each information record according to the ranking results being intended to classification
Sequentially;Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had ring by according to record
The user journal of environment information is analyzed statistics and obtains;And
Return module 604, search for the information record of order module output between described class is returned to information
Rope client.
In a preferred embodiment of the present application, described information search module 602, can be specifically user
It is used for when using search engine searching in network data according to query string obtaining corresponding information record, and
According to each intention classification, described information record is classified, obtain the information record of each intention classification;With
/ or, when user uses browser to carry out information browse according to the described query string that current browse webpage is corresponding
Scan in the network data with each intention class label respectively, obtain the information of each intention classification
Record.
In a preferred embodiment of the present application, described information search server can also include:
First relevance ranking module, the dependency between foundation information record and described query string,
The information record of described information search module output is carried out the first relevance ranking, and by the first dependency
Information record output after sequence is to described sort module;Or
Second relevance ranking module, the dependency between foundation information record and described query string,
The information record of order module output between described class is carried out the second relevance ranking, and by the second dependency
Information record output after sequence is to described return module.
In a preferred embodiment of the present application, described information search server can also include:
Identification module, is used for the user totem information of the active user identifying that described query string is corresponding;
Order module between interest class, for foundation active user under the environmental information that described query string is corresponding
Each distribution being intended to classification, thus the information record of each intention classification is ranked up, wherein, currently use
Family is each under the environmental information that described query string is corresponding is intended to the distribution of classification according to having environment to believe record
The user journal of breath and user totem information is analyzed statistics and obtains;
Described return module, is additionally operable to return to the information record of order module output between described interest class
Information search client, it is also possible to return by between described class between order module and interest class order module comprehensive
The information record being disposed exported is to information search client.
In a preferred embodiment of the present application, described information search server can also include:
Classification internal sort module, is used between described class before or after order module, according to described inquiry
The webpage distribution of specific intended classification under the environmental information that string is corresponding, to the information within each intention classification
Record is ranked up;Wherein, under the environmental information that described query string is corresponding, the webpage of specific intended classification divides
Cloth is to have the user journal of environmental information to be analyzed statistics record to obtain;
Described return module, is additionally operable to the information record by described classification internal sort module exports and returns to letter
Breath search client;The information record of the output of order module between described class can also be returned to information search
Client, or through between described class, order module and classification internal sort module synthesis are disposed and are exported
Information record is to information search client, or order module between order module, interest class through between described class
The information record being disposed exported with classification internal sort module synthesis is to information search client.
With reference to Fig. 7, it is shown that the structure chart of the application a kind of information search client embodiment, specifically may be used
To include:
Receiver module 701, for receiving the query string of user's input;
Environment acquisition module 702, for gathering the environmental information that described query string is corresponding;
Sending module 703, for sending environmental information corresponding to described query string and described query string extremely
Information search server;And
Represent module 704, represent for the information record that described information search server is returned.
In a preferred embodiment of the present application, described information search client can also include:
Inquiry log logging modle, for by user totem information, described query string and corresponding webpage behaviour
Making history and environmental information record to inquiry log, described web page operation history is the information that query string is corresponding
The webpage record being user-operably in record.
For the embodiment of information search server and client, owing to it is carried out with to information record
The embodiment of the method basic simlarity of sequence, so describe is fairly simple, relevant part sees to be remembered information
The part of the embodiment of the method that record is ranked up illustrates.
Each embodiment in this specification all uses the mode gone forward one by one to describe, and each embodiment stresses
Be all the difference with other embodiments, between each embodiment, identical similar part sees mutually
?.
Above to provided herein a kind of information record is ranked up method and apparatus, Yi Zhongxin
Breath search server and information search client, is described in detail, specific case used herein
Principle and embodiment to the application are set forth, and the explanation of above example is only intended to help reason
Solve the present processes and core concept thereof;Simultaneously for one of ordinary skill in the art, according to this
The thought of application, the most all will change, in sum, and this
Description should not be construed as the restriction to the application.
Claims (20)
1. the method that information record is ranked up, it is characterised in that described method includes:
The environmental information that Real-time Collection query string is corresponding;Wherein, described environmental information includes: described inquiry
The corresponding surrounding enviroment information residing for user of string;
The information record of each intention classification is obtained according to described query string;Wherein, described intention classification is used for
Different information requirements is distinguished in each information record;
According to distribution being intended to classification each under the environmental information that described query string is corresponding, carry out being intended to classification
Sequence, and according to the order of the ranking results adjustment information record being intended to classification;Wherein, described query string
Under corresponding environmental information, each distribution being intended to classification is had the user journal of environmental information to enter by according to record
Row analytic statistics obtains;
Wherein, described method also includes:
According to each webpage distribution of specific intended classification under the environmental information that described query string is corresponding, to each
The information record being intended to classification is ranked up;Wherein, specific meaning under the environmental information that described query string is corresponding
Each webpage of figure classification is distributed as having the user journal of environmental information to be analyzed statistics according to record and obtains.
2. the method for claim 1, it is characterised in that look into described in obtaining as follows
Each distribution being intended to classification under the environmental information that inquiry string is corresponding:
Webpage in the whole network is carried out point according to user journal under the environmental information that described query string is corresponding
Analysis statistics, obtains each webpage distribution under described environmental information;
For certain particular webpage foundation user journal to each meaning under the environmental information that described query string is corresponding
Figure classification is analyzed statistics, obtains the intention categorical distribution of particular webpage under described environmental information;
With each webpage as statistical sample, each webpage under described environmental information is distributed and described environmental information
The intention categorical distribution of lower particular webpage is added up, and obtains under the environmental information that described query string is corresponding each
It is intended to the distribution of classification.
3. the method for claim 1, it is characterised in that also include:
Identify the user totem information of the active user that described query string is corresponding;
According to active user's each distribution being intended to classification under the environmental information that described query string is corresponding, to meaning
Figure classification is ranked up, and adjusts the order of each information record according to the ranking results being intended to classification;Wherein,
Described active user is each under the environmental information that described query string is corresponding is intended to the distribution of classification according to note
Record has the user journal of environmental information and user totem information to be analyzed statistics to obtain.
4. method as claimed in claim 3, it is characterised in that the most currently used
Family is each distribution being intended to classification under the environmental information that described query string is corresponding:
User journal is analyzed statistics, obtains the distribution of each intention classification and be specifically intended to classification
The distribution of described each environmental information of lower correspondence, so statistics obtain all users under described environmental information
Each distribution being intended to classification;
Active user's daily record is analyzed statistics, obtains each distribution being intended to classification of active user and work as
Front user is in the specific distribution being intended to described each environmental information corresponding under classification, and then statistics is worked as
Front user is each preliminary distribution being intended to classification under described environmental information;
To described all users each distribution being intended to classification and described active user under described environmental information
Under described environmental information, each preliminary distribution being intended to classification is weighted processing, and obtains described active user
Each distribution being intended to classification under the environmental information that described query string is corresponding.
5. the method for claim 1, it is characterised in that look into described in obtaining as follows
Ask each webpage of specific intended classification under the environmental information that string is corresponding to be distributed:
User journal is analyzed statistics, obtains each webpage distribution in the whole network, described query string correspondence
Environmental information under particular webpage be respectively intended to categorical distribution and in environmental information corresponding to described query string
Under each webpage distribution;
Be distributed according to each webpage in the whole network, particular webpage is each under environmental information corresponding to described query string
Being intended to categorical distribution and the distribution of each webpage under the environmental information that described query string is corresponding, structure is described
The Joint Distribution of each webpage in environmental information, specific intended classification and the whole network that query string is corresponding;
Connection according to each webpage in environmental information, specific intended classification and the whole network that described query string is corresponding
Close the ratio of the Joint Distribution being distributed the environmental information corresponding with described query string and specific intended classification, system
Meter obtains each webpage distribution of specific intended classification under the environmental information that described query string is corresponding.
6. the method as according to any one of Claims 1-4, it is characterised in that described according to institute
When stating the information record that query string obtains each intention classification:
Search in network data according to described query string and obtain corresponding information record, and according to each intention
Described information record is classified by classification, obtains the information record of each intention classification;
And/or, search in the network data with each intention class label respectively according to described query string
Rope, obtains the information record of each intention classification.
7. the method as according to any one of Claims 1-4, it is characterised in that described user's day
Will includes browser log and/or inquiry log;Described browser log record has user totem information, clear
Look at web-page histories and corresponding environmental information;Described inquiry log record has user totem information, query string
And corresponding web page operation history and environmental information, described web page operation history is the information that query string is corresponding
The webpage record being user-operably in record.
8. the method as according to any one of Claims 1-4, it is characterised in that also include:
Each information record being intended to classification after sequence is represented.
9. method as claimed in claim 8, it is characterised in that also include: respectively represent preset
In region, the recommendation results to each intention classification represents.
10. the method as according to any one of Claims 1-4, it is characterised in that described query string
Derive from the webpage that user inputs or user currently browses.
11. 1 kinds of devices that information record is ranked up, it is characterised in that described device includes:
Acquisition module, for the environmental information that Real-time Collection query string is corresponding;Wherein, described environmental information
Including: the surrounding enviroment information residing for described query string correspondence user;
Information record acquisition module, for obtaining the information record of each intention classification according to described query string;
Wherein, described intention classification is for distinguishing different information requirements in each information record;And
Order module between class, is intended to dividing of classification under the environmental information corresponding according to described query string
Cloth, is ranked up being intended to classification, and according to the order of the ranking results adjustment information record being intended to classification;
Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had environmental information by according to record
User journal be analyzed statistics and obtain;
Wherein, described device also includes:
Classification internal sort module, specific intended classification under the environmental information corresponding according to described query string
Webpage distribution, the information record within each intention classification is ranked up;Wherein, described query string
Under corresponding environmental information, the webpage of specific intended classification is distributed as the user having environmental information according to record
Daily record is analyzed statistics and obtains.
12. devices as claimed in claim 11, it is characterised in that also include:
First statistical module, is used for obtaining being intended to dividing of classification under the environmental information that described query string is corresponding
Cloth, including:
First statistics submodule, for foundation user journal pair under the environmental information that described query string is corresponding
Webpage in the whole network is analyzed statistics, obtains each webpage distribution under described environmental information;
Second statistics submodule, is used under the environmental information that described query string is corresponding for certain particular webpage
According to user journal, each intention classification is analyzed statistics, obtains particular webpage under described environmental information
It is intended to categorical distribution;And
Summation submodule, for webpage as variable, is distributed and described the webpage under described environmental information
Under environmental information, the intention categorical distribution of particular webpage is added up, and obtains being intended to classification at described query string
The corresponding distribution under environmental information.
13. devices as claimed in claim 11, it is characterised in that also include:
Identification module, is used for the user totem information of the active user identifying that described query string is corresponding;
Order module between interest-degree class, for according to active user in environmental information corresponding to described query string
Under the distribution of each intention classification, be ranked up being intended to classification, and adjust according to the ranking results being intended to classification
The order of whole each information record, wherein, described active user is under the environmental information that described query string is corresponding
Record is had the user journal of environmental information and user totem information to enter by each foundation that is distributed as being intended to classification
Row analytic statistics obtains.
14. devices as claimed in claim 13, it is characterised in that also include:
Second statistical module, is respectively intended to class for obtaining user under the environmental information that described query string is corresponding
Other distribution, including:
3rd statistics submodule, for user journal is analyzed statistics, obtains being intended to the distribution of classification
It is intended to the distribution of each environmental information of correspondence under classification with specific, and then statistics obtains all users and exists
The distribution of classification it is intended under described environmental information;
4th statistics submodule, for active user's daily record is analyzed statistics, obtains active user's
The distribution and the active user that are intended to classification are intended to described each environmental information of correspondence under classification specific
It is distributed, and then statistics obtains active user and is intended to the preliminary distribution of classification under described environmental information;And
Linear weighted function processes submodule, for described all users are intended under described environmental information classification
Distribution and described active user under described environmental information, is intended to the preliminary distribution of classification is weighted locating
Reason, obtains described active user and is intended to the distribution of classification under the environmental information that described query string is corresponding.
15. an information search server, it is characterised in that including:
Receiver module is corresponding from query string and the described query string of information search client for receiving
Environmental information;Wherein, described environmental information includes: described information search client Real-time Collection, institute
State the surrounding enviroment information residing for query string correspondence user;
Information search module, for scanning in network data according to query string, obtains each intention class
Other information record;Wherein, described intention classification needs for distinguishing different information in each information record
Ask;
Order module between class, is intended to dividing of classification under the environmental information corresponding according to described query string
Cloth, is ranked up being intended to classification, and adjusts the suitable of each information record according to the ranking results being intended to classification
Sequence;Wherein, the distribution being intended to classification under the environmental information that described query string is corresponding is had environment by according to record
The user journal of information is analyzed statistics and obtains;And
Return module, for being returned by the information record of order module output between described class;
Wherein, described information search server also includes:
Classification internal sort module, specific intended classification under the environmental information corresponding according to described query string
Webpage distribution, the information record within each intention classification is ranked up;Wherein, described query string
Under corresponding environmental information, the webpage of specific intended classification is distributed as the user's day having environmental information to record
Will is analyzed statistics and obtains;
Described return module, is additionally operable to the information record by described classification internal sort module exports and returns to letter
Breath search client.
16. information search servers as claimed in claim 15, it is characterised in that described information is searched
Rope module, obtains corresponding information record specifically for searching in network data according to query string, and depends on
According to each intention classification, described information record is classified, obtain the information record of each intention classification;With/
Or, scan in the network data with each intention class label respectively according to described query string,
Information record to each intention classification.
17. information search servers as claimed in claim 15, it is characterised in that also include:
First relevance ranking module, for according to the dependency pair between information record and described query string
The information record of described information search module output carries out the first relevance ranking, and by the first dependency row
Order module between the information record output extremely described class after sequence;Or
Second relevance ranking module, for according to the dependency pair between information record and described query string
Between described class, the information record of order module output carries out the second relevance ranking, and by the second dependency row
Information record output after sequence is to described return module.
18. information search servers as claimed in claim 15, it is characterised in that also include:
Identification module, is used for the user totem information of the active user identifying that described query string is corresponding;
Order module between interest class, for foundation active user under the environmental information that described query string is corresponding
Each distribution being intended to classification, is ranked up being intended to classification, and adjusts according to the ranking results being intended to classification
The order of information record, wherein, described active user respectively anticipates under the environmental information that described query string is corresponding
The distribution of figure classification has the user journal of environmental information and user totem information to be analyzed according to record
Statistics obtains;
Described return module, is additionally operable to return to the information record of order module output between described interest class
Information search client.
19. 1 kinds of information search clients, it is characterised in that including:
Inquire-receive module, for receiving the query string of user's input;
Environment acquisition module, for the environmental information that query string described in Real-time Collection is corresponding;Wherein, described
Environmental information includes: the surrounding enviroment information residing for described query string correspondence user;
Sending module, for sending environmental information corresponding to described query string and described query string to information
Search server;And
Represent module, represent for the information record that described information search server is returned;Wherein,
Described information is recorded as according to distribution being intended to classification each under environmental information corresponding to described query string, to meaning
Figure classification is ranked up, according to the order of the ranking results adjustment information record being intended to classification, and, depend on
According to the webpage distribution of specific intended classification under the environmental information that described query string is corresponding, to each intention classification
Internal information record is ranked up obtaining, and described intention classification is for distinguishing difference in each information record
Information requirement.
20. information search clients as claimed in claim 19, it is characterised in that also include:
Inquiry log logging modle, for by user totem information, described query string and corresponding webpage behaviour
Making history and environmental information record to inquiry log, described web page operation history is the information that query string is corresponding
The webpage record being user-operably in record.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210038993.2A CN102622417B (en) | 2012-02-20 | 2012-02-20 | The method and apparatus that information record is ranked up |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210038993.2A CN102622417B (en) | 2012-02-20 | 2012-02-20 | The method and apparatus that information record is ranked up |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102622417A CN102622417A (en) | 2012-08-01 |
CN102622417B true CN102622417B (en) | 2016-08-31 |
Family
ID=46562336
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210038993.2A Active CN102622417B (en) | 2012-02-20 | 2012-02-20 | The method and apparatus that information record is ranked up |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102622417B (en) |
Families Citing this family (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103593353B (en) * | 2012-08-15 | 2018-11-13 | 阿里巴巴集团控股有限公司 | Information search method, displaying information sorting weighted value determine method and its device |
CN103810210B (en) * | 2012-11-14 | 2018-10-19 | 腾讯科技(深圳)有限公司 | Search result display methods and device |
CN103838754B (en) * | 2012-11-23 | 2017-12-22 | 腾讯科技(深圳)有限公司 | Information retrieval device and method |
CN103885979B (en) | 2012-12-21 | 2018-06-05 | 深圳市世纪光速信息技术有限公司 | The method and apparatus of pushed information |
CN104112235B (en) * | 2013-04-22 | 2018-05-29 | 中广核工程有限公司 | The method and system of nuclear power projects Experience Feedback information search |
CN104657397B (en) * | 2013-11-25 | 2020-03-03 | 腾讯科技(深圳)有限公司 | Information processing method and terminal |
CN104699725B (en) * | 2013-12-10 | 2018-10-09 | 阿里巴巴集团控股有限公司 | data search processing method and system |
US10666735B2 (en) | 2014-05-19 | 2020-05-26 | Auerbach Michael Harrison Tretter | Dynamic computer systems and uses thereof |
US9742853B2 (en) * | 2014-05-19 | 2017-08-22 | The Michael Harrison Tretter Auerbach Trust | Dynamic computer systems and uses thereof |
CN104572960B (en) * | 2014-12-29 | 2018-07-06 | 北京奇虎科技有限公司 | A kind of method and device of search |
CN104715011A (en) * | 2014-12-31 | 2015-06-17 | 上海孩子国科教设备有限公司 | Method and system for conducting data retrieval |
CN105302903B (en) * | 2015-10-27 | 2018-12-14 | 广州神马移动信息科技有限公司 | Searching method, device, system and search result sequencing foundation determination method |
CN105893427A (en) * | 2015-12-07 | 2016-08-24 | 乐视网信息技术(北京)股份有限公司 | Resource searching method and server |
CN106874413A (en) * | 2017-01-22 | 2017-06-20 | 斑马信息科技有限公司 | Search system and its method for processing search results |
CN107515857B (en) * | 2017-08-31 | 2020-08-18 | 科大讯飞股份有限公司 | Semantic understanding method and system based on customization technology |
CN107832432A (en) * | 2017-11-15 | 2018-03-23 | 北京百度网讯科技有限公司 | A kind of search result ordering method, device, server and storage medium |
CN108897785A (en) * | 2018-06-08 | 2018-11-27 | Oppo(重庆)智能科技有限公司 | Search for content recommendation method, device, terminal device and storage medium |
CN108763579B (en) * | 2018-06-08 | 2020-12-22 | Oppo(重庆)智能科技有限公司 | Search content recommendation method and device, terminal device and storage medium |
CN110162535B (en) * | 2019-03-26 | 2023-11-07 | 腾讯科技(深圳)有限公司 | Search method, apparatus, device and storage medium for performing personalization |
CN110990598B (en) * | 2019-11-18 | 2020-11-27 | 北京声智科技有限公司 | Resource retrieval method and device, electronic equipment and computer-readable storage medium |
CN113254513B (en) * | 2021-07-05 | 2021-09-28 | 北京达佳互联信息技术有限公司 | Sequencing model generation method, sequencing device and electronic equipment |
CN113792225B (en) * | 2021-08-25 | 2023-08-18 | 北京库睿科技有限公司 | Multi-data type hierarchical ordering method and device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1050830A2 (en) * | 1999-05-05 | 2000-11-08 | Xerox Corporation | System and method for collaborative ranking of search results employing user and group profiles |
CN1758248A (en) * | 2004-10-05 | 2006-04-12 | 微软公司 | Systems, methods, and interfaces for providing personalized search and information access |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7082365B2 (en) * | 2001-08-16 | 2006-07-25 | Networks In Motion, Inc. | Point of interest spatial rating search method and system |
US7693827B2 (en) * | 2003-09-30 | 2010-04-06 | Google Inc. | Personalization of placed content ordering in search results |
-
2012
- 2012-02-20 CN CN201210038993.2A patent/CN102622417B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1050830A2 (en) * | 1999-05-05 | 2000-11-08 | Xerox Corporation | System and method for collaborative ranking of search results employing user and group profiles |
CN1758248A (en) * | 2004-10-05 | 2006-04-12 | 微软公司 | Systems, methods, and interfaces for providing personalized search and information access |
Also Published As
Publication number | Publication date |
---|---|
CN102622417A (en) | 2012-08-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102622417B (en) | The method and apparatus that information record is ranked up | |
Ortiz‐Cordova et al. | Classifying web search queries to identify high revenue generating customers | |
JP5941075B2 (en) | SEARCH SYSTEM, METHOD, AND COMPUTER-READABLE MEDIUM WITH INTEGRATED USER JUDGMENT INCLUDING A AUTHORITY NETWORK | |
CN101551806B (en) | Personalized website navigation method and system | |
CN107862553A (en) | Advertisement real-time recommendation method, device, terminal device and storage medium | |
JP4418135B2 (en) | Group forming system, group forming method, and group forming apparatus | |
US9996630B2 (en) | System and/or method for linking network content | |
US20060064411A1 (en) | Search engine using user intent | |
US20200294071A1 (en) | Determining user intents related to websites based on site search user behavior | |
CN102037464A (en) | Search results with most clicked next objects | |
CN103646092A (en) | SE (search engine) ordering method based on user participation | |
CN101283353A (en) | Systems for and methods of finding relevant documents by analyzing tags | |
US20120041936A1 (en) | Search engine optimization at scale | |
CN104598604A (en) | Browsing method of website navigation applied in various browsers | |
KR101559719B1 (en) | Auto-learning system and method for derive effective marketing | |
CN105159898B (en) | A kind of method and apparatus of search | |
Chen et al. | The best answers? think twice: online detection of commercial campaigns in the CQA forums | |
CN102930009B (en) | Individual website navigation system | |
Raju et al. | A novel approaches in web mining techniques in case of web personalization | |
KR20010108877A (en) | Method For Evaluating A Web Site | |
Ohmukai et al. | Personal knowledge publishing suite with Weblog | |
Chen et al. | The best answers? Think twice: identifying commercial campagins in the CQA forums | |
CN110321487A (en) | A kind of accurate label recommendations system and its workflow | |
Maheswari et al. | Algorithm for Tracing Visitors' On-Line Behaviors for Effective Web Usage Mining | |
Gudla et al. | Enhanced service recommender and ranking system using browsing patterns of users |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |