CN103780625B - User interest finds method and apparatus - Google Patents

User interest finds method and apparatus Download PDF

Info

Publication number
CN103780625B
CN103780625B CN201410038066.XA CN201410038066A CN103780625B CN 103780625 B CN103780625 B CN 103780625B CN 201410038066 A CN201410038066 A CN 201410038066A CN 103780625 B CN103780625 B CN 103780625B
Authority
CN
China
Prior art keywords
network access
behavioral data
user
field
access behavioral
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410038066.XA
Other languages
Chinese (zh)
Other versions
CN103780625A (en
Inventor
汤传喜
郭奇
崔华
居胜峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201410038066.XA priority Critical patent/CN103780625B/en
Publication of CN103780625A publication Critical patent/CN103780625A/en
Application granted granted Critical
Publication of CN103780625B publication Critical patent/CN103780625B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

Method and apparatus are found the invention discloses a kind of user interest, method therein mainly includes:Gather the network access behavioral data of user;The entity word included according to network access behavioral data and each field set in advance field that corresponding multiple entity words determine belonging to network access behavioral data respectively;The weighted value of behavioral data is accessed according to the attribute information calculating network in the corresponding multiple dimensions of network access behavioral data;The weighted value of the network access behavioral data according to user determines attention rate of the user to the field belonging to network access behavioral data;The interest of the user is recognized to the corresponding interest threshold of the attention rate of network access behavioral data art and corresponding field set in advance according to user;Wherein, the corresponding interest threshold in field is that the network access behavioral data conducted interviews to the field according to multiple users in network is set.The above-mentioned technical proposal that the present invention is provided further can accurately determine user interest.

Description

User interest finds method and apparatus
Technical field
The present invention relates to network access technology field, and in particular to user interest finds method and corresponding user interest It was found that device.
Background technology
Recommendation of personalized information technology can make the lateral user of network issue the information for meeting user interest due to it, because This, recommendation of personalized information technology can effectively improve the click volume and amount of reading of Internet resources.In view of this, customized information Recommended technology is gradually more and more applied in network access.
In recommendation of personalized information technology, discovery user interest accurately and timely is one in the technology extremely important Link.
Existing user interest finds that mode mainly includes two kinds, i.e., a kind of mode actively accuses its interest for guiding user Know network side;And another way finds user interest for automatic, i.e., according to the behavioural information of user(That is the network access of user Behavioral data)It was found that user interest;Wherein, the behavioural information of above-mentioned user can include:The information of the browsed webpage of user, The blog that the information of the microblogging that keyword that user searched for, user deliver, user deliver(blog)Information and user purchase Commodity bought etc..
At present, the behavioural information according to user finds that the specific implementation of user interest is usually:One is read in user During the content such as piece document or reading webpage, the field belonging to the document is determined, such that it is able to the field is defined as into user Interest;It is of course also possible to further the multiple fields involved by the user are compared, user is set foot in into most one Or two fields are defined as the interest of user.
Inventor has found that it is existing that the existing implementation for finding user interest is also easy to produce erroneous judgement in process of the present invention is realized As naming two specific examples and illustrating:
First specific example, user reads a certain content and is sometimes has disturbing factor, and according to the interference The user interest that sexual factor is found is likely to not be the real interest of user;If a certain field is popular domain, so that with The chance that the content in the field is read at family is often more, however, this not represent user really interested in the field;For another example, User can be triggered to browse related content because pop-up push or user are misled by title, however, these contents that user browses The real interest place of user can not be represented.
Second specific example, the reading of user may show its either shallow and temporary transient interest, and if according to User read this partial content is by user's either shallow and temporary transient interest is identified as the real interest of user, then produce user interest Erroneous judgement;If user is during a TV play is seen, the performer in TV play is scanned for sometimes, to have read one It is related to the recommended information of the performer a bit, this reading behavior of user generally and without amount of reading high and lasting occurs Feature, if identifying that user is interested in the performer accordingly, and pushes the information related to the performer obviously not to user Properly.
The content of the invention
It is an object of the present invention to overcome the technical problem existing for existing user interest discovery mode, there is provided a kind of User interest finds that method and corresponding user interest find device, and technical problem to be solved is, further accurate Determine user interest.
The purpose of the present invention and solve its technical problem and can be realized using following technical scheme.
Method is found according to a kind of user interest proposed by the present invention, wherein, methods described includes:Gather the network of user Access behavioral data;The entity word and each field difference set in advance included according to network access behavioral data are corresponding Multiple entity words determine the field belonging to the network access behavioral data;It is corresponding many according to the network access behavioral data Attribute information in individual dimension calculates the weighted value of the network access behavioral data;Network access behavior according to the user The weighted value of data determines attention rate of the user to the field belonging to the network access behavioral data;According to the user The corresponding interest threshold identification of attention rate and corresponding field set in advance to the network access behavioral data art The interest of the user, wherein, the corresponding interest threshold in the field is that the field is visited according to multiple users in network What the network access behavioral data asked was set.
Device is found according to a kind of user interest provided in an embodiment of the present invention, wherein, the device includes:Acquisition module, Network access behavioral data for gathering user;Field module is determined, for being wrapped according to the network access behavioral data Corresponding multiple entity words determine the network access behavioral data institute respectively for the entity word for containing and each field set in advance The field of category;Weighted value module, based on according to the attribute information in the corresponding multiple dimensions of the network access behavioral data Calculate the weighted value of the network access behavioral data;Attention rate module, for the network access behavioral data according to the user Weighted value determine attention rate of the user to the field belonging to the network access behavioral data;Interest identification module, uses In corresponding with corresponding field set in advance to the attention rate of the network access behavioral data art according to the user Interest threshold recognize the interest of the user;Wherein, the corresponding interest threshold in the field is according to multiple users in network What the network access behavioral data conducted interviews to the field was set.
By above-mentioned technical proposal, the user interest that the present invention is provided find method and apparatus at least have following advantages and Beneficial effect:The embodiment of the present invention carries out the network access row of network access by using multiple users in network to corresponding field The interest threshold in corresponding field is set for data, the interest threshold in corresponding field is set up in multiple users to corresponding neck Domain is carried out on the network access characteristic distributions that network access is formed, so that the interest threshold in corresponding field is configured to close The interest threshold of reason;Unique user is weighed to corresponding field by using the interest threshold in the corresponding field of present invention setting Attention rate, can as far as possible avoid passing through the comparison between the heterogeneous networks access behavior to unique user itself to determine to use Existing misjudgment phenomenon during the interest of family;The final present invention can be determined more accurately out user to corresponding field Interest, and more accurately for user issues its real content interested.
Described above is only the general introduction of technical solution of the present invention, in order to better understand technological means of the invention, And can be practiced according to the content of specification, and in order to allow of the invention above-mentioned and other purposes, feature and advantage energy Enough become apparent, below especially exemplified by preferred embodiment, describe in detail as follows.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to institute in embodiment The accompanying drawing for needing to use is briefly described, it should be apparent that, drawings in the following description are only some implementations of the invention Example, for those of ordinary skill in the art, on the premise of not paying creative work, can also obtain according to these accompanying drawings Obtain other accompanying drawings.
Fig. 1 is that user interest provided in an embodiment of the present invention finds method flow diagram;
Fig. 2 is the block schematic illustration that user interest provided in an embodiment of the present invention finds method;
Fig. 3 is that user interest provided in an embodiment of the present invention finds schematic device.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that the embodiment described by specification is only section Example of the present invention, rather than whole embodiments. It is that those of ordinary skill in the art are obtained successively without paid by creative work based on the embodiment in the present invention Other embodiment, belong to the scope of protection of the invention.
Embodiment one, user interest finds method.The user interest finds the flow of method and illustrates such as Fig. 1 and Fig. 2 It is shown.
In Fig. 1, S100, the network access behavioral data of collection user.
Specifically, the network access behavioral data of the user in the present embodiment includes:The information of the browsed webpage of user, The information of the microblogging that keyword that user searched for, user deliver(At least one keyword for such as being extracted from microblogging)、 The information of the blog that user delivers(At least one keyword for such as being extracted from blog)And the commodity of user's purchase Information etc..Above-mentioned network access behavioral data can also carry out the temporal information of network access behavior, such as user including user The time of activation browser client, the time of user's closing browser client, the time of User logs in network, user browse The time of webpage, the time of user's search keyword, user deliver time and the user for the time of microblogging, user delivering blog Buy time of commodity etc..Above-mentioned user carry out network access behavior temporal information can be used for follow-up visiting frequency with And access the calculating at interval etc..
The network that the present embodiment can gather user using the browser client in the network-termination device of user is visited Ask behavioral data.One specific example, the browser client in the network-termination device of user can easily get User performs the relevant information of network access operation, i.e. the network access behavioral data of user, and so, browser client can be with According to its internal network appliance address set in advance, the network access behavioral data of the user for being collected is transferred to accordingly The network equipment(The network equipment or other equipment as where browser server end), so that the corresponding network equipment can Easily to collect the network access behavioral data of user.It should be strongly noted that the present embodiment is carrying out the net of user Network is accessed in the transmitting procedure of behavioral data, and browser client should also be by its identification information with network access behavioral data Transmit to the corresponding network equipment, so, the network equipment can the identification information based on browser client connect determining it The user corresponding to network access behavioral data for receiving;That is, in the present embodiment, user can use browser visitor The identification information at family end is represented.
The network access behavioral data real-time Transmission that browser client can be collected gives the corresponding network equipment, The timing of network access behavioral data that browser client can also be collected or not timing are transferred to corresponding net Network equipment, such as every integral point moment, browser client gathered and be locally stored by user in previous hour institute The network access behavioral data that the network access for carrying out is operated and produced is transferred to the corresponding network equipment, after Successful transmissions, Browser client deletes the above-mentioned network access behavioral data for being successfully transmitted to the corresponding network equipment being locally stored; Again for example, browser client reaches predetermined quantity in its network access behavioral data for gathering and being locally stored(Such as collection simultaneously The memory space shared by network access behavioral data being locally stored reaches predetermined memory space size)When, will be locally stored All-network access behavioral data and be transferred to the corresponding network equipment, and after Successful transmissions, on browser client is deleted State the network access behavioral data for being successfully transmitted to the corresponding network equipment being locally stored.
The present embodiment can also utilize API(Application Programming Interface, application programming interfaces) The network access behavioral data of user is gathered from network side.Using API the network access row of user is being gathered from network side In the case of data, the present embodiment can get the more network access behavioral datas of user, and such as the present embodiment can be with profit The network that user performs before browser client starts to access behavioral data to network equipment report network is got with API Access network access behavioral data of the produced and storage in network side, that is to say, that be configured as in browser client Obtain user network access behavioral data and to the corresponding network equipment send network access behavioral data before, user utilize The corresponding network access behavioral data of network access operation performed by the browser client can be gathered by API.
First specific example of network access behavioral data using API collection users be:The network equipment(Such as browse Network equipment where device server end etc.)Where browser client is received by it network-termination device transmission come Information when, judge whether include the log-on message of microblogging or blog etc. in its information for receiving immediately, if network Equipment is judged to include log-on message in its information for receiving, then the network equipment obtains login user from log-on message Logon account information, and the content that login user is delivered using its login account is obtained from corresponding server using API (Blog or microblogging content that such as login user is delivered), then, the network equipment carries out extraction pass for the content of its acquisition The treatment such as keyword, so that the network equipment collects user(User i.e. represented by browser client mark)Network access row It is data;Wherein, obtain login user using API and be not limited in login user profit using the content that its login account is delivered The content that this is delivered with its login account, can also include login user using its login account at it for the previous period(Such as The previous moon of current time)The content delivered.
Second specific example of network access behavioral data using API collection users be:The network equipment is according to pre- The time for first setting(Such as daily morning)Timing for it at preset time intervals(Such as 24 hours)What is inside received is all Browser client by its network-termination device transmit come all information carry out concentration analysis, with receive from it own The information of the log-on message for including microblogging or blog etc. is identified in information, then, the network equipment is identified according to these Information included in login user log-on message using API from corresponding server(Such as microblogging or blog are corresponding Server)It is middle to obtain the content that each login user is delivered using its login account(The blog delivered such as login user or The contents such as microblogging), afterwards, the network equipment carries out extracting the treatment such as keyword for the content of its acquisition, so that the network equipment is adopted Collect user(User i.e. represented by browser client mark)Network access behavioral data;Wherein, obtained using API and stepped on Employ in family is not limited in login user this is delivered using its login account using the content that its login account is delivered Hold, can also include login user using its login account at it for the previous period(Such as the previous moon of current time)Delivered Content.
It should be noted that in above-mentioned first specific example and second specific example, if a network There is the situation that many people use in terminal device, then can in a plurality of information from the browser client in the network-termination device The log-on message of multiple different login users can be included;In this case, the present embodiment can will come from a network end The log-on message difference of the multiple different login user in a plurality of information of the browser client in end equipment is corresponding interior Keyword in appearance is all as a user(User i.e. represented by browser client mark)Network access behavioral data, That is, not made a distinction to login user;Certainly, the present embodiment can also will be clear in from a network-termination device The log-on message correspondence of the one of login user in the multiple different login user look in a plurality of information of device client Content in keyword as a user(User i.e. represented by browser client)Network access behavioral data, That is, the present embodiment can make a distinction to login user;For example, the present embodiment can login time in multiple login users Keyword in the corresponding content of log-on message of a most login user of number is used as the user in the present embodiment(Browse User represented by device client)Network access behavioral data, and the log-on message of other login users is not obtained The treatment operation of corresponding contents and extraction keyword etc., will a most login user of login times and browser clients Holding the user represented by mark is associated.
The present embodiment can also be using utilization browser client acquisition mode and the API collection enumerated except above two Other modes outside mode obtain the network access behavioral data of user.In addition, the network end of the user in the present embodiment End equipment can be the network that computer or intelligent mobile phone or panel computer of user etc. can carry out network access Terminal device.
S110, the entity word included according to network access behavioral data and each field difference set in advance are corresponding Multiple entity words determine the field belonging to the network access behavioral data that above-mentioned steps are collected.
Specifically, the present embodiment can be expressed as being made up of a series of entity word by each field in advance one to Amount, for the network access behavioral data that the network equipment is received, the network equipment can first by the network access behavior The entity word that data are included(Such as comprising one or more entity word)One vector is calculated by pre-defined algorithm, then, is led to Cross predetermined distance function measure corresponding to the network access behavioral data it is vectorial corresponding with each field vector between away from From, afterwards, the field according to belonging to each distance measured out determines the above-mentioned network access behavioral data for receiving(Such as will be away from It is defined as the field belonging to the network access behavioral data from nearest field).
The present embodiment can also determine the neck belonging to the above-mentioned network access behavioral data for collecting using other modes Domain, no longer enumerates explanation herein.
S120, according in the corresponding multiple dimensions of network access behavioral data attribute information calculating network access behavior number According to weighted value.
Specifically, the network access behavioral data in the present embodiment is to that should have multiple dimensions(Here dimension can also claim It is statistical dimension), and to that should have corresponding attribute information in each dimension, the attribute information is not offered as network access The intrinsic attribute in its corresponding dimension of behavioral data, but a kind of it is made due to the access behavior of user in maintenance The temporal properties being had.
One specific example, the attribute information in the corresponding multiple dimensions of network access behavioral data in the present embodiment Can include:Network access behavioral data art touch up to number of times, network access behavioral data art access frequently Spend, produce the information of the content resource corresponding to the access mode and network access behavioral data of the network access behavioral data Quality.
Another specific example, the attribute letter in the corresponding multiple dimensions of network access behavioral data in the present embodiment Breath can include:Touching for network access behavioral data art reaches number of times, the access of network access behavioral data art The letter of the content resource for being spaced, producing corresponding to the access mode and network access behavioral data of the network access behavioral data Breath quality.
Another specific example, the attribute letter in the corresponding multiple dimensions of network access behavioral data in the present embodiment Breath can include:Touching for network access behavioral data art reaches number of times, the access of network access behavioral data art Frequency, network access behavioral data art access interval, produce the network access behavioral data access mode and The information quality of the content resource corresponding to network access behavioral data.
Wherein, above-mentioned network access behavioral data art touch up to number of times represent user to the field touch up to time Number, that is to say, that in a territory, if accessing the tactile suitable up to carrying out of behavioral data to the all-network in the field If sequence metering, then the corresponding order metering value of the network access behavioral data is the network access behavioral data art Touch reach number of times.Touching for above-mentioned network access behavioral data art can be set up to number of times by the network equipment.
Wherein, the visiting frequency of above-mentioned network access behavioral data art represents user to the access in the field frequently Degree, that is to say, that in a territory, if using each network access behavioral data in the field as user Once access to the field, then the network access behavioral data is being brought the calculating to the visiting frequency in the field in real time into When the visiting frequency value that is obtained can be as the visiting frequency in the field described in network access behavioral data.Above-mentioned network access The visiting frequency of behavioral data art can be calculated and set by the network equipment.Touch to reach is between number of times and visiting frequency There is relation, touching such as within a period of time is more up to number of times, then visiting frequency can be higher, a specific example, if User often sees NBA news, then touching for entity word NBA can be many up to number of times, and at the same time, entity word NBA is on time dimension The visiting frequency for being shown also can be higher.
Wherein, the access mode of the above-mentioned generation network access behavioral data refers to that user is carrying out corresponding network access And produce the specific access mode used during the network access behavioral data, such as network access behavioral data be user due to Actively access and produce(As actively open browser client and be input into address field the web page browsing of corresponding URL with And the web page browsing of active search keyword etc.), or user due to click on push pop-up or webpage in content and produce Raw.The access mode of the above-mentioned generation network access behavioral data can be set by browser client, and be visited with network Ask that behavioral data is transmitted to the network equipment together.
Wherein, the information quality of the content resource corresponding to above-mentioned network access behavioral data can table to a certain extent The professional degree of corresponding content resource is shown, the information quality of content resource can using in the content resource art extremely A few high-end user determines to the access situation of the content resource corresponding to the network access behavioral data;Here high-end User can be to have been determined as to the field(Field belonging to the above-mentioned network access behavioral data for receiving)With emerging The user of interest(It is referred to as the senior user in the field).One specific example, the present embodiment can be visited according to network Ask whether the content resource corresponding to behavioral data is accessed and/or by phase by one or more high-end users in corresponding field The relevant informations such as the number of times that all high-end users in field are accessed are answered to determine corresponding to the network access behavioral data Specific value of the content resource on information quality.The information quality of the content resource corresponding to above-mentioned network access behavioral data Can be set by the network equipment.In addition, above-mentioned high-end user can also for be not only confirmed as there is the field interest and The interest for also tackling the field reaches the user of fever degree, and the attention rate of content resource art is not only reached in user such as To corresponding interest threshold, but also in the case of reaching predetermined threshold, the user is confirmed as the high-end user in the field, should Predetermined threshold is higher than the corresponding interest threshold of Internet resources art;For another example, in user to the pass of content resource art Note degree not only reaches corresponding interest threshold, and the user also carried out access to predetermined website, then can be true by the user It is set to high-end user;Above-mentioned predetermined website is usually highly professional website.
Wherein, the access time interval user of above-mentioned network access behavioral data art is between the access in the field Every;I.e. during the multiple online of user, between once being accessed and access next time to the field between before a field Every user surf the Net number of times;Here online number of times can be calculated in units of day(That is multiple online of the user within one day As the once online of the user), the online number of times in the present embodiment can also be calculated with other unit, such as with user's opening The number of times of browser client is calculated for unit.The access interval of above-mentioned network access behavioral data art can be by network Equipment is calculated and set.One specific example, user January 7 online have accessed sports field in content resource, it Afterwards, user never has online, and until January 10, user just surfed the Net again, and have accessed the money of the content in sports field again Source, then the access of corresponding network access behavioral data art can be set to 1, and be not to access this It is set to the number of days that January 7 was spaced and January 10 between.
The present embodiment can in advance for the different attribute information in all dimensions or partial dimensional is respectively provided with accordingly Coefficient is such as the coefficient that actively access is set higher than the coefficient set for passive access, is for another example by high-end user access Hold the coefficient of the corresponding information quality setting of resource higher than the corresponding information quality of content resource not accessed by high-end user The coefficient of setting.So, after attribute information of the present embodiment on the corresponding multiple dimensions of network access behavioral data are determined, Each attribute information and corresponding coefficient can be utilized to calculate the weighted value of network access behavioral data.The present embodiment can be with The calculating of the weighted value of network access behavioral data is carried out using corresponding computational methods according to actual conditions, it is specific to calculate Method no longer illustrated in greater detail one by one herein.
The present embodiment can receive a network access behavioral data or while receive a plurality of network access row During for data, the weighted value of the network access behavioral data for receiving is calculated immediately, and the weighted value that will be calculated is visited with network Ask that the attribute information in behavioral data and each dimension corresponding to it is locally stored together.Certainly, the present embodiment can also be adopted The network access behavioral data that it is received is processed with the mode of timing or not timing, for example, every integral point moment, network All network access behavioral datas for being calculated that equipment is received and is locally stored carry out weighted value calculating, and After the completion of calculating, the weighted value that will be calculated is together with corresponding network access behavioral data and network access behavioral data correspondence Each dimension on attribute information store together;Again for example, the network equipment reaches in its network access behavioral data being locally stored To predetermined quantity(Memory space as received and shared by the network access behavioral data that is locally stored reaches predetermined memory space Size)When, all network access behavioral datas for not carrying out weighted value calculating to being locally stored carry out weighted value calculating, and After the completion of calculating, each weighted value that will be calculated is together with corresponding network access behavioral data and network access behavioral data pair The attribute information in each dimension answered is stored together.
Attribute information in the corresponding multiple dimensions of the network access behavioral data of user, network access behavioral data and Weighted value for calculating etc. can be collectively stored in the feature database of the user(As shown in Figure 2)In.
The present embodiment can access the weighted value of behavioral data, concrete implementation mode using various ways come calculating network Can be set according to practical situations, no longer illustrated in detail herein.
S130, determine user to belonging to network access behavioral data according to the weighted value of the network access behavioral data of user Field attention rate.
Specifically, the present embodiment can in real time calculate pass of the user to the field belonging to network access behavioral data Note degree, that is to say, that the network equipment often receives a network access behavioral data or the network equipment while receiving a plurality of During network access behavioral data, the attention rate that can immediately carry out network access behavioral data is calculated, and utilization is currently calculated Attention rate correct attention rate of the user to the network access behavioral data art(" online treatment " in such as Fig. 2, and Utilize the storage information in the modified result " feature database " of " online treatment ").
The present embodiment can also use non real-time mode(That is offline mode)User is calculated to network access behavioral data The attention rate in affiliated field, for example, the network access behavioral data of the user received to the previous day in daily morning The calculating treatment of attention rate is carried out, after the completion of calculating treatment, user is corrected using the attention rate for currently calculating to each net Network accesses the attention rate of behavioral data art(" processed offline " in such as Fig. 2, and utilize the modified result of " processed offline " Storage information in " feature database ").
The present embodiment can calculate user using various ways using the weighted value of the network access behavioral data of user To the attention rate of network access behavioral data art, concrete implementation mode can be set according to actual conditions, herein No longer illustrate in detail.
S140, the attention rate according to user to network access behavioral data art and corresponding field pair set in advance The interest of the interest threshold identifying user answered.
Specifically, the corresponding interest threshold in corresponding field set in advance is according to multiple users in network in the present embodiment (Such as the whole network user)To belong to the content resource in the field conduct interviews produced by network access behavioral data and set.
Due to multiple users(Such as the whole network user)Access situation to a field can embody the field by different use The difference of family degree of interest, therefore, the interest in the field is set to the access situation in field using multiple users Threshold value can be with the accurate actual access situation for embodying user interested in the field to the field, so that the present embodiment Judge whether user is interesting to the field by using such interest threshold, the result of judgement can be made more Accurately.
One specific example, is set with two fields, i.e. the first field and the second field, the first field be one can Often the field for being touched by everybody(Such as NBA), and the second field is a field that can not be touched by everybody often(Such as Ornamental fish), user A to the access times in the first field often well beyond user A to the access times in the second field, so And, this can accurately not be represented where the interest that the first field is user A, that is to say, that if by by user A to the The interest that the access times in one field and the access times to the second field are compared to determine user A is the first field, then The interest probably determined not is the interest of user A.According to the actual fact, due to multiple users(Such as the whole network user) The chance for contacting the first field is all more, and the chance for contacting the second field is all less, therefore, according to multiple users couple in network It is that the interest threshold that the first field is set should be higher than that as the second field sets from the point of view of the access situation in first field and the second field The interest threshold put.
One more specifically example, the content update amount in sports news field is larger, and user A averagely has 10 bodies daily The amount of reading of news is educated, and the content update amount in ornamental fish field is less, user A averagely has readding for 2 ornamental fish contents daily Reading amount, and from the point of view of the access situation of the whole network user, have 20 users of the amount of reading of sports news daily just at last to physical culture News Field is interested, and has 2 users of the amount of reading of ornamental fish just can be interested in ornamental fish field at last daily.
Different user divides the network access in same field the network access distribution situation of different field with different user Cloth situation is as shown in following Tables 1 and 2s.
Table 1
In table 1, Total User14560 represent that this participates in the quantity of the user of statistics, and Info " * * " represents * * fields, User_num represents the number of users that access was carried out to the content resource in field, and User_prop represents the content money to field Source carried out access user account for this participate in statistics user ratio, a specific example, for " internet " field For, User_prop=13095/14560=0.899.
As shown in Table 1, due to information content(Information updating amount in other words)Difference and whether be that popular domain etc. is various Reason, the characteristics of access of the user to different field is had different, is entered by the access for same user to different field Row contrasts to determine that user field interested is irrational.
Table 2
Table 2 is the particular content of the further displaying in " internet " field in table 1, and User_num is represented in the field Content resource carried out the quantity of the user for accessing, and User_prop is represented carried out access to the content resource in the field User accounts for the ratio of this user for participating in statistics, and Days represents that user accesses the number of days in " internet " field, and pv represents user Number of times is reached to touching for " internet " field, entity_num represents that user accesses the reality that the content resource in " internet " is included The quantity of pronouns, general term for nouns, numerals and measure words.
Data in table 2 may indicate that to " internet " carried out access different user to the field touch up to time Number, the difference for touching the presence of the aspects such as the entity word quantity that reaches and the visiting frequency to the field.
In the present embodiment, a specific example for pre-setting the corresponding interest threshold in field is, timing or not Multiple users in the collection network of timing(Such as the whole network user)Network access behavioral data(Multiple using is obtained with offline mode The network access behavioral data at family, as shown in " processed offline " square frame in Fig. 2);For each network access for getting For behavioral data, the entity word that the network access behavioral data is included is determined respectively, according to network access behavioral data institute Comprising entity word and corresponding multiple entity words determine each network access behavioral data point respectively in each field set in advance Not affiliated field;Afterwards, according to the attribute information in the corresponding multiple dimensions of each network access behavioral data(Here attribute The specific description as in above-mentioned S120 of information)Calculate the weighted value of each network access behavioral data(The calculating of weighted value is specific such as Description in above-mentioned S120);Then, for each field, the all-network in each field accesses behavioral data The distribution situation of weighted value be respectively provided with the corresponding interest threshold in each field, such as a field, will can belong to The weighted value that the all-network in the field accesses behavioral data is put into coordinate, each weighted value as a point in coordinate, Each point is coupled together can form a broken line, and the corresponding weighted value of the uninterested user in the field would generally be assembled In a shallower interval in broken line, and the corresponding weighted value of user interested in the field would generally be gathered in folding Another in line is interval, and another interval is relative to the trend that foregoing interval would generally show as flying up, so that The present embodiment can determine the corresponding interest threshold in the field by searching in the broken line corresponding flex point, and the present embodiment can be with The weighted value of the flex point that will be found is used as the corresponding interest threshold in the field.The present embodiment determines a specific example of flex point Son is, in the case where a certain proportion of weighted value is covered, if it is determined that go out the slope of adjacent oblique line difference reach it is certain During threshold value, then the intersection point of adjacent oblique line can be defined as flex point;The present embodiment can be manually adjusted to the flex point chosen.
The corresponding interest threshold in each field that the present embodiment is calculated can be stored in field distribution library as shown in Figure 2.
The present embodiment can reach or surpass user is judged to the attention rate of network access behavioral data art When crossing the corresponding interest threshold in the field set in advance, the field as the interest of user and recommends to user to accord with accordingly The content resource of its interest is closed, i.e., as shown in Fig. 2 by the data of storage in " feature database " and " field distribution library " as " individual The input information of property engine ", so that " personalized engine " can export the content resource for meeting user interest, and then this implementation Example can issue its content resource interested to user.
User have it is passive browse custom in the case of, user's ordinary practice is in browsing various top news and in real time The content that pop-up is pushed, passively custom is browsed as being based on, and user can be caused all to have more to multiple fields Network access phenomenon;However, because these access are impromptu and random, therefore, user accesses it multiple being related to The attention rate in field is likely to reach the interest threshold in corresponding field, so that the present embodiment is set using based on multiple users The interest threshold in each field put can be excluded and for the field belonging to the impromptu and random content for browsing of user be defined as user The phenomenon in field interested.
The above-mentioned technical proposal provided using the present embodiment, can be entered with the accurate field for determining that user is interested One step, the present embodiment can be visited with the finer entity word determined interested to user, the network such as in the present embodiment Ask that the attribute information in the corresponding multiple dimensions of behavioral data can also include:The entity that network access behavioral data is included The entity word that word touching in network access behavioral data art reaches number of times, network access behavioral data is included is in network The entity word that visiting frequency and network access behavioral data in access behavioral data art are included is in the network The access interval in behavioral data art is accessed, these three attribute informations are directed to the network access behavior number in field According to the entity word for being included, rather than for network access behavioral data art.One specific example, in Fig. 2 Not only record has a plurality of network access behavioral data of user in the feature database for showing, and also record has for neck in this feature storehouse Domain is being touched up to number of times, visiting frequency, is accessing interval, the attribute information in access mode and information quality dimension and for neck Touching for entity word in domain is spaced up to number of times, visiting frequency and access.
Based on the above-mentioned attribute information for entity word, the present embodiment may be used also in the corresponding interest threshold in the field of setting With the interest threshold of each entity word in further setting field, so, each entity word that not only can be in field it is emerging Interesting threshold value judges the content of the more specific refinement in the field interested to user, and, even lose interest in user Field in, it is also possible to user is gone out by multilevel iudge and more pays close attention to the content of some.
The mode for setting the mode interest threshold corresponding with above-mentioned setting field of the interest threshold of entity word is essentially identical, No longer describe in detail herein.The present embodiment is that the interest threshold that the entity word in field is set is stored in as shown in Figure 2 Field distribution library in.
It should be strongly noted that in the case where interest threshold is provided with for entity word in advance, the present embodiment is being neck When domain sets corresponding interest threshold, attention rate of multiple users to field is not only considered as, can also will be each in the field The corresponding interest threshold of entity word is used as a reference factor for determining the corresponding interest threshold in field.In addition, above-mentioned generation net The information quality that network accesses the content resource corresponding to the access mode and network access behavioral data of behavioral data can be used During for entity word, interest threshold and identifying user entity interested word are set, that is to say, that produce network to visit Ask that the access mode of behavioral data can be as the access mode of the entity word for producing network access behavioral data to be included, network The information quality for accessing the content resource corresponding to behavioral data can be as the entity included in network access behavioral data The access mode of word.
The present embodiment is pushing it after user field interested and entity interested word is determined to user During content resource interested, may be referred to user's entity interested word in user field interested, such that it is able to Family issues the content resource for meeting its finer interest.
The present embodiment can also be when to user's content recommendation resource, it is considered to each content in content recommendation resource collection Value of the resource in information quality dimension, for example, for user field interested, its neck interested is being recommended to the user During content resource in domain, the value in information quality dimension in its field interested can be recommended higher to the user Content resource;One specific example, if user is interested in ornamental fish field(I.e. the user is the senior of ornamental fish field User), then when the content resource of ornamental fish is recommended to the user, should be higher to its recommendation value in information quality dimension Content resource, this way it is possible to avoid recommending rudimentary knowledge of the culture of ornamental fish etc. not meet the content of user's actual need to it Phenomenon.
The present embodiment can also issue corresponding content resource according to the current access scenario of user to user, and one specific Example, the network equipment receive browser client gather and transmit come user network access behavioral data after, network Equipment extracts entity word from the network access behavioral data, and judges the network access behavior number using the entity word for extracting According to affiliated field, and then, determining the user to the field according to the information stored in its feature database and field distribution library When loseing interest in, the network equipment can search the corresponding attention rate of all entity words under the field in feature database, then, choosing Take the content resource corresponding to attention rate highest entity word(That is information source), and the content resource is handed down to user;Certainly, The present embodiment can also when determining that the user loses interest in the field, to the user recommend in some fields in letter The relatively low content resource of value in breath quality dimensions;One specific example, if user loses interest in ornamental fish field, In the corresponding content resource during the access scenario current according to user recommends ornamental fish to the user, should be to its recommendation in letter The relatively low content resource of value in breath quality dimensions, such as recommends the rudimentary knowledge and elementary guidance of the culture of ornamental fish to user Related content.
In addition, the entity word that the present embodiment can also be included according to the network access behavioral data of user is issued to user Corresponding content.
Embodiment two, user interest finds device, and the device is as shown in Figure 3.
In Fig. 3, the device mainly includes:Acquisition module 300, determine field module 310, weighted value module 320, attention rate Module 330 and interest identification module 340.The device can also include:Threshold setting module 350 and issue module 360.
Acquisition module 300 is connected respectively with determination field module 310 and weighted value module 320.Acquisition module 300 is mainly used In the network access behavioral data of collection user.
Specifically, the network access behavioral data of the user of the collection of acquisition module 300 includes:The browsed webpage of user The information of the microblogging that keyword that information, user searched for, user deliver(At least one keyword for such as being extracted from microblogging Deng), the information of blog delivered of user(At least one keyword for such as being extracted from blog)And the business of user's purchase Information of product etc..The network access behavioral data of the user of the collection of acquisition module 300 can also carry out network visit including user The temporal information of behavior is asked, the temporal information can be used for the calculating at follow-up visiting frequency and access interval etc..
Acquisition module 300 can gather the network of user using the browser client in the network-termination device of user Behavioral data is accessed, acquisition module 300 can also gather the network access behavioral data of user using API.Using API come In the case of gathering the network access behavioral data of user, acquisition module 300 can get the more network access rows of user It is data.Acquisition module 300 can also be adopted using the utilization browser client acquisition mode and API that are enumerated except above two Other modes outside mode set obtain the network access behavioral data of user.The description of specific such as above-mentioned embodiment of the method, It is not repeated.
Determine that field module 310 is also connected with attention rate module 330.Determine that field module 310 is mainly used according to network Corresponding multiple entity words determine that network is visited respectively to access the entity word that is included of behavioral data and each field set in advance Ask the field belonging to behavioral data.
Specifically, determining that each field can be expressed as what is be made up of a series of entity word by field module 310 in advance One vector, for the network access behavioral data that the network equipment is received, determines that field module 310 will first can be somebody's turn to do The entity word that network access behavioral data is included(Such as comprising one or more entity word)One is calculated by pre-defined algorithm Vector, then, it is determined that field module 310 measures the vector corresponding to the network access behavioral data by predetermined distance function The distance between vector corresponding with each field, afterwards, determines that field module 310 is determined according to each distance measured out above-mentioned The field belonging to network access behavioral data for receiving(Closest field is such as defined as the network access behavioral data Affiliated field).
Weighted value module 320 is also connected with attention rate module 330.Weighted value module 320 is mainly used according to network access Attribute information calculating network in the corresponding multiple dimensions of behavioral data accesses the weighted value of behavioral data.
Specifically, the network access behavioral data in the present embodiment is to that should have multiple dimensions(Here dimension can also claim It is statistical dimension), and to that should have corresponding attribute information in each dimension, the attribute information is not offered as network access The intrinsic attribute in its corresponding dimension of behavioral data, but a kind of it is made due to the access behavior of user in maintenance The temporal properties being had.
The specific ginseng included by the attribute information in the corresponding multiple dimensions of network access behavioral data in the present embodiment The description of for example above-mentioned embodiment of the method for implication represented by several and each parameter.
Weighted value module 320 can in advance for the different attribute information in all dimensions or partial dimensional is respectively provided with phase The coefficient answered, is to be accessed by high-end user such as actively to access the coefficient for setting higher than the coefficient set for passive access for another example The coefficient that sets of the corresponding information quality of content resource higher than the corresponding information of content resource not accessed by high-end user The coefficient of quality settings.So, category of the weighted value module 320 on the corresponding multiple dimensions of network access behavioral data are determined After property information, it is possible to use each attribute information and corresponding coefficient calculate the weighted value of network access behavioral data.Power Weight values module 320 can carry out the weighted value of network access behavioral data according to actual conditions using corresponding computational methods Calculate, specific computational methods no longer illustrated in greater detail one by one herein.
Weighted value module 320 in acquisition module 300 can receive a network access behavioral data or while receive During to a plurality of network access behavioral data, the weighted value of the network access behavioral data for receiving is calculated immediately, and will calculate Weighted value and network access behavioral data and each dimension corresponding to it on attribute information together be locally stored.Certainly, Weighted value module 320 can also process its network access behavioral data for receiving by the way of timing or not timing, For example, every integral point moment, weighted value module 320 all is calculated what acquisition module 300 was received and was locally stored Network access behavioral data carries out weighted value calculating, and after the completion of calculating, the weighted value that weighted value module 320 will be calculated connects Stored together with the attribute information on corresponding network access behavioral data and the corresponding each dimension of network access behavioral data; Again for example, weighted value module 320 reaches predetermined quantity in the network access behavioral data being locally stored(Such as receive and be locally stored Network access behavioral data shared by memory space reach predetermined memory space size)When, to be locally stored it is all not The network access behavioral data for carrying out weighted value calculating carries out weighted value calculating, and after the completion of calculating, weighted value module 320 will Each weighted value for calculating is together with corresponding network access behavioral data and the corresponding each dimension of network access behavioral data Attribute information store together.
Weighted value module 320 can access the weighted value of behavioral data using various ways come calculating network, specific real Existing mode can be set according to practical situations, no longer illustrate in detail herein.
Attention rate module 330 is also connected with interest identification module 340.Attention rate module 330 is mainly used according to user's The weighted value of network access behavioral data determines attention rate of the user to the field belonging to network access behavioral data.
Specifically, attention rate module 330 can in real time calculate user to the neck belonging to network access behavioral data The attention rate in domain, that is to say, that acquisition module 300 often receives a network access behavioral data or acquisition module 300 is same When receiving a plurality of network access behavioral data, attention rate module 330 can immediately carry out the pass of network access behavioral data Note degree is calculated, and concern of the user to the network access behavioral data art is corrected using the attention rate for currently calculating Degree.
Attention rate module 330 can also use non real-time mode(That is offline mode)User is calculated to network access behavior The attention rate in the field belonging to data, for example, being gathered to the previous day acquisition module 300 in daily morning attention rate module 330 The network access behavioral data of user carry out the calculating treatment of attention rate, after the completion of calculating treatment, the profit of attention rate module 330 Attention rate of the user to each network access behavioral data art is corrected with the attention rate for currently calculating.
Attention rate module 330 can be calculated using various ways using the weighted value of the network access behavioral data of user User can be set to the attention rate of network access behavioral data art, concrete implementation mode according to actual conditions, No longer illustrate in detail herein.
Interest identification module 340 also with threshold setting module 350 and issue module 360 and be connected respectively.Interest recognizes mould Block 340 is mainly used in the attention rate to network access behavioral data art and corresponding field pair set in advance according to user The interest of the interest threshold identifying user answered.It is emerging
Specifically, interest identification module 340 can judge pass of the user to network access behavioral data art During corresponding up to or over the field set in advance interest threshold of note degree, the field as the interest of user and makes Module 360 is issued to recommend to meet the content resource of its interest accordingly to user.
The distribution that threshold setting module 350 is mainly used in the weighted value of the network access behavioral data in each field sets Put each field and distinguish corresponding interest threshold.
Specifically, the specific example that threshold setting module 350 pre-sets the corresponding interest threshold in field is to adopt Multiple users in the collection network of the timing of collection module 300 or not timing(Such as the whole network user)Network access behavioral data(I.e. The network access behavioral data of multiple users is obtained with offline mode);For each network access behavioral data for getting For, determine that field module 310 determines the entity word that the network access behavioral data is included respectively, determine field module 310 It is true that the entity word and each field set in advance included according to network access behavioral data distinguish corresponding multiple entity words Fixed each network access behavioral data field affiliated respectively;Afterwards, weighted value module 320 is according to each network access behavioral data pair The attribute information in multiple dimensions answered(Here the specific description as in above-mentioned S120 of attribute information)Calculate each network access The weighted value of behavioral data(The specific description as in above-mentioned S120 of calculating of weighted value);Then, for each field Speech, the distribution situation of the weighted value that all-network of the threshold setting module 350 in each field accesses behavioral data sets respectively The corresponding interest threshold in each field is put, such as a field, threshold setting module 350 can will belong to the institute in the field The weighted value for having network access behavioral data is put into coordinate, and each weighted value sets mould as a point in coordinate, threshold value Each point is coupled together and can form a broken line by block 350, and threshold setting module 350 can be corresponding in the broken line by searching Flex point determine the corresponding interest threshold in the field, the weighted value of the flex point that threshold setting module 350 will can find makees It is the corresponding interest threshold in the field.
Issue the field that module 360 can be also used for belonging to the network access behavioral data of user that is arrived according to Real-time Collection In each entity word corresponding to attention rate, issue corresponding content to user.
Specifically, interest identification module 340 can be with the finer entity word determined interested to user, such as this reality The attribute information applied in the corresponding multiple dimensions of the network access behavioral data in example can also include:Network access behavioral data Comprising entity word touching in network access behavioral data art included up to number of times, network access behavioral data The entity that visiting frequency and network access behavioral data of the entity word in network access behavioral data art are included Access interval of the word in the network access behavioral data art, these three attribute informations are directed to the net in field Network accesses the entity word that behavioral data is included, rather than for network access behavioral data art.
Based on the above-mentioned attribute information for entity word, threshold setting module 350 is in the corresponding interest threshold in the field of setting When, the interest threshold of each entity word that can also further in setting field, so, interest identification module 340 not only can be with root The content of the more specific refinement in the field interested to user is judged according to the interest threshold of each entity word in field, and And, even in the uninterested field of user, issue module 360 and user can also be gone out by multilevel iudge and more pay close attention to Content.
Threshold setting module 350 sets the mode interest threshold corresponding with above-mentioned setting field of the interest threshold of entity word Mode it is essentially identical, no longer describe in detail herein.
After user field interested and entity interested word is determined, issue module 360 and pushed away to user During the content resource for giving its interested, user's entity interested word in user field interested is may be referred to, such that it is able to The content resource for meeting its finer interest is issued to user.
Issuing module 360 can also be when to user's content recommendation resource, it is considered to each interior in content recommendation resource collection Hold value of the resource in information quality dimension, for example, for user field interested, issuing module 360 to the user When recommending the content resource in its domain of interest, can to the user recommend in its field interested in information quality Value content resource higher in dimension.
Issuing module 360 can also issue corresponding content resource, one according to the current access scenario of user to user Specific example, acquisition module 300 receives the network access behavioral data that browser client gathered and transmitted the user for coming Afterwards, determine that field module 310 extracts entity word from the network access behavioral data, and judge to be somebody's turn to do using the entity word for extracting Field belonging to network access behavioral data, and then, stored according in feature database and field distribution library in interest identification module 340 Information when determining that the user loses interest in the field, issuing module 360 can search under the field in feature database The corresponding attention rate of all entity words, then, issues module 360 and chooses content resource corresponding to attention rate highest entity word (That is information source), and the content resource is handed down to user;Certainly, issuing module 360 can also be true in interest identification module 340 When making the user field being lost interest in, recommend to the user value in information quality dimension in some fields compared with Low content resource;One specific example, if user loses interest in ornamental fish field, issues module 360 in basis When the current access scenario of user recommends the corresponding content resource in ornamental fish to the user, issue the Ying Xiangqi of module 360 and push away The relatively low content resource of value in information quality dimension is recommended, the basis that module 360 recommends the culture of ornamental fish to user is such as issued The related content such as knowledge and elementary guidance.Issuing module 360 can also be included according to the network access behavioral data of user Entity word issue corresponding content to user.
As seen through the above description of the embodiments, those skilled in the art can be understood that the present invention can Realized by the mode of software plus required general hardware platform.Based on such understanding, technical scheme essence On the part that is contributed to prior art in other words can be embodied in the form of software product, the computer software product Can store in storage medium, such as ROM/RAM, magnetic disc, CD, including some instructions are used to so that a computer equipment (Can be personal computer, server, or network equipment etc.)Perform some of each embodiment of the invention or embodiment Method described in part.
Each embodiment in this specification is described by the way of progressive, identical similar portion between each embodiment Divide mutually referring to what each embodiment was stressed is the difference with other embodiment.Especially for device or For the embodiment of person's system, because it is substantially similar to embodiment of the method, so describe fairly simple, related part referring to The part explanation of embodiment of the method.Apparatus and system embodiment described above is only schematical, wherein described The unit illustrated as separating component can be or may not be physically separate, and the part shown as unit can be with It is or may not be physical location, you can with positioned at a place, or can also be distributed on multiple NEs.Can The purpose of this embodiment scheme is realized to select some or all of module therein according to the actual needs.This area is general Logical technical staff is without creative efforts, you can to understand and implement.
Method and apparatus are described in detail to be found to user interest provided by the present invention above, it is used herein Specific case is set forth to principle of the invention and implementation method, and the explanation of above example is only intended to help and understands this The method and its core concept of invention;Simultaneously for those of ordinary skill in the art, according to thought of the invention, specific Be will change in implementation method and range of application.In sum, this specification content should not be construed as to of the invention Limitation.

Claims (22)

1. a kind of user interest finds method, it is characterised in that including:
Gather the network access behavioral data of user;
The entity word and each field set in advance included according to the network access behavioral data distinguish corresponding multiple Entity word determines the field belonging to the network access behavioral data;
The network access behavior number is calculated according to the attribute information in the corresponding multiple dimensions of the network access behavioral data According to weighted value;
The weighted value of the network access behavioral data according to the user determines the user to the network access behavioral data The attention rate in affiliated field;
The attention rate to the network access behavioral data art and corresponding field pair set in advance according to the user The interest threshold answered recognizes the interest of the user, wherein, the corresponding interest threshold in the field is used according to multiple in network What the network access behavioral data that family conducts interviews to the field was set;
Wherein, the entity word and each field difference set in advance for being included according to the network access behavioral data are corresponding Multiple entity words determine the field belonging to the network access behavioral data, including:
Every field is expressed as a vector being made up of a series of entity word in advance;
The entity word that the network access behavioral data is included calculates a vector by pre-defined algorithm;
Vector vectorial corresponding with each field corresponding to the network access behavioral data is measured by predetermined distance function The distance between;
Each distance according to measuring out determines the field belonging to the network access behavioral data.
2. the method for claim 1, it is characterised in that the network access behavioral data of the collection user includes:
The network-termination device for receiving user transmits the network access behavior number of the user gathered by browser client for coming According to;And/or
The network access behavioral data of user is gathered from network side by application programming interfaces API.
3. the method for claim 1, it is characterised in that:
Attribute information in the corresponding multiple dimensions of the network access behavioral data is included but is not limited to:Network access behavior number According to touching up to number of times, the visiting frequency of network access behavioral data art, the generation network access behavior for art The information quality of the content resource corresponding to the access mode and network access behavioral data of data;Or
Attribute information in the corresponding multiple dimensions of the network access behavioral data is included but is not limited to:Network access behavior number Number of times, the access interval of network access behavioral data art are reached according to touching for art, produce the network access behavior The information quality of the content resource corresponding to the access mode and network access behavioral data of data;Or
Attribute information in the corresponding multiple dimensions of the network access behavioral data is included but is not limited to:Network access behavior number Reached belonging to number of times, the visiting frequency of network access behavioral data art, network access behavioral data according to touching for art The access in field is spaced, produced corresponding to the access mode and network access behavioral data of the network access behavioral data The information quality of content resource;
Wherein, the access mode includes:Actively access and push and access.
4. method as claimed in claim 3, it is characterised in that:
The information quality of the content resource is provided according to the user that the content resource art is interested in the content The access in source determines;Or
The information quality of the content resource is interested in and to the Internet resources institute according to the content resource art The user that the attention rate in the field of category reaches predetermined threshold determines to the access of the content resource, wherein, the predetermined threshold Value is higher than the corresponding interest threshold of the Internet resources art;Or
The information quality of the content resource is interested in and to the Internet resources institute according to the content resource art The user that predetermined website in the field of category access determines to the access of the content resource.
5. the method for claim 1, it is characterised in that the corresponding interest threshold in the field is set by following manner Put:
The network access behavioral data of the multiple users of collection;
Entity word and each field difference set in advance that network access behavioral data according to the multiple user is included Corresponding multiple entity words determine each network access behavioral data field affiliated respectively;
Each network access behavioral data is calculated according to the attribute information in the corresponding multiple dimensions of the network access behavioral data Weighted value;
The distribution of the weighted value of the network access behavioral data in each field sets each field and distinguishes corresponding interest threshold.
6. method as claimed in claim 5, it is characterised in that the power of the network access behavioral data in each field The distribution of weight values is set the step of corresponding interest threshold is distinguished in each field also to be included:
For a field, power is determined in the distribution of the weighted value that the all-network for belonging to the field accesses behavioral data Weight values flex point, and the weighted value flex point is set to the corresponding interest threshold in the field.
7. method as claimed in claim 3, it is characterised in that in the corresponding multiple dimensions of the network access behavioral data Attribute information also includes:
The entity word that network access behavioral data is included touching in the network access behavioral data art reaches number of times;
Visiting frequency of the entity word that network access behavioral data is included in the network access behavioral data art;
Access interval of the entity word that network access behavioral data is included in the network access behavioral data art.
8. method as claimed in claim 7, it is characterised in that methods described also includes:
The network access behavior number is calculated according to the attribute information in the corresponding multiple dimensions of the network access behavioral data The weighted value of the entity word in;
The weighted value of the entity word in the network access behavioral data determines the user to the network access behavior The attention rate of the entity word in field belonging to data;
According to the user to the attention rate of the entity word in the network access behavioral data art and set in advance The corresponding interest threshold of entity word in corresponding field recognizes the interest of the user.
9. method as claimed in claim 8, it is characterised in that methods described also includes:
According to Real-time Collection to user network access behavioral data belonging to field in the corresponding attention rate of each entity word, Corresponding content is issued to user.
10. the method as described in any claim in claim 1 to 9, it is characterised in that methods described also includes:
The entity word that network access behavioral data according to user is included issues corresponding content to user.
Method in 11. such as claim 1 to 9 as described in any claim, it is characterised in that:
The user is to the attention rate in the field belonging to the network access behavioral data and the interest of the user according to reality When the network access behavioral data real-time update of user that collects;Or
The user is to the attention rate in the field belonging to the network access behavioral data and the interest of the user according to adopting The network access behavioral data of the user for collecting is regularly updated.
A kind of 12. user interests find device, it is characterised in that the device includes:
Acquisition module, the network access behavioral data for gathering user;
Field module is determined, for the entity word included according to the network access behavioral data and each neck set in advance The domain field that corresponding multiple entity words determine belonging to the network access behavioral data respectively;
Weighted value module, described in being calculated according to the attribute information in the corresponding multiple dimensions of the network access behavioral data The weighted value of network access behavioral data;
Attention rate module, the weighted value for the network access behavioral data according to the user determines the user to the net Network accesses the attention rate in the field belonging to behavioral data;
Interest identification module, for according to the user to the attention rate of the network access behavioral data art and in advance The corresponding interest threshold in corresponding field of setting recognizes the interest of the user;
Wherein, the corresponding interest threshold in the field is the network access conducted interviews to the field according to multiple users in network What behavioral data was set;
Wherein, determination field module be used to being expressed as being made up of a series of entity word by every field in advance one to Amount;The entity word that the network access behavioral data is included calculates a vector by pre-defined algorithm;By it is predetermined away from The distance between vector vectorial corresponding with each field corresponding to the network access behavioral data is measured from function;According to degree Each distance for measuring determines the field belonging to the network access behavioral data.
13. devices as claimed in claim 12, it is characterised in that the network access behavioral data of the collection user includes:
The network-termination device for receiving user transmits the network access behavior number of the user gathered by browser client for coming According to;And/or
The network access behavioral data of user is gathered from network side by application programming interfaces API.
14. devices as claimed in claim 12, it is characterised in that:
Attribute information in the corresponding multiple dimensions of the network access behavioral data is included but is not limited to:Network access behavior number According to touching up to number of times, the visiting frequency of network access behavioral data art, the generation network access behavior for art The information quality of the content resource corresponding to the access mode and network access behavioral data of data;Or
Attribute information in the corresponding multiple dimensions of the network access behavioral data is included but is not limited to:Network access behavior number Number of times, the access interval of network access behavioral data art are reached according to touching for art, produce the network access behavior The information quality of the content resource corresponding to the access mode and network access behavioral data of data;Or
Attribute information in the corresponding multiple dimensions of the network access behavioral data is included but is not limited to:Network access behavior number Reached belonging to number of times, the visiting frequency of network access behavioral data art, network access behavioral data according to touching for art The access in field is spaced, produced corresponding to the access mode and network access behavioral data of the network access behavioral data The information quality of content resource;
Wherein, the access mode includes:Actively access and push and access.
15. devices as claimed in claim 14, it is characterised in that:
The information quality of the content resource is to the content according to the user being interested in the content resource art The access of resource determines;Or
The information quality of the content resource is interested in and to the Internet resources institute according to the content resource art The user that the attention rate in the field of category reaches predetermined threshold determines to the access of the content resource, wherein, the predetermined threshold Value is higher than the corresponding interest threshold of the Internet resources art;Or
The information quality of the content resource is interested in and to the Internet resources institute according to the content resource art The user that predetermined website in the field of category access determines to the access of the content resource.
16. devices as claimed in claim 12, it is characterised in that the corresponding interest threshold in the field is set by following manner Put:
The network access behavioral data of the multiple users of collection;
Entity word and each field difference set in advance that network access behavioral data according to the multiple user is included Corresponding multiple entity words determine each network access behavioral data field affiliated respectively;
Each network access behavior number is calculated according to the attribute information in the corresponding multiple dimensions of each network access behavioral data According to weighted value;
And described device also includes:
Threshold setting module, the distribution for the weighted value of the network access behavioral data in each field sets each field point Not corresponding interest threshold.
17. devices as claimed in claim 16, it is characterised in that the network access behavioral data in each field The distribution of weighted value is set the step of corresponding interest threshold is distinguished in each field to be included:
For a field, power is determined in the distribution of the weighted value that the all-network for belonging to the field accesses behavioral data Weight values flex point, and the weighted value flex point is set to the corresponding interest threshold in the field.
18. devices as claimed in claim 14, it is characterised in that in the corresponding multiple dimensions of the network access behavioral data Attribute information also include:
The entity word that network access behavioral data is included touching in the network access behavioral data art reaches number of times;
Visiting frequency of the entity word that network access behavioral data is included in the network access behavioral data art;
Access interval of the entity word that network access behavioral data is included in the network access behavioral data art.
19. devices as claimed in claim 18, it is characterised in that:
Weighted value module is additionally operable to, and institute is calculated according to the attribute information in the corresponding multiple dimensions of the network access behavioral data State the weighted value of the entity word in network access behavioral data;
Attention rate module is additionally operable to, and the weighted value of the entity word in the network access behavioral data determines the user couple The attention rate of the entity word in field belonging to the network access behavioral data;
Interest identification module is additionally operable to, according to the user to the entity word in the network access behavioral data art The corresponding interest threshold of entity word in attention rate and corresponding field set in advance recognizes the interest of the user.
20. devices as claimed in claim 19, it is characterised in that described device also includes:
Issue module, for according to Real-time Collection to user network access behavioral data belonging to field in each entity word Corresponding attention rate, corresponding content is issued to user.
Device in 21. such as claim 12 to 20 as described in any claim, it is characterised in that described device also includes:
Module is issued, the entity word for being included according to the network access behavioral data of user is issued in corresponding to user Hold.
Device in 22. such as claim 12 to 20 as described in any claim, it is characterised in that:
The user is to the attention rate in the field belonging to the network access behavioral data and the interest of the user according to reality When the network access behavioral data real-time update of user that collects;Or
The user is to the attention rate in the field belonging to the network access behavioral data and the interest of the user according to adopting The network access behavioral data of the user for collecting is regularly updated.
CN201410038066.XA 2014-01-26 2014-01-26 User interest finds method and apparatus Active CN103780625B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410038066.XA CN103780625B (en) 2014-01-26 2014-01-26 User interest finds method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410038066.XA CN103780625B (en) 2014-01-26 2014-01-26 User interest finds method and apparatus

Publications (2)

Publication Number Publication Date
CN103780625A CN103780625A (en) 2014-05-07
CN103780625B true CN103780625B (en) 2017-07-04

Family

ID=50572455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410038066.XA Active CN103780625B (en) 2014-01-26 2014-01-26 User interest finds method and apparatus

Country Status (1)

Country Link
CN (1) CN103780625B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361063B (en) * 2014-11-04 2018-03-16 北京字节跳动网络技术有限公司 user interest discovery method and device
CN104991935B (en) * 2015-07-06 2019-03-12 无锡天脉聚源传媒科技有限公司 A kind for the treatment of method and apparatus of website attention rate
CN105893407A (en) * 2015-11-12 2016-08-24 乐视云计算有限公司 Individual user portraying method and system
CN106202502B (en) * 2016-07-20 2020-02-07 福州大学 User interest discovery method in music information network
CN108460050A (en) * 2017-02-21 2018-08-28 中兴通讯股份有限公司 A kind of history management method and device
CN107358447B (en) * 2017-06-29 2021-01-29 安徽大学 Personalized service recommendation method and system with service quality as center
CN108769809B (en) * 2018-05-28 2021-06-29 成都极米科技股份有限公司 Smart television-based home user behavior data acquisition method and device and computer-readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6385619B1 (en) * 1999-01-08 2002-05-07 International Business Machines Corporation Automatic user interest profile generation from structured document access information
CN101866341A (en) * 2009-04-17 2010-10-20 华为技术有限公司 Information push method, device and system
CN102402766A (en) * 2011-12-27 2012-04-04 纽海信息技术(上海)有限公司 User interest modeling method based on web page browsing

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7613664B2 (en) * 2005-03-31 2009-11-03 Palo Alto Research Center Incorporated Systems and methods for determining user interests
US8438170B2 (en) * 2006-03-29 2013-05-07 Yahoo! Inc. Behavioral targeting system that generates user profiles for target objectives

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6385619B1 (en) * 1999-01-08 2002-05-07 International Business Machines Corporation Automatic user interest profile generation from structured document access information
CN101866341A (en) * 2009-04-17 2010-10-20 华为技术有限公司 Information push method, device and system
CN102402766A (en) * 2011-12-27 2012-04-04 纽海信息技术(上海)有限公司 User interest modeling method based on web page browsing

Also Published As

Publication number Publication date
CN103780625A (en) 2014-05-07

Similar Documents

Publication Publication Date Title
CN103780625B (en) User interest finds method and apparatus
CN103886090B (en) Content recommendation method and device based on user preferences
US10572565B2 (en) User behavior models based on source domain
CN107273489B (en) Content delivery method, electronic equipment and computer storage medium
CN103888466B (en) user interest discovery method and device
CN102929928B (en) Multidimensional-similarity-based personalized news recommendation method
CN104205158B (en) Measure the system, method and product of online spectators
CN103455522B (en) Recommendation method and system of application extension tools
US8694374B1 (en) Detecting click spam
US9996630B2 (en) System and/or method for linking network content
CN102831114B (en) Realize method and the device of internet user access Statistic Analysis
CN103744916B (en) A kind of method and apparatus for sharing temperature information for being used to determine target video
CN102298615A (en) Method for displaying research result realized by computer and equipment
CN103324645A (en) Method and device for recommending webpage
CN107103062A (en) A kind of webpage recommending method and system
CN102708174A (en) Method and device for displaying rich media information in browser
CN101814171A (en) Media-oriented network influence index calculation method
CN103019550A (en) Real-time display method and system for associated content
CN104899236B (en) A kind of comment information display methods, apparatus and system
CN107679239A (en) Recommend method in a kind of personalized community based on user behavior
US20160267521A1 (en) Systems and methods for electronically monitoring audience attentiveness and receptiveness
CN109492076A (en) A kind of network-based community's question and answer website answer credible evaluation method
GB2456916A (en) Method for presenting promotional information on a web page, e.g. an on-line targeted advertising method.
CN109635192A (en) Magnanimity information temperature seniority among brothers and sisters update method and platform towards micro services
CN102479193A (en) Method and equipment for match search popularization based on match bid coefficient

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant