Summary of the invention
The invention provides a kind of method and the device of determining weight of website, the accuracy of search engine image data and the promptness of renewal can be improved.
The invention provides following scheme:
Determine a method for weight of website, comprising:
The accessed web page relevant information of user is reported search engine server by browser end; Described accessed web page relevant information comprises: the unique identification information of accessed webpage and when the target web that user's access links is corresponding, the unique identification information of the source web page at described link place;
Described search engine server according to the described accessed web page relevant information collected from multiple browser end, the authoritative information of statistics website, the authoritative information of described website comprises the quantity of chain outside webpage quantity and website that website comprises; So that described search engine server determines the weight of website according to the authoritative information of described website.
Wherein, also comprise:
Add up the visit capacity of each webpage under same website, adjust the weight of this website according to the visit capacity of each webpage under same website.
Wherein, the weight of the described adjustment of the visit capacity according to each webpage under same website website comprises:
According to visit capacity under same website more than the quantity of the webpage of the first preset threshold value, this website is weighted;
Or,
According to total visit capacity of same website, this website is weighted.
Wherein, the accessed web page relevant information that described browser end reports also comprises the user profile of accessed web page, and described method also comprises:
Add up the calling party amount of each webpage under same website, adjust the weight of this website according to the calling party amount of each webpage under same website.
Wherein, the described weight adjusting this website according to the calling party amount of each webpage under same website comprises:
According to calling party amount under same website more than the quantity of the webpage of the second preset threshold value, this website is weighted;
Or,
According to total calling party amount of same website, this website is weighted.
Determine a device for weight of website, comprising:
Browser end processing unit, is positioned at browser end, for the accessed web page relevant information of user is reported search engine server; Described accessed web page relevant information comprises: the unique identification information of accessed webpage and when the target web that user's access links is corresponding, the unique identification information of the source web page at described link place;
Search engine processing unit, be positioned at described search engine server end, for according to the described accessed web page relevant information collected from multiple browser end, the authoritative information of statistics website, the authoritative information of described website comprises the quantity of chain outside webpage quantity and website that website comprises; So that described search engine server determines the weight of website according to the authoritative information of described website.
Wherein, also comprise:
By visit capacity adjustment unit, for adding up the visit capacity of each webpage under same website, adjust the weight of this website according to the visit capacity of each webpage under same website.
Wherein, describedly to comprise by visit capacity adjustment unit:
First weighting subelement, for according to visit capacity under same website more than the quantity of the webpage of the first preset threshold value, this website is weighted;
Or,
Second weighting subelement, for the total visit capacity according to same website, is weighted this website.
Wherein, the accessed web page relevant information that described browser end reports also comprises the user profile of accessed web page, and described device also comprises:
By calling party amount adjustment unit, for adding up the calling party amount of each webpage under same website, adjust the weight of this website according to the calling party amount of each webpage under same website.
Wherein, describedly to comprise by calling party amount adjustment unit:
3rd weighting subelement, for according to calling party amount under same website more than the quantity of the webpage of the second preset threshold value, this website is weighted;
Or,
4th weighting subelement, for the total calling party amount according to same website, is weighted this website.
According to specific embodiment provided by the invention, the invention discloses following technique effect:
Pass through the present invention, user's accessed web page relevant information that search engine server can be reported by browser end, chain quantity outside the website counting webpage quantity that website comprises and website, like this, more just can determine the weight of each website in conjunction with other parameter (as webpage renewal frequency etc.).Like this, when needing to download the network address in URL library, just can carry out download schedule according to the weight of the website at each network address place, certainly, also weight of website can be applied to other occasions, such as, according to the webpage of user's current accessed for user recommend other related web pages time, according to the weight of website, webpage place, each related web page can be sorted equally; Or, utilize the weight of appointed website to carry out application to recommend: if the exemplary application of candidate is from the website of specifying, then on original score value, to add the weight of this website, to improve weights, carry out integrated ordered again, export several application that score value is the highest, etc.Wherein, outside the website adding up the webpage quantity that comprises of website and website during chain quantity, owing to being that the user's accessed web page correlation circumstance reported according to browser end carries out adding up, therefore, to carry out for the mode of adding up again according to the webpage that certain frequency carries out capturing relative to search engine, accuracy can be made higher, also can obtain upgrading more timely simultaneously.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain, all belongs to the scope of protection of the invention.
See Fig. 1, the method for the determination weight of website that the embodiment of the present invention provides comprises the following steps:
S101: the accessed web page relevant information of user is reported search engine server by browser end; Described accessed web page relevant information comprises: the unique identification information of accessed webpage and when the target web that user clicks on links is corresponding, the unique identification information of the source web page at link place;
In embodiments of the present invention, from the data precision improved as website authority judging basis and renewal promptness angle, the quality of the Search Results that search engine provides is improved.During specific implementation, mainly browser is combined with search engine, browser is as the instrument of user's accessed web page, the concrete access situation of user can be got, comprise the unique identification information of the webpage of user's access, and when if the webpage of user's access is the target web of correspondence of certain link, the URL of the source web page at this link place, etc.Wherein, during the target web of the link correspondence in user's access originator webpage, may be directly in source web page, click this link, or, also may be that the modes such as the address field click of source web page being copied to browser conduct interviews, in a word, the operation that browser end can perform according to user, grabs the unique identification of link place source web page.About the unique identification of webpage, can be URL (the Uniform/UniversalResource Locator of webpage, URL(uniform resource locator)), or, to a certain extent, the MD5 value etc. of web page title or web page contents, also can as the unique identification of webpage, therefore, reported server to be also fine.For ease of describing, be hereafter all introduced for URL.
It should be noted that, due in practical application, the computed applied environment of people, as being not quite similar of operating system, browser type etc., multiple implementation can be had: such as use a kind of browser with monitoring function to the process that user's accessed web page situation is monitored, when user uses browser access webpage, record accessed web page relevant information.Wherein, browser can be the browser Internet Explorer (being called for short IE) that Windows operating system carries, and other third party's browsers.So-called third party's browser; be often referred to the browser software of the non-IE run in Windows operating system; this kind of third party's browser can have abundant unique function design for user and personalized expansion because of it usually, manyly to apply easily for user provides.
In addition for the browser supporting plug-in extension function, also can be realized by the plug-in card program started with browser; Plug-in unit is application program that write out according to certain application programming interfaces specification, that can be called to realize processing by master routine certain affairs, in embodiments of the present invention, for not obtaining and reporting functions the relevant information of user's accessed web page, but the browser that browser plug-in is expanded can be supported, being realized by plug-in card program, is also a kind of effective implementation.
Moreover, can also by non-browser program and browser plug-in, such as certain watchdog routine or program monitoring assembly have been come, what sent user by watchdog routine or program monitoring assembly detects the request of access of target web, and reports the relevant information of user's accessed web page.
It should be noted that in addition, when specific implementation, browser end is once find that user accesses certain webpage, just the relevant information of this visit can be reported search engine server, or, also first can carry out record at browser end to the access related information of user, when reaching certain hour interval, or when the data of record reach a certain amount of, then report search engine server, etc.
S102: described search engine server according to the described accessed web page relevant information collected from multiple browser end, the authoritative information of statistics website, the authoritative information of described website comprises the quantity of chain outside webpage quantity and website that website comprises; So that described search engine server determines the weight of website according to the authoritative information of described website.
Because the customer volume of browser is larger, therefore, search engine server also can collect a large amount of accessed web page relevant informations from numerous browser ends, and then just can add up the authoritative information of website according to these information.
For the ease of understanding, the first simple difference introducing lower website and webpage here.Generally speaking, website is the properties collection having independent domain name, independent parking space, and these contents may be webpages, also may be program or alternative document, not necessarily will have a lot of webpage, as long as have independent domain name and space, even only have a page also to cry website; Webpage is the ingredient of website, and be the platform of carrying various websites application, a webpage is exactly a browsing pages, and such as blog, the personal homepage hanging over others' there, enterprise build a station the enterprise's page in system, the trade company's page in multi-user mall etc.The URL of webpage is also referred to as web page address, is the address of the resource of standard on the Internet, generally speaking, complete, look as follows with the common uniform resource identifier grammer of authorization portions:
Agreement: // user name password: subdomain name. domain name. TLD: port numbers/directory/file name. file suffixes? parameter=value # mark
Visible, contain the domain name of affiliated web site in the URL of webpage, whether identical by judging the domain name comprised in the URL of each webpage, just can judge whether belong to same website.Therefore, for search engine server, after the URL of webpage collecting user's access from browser, just the webpage of the same domain name comprised in URL can be sorted out, so just can count the webpage quantity comprised in same website.Certainly, because browser end can the accessed web page relevant information of constantly report of user, therefore, the data that search engine server also can be uploaded according to browser end, constantly upgrade this parameter of webpage quantity comprised in website.
Meanwhile, because browser end also uploads the link situation of user's accessed web page, therefore, the quantity that plain engine server also can count chain outside website is accordingly searched.During specific implementation, at browser end, if find that user is certain webpage that the mode that clicks web page interlinkage is accessed, then while the URL recording this accessed webpage, the URL of the source web page at this web page interlinkage place can be recorded.Such as, suppose that user clicks the link of webpage B in the content of pages of webpage A, browse the page of webpage B with this, be then equivalent to the link comprising webpage B in the content of pages of webpage A, accordingly, just webpage A is called the source web page at the link place of webpage B.The URL of webpage A is also reported search engine server, informs search engine server with this while the URL of webpage B will being reported search engine server by browser end, and that user accesses is webpage B, and includes the link of this webpage B in webpage A.Search engine server is after knowing this information, just can judge the URL of webpage B and webpage A further, if the domain name of the URL of two webpages is identical, then belong to chain in website, if the domain name of the URL of two webpages is different, then belong to chain outside website, the website that is belonging to webpage A imports to the website belonging to webpage B.By that analogy, webpage B and its URL linking other former webpages at place can also be compared, chain quantity outside the website counting this webpage B, simultaneously, chain quantity outside the website that can also count other webpages under webpage B affiliated web site, chain quantity outside the website of webpages all under same website is merged, chain quantity outside the website that just can obtain this website.
By described visible above, user's accessed web page relevant information that search engine server can be reported by browser end, chain quantity outside the website counting webpage quantity that website comprises and website, like this, the weight of each website just can be determined again in conjunction with other parameter, and then just can when providing Search Results to user, the weight according to the website at each Search Results place sorts, and then is presented to user.Wherein, outside the website adding up the webpage quantity that comprises of website and website during chain quantity, owing to being that the user's accessed web page correlation circumstance reported according to browser end carries out adding up, therefore, to carry out for the mode of adding up again according to the webpage that certain frequency carries out capturing relative to search engine, accuracy can be made higher, also can obtain upgrading more timely simultaneously.
In addition, in embodiments of the present invention, owing to can get the accessed web page relevant information of user from browser end, therefore, except can counting the website of webpage quantity that website comprises, website outside chain quantity, website can also be gone out and access relevant information to user, access to user the authority that relevant information also can embody website to a certain extent due to this, therefore on the basis using traditional calculating weight of website, the authoritative weight of website can also be adjusted.
During specific implementation, first can count the visit capacity of each webpage under same website, then adjust the weight of this website according to the visit capacity of each webpage under same website.Wherein, when adding up the visit capacity of each webpage under same website, can carry out as follows: at browser end, as long as find that user have accessed certain webpage, just can upload the relevant information of this visit, at search engine server end, often receive the information that a browser end is sent, just therefrom can parse the URL of webpage, judge whether there is this URL in database, if there is no, then determine the website belonging to this webpage according to the domain name of the URL of this webpage, added in database, its access times are designated as 1 simultaneously; If there is this webpage in database, then on the access times basis that this webpage is current, add 1, other webpages have also all done similar process.Like this, for same website, while counting its all webpages comprised, user's access times that its each webpage comprised is corresponding respectively can also be counted.
Specifically when being weighted according to the weight of visit capacity to website of each webpage under same website, multiple implementation can be had.Such as, under a kind of mode, the quantity of the webpage of preset threshold value can be exceeded according to visit capacity under same website, weight of website is weighted, also be, under same website, the quantity of the webpage that visit capacity is larger is larger, or the larger webpage proportion of visit capacity is larger, then the weight of website is larger.Or, under another kind of mode, also according to total visit capacity of same website, this website can be weighted.That is, after the visit capacity counting each webpage under same website, visit capacity corresponding respectively for each webpage is added total visit capacity that just can obtain this website, the website that total visit capacity is larger, and corresponding weight is also higher.
Except being weighted according to visit capacity, can also be weighted according to the calling party amount of website.Wherein, visit capacity and calling party amount are different concepts, visit capacity refers to total access times, for the website that two total visit capacities are suitable, user's visit capacity may be different, such as, the visit capacity of one of them website may be brought by a lot of user, and the visit capacity of another website may be brought by a few users, now then prove that the calling party amount of first website is greater than second website, accordingly, the authority of first website also can be greater than second website.During specific implementation, in order to the calling party amount making search engine server can count each website, browser end is when reporting accessed web page relevant information, user profile can also be carried, wherein, user profile can be that user registers and the accounts information logged in, or can also be the IP address information of user, etc.Search engine server is according to user profile, just can count the calling party amount of each webpage, then the calling party amount of each webpage under same website is added total calling party amount that just can obtain website, and then according to total calling party amount of this website, website is weighted, the website that total calling party amount is larger, weight is also higher.Certainly, when being weighted according to calling party amount, except total visit capacity of foundation website, also can according to other factor, such as, similar with visit capacity, the webpage number that in website, calling party amount is larger can be seen, or the ratio shared by the webpage that calling party amount is larger is weighted website, etc.
In a word, user's accessed web page relevant information that search engine server can be reported by browser end, chain quantity outside the website counting webpage quantity that website comprises and website, like this, more just can determine the weight of each website in conjunction with other parameter (as webpage renewal frequency etc.).Like this, when needing to download the network address in URL library, just can carry out download schedule according to the weight of the website at each network address place, certainly, also weight of website can be applied to other occasions, such as, according to the webpage of user's current accessed for user recommend other related web pages time, according to the weight of website, webpage place, each related web page can be sorted equally; Or, utilize the weight of appointed website to carry out application to recommend: if the exemplary application of candidate is from the website of specifying, then on original score value, to add the weight of this website, to improve weights, carry out integrated ordered again, export several application that score value is the highest, etc.Wherein, outside the website adding up the webpage quantity that comprises of website and website during chain quantity, owing to being that the user's accessed web page correlation circumstance reported according to browser end carries out adding up, therefore, to carry out for the mode of adding up again according to the webpage that certain frequency carries out capturing relative to search engine, accuracy can be made higher, also can obtain upgrading more timely simultaneously.
What provide with the embodiment of the present invention provides the method for Search Results corresponding, and the embodiment of the present invention additionally provides a kind of device providing Search Results, and see Fig. 2, this device can comprise:
Browser end processing unit 201, is positioned at browser end, for the accessed web page relevant information of user is reported search engine server; Described accessed web page relevant information comprises: the unique identification information of accessed webpage and when the target web that user's access links is corresponding, the unique identification information of the source web page at described link place;
Search engine processing unit 202, be positioned at described search engine server end, for according to the described accessed web page relevant information collected from multiple browser end, the authoritative information of statistics website, the authoritative information of described website comprises the quantity of chain outside webpage quantity and website that website comprises; So that described search engine server determines the weight of website according to the authoritative information of described website.
During specific implementation, this device can also comprise:
By visit capacity adjustment unit, for adding up the visit capacity of each webpage under same website, adjust the weight of this website according to the visit capacity of each webpage under same website.
Wherein, describedly to comprise by visit capacity adjustment unit:
First weighting subelement, for according to visit capacity under same website more than the quantity of the webpage of the first preset threshold value, this website is weighted;
Or,
Second weighting subelement, for the total visit capacity according to same website, is weighted this website.
Under another kind of embodiment, the accessed web page relevant information that described browser end reports also comprises the user profile of accessed web page, and now, this device can also comprise:
By calling party amount adjustment unit, for adding up the calling party amount of each webpage under same website, adjust the weight of this website according to the calling party amount of each webpage under same website.
Concrete, describedly can to comprise by calling party amount adjustment unit:
3rd weighting subelement, for according to calling party amount under same website more than the quantity of the webpage of the second preset threshold value, this website is weighted;
Or,
4th weighting subelement, for the total calling party amount according to same website, is weighted this website.
User's accessed web page relevant information that search engine server can be reported by browser end, chain quantity outside the website counting webpage quantity that website comprises and website, like this, more just can determine the weight of each website in conjunction with other parameter (as webpage renewal frequency etc.).Like this, when needing to download the network address in URL library, just can carry out download schedule according to the weight of the website at each network address place, certainly, also weight of website can be applied to other occasions, such as, according to the webpage of user's current accessed for user recommend other related web pages time, according to the weight of website, webpage place, each related web page can be sorted equally; Or, utilize the weight of appointed website to carry out application to recommend: if the exemplary application of candidate is from the website of specifying, then on original score value, to add the weight of this website, to improve weights, carry out integrated ordered again, export several application that score value is the highest, etc.Wherein, outside the website adding up the webpage quantity that comprises of website and website during chain quantity, owing to being that the user's accessed web page correlation circumstance reported according to browser end carries out adding up, therefore, to carry out for the mode of adding up again according to the webpage that certain frequency carries out capturing relative to search engine, accuracy can be made higher, also can obtain upgrading more timely simultaneously.
As seen through the above description of the embodiments, those skilled in the art can be well understood to the mode that the present invention can add required general hardware platform by software and realizes.Based on such understanding, technical scheme of the present invention can embody with the form of software product the part that prior art contributes in essence in other words, this computer software product can be stored in storage medium, as ROM/RAM, magnetic disc, CD etc., comprising some instructions in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) perform the method described in some part of each embodiment of the present invention or embodiment.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for device or system embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, relevant part illustrates see the part of embodiment of the method.Apparatus and system embodiment described above is only schematic, the wherein said unit illustrated as separating component or can may not be and physically separates, parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of module wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.Those of ordinary skill in the art, when not paying creative work, are namely appreciated that and implement.
Above to method and the device of determining weight of website provided by the present invention, be described in detail, apply specific case herein to set forth principle of the present invention and embodiment, the explanation of above embodiment just understands method of the present invention and core concept thereof for helping; Meanwhile, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications.In sum, this description should not be construed as limitation of the present invention.