Summary of the invention
The invention provides the method and the device of a kind of definite website weight, can improve the promptness of the accuracy and the renewal of search engine image data.
The invention provides following scheme:
The method of a kind of definite website weight comprises:
Browser end reports search engine server with user's accessed web page relevant information; Said accessed web page relevant information comprises: by the uniqueness identification information of accessed web page and when user capture links corresponding target web, and the uniqueness identification information of the source web page at said link place;
Said search engine server is according to the said accessed web page relevant information of collecting from a plurality of browser ends, the authoritative information of statistics website, the webpage quantity that the authoritative packets of information purse rope station of said website comprises, and the website outside the quantity of chain; So that said search engine server is confirmed the weight of website according to the authoritative information of said website.
Wherein, also comprise:
Add up the visit capacity of each webpage under the same website, adjust the weight of this website according to the visit capacity of each webpage under the same website.
Wherein, the weight of said visit capacity adjustment website according to each webpage under the same website comprises:
According to the quantity of visit capacity under the same website, weighting is carried out in this website above the webpage of first preset threshold value;
Perhaps,
According to total visit capacity of same website, weighting is carried out in this website.
Wherein, the accessed web page relevant information that said browser end reports also comprises the user profile of accessed web page, and said method also comprises:
Add up the calling party amount of each webpage under the same website, adjust the weight of this website according to the calling party amount of each webpage under the same website.
Wherein, the weight of said this website of calling party amount adjustment according to each webpage under the same website comprises:
According to the quantity of calling party amount under the same website, weighting is carried out in this website above the webpage of second preset threshold value;
Perhaps,
According to total calling party amount of same website, weighting is carried out in this website.
The device of a kind of definite website weight comprises:
The browser end processing unit is positioned at browser end, is used for user's accessed web page relevant information is reported search engine server; Said accessed web page relevant information comprises: by the uniqueness identification information of accessed web page and when user capture links corresponding target web, and the uniqueness identification information of the source web page at said link place;
The search engine processing unit; Be positioned at said search engine server end; Be used for according to the said accessed web page relevant information of collecting from a plurality of browser ends, the authoritative information of statistics website, the webpage quantity that the authoritative packets of information purse rope station of said website comprises, and the website outside the quantity of chain; So that said search engine server is confirmed the weight of website according to the authoritative information of said website.
Wherein, also comprise:
By visit capacity adjustment unit, the visit capacity that is used to add up each webpage under the same website, adjust the weight of this website according to the visit capacity of each webpage under the same website.
Wherein, saidly comprise by the visit capacity adjustment unit:
The first weighting subelement is used for according to the quantity of visit capacity under the same website above the webpage of first preset threshold value weighting being carried out in this website;
Perhaps,
The second weighting subelement is used for the total visit capacity according to same website, and weighting is carried out in this website.
Wherein, the accessed web page relevant information that said browser end reports also comprises the user profile of accessed web page, and said device also comprises:
Press calling party amount adjustment unit, be used to add up the calling party amount of each webpage under the same website, adjust the weight of this website according to the calling party amount of each webpage under the same website.
Wherein, saidly comprise by calling party amount adjustment unit:
The 3rd weighting subelement is used for according to the quantity of calling party amount under the same website above the webpage of second preset threshold value weighting being carried out in this website;
Perhaps,
The 4th weighting subelement is used for the total calling party amount according to same website, and weighting is carried out in this website.
According to specific embodiment provided by the invention, the invention discloses following technique effect:
Through the present invention; The user capture webpage relevant information that search engine server can report through browser end; Count chain quantity outside the website of webpage quantity that the website comprises and website; Combine other parameter (like page refreshment frequency etc.) just can determine the weight of each website like this, again.Like this, in the time need downloading, just can carry out download schedule according to the weight of the website at each network address place to the network address in the network address storehouse; Certainly; Also can the website weight be applied to other occasions, for example, when recommending other related web pages for the user according to the webpage of user's current accessed; Can belong to the weight of website equally according to webpage, each related web page is sorted; Perhaps, utilize the weight of appointed website to use recommendation: if candidate's exemplary application from specified web, then adds the weight of this website on original score value; To improve weights; Carry out again integrated ordered, output score value the highest several application, or the like.Wherein, Outside the website of the statistics webpage quantity that comprises of website and website during chain quantity; Owing to be to add up according to the user capture webpage correlation circumstance that browser end reports, therefore, for the mode that the webpage that grasps according to certain frequency with respect to search engine is added up again; Can also can obtain upgrading more timely simultaneously so that accuracy is higher.
Embodiment
To combine the accompanying drawing in the embodiment of the invention below, the technical scheme in the embodiment of the invention is carried out clear, intactly description, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, the every other embodiment that those of ordinary skills obtained belongs to the scope that the present invention protects.
Referring to Fig. 1, the method for definite website weight that the embodiment of the invention provides may further comprise the steps:
S101: browser end reports search engine server with user's accessed web page relevant information; Said accessed web page relevant information comprises: by the uniqueness identification information of accessed web page and when the corresponding target web of user clicks on links, and the uniqueness identification information of the source web page at link place;
In embodiments of the present invention, from improving as the data accuracy of the authoritative judging basis in website and upgrading the promptness angle, improve the quality of the Search Results that search engine provides.During concrete the realization; Mainly be that browser is combined with search engine, browser can get access to user's concrete visit situation as the instrument of user capture webpage; The uniqueness identification information that comprises user accessing web page; And if user accessing web page is when being the target web of correspondence of certain link, the URL of the source web page at this link place, or the like.Wherein, During the corresponding target web of link in the user capture source web page, possibly be directly in source web page, to click this link, perhaps; Also possibly be that click with source web page copies modes such as browser's address bar to and conducts interviews; In a word, browser end can grab the uniqueness sign of link place source web page according to the operation of user's execution.About the uniqueness sign of webpage, can be the URL (Uniform/Universal Resource Locator, URL) of webpage; Perhaps; To a certain extent, the MD5 value of web page title or web page contents etc., the uniqueness that also can be used as webpage identifies; Therefore, it being reported server also is fine.For ease of describing, hereinafter all is that example is introduced with URL.
Need to prove; Because in the practical application; The computed applied environment of people, like being not quite similar of operating system, browser type etc., the process that user capture webpage situation is monitored can have multiple implementation: for example use a kind of browser that has monitoring function; When the user uses the browser access webpage, note the accessed web page relevant information.Wherein, browser can be the browser Internet Explorer (being called for short IE) that Windows operating system carries, and other third party's browsers.So-called third party's browser; Be often referred to the browser software of the non-IE that on Windows operating system, moves; This type third party browser can be used for the user provides many because of it has abundant unique function design and personalized expansion to the user usually easily.
To the browser of supporting the plug-in unit expanded function, also can realize in addition by the plug-in card program that starts with browser; Plug-in unit be write out according to certain application programming interfaces standard, can be called realize to handle the application program of certain affairs by master routine; In embodiments of the present invention; For the relevant information of user capture webpage not being obtained and reporting functions; But can support the browser of browser plug-in expansion, realize through plug-in card program, also be a kind of effective implementation.
Moreover; Can also be by non-browser program and browser plug-in; Accomplish such as certain watchdog routine or program monitoring assembly; By watchdog routine or program monitoring assembly the access request to target web that the user sends is detected, and the relevant information of user capture webpage is reported.
Need to prove that in addition when concrete the realization, in a single day browser end finds the user capture webpage; Just can the relevant information of this visit be reported search engine server, perhaps, also can carry out record at browser end to user's access relevant information earlier; When reaching certain hour at interval; When perhaps data recorded reaches a certain amount of, report search engine server again, or the like.
S102: said search engine server is according to the said accessed web page relevant information of collecting from a plurality of browser ends, the authoritative information of statistics website, the webpage quantity that the authoritative packets of information purse rope station of said website comprises, and the website outside the quantity of chain; So that said search engine server is confirmed the weight of website according to the authoritative information of said website.
Because the customer volume of browser is bigger, therefore, search engine server also can be collected a large amount of accessed web page relevant informations from numerous browser ends, and then just can add up the authoritative information of website according to these information.
For the ease of understanding, the at first difference of website and webpage under the brief account here.Generally speaking, the website is the properties collection that independent domain name, independent parking space are arranged, and these contents possibly be webpages, also possibly be program or alternative document, not necessarily a lot of webpages will be arranged, as long as independent domain name and space are arranged, even have only a page also to cry the website; Webpage is the ingredient of website, is to carry the platform that use various websites, and a webpage is exactly a browsing pages, for example build a station the enterprise's page in the system, trade company's page in the multi-user store or the like of blog, the personal homepage, the enterprise that hang over others' there.The URL of webpage also is called as web page address, is the resource addresses of standard on the Internet, and generally speaking, common unified resource identifier grammer complete, that have authorization portions looks as follows:
Agreement: // user name password: subdomain name. domain name. TLD: port numbers/directory/file name. file suffixes? Parameter=value # sign
It is thus clear that, comprised the domain name of affiliated web site among the URL of webpage, whether identical through the domain name that comprises among the URL that judges each webpage, just can judge whether belong to same website.Therefore, for search engine server, after collecting the URL of user accessing web page, just can the webpage of the same domain name that comprises among the URL be sorted out, so just can count the webpage quantity that comprises in the same website from browser.Certainly, because the browser end accessed web page relevant information of report of user constantly, therefore, the data that search engine server also can be uploaded according to browser end are constantly upgraded this parameter of webpage quantity that comprises in the website.
Simultaneously, because browser end has also been uploaded the link situation of user capture webpage, therefore, search the quantity that plain engine server also can count chain outside the website in view of the above.During concrete the realization, at browser end, if find that the user is certain webpage of visiting with the mode of having clicked web page interlinkage, then can note this by the URL of accessed web page in, note the URL of the source web page at this web page interlinkage place.For example, suppose that the user has clicked the link of webpage B in the content of pages of webpage A, come the page of browsing page B, then be equivalent to comprise in the content of pages of webpage A the link of webpage B with this, corresponding, just webpage A is called the source web page that the link of webpage B belongs to.When browser end will report search engine server with the URL of webpage B, the URL of webpage A is also reported search engine server, informs search engine server with this, user capture be webpage B, and include the link of this webpage B among the webpage A.Search engine server is after knowing this information; Just can be further the URL of webpage B and webpage A be judged; If the domain name of the URL of two webpages is identical, then belong to chain in the website, if the domain name of the URL of two webpages is different; Then belong to chain outside the website, that is to say by the website under the webpage A to import to the website under the webpage B.By that analogy; Can also webpage B and the URL that it links other former webpages at place be compared; Count chain quantity outside the website of this webpage B, simultaneously, can also count chain quantity outside the website of other webpages under the webpage B affiliated web site; Chain quantity outside the website of all webpages under the same website is merged, just can obtain chain quantity outside the website of this website.
Said visible through preamble; The user capture webpage relevant information that search engine server can report through browser end counts chain quantity outside the website of webpage quantity that the website comprises and website, like this; Combine other parameter just can determine the weight of each website again; And then just can be when the user provides Search Results, sort according to the weight of the website at each Search Results place, and then represent to the user.Wherein, Outside the website of the statistics webpage quantity that comprises of website and website during chain quantity; Owing to be to add up according to the user capture webpage correlation circumstance that browser end reports, therefore, for the mode that the webpage that grasps according to certain frequency with respect to search engine is added up again; Can also can obtain upgrading more timely simultaneously so that accuracy is higher.
In addition, in embodiments of the present invention, owing to can get access to user's accessed web page relevant information from browser end; Therefore; Except the website that can count webpage quantity that the website comprises, website, outside the chain quantity, can also go out the website information relevant, because this information relevant with user capture also can embody the authority of website to a certain extent with user capture; Therefore can also on the basis of using traditional calculating website weight, the authoritative weight of website be adjusted.
During concrete the realization, at first can count the visit capacity of each webpage under the same website, adjust the weight of this website then according to the visit capacity of each webpage under the same website.Wherein, under the same website of statistics, during the visit capacity of each webpage, can carry out as follows: at browser end; As long as certain webpage of having found user capture just can be uploaded the relevant information of this visit, at the search engine server end; Whenever receive the information that a browser end is sent, just can therefrom parse the URL of webpage, whether had this URL in the judgment data storehouse; If do not exist; Then confirm the website under this webpage, it is added in the database, simultaneously its access times are designated as 1 according to the domain name of the URL of this webpage; If there has been this webpage in the database, then on the current access times basis of this webpage, add 1, other webpages are also all done similar processing.Like this, to same website, when counting its all webpages that comprise, can also count the corresponding respectively user capture number of times of its each webpage that comprises.
Specifically the visit capacity of each webpage adds temporary the weight of website under according to same website, and multiple implementation can be arranged.For example; Under a kind of mode, can surpass the quantity of the webpage of preset threshold value according to visit capacity under the same website, the website weight is carried out weighting; Also be; The quantity of the webpage that visit capacity is bigger under the same website is big more, and the webpage proportion that perhaps visit capacity is bigger is big more, and then the weight of website is big more.Perhaps, under another kind of mode, also can carry out weighting to this website according to total visit capacity of same website.That is to say that after the visit capacity of each webpage under counting same website, the visit capacity addition that each webpage is corresponding respectively just can obtain total visit capacity of this website, total big more website of visit capacity, corresponding weight is also just high more.
Except carrying out the weighting, can also carry out weighting according to the calling party amount of website according to visit capacity.Wherein, visit capacity and calling party amount are different concept, and visit capacity is meant total access times; For the suitable website of two total visit capacities; The user capture amount possibly be different, and for example, the visit capacity of one of them website possibly brought by a lot of users; And the visit capacity of another website possibly brought by few users; Prove then that the calling party amount of first website is greater than second website this moment, and corresponding, the authority of first website also can be greater than second website.During concrete the realization; In order to make search engine server can count the calling party amount of each website, browser end can also carry user profile when reporting the accessed web page relevant information; Wherein, User profile can be the accounts information that the user registers and logins, and perhaps can also be user's IP address information, or the like.Search engine server is according to user profile; Just can count the calling party amount of each webpage; Then the calling party amount addition of each webpage under the same website just can be obtained total calling party amount of website; And then according to total calling party amount of this website weighting is carried out in the website and get final product, total big more website of calling party amount, weight is also just high more.Certainly, carrying out according to the calling party amount under the situation of weighting, except total visit capacity according to the website; Also can be according to other factor; For example, similar with visit capacity, can see the webpage number that the calling party amount is bigger in the website; The shared ratio of the webpage that perhaps the calling party amount is bigger is carried out weighting to the website, or the like.
In a word; The user capture webpage relevant information that search engine server can report through browser end; Count chain quantity outside the website of webpage quantity that the website comprises and website, like this, combine other parameter (like page refreshment frequency etc.) just can determine the weight of each website again.Like this, in the time need downloading, just can carry out download schedule according to the weight of the website at each network address place to the network address in the network address storehouse; Certainly; Also can the website weight be applied to other occasions, for example, when recommending other related web pages for the user according to the webpage of user's current accessed; Can belong to the weight of website equally according to webpage, each related web page is sorted; Perhaps, utilize the weight of appointed website to use recommendation: if candidate's exemplary application from specified web, then adds the weight of this website on original score value; To improve weights; Carry out again integrated ordered, output score value the highest several application, or the like.Wherein, Outside the website of the statistics webpage quantity that comprises of website and website during chain quantity; Owing to be to add up according to the user capture webpage correlation circumstance that browser end reports, therefore, for the mode that the webpage that grasps according to certain frequency with respect to search engine is added up again; Can also can obtain upgrading more timely simultaneously so that accuracy is higher.
Corresponding with the method that Search Results is provided that the embodiment of the invention provides, the embodiment of the invention also provides a kind of device that Search Results is provided, and referring to Fig. 2, this device can comprise:
Browser end processing unit 201 is positioned at browser end, is used for user's accessed web page relevant information is reported search engine server; Said accessed web page relevant information comprises: by the uniqueness identification information of accessed web page and when user capture links corresponding target web, and the uniqueness identification information of the source web page at said link place;
Search engine processing unit 202; Be positioned at said search engine server end; Be used for according to the said accessed web page relevant information of collecting from a plurality of browser ends, the authoritative information of statistics website, the webpage quantity that the authoritative packets of information purse rope station of said website comprises, and the website outside the quantity of chain; So that said search engine server is confirmed the weight of website according to the authoritative information of said website.
During concrete the realization, this device can also comprise:
By visit capacity adjustment unit, the visit capacity that is used to add up each webpage under the same website, adjust the weight of this website according to the visit capacity of each webpage under the same website.
Wherein, saidly comprise by the visit capacity adjustment unit:
The first weighting subelement is used for according to the quantity of visit capacity under the same website above the webpage of first preset threshold value weighting being carried out in this website;
Perhaps,
The second weighting subelement is used for the total visit capacity according to same website, and weighting is carried out in this website.
Under another kind of embodiment, the accessed web page relevant information that said browser end reports also comprises the user profile of accessed web page, and at this moment, this device can also comprise:
Press calling party amount adjustment unit, be used to add up the calling party amount of each webpage under the same website, adjust the weight of this website according to the calling party amount of each webpage under the same website.
Concrete, saidly can comprise by calling party amount adjustment unit:
The 3rd weighting subelement is used for according to the quantity of calling party amount under the same website above the webpage of second preset threshold value weighting being carried out in this website;
Perhaps,
The 4th weighting subelement is used for the total calling party amount according to same website, and weighting is carried out in this website.
The user capture webpage relevant information that search engine server can report through browser end; Count chain quantity outside the website of webpage quantity that the website comprises and website; Combine other parameter (like page refreshment frequency etc.) just can determine the weight of each website like this, again.Like this, in the time need downloading, just can carry out download schedule according to the weight of the website at each network address place to the network address in the network address storehouse; Certainly; Also can the website weight be applied to other occasions, for example, when recommending other related web pages for the user according to the webpage of user's current accessed; Can belong to the weight of website equally according to webpage, each related web page is sorted; Perhaps, utilize the weight of appointed website to use recommendation: if candidate's exemplary application from specified web, then adds the weight of this website on original score value; To improve weights; Carry out again integrated ordered, output score value the highest several application, or the like.Wherein, Outside the website of the statistics webpage quantity that comprises of website and website during chain quantity; Owing to be to add up according to the user capture webpage correlation circumstance that browser end reports, therefore, for the mode that the webpage that grasps according to certain frequency with respect to search engine is added up again; Can also can obtain upgrading more timely simultaneously so that accuracy is higher.
Description through above embodiment can know, those skilled in the art can be well understood to the present invention and can realize by the mode that software adds essential general hardware platform.Based on such understanding; The part that technical scheme of the present invention contributes to prior art in essence in other words can be come out with the embodied of software product; This computer software product can be stored in the storage medium, like ROM/RAM, magnetic disc, CD etc., comprises that some instructions are with so that a computer equipment (can be a personal computer; Server, the perhaps network equipment etc.) carry out the described method of some part of each embodiment of the present invention or embodiment.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and identical similar part is mutually referring to getting final product between each embodiment, and each embodiment stresses all is the difference with other embodiment.Especially, for device or system embodiment, because it is basically similar in appearance to method embodiment, so describe fairly simplely, relevant part gets final product referring to the part explanation of method embodiment.Apparatus and system embodiment described above only is schematic; Wherein said unit as the separating component explanation can or can not be physically to separate also; The parts that show as the unit can be or can not be physical locations also; Promptly can be positioned at a place, perhaps also can be distributed on a plurality of NEs.Can realize the purpose of present embodiment scheme according to the needs selection some or all of module wherein of reality.Those of ordinary skills promptly can understand and implement under the situation of not paying creative work.
More than to the method and the device of definite website provided by the present invention weight; Carried out detailed introduction; Used concrete example among this paper principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, part all can change on embodiment and range of application.In sum, this description should not be construed as limitation of the present invention.