CN102663054B - A kind of method and device determining weight of website - Google Patents

A kind of method and device determining weight of website Download PDF

Info

Publication number
CN102663054B
CN102663054B CN201210089527.7A CN201210089527A CN102663054B CN 102663054 B CN102663054 B CN 102663054B CN 201210089527 A CN201210089527 A CN 201210089527A CN 102663054 B CN102663054 B CN 102663054B
Authority
CN
China
Prior art keywords
website
webpage
web page
search engine
accessed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210089527.7A
Other languages
Chinese (zh)
Other versions
CN102663054A (en
Inventor
李铁钧
张绍瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
360 Science And Technology Co Ltd
Original Assignee
Tianjin Qi Si Science And Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Qi Si Science And Technology Ltd filed Critical Tianjin Qi Si Science And Technology Ltd
Priority to CN201210089527.7A priority Critical patent/CN102663054B/en
Publication of CN102663054A publication Critical patent/CN102663054A/en
Application granted granted Critical
Publication of CN102663054B publication Critical patent/CN102663054B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of method and the device of determining weight of website, wherein, described method comprises: the accessed web page relevant information of user is reported search engine server by browser end; Described accessed web page relevant information comprises: the unique identification information of accessed webpage and when the target web that user's access links is corresponding, the unique identification information of the source web page at described link place; Described search engine server according to the described accessed web page relevant information collected from multiple browser end, the authoritative information of statistics website, the authoritative information of described website comprises the quantity of chain outside webpage quantity and website that website comprises; So that described search engine server determines the weight of website according to the authoritative information of described website.By the present invention, the accuracy of search engine image data and the promptness of renewal can be improved.

Description

A kind of method and device determining weight of website
Technical field
The present invention relates to search engine technique field, particularly relate to a kind of method and the device of determining weight of website.
Background technology
Along with the universal of computing machine and the development of internet, the use of people to network is more and more frequent, computer network becomes requisite instrument in people's daily life gradually, and the various abundant information service that search engine can provide because of itself, provide the user information and the data of every aspect, be widely used in daily life, bring huge facility to the productive life that people are daily.
Search engine web site is class website internet providing specially retrieval service, and user, by inputted search word (query) in the interface that provides at search engine, obtains the Search Results that search engine returns for this search word.As key one ring that search engine runs, the new page that internet constantly occurs and information being got up, is the basis that search engine web site provides service.Search engine server needs the URL library constantly updating oneself, the webpage that network address in download URL library is corresponding, again the content information of these webpages is carried out processing and integrating, set up information database and index data base, to provide information retrieval and inquiry service for user.
But, webpage quantity on nowadays internet is extremely huge, and growth rate again quickly when, the webpage wanting to grab each at short notice carries out download and analyzes, it is almost an impossible mission, this is because, on internet, the quantity of webpage is extremely huge, the page corresponding to the URL that the crawlers of search engine grabs an on the internet also just part wherein, even but this part page, want all to download in search engine server, need to take a large amount of resources, therefore, usually take a kind ofly to arrange priority by search engine to the network address in URL library, generate and safeguard and download queue, carry out order according to the priority height of webpage to be downloaded and download schedule is carried out to webpage.Wherein, the download priority of webpage mainly according to the authority of website, webpage place because usually setting, therefore, how Obtaining Accurate is compare the key link to the authority evaluation of website.
Prior art is when determining a website authoritative, in the webpage quantity that main consideration website comprises, website outside the renewal frequency of each webpage, website chain (so-called outer chain refers to, in the link about certain website that other external websites such as blog, forum are issued, by outer chain, the website of oneself can be imported to from other website) quantity, website, outer chain source significance level etc.But search engine during these parameters, can depend on the web data crawled, or user is to the click situation of Search Results more than gathering, but this because crawl the difference of mode, and can cause deviation in various degree.Such as, the webpage number ratio that website comprises comparatively depends on search engine and crawls situation to the webpage under this website, if the webpage number ratio comprised in a website is larger, but search engine has only crawled sub-fraction wherein, the webpage quantity that this website that then search engine is known comprises, is actually and is less than this website actual webpage quantity comprised; The renewal frequency of webpage compares and relies on search engine and crawl frequency to this website, if the renewal frequency of certain webpage is very high, but search engine to this website to crawl frequency lower, then the renewal frequency of this webpage that search engine collects can be less than the actual renewal frequency of this webpage; The outer chain quantity of website then more depends on the link analysis to magnanimity webpage on internet, if analytically comprehensive not, still can cause the deviation of data.In addition, the exploitation of website and maintainer also often adopt some means, affect the authenticity of these data above, thus make website obtain higher weight evaluation.In a word, due to the impact of above factors, search engine of the prior art is existed, and image data is inaccurate, the Data Update defect such as not in time, and then causes the quality comparation of the Search Results finally provided low.
Summary of the invention
The invention provides a kind of method and the device of determining weight of website, the accuracy of search engine image data and the promptness of renewal can be improved.
The invention provides following scheme:
Determine a method for weight of website, comprising:
The accessed web page relevant information of user is reported search engine server by browser end; Described accessed web page relevant information comprises: the unique identification information of accessed webpage and when the target web that user's access links is corresponding, the unique identification information of the source web page at described link place;
Described search engine server according to the described accessed web page relevant information collected from multiple browser end, the authoritative information of statistics website, the authoritative information of described website comprises the quantity of chain outside webpage quantity and website that website comprises; So that described search engine server determines the weight of website according to the authoritative information of described website.
Wherein, also comprise:
Add up the visit capacity of each webpage under same website, adjust the weight of this website according to the visit capacity of each webpage under same website.
Wherein, the weight of the described adjustment of the visit capacity according to each webpage under same website website comprises:
According to visit capacity under same website more than the quantity of the webpage of the first preset threshold value, this website is weighted;
Or,
According to total visit capacity of same website, this website is weighted.
Wherein, the accessed web page relevant information that described browser end reports also comprises the user profile of accessed web page, and described method also comprises:
Add up the calling party amount of each webpage under same website, adjust the weight of this website according to the calling party amount of each webpage under same website.
Wherein, the described weight adjusting this website according to the calling party amount of each webpage under same website comprises:
According to calling party amount under same website more than the quantity of the webpage of the second preset threshold value, this website is weighted;
Or,
According to total calling party amount of same website, this website is weighted.
Determine a device for weight of website, comprising:
Browser end processing unit, is positioned at browser end, for the accessed web page relevant information of user is reported search engine server; Described accessed web page relevant information comprises: the unique identification information of accessed webpage and when the target web that user's access links is corresponding, the unique identification information of the source web page at described link place;
Search engine processing unit, be positioned at described search engine server end, for according to the described accessed web page relevant information collected from multiple browser end, the authoritative information of statistics website, the authoritative information of described website comprises the quantity of chain outside webpage quantity and website that website comprises; So that described search engine server determines the weight of website according to the authoritative information of described website.
Wherein, also comprise:
By visit capacity adjustment unit, for adding up the visit capacity of each webpage under same website, adjust the weight of this website according to the visit capacity of each webpage under same website.
Wherein, describedly to comprise by visit capacity adjustment unit:
First weighting subelement, for according to visit capacity under same website more than the quantity of the webpage of the first preset threshold value, this website is weighted;
Or,
Second weighting subelement, for the total visit capacity according to same website, is weighted this website.
Wherein, the accessed web page relevant information that described browser end reports also comprises the user profile of accessed web page, and described device also comprises:
By calling party amount adjustment unit, for adding up the calling party amount of each webpage under same website, adjust the weight of this website according to the calling party amount of each webpage under same website.
Wherein, describedly to comprise by calling party amount adjustment unit:
3rd weighting subelement, for according to calling party amount under same website more than the quantity of the webpage of the second preset threshold value, this website is weighted;
Or,
4th weighting subelement, for the total calling party amount according to same website, is weighted this website.
According to specific embodiment provided by the invention, the invention discloses following technique effect:
Pass through the present invention, user's accessed web page relevant information that search engine server can be reported by browser end, chain quantity outside the website counting webpage quantity that website comprises and website, like this, more just can determine the weight of each website in conjunction with other parameter (as webpage renewal frequency etc.).Like this, when needing to download the network address in URL library, just can carry out download schedule according to the weight of the website at each network address place, certainly, also weight of website can be applied to other occasions, such as, according to the webpage of user's current accessed for user recommend other related web pages time, according to the weight of website, webpage place, each related web page can be sorted equally; Or, utilize the weight of appointed website to carry out application to recommend: if the exemplary application of candidate is from the website of specifying, then on original score value, to add the weight of this website, to improve weights, carry out integrated ordered again, export several application that score value is the highest, etc.Wherein, outside the website adding up the webpage quantity that comprises of website and website during chain quantity, owing to being that the user's accessed web page correlation circumstance reported according to browser end carries out adding up, therefore, to carry out for the mode of adding up again according to the webpage that certain frequency carries out capturing relative to search engine, accuracy can be made higher, also can obtain upgrading more timely simultaneously.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the process flow diagram of the method that the embodiment of the present invention provides;
Fig. 2 is the schematic diagram of the device that the embodiment of the present invention provides.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain, all belongs to the scope of protection of the invention.
See Fig. 1, the method for the determination weight of website that the embodiment of the present invention provides comprises the following steps:
S101: the accessed web page relevant information of user is reported search engine server by browser end; Described accessed web page relevant information comprises: the unique identification information of accessed webpage and when the target web that user clicks on links is corresponding, the unique identification information of the source web page at link place;
In embodiments of the present invention, from the data precision improved as website authority judging basis and renewal promptness angle, the quality of the Search Results that search engine provides is improved.During specific implementation, mainly browser is combined with search engine, browser is as the instrument of user's accessed web page, the concrete access situation of user can be got, comprise the unique identification information of the webpage of user's access, and when if the webpage of user's access is the target web of correspondence of certain link, the URL of the source web page at this link place, etc.Wherein, during the target web of the link correspondence in user's access originator webpage, may be directly in source web page, click this link, or, also may be that the modes such as the address field click of source web page being copied to browser conduct interviews, in a word, the operation that browser end can perform according to user, grabs the unique identification of link place source web page.About the unique identification of webpage, can be URL (the Uniform/UniversalResource Locator of webpage, URL(uniform resource locator)), or, to a certain extent, the MD5 value etc. of web page title or web page contents, also can as the unique identification of webpage, therefore, reported server to be also fine.For ease of describing, be hereafter all introduced for URL.
It should be noted that, due in practical application, the computed applied environment of people, as being not quite similar of operating system, browser type etc., multiple implementation can be had: such as use a kind of browser with monitoring function to the process that user's accessed web page situation is monitored, when user uses browser access webpage, record accessed web page relevant information.Wherein, browser can be the browser Internet Explorer (being called for short IE) that Windows operating system carries, and other third party's browsers.So-called third party's browser; be often referred to the browser software of the non-IE run in Windows operating system; this kind of third party's browser can have abundant unique function design for user and personalized expansion because of it usually, manyly to apply easily for user provides.
In addition for the browser supporting plug-in extension function, also can be realized by the plug-in card program started with browser; Plug-in unit is application program that write out according to certain application programming interfaces specification, that can be called to realize processing by master routine certain affairs, in embodiments of the present invention, for not obtaining and reporting functions the relevant information of user's accessed web page, but the browser that browser plug-in is expanded can be supported, being realized by plug-in card program, is also a kind of effective implementation.
Moreover, can also by non-browser program and browser plug-in, such as certain watchdog routine or program monitoring assembly have been come, what sent user by watchdog routine or program monitoring assembly detects the request of access of target web, and reports the relevant information of user's accessed web page.
It should be noted that in addition, when specific implementation, browser end is once find that user accesses certain webpage, just the relevant information of this visit can be reported search engine server, or, also first can carry out record at browser end to the access related information of user, when reaching certain hour interval, or when the data of record reach a certain amount of, then report search engine server, etc.
S102: described search engine server according to the described accessed web page relevant information collected from multiple browser end, the authoritative information of statistics website, the authoritative information of described website comprises the quantity of chain outside webpage quantity and website that website comprises; So that described search engine server determines the weight of website according to the authoritative information of described website.
Because the customer volume of browser is larger, therefore, search engine server also can collect a large amount of accessed web page relevant informations from numerous browser ends, and then just can add up the authoritative information of website according to these information.
For the ease of understanding, the first simple difference introducing lower website and webpage here.Generally speaking, website is the properties collection having independent domain name, independent parking space, and these contents may be webpages, also may be program or alternative document, not necessarily will have a lot of webpage, as long as have independent domain name and space, even only have a page also to cry website; Webpage is the ingredient of website, and be the platform of carrying various websites application, a webpage is exactly a browsing pages, and such as blog, the personal homepage hanging over others' there, enterprise build a station the enterprise's page in system, the trade company's page in multi-user mall etc.The URL of webpage is also referred to as web page address, is the address of the resource of standard on the Internet, generally speaking, complete, look as follows with the common uniform resource identifier grammer of authorization portions:
Agreement: // user name password: subdomain name. domain name. TLD: port numbers/directory/file name. file suffixes? parameter=value # mark
Visible, contain the domain name of affiliated web site in the URL of webpage, whether identical by judging the domain name comprised in the URL of each webpage, just can judge whether belong to same website.Therefore, for search engine server, after the URL of webpage collecting user's access from browser, just the webpage of the same domain name comprised in URL can be sorted out, so just can count the webpage quantity comprised in same website.Certainly, because browser end can the accessed web page relevant information of constantly report of user, therefore, the data that search engine server also can be uploaded according to browser end, constantly upgrade this parameter of webpage quantity comprised in website.
Meanwhile, because browser end also uploads the link situation of user's accessed web page, therefore, the quantity that plain engine server also can count chain outside website is accordingly searched.During specific implementation, at browser end, if find that user is certain webpage that the mode that clicks web page interlinkage is accessed, then while the URL recording this accessed webpage, the URL of the source web page at this web page interlinkage place can be recorded.Such as, suppose that user clicks the link of webpage B in the content of pages of webpage A, browse the page of webpage B with this, be then equivalent to the link comprising webpage B in the content of pages of webpage A, accordingly, just webpage A is called the source web page at the link place of webpage B.The URL of webpage A is also reported search engine server, informs search engine server with this while the URL of webpage B will being reported search engine server by browser end, and that user accesses is webpage B, and includes the link of this webpage B in webpage A.Search engine server is after knowing this information, just can judge the URL of webpage B and webpage A further, if the domain name of the URL of two webpages is identical, then belong to chain in website, if the domain name of the URL of two webpages is different, then belong to chain outside website, the website that is belonging to webpage A imports to the website belonging to webpage B.By that analogy, webpage B and its URL linking other former webpages at place can also be compared, chain quantity outside the website counting this webpage B, simultaneously, chain quantity outside the website that can also count other webpages under webpage B affiliated web site, chain quantity outside the website of webpages all under same website is merged, chain quantity outside the website that just can obtain this website.
By described visible above, user's accessed web page relevant information that search engine server can be reported by browser end, chain quantity outside the website counting webpage quantity that website comprises and website, like this, the weight of each website just can be determined again in conjunction with other parameter, and then just can when providing Search Results to user, the weight according to the website at each Search Results place sorts, and then is presented to user.Wherein, outside the website adding up the webpage quantity that comprises of website and website during chain quantity, owing to being that the user's accessed web page correlation circumstance reported according to browser end carries out adding up, therefore, to carry out for the mode of adding up again according to the webpage that certain frequency carries out capturing relative to search engine, accuracy can be made higher, also can obtain upgrading more timely simultaneously.
In addition, in embodiments of the present invention, owing to can get the accessed web page relevant information of user from browser end, therefore, except can counting the website of webpage quantity that website comprises, website outside chain quantity, website can also be gone out and access relevant information to user, access to user the authority that relevant information also can embody website to a certain extent due to this, therefore on the basis using traditional calculating weight of website, the authoritative weight of website can also be adjusted.
During specific implementation, first can count the visit capacity of each webpage under same website, then adjust the weight of this website according to the visit capacity of each webpage under same website.Wherein, when adding up the visit capacity of each webpage under same website, can carry out as follows: at browser end, as long as find that user have accessed certain webpage, just can upload the relevant information of this visit, at search engine server end, often receive the information that a browser end is sent, just therefrom can parse the URL of webpage, judge whether there is this URL in database, if there is no, then determine the website belonging to this webpage according to the domain name of the URL of this webpage, added in database, its access times are designated as 1 simultaneously; If there is this webpage in database, then on the access times basis that this webpage is current, add 1, other webpages have also all done similar process.Like this, for same website, while counting its all webpages comprised, user's access times that its each webpage comprised is corresponding respectively can also be counted.
Specifically when being weighted according to the weight of visit capacity to website of each webpage under same website, multiple implementation can be had.Such as, under a kind of mode, the quantity of the webpage of preset threshold value can be exceeded according to visit capacity under same website, weight of website is weighted, also be, under same website, the quantity of the webpage that visit capacity is larger is larger, or the larger webpage proportion of visit capacity is larger, then the weight of website is larger.Or, under another kind of mode, also according to total visit capacity of same website, this website can be weighted.That is, after the visit capacity counting each webpage under same website, visit capacity corresponding respectively for each webpage is added total visit capacity that just can obtain this website, the website that total visit capacity is larger, and corresponding weight is also higher.
Except being weighted according to visit capacity, can also be weighted according to the calling party amount of website.Wherein, visit capacity and calling party amount are different concepts, visit capacity refers to total access times, for the website that two total visit capacities are suitable, user's visit capacity may be different, such as, the visit capacity of one of them website may be brought by a lot of user, and the visit capacity of another website may be brought by a few users, now then prove that the calling party amount of first website is greater than second website, accordingly, the authority of first website also can be greater than second website.During specific implementation, in order to the calling party amount making search engine server can count each website, browser end is when reporting accessed web page relevant information, user profile can also be carried, wherein, user profile can be that user registers and the accounts information logged in, or can also be the IP address information of user, etc.Search engine server is according to user profile, just can count the calling party amount of each webpage, then the calling party amount of each webpage under same website is added total calling party amount that just can obtain website, and then according to total calling party amount of this website, website is weighted, the website that total calling party amount is larger, weight is also higher.Certainly, when being weighted according to calling party amount, except total visit capacity of foundation website, also can according to other factor, such as, similar with visit capacity, the webpage number that in website, calling party amount is larger can be seen, or the ratio shared by the webpage that calling party amount is larger is weighted website, etc.
In a word, user's accessed web page relevant information that search engine server can be reported by browser end, chain quantity outside the website counting webpage quantity that website comprises and website, like this, more just can determine the weight of each website in conjunction with other parameter (as webpage renewal frequency etc.).Like this, when needing to download the network address in URL library, just can carry out download schedule according to the weight of the website at each network address place, certainly, also weight of website can be applied to other occasions, such as, according to the webpage of user's current accessed for user recommend other related web pages time, according to the weight of website, webpage place, each related web page can be sorted equally; Or, utilize the weight of appointed website to carry out application to recommend: if the exemplary application of candidate is from the website of specifying, then on original score value, to add the weight of this website, to improve weights, carry out integrated ordered again, export several application that score value is the highest, etc.Wherein, outside the website adding up the webpage quantity that comprises of website and website during chain quantity, owing to being that the user's accessed web page correlation circumstance reported according to browser end carries out adding up, therefore, to carry out for the mode of adding up again according to the webpage that certain frequency carries out capturing relative to search engine, accuracy can be made higher, also can obtain upgrading more timely simultaneously.
What provide with the embodiment of the present invention provides the method for Search Results corresponding, and the embodiment of the present invention additionally provides a kind of device providing Search Results, and see Fig. 2, this device can comprise:
Browser end processing unit 201, is positioned at browser end, for the accessed web page relevant information of user is reported search engine server; Described accessed web page relevant information comprises: the unique identification information of accessed webpage and when the target web that user's access links is corresponding, the unique identification information of the source web page at described link place;
Search engine processing unit 202, be positioned at described search engine server end, for according to the described accessed web page relevant information collected from multiple browser end, the authoritative information of statistics website, the authoritative information of described website comprises the quantity of chain outside webpage quantity and website that website comprises; So that described search engine server determines the weight of website according to the authoritative information of described website.
During specific implementation, this device can also comprise:
By visit capacity adjustment unit, for adding up the visit capacity of each webpage under same website, adjust the weight of this website according to the visit capacity of each webpage under same website.
Wherein, describedly to comprise by visit capacity adjustment unit:
First weighting subelement, for according to visit capacity under same website more than the quantity of the webpage of the first preset threshold value, this website is weighted;
Or,
Second weighting subelement, for the total visit capacity according to same website, is weighted this website.
Under another kind of embodiment, the accessed web page relevant information that described browser end reports also comprises the user profile of accessed web page, and now, this device can also comprise:
By calling party amount adjustment unit, for adding up the calling party amount of each webpage under same website, adjust the weight of this website according to the calling party amount of each webpage under same website.
Concrete, describedly can to comprise by calling party amount adjustment unit:
3rd weighting subelement, for according to calling party amount under same website more than the quantity of the webpage of the second preset threshold value, this website is weighted;
Or,
4th weighting subelement, for the total calling party amount according to same website, is weighted this website.
User's accessed web page relevant information that search engine server can be reported by browser end, chain quantity outside the website counting webpage quantity that website comprises and website, like this, more just can determine the weight of each website in conjunction with other parameter (as webpage renewal frequency etc.).Like this, when needing to download the network address in URL library, just can carry out download schedule according to the weight of the website at each network address place, certainly, also weight of website can be applied to other occasions, such as, according to the webpage of user's current accessed for user recommend other related web pages time, according to the weight of website, webpage place, each related web page can be sorted equally; Or, utilize the weight of appointed website to carry out application to recommend: if the exemplary application of candidate is from the website of specifying, then on original score value, to add the weight of this website, to improve weights, carry out integrated ordered again, export several application that score value is the highest, etc.Wherein, outside the website adding up the webpage quantity that comprises of website and website during chain quantity, owing to being that the user's accessed web page correlation circumstance reported according to browser end carries out adding up, therefore, to carry out for the mode of adding up again according to the webpage that certain frequency carries out capturing relative to search engine, accuracy can be made higher, also can obtain upgrading more timely simultaneously.
As seen through the above description of the embodiments, those skilled in the art can be well understood to the mode that the present invention can add required general hardware platform by software and realizes.Based on such understanding, technical scheme of the present invention can embody with the form of software product the part that prior art contributes in essence in other words, this computer software product can be stored in storage medium, as ROM/RAM, magnetic disc, CD etc., comprising some instructions in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) perform the method described in some part of each embodiment of the present invention or embodiment.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for device or system embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, relevant part illustrates see the part of embodiment of the method.Apparatus and system embodiment described above is only schematic, the wherein said unit illustrated as separating component or can may not be and physically separates, parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of module wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.Those of ordinary skill in the art, when not paying creative work, are namely appreciated that and implement.
Above to method and the device of determining weight of website provided by the present invention, be described in detail, apply specific case herein to set forth principle of the present invention and embodiment, the explanation of above embodiment just understands method of the present invention and core concept thereof for helping; Meanwhile, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications.In sum, this description should not be construed as limitation of the present invention.

Claims (10)

1. determine a method for weight of website, it is characterized in that, comprising:
When user uses browser access webpage, what sent user by browser or browser plug-in or watchdog routine or program monitoring assembly is detected the request of access of target web, browser end is once find that user accesses certain webpage, by the unique identification information of the accessed webpage of this visit and when the target web that user's access links is corresponding, the unique identification information of the source web page at described link place reports search engine server; Or browser end to the unique identification information of accessed webpage and when the target web that user's access links is corresponding time, the unique identification information of the source web page at described link place carries out record, when reaching certain hour interval, or when the data of record reach a certain amount of, then report search engine server; Described unique identification information, comprising: the MD5 value of the URL of webpage, web page title, web page contents;
Described search engine server, according to the described accessed web page relevant information collected from multiple browser end, counts the quantity of chain outside webpage quantity and website that website comprises, described search engine server is according to the described accessed web page relevant information collected from multiple browser end, count the webpage quantity that website comprises, and the quantity of chain outside website, comprise: the webpage of described search engine server to the same domain name comprised in the URL of the webpage that described user accesses is sorted out, count the webpage quantity comprised in same website, and described search engine server judges that whether the URL of the accessed webpage of described this visit is identical with the domain name of the URL of described source web page, if not identical, then determine the accessed webpage of this visit be described source web page website outside chain, chain quantity outside the website of webpages all under same website is merged, chain quantity outside the website obtaining same website, so that the quantity of chain determines the weight of website outside the webpage quantity that described search engine server comprises according to described website, website, when needing to download the network address in URL library, the weight according to the website at each network address place carries out download schedule, and according to the weight of website, webpage place, each related web page is sorted.
2. method according to claim 1, is characterized in that, also comprises:
Add up the visit capacity of each webpage under same website, adjust the weight of this website according to the visit capacity of each webpage under same website.
3. method according to claim 2, is characterized in that, the described weight adjusting this website according to the visit capacity of each webpage under same website comprises:
According to visit capacity under same website more than the quantity of the webpage of the first preset threshold value, this website is weighted;
Or,
According to total visit capacity of same website, this website is weighted.
4. method according to claim 1, is characterized in that, the accessed web page relevant information that described browser end reports also comprises the user profile of accessed web page, and described method also comprises:
Add up the calling party amount of each webpage under same website, adjust the weight of this website according to the calling party amount of each webpage under same website.
5. method according to claim 4, is characterized in that, the described weight adjusting this website according to the calling party amount of each webpage under same website comprises:
According to calling party amount under same website more than the quantity of the webpage of the second preset threshold value, this website is weighted;
Or,
According to total calling party amount of same website, this website is weighted.
6. determine a device for weight of website, it is characterized in that, comprising:
Browser end processing unit, when user uses browser access webpage, what sent user by browser or browser plug-in or watchdog routine or program monitoring assembly is detected the request of access of target web, browser end is once find that user accesses certain webpage, by the unique identification information of this accessed webpage and when the target web that user's access links is corresponding, the unique identification information of the source web page at described link place reports search engine server; Or browser end to the unique identification information of accessed webpage and when the target web that user's access links is corresponding time, the unique identification information of the source web page at described link place carries out record, when reaching certain hour interval, or when the data of record reach a certain amount of, then report search engine server; Described unique identification information, comprising: the MD5 value of the URL of webpage, web page title, web page contents;
Search engine processing unit, is positioned at described search engine server end, for according to the described accessed web page relevant information collected from multiple browser end, counts the quantity of chain outside webpage quantity and website that website comprises, described search engine server is according to the described accessed web page relevant information collected from multiple browser end, count the webpage quantity that website comprises, and the quantity of chain outside website, comprise: the webpage of described search engine server to the same domain name comprised in the URL of the webpage that described user accesses is sorted out, count the webpage quantity comprised in same website, and described search engine server judges that whether the URL of the accessed webpage of described this visit is identical with the domain name of the URL of described source web page, if not identical, then determine the accessed webpage of this visit be described source web page website outside chain, chain quantity outside the website of webpages all under same website is merged, chain quantity outside the website obtaining same website, so that the quantity of chain determines the weight of website outside the webpage quantity that described search engine server comprises according to described website, website, when needing to download the network address in URL library, the weight according to the website at each network address place carries out download schedule, and according to the weight of website, webpage place, each related web page is sorted.
7. device according to claim 6, is characterized in that, also comprises:
By visit capacity adjustment unit, for adding up the visit capacity of each webpage under same website, adjust the weight of this website according to the visit capacity of each webpage under same website.
8. device according to claim 7, is characterized in that, describedly comprises by visit capacity adjustment unit:
First weighting subelement, for according to visit capacity under same website more than the quantity of the webpage of the first preset threshold value, this website is weighted;
Or,
Second weighting subelement, for the total visit capacity according to same website, is weighted this website.
9. device according to claim 6, is characterized in that, the accessed web page relevant information that described browser end reports also comprises the user profile of accessed web page, and described device also comprises:
By calling party amount adjustment unit, for adding up the calling party amount of each webpage under same website, adjust the weight of this website according to the calling party amount of each webpage under same website.
10. device according to claim 9, is characterized in that, describedly comprises by calling party amount adjustment unit:
3rd weighting subelement, for according to calling party amount under same website more than the quantity of the webpage of the second preset threshold value, this website is weighted;
Or,
4th weighting subelement, for the total calling party amount according to same website, is weighted this website.
CN201210089527.7A 2012-03-29 2012-03-29 A kind of method and device determining weight of website Active CN102663054B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210089527.7A CN102663054B (en) 2012-03-29 2012-03-29 A kind of method and device determining weight of website

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210089527.7A CN102663054B (en) 2012-03-29 2012-03-29 A kind of method and device determining weight of website

Publications (2)

Publication Number Publication Date
CN102663054A CN102663054A (en) 2012-09-12
CN102663054B true CN102663054B (en) 2015-08-12

Family

ID=46772545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210089527.7A Active CN102663054B (en) 2012-03-29 2012-03-29 A kind of method and device determining weight of website

Country Status (1)

Country Link
CN (1) CN102663054B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104252348B (en) * 2013-06-27 2018-07-20 腾讯科技(深圳)有限公司 A kind of web page access statistical method and device based on browser
CN103678663B (en) * 2013-12-24 2018-02-27 北京奇虎科技有限公司 Web search method and client
CN103970850B (en) * 2014-05-04 2017-09-22 广州品唯软件有限公司 Site information recommends method and system
CN106709042B (en) * 2016-12-30 2020-09-25 北京小度互娱科技有限公司 Index updating method and equipment
CN108694197A (en) * 2017-04-10 2018-10-23 富士通株式会社 Hypertext grasping means and device
CN107329992A (en) * 2017-06-07 2017-11-07 上海斐讯数据通信技术有限公司 A kind of management method and management system of websites collection ranking
CN107742261A (en) * 2017-11-01 2018-02-27 赛尔网络有限公司 The method for obtaining group user access covering rate lifting weight
CN108063974B (en) * 2017-12-12 2021-08-06 深圳市雷鸟网络传媒有限公司 Television activity page data transmission method, television equipment, system and storage medium
CN108804540B (en) * 2018-05-08 2020-12-22 苏州闻道网络科技股份有限公司 Search engine link analysis system and analysis method
CN108600054B (en) * 2018-05-10 2020-11-20 中国互联网络信息中心 Method and system for judging number of websites based on domain name area files
CN109522494B (en) * 2018-11-08 2020-09-15 杭州安恒信息技术股份有限公司 Dark chain detection method, device, equipment and computer readable storage medium
CN111966948B (en) * 2020-09-25 2023-08-01 北京百度网讯科技有限公司 Information delivery method, device, equipment and storage medium
CN113806660A (en) * 2021-09-17 2021-12-17 北京百度网讯科技有限公司 Data evaluation method, training method, device, electronic device and storage medium
CN115577197B (en) * 2022-12-07 2023-10-27 杭州城市大数据运营有限公司 Component discovery method, system and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101473304A (en) * 2006-04-18 2009-07-01 双子星设计技术公司 Method for ranking webpages via circuit simulation
CN102236710A (en) * 2011-06-30 2011-11-09 百度在线网络技术(北京)有限公司 Method and equipment for displaying news information in query result
CN102347930A (en) * 2010-07-26 2012-02-08 中国电信股份有限公司 Method and system for obtaining webpage content

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101383730B (en) * 2008-10-30 2012-01-25 北京搜狗科技发展有限公司 Method and device for determining authoritative website

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101473304A (en) * 2006-04-18 2009-07-01 双子星设计技术公司 Method for ranking webpages via circuit simulation
CN102347930A (en) * 2010-07-26 2012-02-08 中国电信股份有限公司 Method and system for obtaining webpage content
CN102236710A (en) * 2011-06-30 2011-11-09 百度在线网络技术(北京)有限公司 Method and equipment for displaying news information in query result

Also Published As

Publication number Publication date
CN102663054A (en) 2012-09-12

Similar Documents

Publication Publication Date Title
CN102663054B (en) A kind of method and device determining weight of website
CN102663048B (en) Method and device for providing search result
CN102799662B (en) Method, the Apparatus and system of network address is recommended based on domain name access historical record
CN102667761B (en) Scalable cluster database
CN102955798B (en) A kind of searching method and search server based on search engine
CN103618696B (en) Method and server for processing cookie information
CA2621031A1 (en) Mobile sitemaps
EP2684106A1 (en) Determining preferred categories based on user access attribute values
CN101911065B (en) Access subject information retrieval device
JP2006127529A (en) Web page ranking with hierarchical consideration
CN102663052B (en) Method and device for providing search results of search engine
US8838643B2 (en) Context-aware parameterized action links for search results
CN105718533A (en) Information pushing method and device
KR20030016037A (en) Method for searching web page on popularity of visiting web pages and apparatus thereof
CN104252348A (en) Webpage access statistics method and device based on browser
CN107957938A (en) A kind of method and system for obtaining website test data
CN102663049A (en) Method and device for updating search engine web address library
CN102541946B (en) Method and equipment for determining recommendation degree of hyperlink based on recommendation attribute of hyperlink
CN104202418B (en) Recommend the method and system of the content distributing network of business for content supplier
CN102541947A (en) Method and equipment for updating authority score of webpage based on friefox event
CN101133415A (en) Server, method and system for providing information search service by using sheaf of pages
WO2015149550A1 (en) Method and apparatus for determining grades of links within website
WO2015062652A1 (en) Technique for data traffic analysis
CN101383730A (en) Method and device for determining authoritative website
CN104392000B (en) Determine the method and apparatus that mobile site captures quota

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
ASS Succession or assignment of patent right

Owner name: BEIJING QIHU TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: QIZHI SOFTWARE (BEIJING) CO., LTD.

Effective date: 20120926

Owner name: QIZHI SOFTWARE (BEIJING) CO., LTD.

Effective date: 20120926

C10 Entry into substantive examination
C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100016 CHAOYANG, BEIJING TO: 100088 XICHENG, BEIJING

SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20120926

Address after: 100088 Beijing city Xicheng District xinjiekouwai Street 28, block D room 112 (Desheng Park)

Applicant after: Beijing Qihu Technology Co., Ltd.

Applicant after: Qizhi Software (Beijing) Co., Ltd.

Address before: The 4 layer 100016 unit of Beijing city Chaoyang District Jiuxianqiao Road No. 14 Building C

Applicant before: Qizhi Software (Beijing) Co., Ltd.

ASS Succession or assignment of patent right

Owner name: TIANJIN QISI TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: BEIJING QIHU TECHNOLOGY CO., LTD.

Effective date: 20141203

Free format text: FORMER OWNER: QIZHI SOFTWARE (BEIJING) CO., LTD.

Effective date: 20141203

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100088 XICHENG, BEIJING TO: 300384 NANKAI, TIANJIN

TA01 Transfer of patent application right

Effective date of registration: 20141203

Address after: 300384 Tianjin hi New Technology Industrial Park Huayuan Industrial District No. 18 West North 2-102 industrial incubation -5

Applicant after: Tianjin Qi Si Science and Technology Ltd.

Address before: 100088 Beijing city Xicheng District xinjiekouwai Street 28, block D room 112 (Desheng Park)

Applicant before: Beijing Qihu Technology Co., Ltd.

Applicant before: Qizhi Software (Beijing) Co., Ltd.

C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address

Address after: 300000 Binhai high tech Zone, Tianjin Binhai hi tech Park Science and Technology Park, No. 39, No. six, No. 9-3-401

Patentee after: 360 Polytron Technologies Inc

Address before: 300384 Tianjin hi New Technology Industrial Park Huayuan Industrial District No. 18 West North 2-102 industrial incubation -5

Patentee before: Tianjin Qi Si Science and Technology Ltd.

CP03 Change of name, title or address
CP01 Change in the name or title of a patent holder

Address after: 300000 Binhai high tech Zone, Tianjin Binhai hi tech Park Science and Technology Park, No. 39, No. six, No. 9-3-401

Patentee after: 360 science and Technology Co., Ltd.

Address before: 300000 Binhai high tech Zone, Tianjin Binhai hi tech Park Science and Technology Park, No. 39, No. six, No. 9-3-401

Patentee before: 360 Polytron Technologies Inc

CP01 Change in the name or title of a patent holder