US20120089598A1 - Generating Website Profiles Based on Queries from Websites and User Activities on the Search Results - Google Patents
Generating Website Profiles Based on Queries from Websites and User Activities on the Search Results Download PDFInfo
- Publication number
- US20120089598A1 US20120089598A1 US13/323,758 US201113323758A US2012089598A1 US 20120089598 A1 US20120089598 A1 US 20120089598A1 US 201113323758 A US201113323758 A US 201113323758A US 2012089598 A1 US2012089598 A1 US 2012089598A1
- Authority
- US
- United States
- Prior art keywords
- website
- search
- profile
- user
- search results
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Definitions
- the present invention relates generally to the field of a search engine in a computer network system, in particular to system and method of generating a profile for a website and using the profile to customize rankings of search results in response to search queries submitted from the website.
- Search engines are a powerful tool of locating and retrieving documents from the Internet (or an intranet). Many websites include at least one search box on their webpages. The search box on a particular webpage typically enables users to submit search queries to search for documents at the website associated with the webpage, or to search for document on the Internet. However, most websites do not have an exclusive, dedicated search engine system for processing these search queries. This is especially true if the search box enables searches of the entire Internet for relevant documents. Rather, the search queries are re-directed to and processed by a third-party search engine (e.g., www.google.com). The third-party search engine generates search results responsive to the search queries (e.g., by searching a database of documents) and returns the search results to the requesting users.
- a third-party search engine e.g., www.google.com
- the search results produced by the third-party search engine are independent of the website from which a search query is submitted.
- the search engine generates the same search result for the search query “apple” irrespective of whether the search query is from the website of an online retail electronics store frequented by Apple computer users or an online shopping website hosted by a grocery store.
- the search results returned for the search query “apple” are likely to include results of little interest to visitors to these respective websites.
- a sports news website may have one webpage covering domestic news and another one devoted to international news.
- a user entering the term “football” into the search box on the domestic news webpage is probably interested in news related to American football, while a user entering the same term “football” into the search box on the international news webpage is probably more interested in news about soccer (which is known as “football” outside the United States).
- Similar issues may arise if a sports news website has different webpages covering news for different sports, and search boxes in each of these pages. Thus, when a search engine ignores the webpages from which a search query is submitted, users do not receive search results best tailored to their distinct interests.
- search engine that can customize its search results in accordance with the websites (or webpages) from which the corresponding search queries are submitted so as to highlight information items in the search results that are most likely to be of interest to the users who submit the search queries. Further, it would be desirable for such a system to operate without explicit input from a user with regard to the user's personal preferences and interests and therefore free the user from concerns over exposing private information.
- an information server receives multiple search queries from a website submitted by different users. Different search results responsive to the search queries are provided to the requesting users. The information server monitors activities of the users on the search results and generates a profile for the website using the search queries and the user activities.
- an information server receives a same query from two websites and identifies a plurality of information items associated with the search query.
- the information server uses profiles of the two websites to customize the information items into two different orders and serves the information items to the two websites in the two different orders.
- the two website profiles are related to the search histories of the two websites.
- the present invention including website profile construction and search results re-ordering and/or scoring, can be implemented on either the client side or the server side of a client-server network environment.
- FIG. 1 is a block diagram of an exemplary distributed system that includes a plurality of websites and clients requesting information from an information server in accordance with some embodiments of the present invention.
- FIG. 2 is a flow diagram of a process for generating a website (or webpage) profile using search queries, search results and user activities associated with the website (or webpage) in accordance with some embodiments of the present invention.
- FIG. 3 is a block diagram of a process for updating a website (or webpage) profile by merging an incremental website (or webpage) profile into the website (or webpage) profile in accordance with some embodiments of the present invention.
- FIG. 4 is a prophetic example of a curve characterizing the popularity distribution of search queries submitted from a website (or webpage).
- FIG. 5 is a block diagram illustrating how the process of creating a website profile is divided into multiple sub-processes in accordance with some embodiments of the present invention.
- FIG. 6A is a block diagram of an exemplary category map that may be used for generating category-based website profiles in accordance with some embodiments of the present invention.
- FIG. 6B is a block diagram of an exemplary data structure that may be used for storing category-based website profiles in accordance with some embodiments of the present invention.
- FIG. 7 is a block diagram of an exemplary data structure that may be used for storing term-based website profiles in accordance with some embodiments of the present invention.
- FIG. 8 is a block diagram of an exemplary data structure that may be used for storing link-based website profiles in accordance with some embodiments of the present invention.
- FIG. 9 is a flow diagram of a process for generating website-dependent search results using website profiles in accordance with some embodiments of the present invention.
- FIG. 10 is a block diagram of exemplary data structures that may be used for storing category-based, term-based, and link-based boost factors for documents in the search results in accordance with some embodiments of the present invention.
- FIG. 11 is a flow diagram of another process for generating website-dependent search results using website profiles in accordance with some embodiments of the present invention.
- FIG. 12 is a block diagram of an exemplary information server in accordance with some embodiments of the present invention.
- the embodiments discussed below only include systems and methods that generate a website profile based on the search history associated with the website and then use the website profile to rank search results in response to search queries submitted from the website.
- the underlying principles discussed below can be easily extended to create webpage profiles and generate webpage-dependent search results using the webpage profiles.
- FIG. 1 is a block diagram of an exemplary environment 100 for implementing some embodiments of the present invention.
- One or more websites 102 and clients 103 can be connected to a communication network 104 .
- the communication network 104 can be connected to an information server 106 .
- the information server 106 may include a front end server 120 , a search engine 122 , a document profiler 125 , a website profiler 129 , a search result ranker 126 , a document profile database 123 , a content database 124 , a search history database 127 , and a website profile database 128 .
- the information server 106 contains a subset or superset of the elements illustrated in FIG. 1 .
- FIG. 1 shows the information server 106 as a number of discrete items, the figure is intended more as a functional description of the various features which may be present in the information server 106 rather than a structural schematic of the various embodiments.
- items shown separately could be combined and some items could be further separated, as would be recognized by one of ordinary skill in the art of designing such systems.
- the four different databases 123 , 124 , 127 , and 128 shown separately in the figure could be implemented by a single database server.
- the actual number of computers constituting the information server 106 and the allocation of features among the computers will vary from one implementation to another, and may depend in part on the amount of traffic that the information server 106 must handle during peak usage periods as well as during average usage periods.
- a website 102 is typically a collection of webpages associated with a domain name on the Internet. Each website (or webpage) has a universal resource locator (URL) that uniquely identifies the location of the website (or webpage) on the Internet. Any visitor can visit the website by entering its URL in a browser window.
- a website can be hosted by a web server exclusively owned by the owner of the domain name or by an Internet service provider wherein its web server manages multiple websites associated with different domain names.
- the website 102 includes two webpages 114 and 116 , each having an associated search box 115 and 117 , respectively.
- a visitor to the webpage 114 can search the website 102 or the entire Internet for relevant information by entering a search query into the search box.
- the term “website” as used in this document refers to a logical location (e.g., an Internet or intranet location) identified by a URL, or it refers to a web server hosting the website represented by the URL, or both.
- a client 103 can be any of a number of devices (e.g., a computer, an internet kiosk, a personal digital assistant, a cell phone, a gaming device, a desktop computer, or a laptop computer) and can include a client application 132 , a client assistant 134 , and/or client memory 136 .
- the client application 132 can be a software application that permits a user to interact with the client 103 and/or network resources to perform one or more tasks.
- the client application 132 can be a browser (e.g., Firefox) or other type of application that permits a user to search for, browse, and/or use resources (e.g., webpages and web services) at the website 102 from the client 103 and/or accessible via the communication network 104 .
- the client assistant 134 can be a software application that performs one or more tasks related to monitoring or assisting a user's activities with respect to the client application 132 and/or other applications. For instance, the client assistant 134 assists a user at the client 103 with browsing for resources (e.g., files) hosted by the website 102 ; processes information (e.g., search results) received from the information server 106 ; and monitors the user's activities on the search results.
- resources e.g., files
- information server 106 e.g., search results
- the client assistant 134 is part of the client application 132 , available as a plug-in or extension to the client application 132 (provided, for example, from various online sources), while in other embodiments the client application is a stand-alone program separate from the client application 132 .
- the client assistant 134 is embedded in one or more webpages or other documents downloaded from one or more servers, such as the information server 106 .
- Client memory 136 can store information such as webpages, documents received from the information server 106 , system information, and/or information about a user, among other things.
- the communication network 104 can be any wired or wireless local area network (LAN) and/or wide area network (WAN), such as an intranet, an extranet, or the Internet. It is sufficient that the communication network 104 provide communication capability between the websites 102 , the clients 103 and the information server 106 .
- the communication network 104 uses the HyperText Transport Protocol (HTTP) to transport information using the Transmission Control Protocol/Internet Protocol (TCP/IP).
- HTTP permits client computers to access various resources available via the communication network 104 .
- the various embodiments of the invention are not limited to the use of any particular protocol.
- resource refers to any piece of information or service that is accessible via a URL and can be, for example, a webpage, a document, a database, an image, a computational object, a search engine, or other online information service.
- a user from a client 103 first sends to a website 102 a request for a webpage.
- the website responds by identifying the requested webpage and returns it to the requesting client 103 .
- the webpage may include a document of interest to the user (e.g., a newspaper article).
- the webpage may also include a search box (e.g., at or near the top of the webpage). While or after browsing the content of the webpage, the user may be interested in getting more information. To do so, the user can enter a search query into the search box and submit the search query to the website 102 .
- the search query may include one or more query terms.
- the website 102 upon receipt of the search query, the website 102 generates and sends a search request to the information server 106 .
- the client 103 upon receipt of the search query, the website 102 generates and sends a search request to the information server 106 .
- the client 103 upon receipt of the search query, the website 102 generates and sends a search request to the information server 106 .
- the client 103 upon receipt of the search query, the website 102 generates and sends a search request to the information server 106 .
- the client 103 generates and sends the search request directly to the information server 106 without routing the request through the website 102 .
- the search request includes the search query and unique identifiers of the requesting website 102 and the requesting client 103 .
- the front end server 120 is configured to handle a variety of requests from the websites 102 and the clients 103 via their respective connections with the communication network 104 .
- the front end server 120 is connected to the search engine 122 and the search engine 122 is connected to the content database 124 , respectively.
- the content database 124 stores a large number of indexed documents retrieved from different websites. Alternately, or in addition, the content database 124 stores an index of documents stored at various websites.
- each indexed document is assigned a page rank according to the document's link structure. The page rank serves as a query-independent measure of the document's importance.
- the front end server 120 passes the search request onto the search engine 122 .
- the search engine 122 then communicates with the content database 124 to select a plurality of documents in response to the search request.
- the search engine 122 assigns a generic ranking score to each document based on the document's page rank, the text associated with the document, and the search query.
- the search engine 122 is also connected to the document profile database 123 .
- the document profile database 123 stores a document profile for each indexed document in the content database 124 .
- Both the document profile database 123 and the content database 124 are connected to the document profiler 125 .
- the document profiler For each document in the content database 124 , the document profiler generates a document profile by analyzing the content of the document and its link structure. The generation of document profiles is independent of the operation of the search engine 122 .
- the document profiler 125 is invoked to generate a document profile whenever the information server 106 identifies a new document or a new version of an existing document on the Internet.
- the document profiler 125 is invoked periodically to generate document profiles for all new files identified during a predetermined time period.
- the document profile database 123 and the content database 124 are merged together so that a document and its associated profile can be located by a single database query.
- the search engine 122 sends the identified documents and their associated document profiles to the search result ranker 126 .
- the search result ranker 126 has a connection to the website profile database 128 .
- the website profile database 128 stores a large number of website profiles including the profile of the requesting website 102 .
- the search result ranker 126 converts the generic ranking score of each identified document into a website-dependent ranking score. The documents are then re-ordered in accordance with their respective website-dependent ranking scores.
- the search result ranker 126 creates a search result in accordance with the updated order of the documents, the search result including multiple document links, one for each document.
- the search result, or a portion of the search result (e.g., information identifying the top 10, 15 or 20 results) is returned to the requesting client 103 and displayed to the user through the client application 132 .
- the user after browsing the search result, may click one or more document links in the search result to download and view one or more documents identified by the search result.
- this particular division of tasks is exemplary, and other divisions may be used in other embodiments of the present invention.
- the website profile (of the website from which a search query is received) may be transmitted with the search query to the search engine 122 , and the search engine 122 may use that information to compute website specific document scores for ranking the search results. In effect, this would merge the search result ranker 126 into the search engine 122 .
- other divisions of tasks may be used.
- a website profile should reflect the interests of the users of the associated website, and in many embodiments the website profile will be unique to its associated website. For example, a consumer electronics website should have a website profile that boosts webpages related to electronic products while an on-line grocery store website should have a website profile that promotes webpages related to farm produces.
- a website profile is not static, because a static website profile is unlikely to result in the information server 106 serving the most relevant search results to users of the associated website. Instead, a website profile is updated from time to time, (e.g., periodically) so as to re-align the website profile with the current interest of the users of the website. While some website profiles may remain virtually static for long periods of time (e.g., websites serving a small, static population of users who submit searches from the website on only a very narrow range of topics), many website profiles will vary over time as the users of the website changes and as the interests of the website's users varies over time.
- a typical user profile is generated by analyzing an individual user's search history. This user profile is only used to modulate search results responsive to search queries submitted by the same user. For the same search query, two different users may receive different search results from the same search engine if they have different user profiles.
- a website profile is generated by analyzing the search history of multiple users while visiting the website so as to characterize the multiple users' interests.
- This website profile can be used to modulate search results responsive to search queries submitted by any user from the same website, including new users of the website who made no prior “contribution” to the website profile. Therefore, the same user submitting the same search query from two different websites may receive different search results if the two websites have different website profiles.
- the website profile also has an important advantage over the user profile in terms of protecting a user's privacy.
- a user profile is associated with an individual user. To create the user profile, the individual user, either explicitly or implicitly (e.g., by monitoring or logging search queries and other online activities of the user), needs to complete a survey of his or her personal preferences. This survey indicates what information items may be of interest to the user. Further, the user must have an account at a website or a search engine system and the user must log into his or her account to invoke the user profile to personalize the search results. In contrast, the creation and usage of the website profile does not require any personal information from any user.
- a website profile is associated with a website, not an individual user. Any individual user's activity at the website is attributed to all the users of the website. A user does not need to log into his or her account at the website in order to use the website profile. As long as a search query is submitted from the website, the information server automatically “personalizes” the corresponding search result in accordance with the website profile.
- the website profiler 129 is responsible for generating and updating website profiles.
- the website profiler 129 needs to have access to the users search history at the website.
- the users search history includes the search queries submitted by users while visiting the website, the search results responsive to the search queries, and the user activities on the search results (e.g., selection of a document link, sometimes called “clicking” on a search result, or mouse hovering time over a document link).
- the front end server 120 when the front end server 120 receives a search query from a website, it submits a copy of the search query to the search engine 122 to solicit a search result. In addition, the front end server 120 sends another copy of the search query to the search history database 127 . The search history database 127 then generates a record, the record including at least the search query and an identifier of the website from which the search query was received.
- the search result ranker 126 prepares a search result responsive to the search query.
- the search result i.e., information representing at least a portion of the search result
- the search result is sent back to the requesting client through the front end server 120 .
- a copy of the search result, or a portion of the search result, is also stored in the search history database 127 together with the search query record.
- the client assistant 134 at the requesting client monitors the requesting user's activities on the search result, e.g., recording the user's selection(s) of the document links in the search result and/or the mouse hovering time on different document links.
- the client assistant 134 or the website profiler 129 determines the document “dwell time” for a document selected by the user, by determining the amount of time between user selection of the corresponding document link and the user exiting from the document.
- the client assistant 134 includes executable instructions, stored in the webpage(s) containing the search result, which monitor the user's actions with respect to the search results and transmit information about the monitored user actions back to the information server 106 .
- the information server 106 stores information about these user activities is transferred back to the information server 106 and stored in the search history database 127 for subsequent use.
- the website profiler 129 records the moment that a user submits a search query (t 0 ), the moment that the user clicks the first document link in the corresponding search result (t 1 ), and the moment that the user clicks the second document link in the search result (t 2 ), etc.
- the differences between two consecutive moments e.g., t 1 -t 0 or t 2 -t 1 ) are reasonable approximations of the amount of time spent viewing the search result or the document whose link was selected by the user.
- the website profiler 129 has no information about the user's dwell time for the last document in the search result that the user selects for viewing.
- the website profiler 129 also receives click and timestamp information for user actions after the user finishes viewing documents from a search result.
- the website profiler 129 further records the moment that the user submits a second query (t 3 ), the moment the user selects a document from the second search results (t 4 ), and so on.
- the website profiler 129 may record the moment (t 5 ) when the user either closes the browser window that was being used to view search results and documents listed in the search results or navigates away from the website from which the query was received.
- This additional information enables the website profiler 129 to determine the user dwell time for all search result documents (i.e., documents listed in search results) viewed by a user, which in turn enables the website profiler 129 to generate a more accurate website profile for a website.
- FIG. 2 is a flow diagram of a process for generating a website profile using the website's search history in accordance with some embodiments of the present invention.
- the website profiler 129 identifies search queries submitted from the website ( 210 ). While in most cases, this will include all search queries submitted from the website, in the case of very popular or busy websites, the identified search queries may comprise a subset or sampling of the submitted search queries. Search queries submitted from a website during a predetermined time period presumably represent the general interest of users using the website. The search queries are especially relevant to capture dynamic user interests that vary by time.
- the website profiler 129 identifies the corresponding search results ( 215 ).
- the search results are served to the requesting users with an embedded client assistant 134 that sends information about the user activities on the search results to the website profiler 127 .
- the website profiler identifies user activities on the search results ( 230 ). Identified user activities may include user clicks on document links in search results.
- identified user activities may include mouse hovering time on the document links. Generally speaking, a user clicks a document link if the user is interested in the document's content. Similarly, the fact that the mouse moves onto a particular document link and stays there for a substantial amount of time indicates that this document is relevant to the user's interest. In some embodiments, information about the mouse hovering time may be unavailable.
- the website profiler 129 can identify documents selected by the website users. In some embodiments, the website profiler 129 visits the content database 124 to retrieve the profiles of the corresponding documents ( 235 ). As noted above, each identified document may have a profile (e.g., a category profile) that was previously generated. If any of the identified documents do not yet have profiles, those documents can be ignored, or the website profiler may call upon the document profiler 125 to produce document profiles for those documents. A website profile is then generated from the retrieved document profiles ( 240 ).
- a profile e.g., a category profile
- the website profile may include one or more of the following: a weighted listing or vector of categories (sometimes called a category-profile), key terms from the search queries and/or user visited documents (sometimes called a term profile), and information about the links to the user visited documents (sometimes called a link profile).
- This website profile is stored in the website profile database 128 .
- the search result ranker 126 can retrieve the website profile to re-order the ranks of the documents within a search result.
- operations 235 and 240 are replaced by a clustering operation in which user selected documents are clustered purely based on the fact that the same user clicks their associated links.
- the website profiler directly matches a document's URL against a known set of URLs associated with a particular category. In either case, the website profiler 129 does not need to access the documents' contents in order to generate the website profile.
- operations 230 through 240 are replaced by a process that maps the queries submitted from a website to a set of categories.
- the categorization of queries can be based on the terms in the queries themselves, or by accessing the profiles of the top N search results (e.g., the top 5, 10, 15 or 20 search results), merging those document profiles to produce a query profile for each query, and merging the query profiles, weighted in accordance with their frequency of submission by the users of the website's search box(es) to generate a website profile.
- this process may exclude queries that are deemed to be unlikely to be related to the primary interests of the website's users.
- a website profile is updated from time to time in order to keep track of the current interests of the users visiting the website ( 245 ).
- a website profile is updated at a predetermined time interval (e.g., every week or every day).
- a website profile is updated whenever the number of new search queries at the website reaches a threshold value since a last (i.e., most recent) update.
- the website profiler 129 repeats the aforementioned process to update the website profile.
- different websites attract substantially different magnitudes of traffic and therefore should be treated differently in terms of profile updating. For instance, a popular website may receive tens of thousands of hits per day while a less popular website may have a much lower hit rate.
- the search history database 126 may allocate amounts of storage space for different websites. As a result, the volume of search history associated with the popular website does not exhaust its designated space and the less popular website does not waste too much space before their next scheduled profile updating.
- Some websites are so popular that it is impractical to store in the search history database 127 all the search history for the purpose of profile updating. For example, an on-line bookstore may have a significantly large number of visitors when a new bestseller is released.
- the search history database 127 may not have the space to store all the search history.
- One approach to solve this issue is to intentionally ignore some of the search queries, search results and user activities.
- This may be accomplished by sampling the search queries, search results and/or user activities so as to produce an unbiased sample of the search history. While the extent of the sampling may vary from one embodiment to another, experiments suggest that a search history encompassing several months of user activities will have sufficient data to generate a reliable website profile, for most websites, so long as (A) the sampling is done in a manner that avoids significant biases, and (B) it includes user activity data corresponding to a few weeks of representative search history.
- the space shortage issue can be solved by generating a series of incremental website profiles for different portions of the search history and merging the incremental website profiles into the website profile.
- the website profiler 129 first generates an incremental profile 311 for the search history section 301 .
- Each search history section 301 , 303 , 305 may include a predefined quantity of search history information, or it may include search history information for a predefined length of time (e.g., an hour), or it may a portion of the search history selected in accordance with predefined selection criteria.
- the process of generating an incremental website profile is similar to the process discussed above in connection with FIG. 2 .
- the incremental profile 311 is equivalent to the search history section 301 in terms of characterizing the interests of the website users.
- the website profiler 129 can create the new website profile 337 by merging the incremental profiles 311 , 313 , and 315 into the old website profile 331 .
- the website profiler 129 is able to take into account the entire search history by creating incremental website profiles for search history sections 301 , 303 , and 305 and by merging an existing website profile with incremental profiles 311 , 313 , and 315 .
- a website profile is used for “personalizing” or “flavoring” search results responsive to search queries submitted from a specific website.
- An underlying assumption in the present specification is that these search queries are, more or less, related to the topics covered by the website.
- the search query “Tiger Woods” is reasonably relevant while the search query “Britney Spears” is probably irrelevant at all.
- Another source of contamination of the website profile is query terms that, although relevant, have very low popularity. Special treatment may be necessary to make sure that user activities with respect to very low popularity query terms do not significantly bias the search results.
- FIG. 4 is an exemplary curve 400 characterizing the popularity distribution of search queries submitted from a website. All the search queries are divided into three categories by the two thresholds 415 and 425 .
- the leftmost category 410 includes those search queries that are “abnormally” popular, but less relevant, to the website.
- the search query “Britney Spears” being submitted by a golfing website's search window is an example of a search query in this category.
- the website profiler 129 should eliminate or at least reduce the influence of the search history associated with these queries on the website profile by giving them relatively low weights.
- the middle category 420 includes those search queries that are reasonably popular and relevant to the website. The search history corresponding to these search queries should be granted higher weights to make a major contribution to the website profile.
- the rightmost category 430 includes those queries that only appear in the website's search box occasionally. They should be treated in a manner similar to the queries in the leftmost category 410 .
- search query or a corresponding search result
- the popularity of the search query and the amount of user activities on the search result affect the contribution of the search query and the search result on the website profile.
- Time is another important factor.
- recent search history plays a more prominent role than less recent search history in the formation of the website profile.
- One skilled in the art can easily apply similar principles to other aspects of the search history associated with the website.
- FIG. 5 is a block diagram illustrating how the process of creating a website profile is divided into multiple sub-processes in accordance with some embodiments of the present invention.
- it is a non-trivial process to create a profile 530 for a website using its search history.
- the search history involves different types of information from different sources, such as the search queries 501 submitted by users from the website, the search results 503 generated by the search engine in response to the search queries, and the user activities 505 on the search results.
- this process is further divided into multiple sub-processes.
- Each sub-process produces a specific type of website profile characterizing the interests of the website users from a particular perspective. They are:
- the website profile 530 includes only a subset of the profiles 531 , 533 , 535 .
- the website profile 530 may include the term-based profile 533 and the category-based profile 531 , but not the link-based profile 535 .
- the website profile 530 includes a plurality of profiles, at least one of which is a combination of two or more of the aforementioned profiles 531 , 533 , 535 .
- the category-based, term-based and/or link-based profiles are further processed to generate a refined category-based (or cluster-based) profile.
- this refined category-based (or cluster-based) profile appears in the form of multiple category-based (or cluster-based) sub-profiles to characterize different aspects of the website.
- the category-based profile 531 may be constructed, for instance, by mapping search history items (e.g., search queries, content terms, and/or user-selected documents) to categories, and then aggregating the resulting sets of the categories and weighting the categories.
- the categories may be weighted based on their frequency of occurrence in the search history items.
- the categories may be weighted based on the relevance of the search history items to the categories.
- the search history items accumulated over a period of time may be treated as a group for mapping into weighted categories. Other suitable ways of mapping the search history into weighted categories may also be used.
- FIG. 6A illustrates a hierarchal category map 600 according to the Open Directory Project (http://dmoz.org/). Starting from the root level of map 600 , documents are organized under several major topics, such as “Art”, “News”, “Sports”, etc. These major topics are often too broad to delineate the specific interest of a website user. They are further divided into multiple more specific sub-topics.
- the topic “Art” may comprise the sub-topics like “Movie”, “Music”, and “Literature” and the sub-topic “Music” may further comprise sub-sub-topics like “Lyrics”, “News”, and “Reviews.”
- each topic (or sub-topic) is associated with a unique category identifier like 1.1 for “Art”, 1.4.2.3 for “Talk Show”, and 1.6.1 for “Basketball.”
- the categories shown in FIG. 6A are only for illustrative purposes.
- One skilled in the art will appreciate that there are many other ways of categorizing documents. For example, different concepts can be extracted from the contents of the documents and different categories of relevant information are grouped in accordance with these concepts.
- the interests of users of a particular website may be associated with multiple categories at different levels, each having a weight indicative of the category's relevance to the users' interest.
- the categories and their associated weights can be determined from analyzing the search history associated with the website.
- FIG. 6B is a block diagram of an exemplary data structure, a category-based website profile table 650 , which may be used for storing category-based website profiles in accordance with some embodiments of the present invention.
- the category-based profile table 650 includes a table 640 having a plurality of records 642 , each record including a WEBSITE_ID, a FLAVOR_ID and a pointer pointing to another data structure, such as table 660 - 1 .
- a website may have one or more flavors to better serve different user groups. For example, the website “WEBSITE_ 1 ” has at least two different flavors, “FLAVOR_ 1 ” and “FLAVOR_ 2 .” These two different “flavors” may correspond to different search boxes on different webpages.
- Table 660 - 1 includes two columns, CATEGORY_ID and WEIGHT.
- the CATEGORY_ID column contains a category's identifier as shown in FIG. 6A , and the value in the WEIGHT column indicates the relevance of the category to the interests of the website users.
- the search history items are automatically classified into different clusters.
- Clusters are usually more dynamic than categories.
- categories are typically pre-generated. Search history items associated with different websites are classified against the same set of categories. In contrast, there may not be a predefined set of clusters for a particular website. The search history items associated with the website fall into an automatically generated set of clusters. Therefore, clusters may be better tailored to characterize the interests and preferences of the users of the website. For convenience, many discussions of the present invention use categories as an example. But it will be clear to one skilled in the art that the underlying algorithms are also applicable to clusters with no or little adjustment.
- the website profile based upon the category map 600 is a topic-oriented implementation.
- the items in a category-based profile can also be organized in other ways.
- the interests of the website users can be categorized based on the formats of the documents identified by the website users, such as HTML, plain text, PDF, Microsoft Word, etc. Different formats may have different weights.
- the interests of the website users can be categorized according to the types of the identified documents, e.g., an organization's homepage, a person's homepage, a research paper, or a news group posting, each type having an associated weight.
- Documents can also be categorized by document origin, for instance the country associated with each document's host.
- two or more of the above-identified category-based profiles may co-exist, with each one reflecting a respective aspect of the interests of the website users.
- FIG. 7 is a block diagram of an exemplary data structure, a term-based profile table 700 , which may be used for storing term-based website profiles in accordance with some embodiments of the present invention.
- the table 700 includes a plurality of records 710 , each record corresponding to a website's term-based profile.
- a term-based profile record 710 includes a plurality of columns including a WEBSITE_ID column 720 and multiple columns of (TERM, WEIGHT) pairs 740 .
- the WEBSITE_ID column stores a website identifier.
- Each (TERM, WEIGHT) pair 740 includes a term of typically one to three words that is deemed relevant to the interests of the website users and a weight associated with the term indicating the relevance of the term. The weight of a term is not necessarily a positive value. A negative weight suggests that the website users disfavor documents including this term in the search results.
- link-based profile another type of website profile is referred to as a link-based profile.
- the page rank of a document is based on the link structure that connects the document to other documents on the Internet. A document having more links pointing to it is often assigned a higher page rank and is therefore deemed more popular by the search engine.
- Link information of documents selected by a website's users can be used to infer the interests of the website's users.
- a list of preferred URLs is identified for the website users by analyzing the click rate of these URLs. Each preferred URL may be further weighted according to the mouse hovering time by the website users at the URL.
- a list of preferred web hosts is identified for the website users by analyzing the users' visit rate at different web hosts.
- the weights of the two or more URLs may be combined as the weight of the web host.
- FIG. 8 is a block diagram of an exemplary data structure that may be used for storing link-based website profiles in accordance with some embodiments of the present invention.
- the link-based profile table 800 includes a table 810 that includes a plurality of records 820 , each record including a WEBSITE_ID and a pointer pointing to another data structure, such as table 810 - 1 .
- Table 810 - 1 may include two columns, LINK_ID 830 and WEIGHT 840 .
- the LINK_ID 830 may be associated with a preferred URL or host.
- the actual URL/host may be stored in the table instead of the LINK_ID, however it is preferable to store the LINK_ID to save storage space.
- a preferred list of URLs and/or hosts includes URLs and/or hosts that have been directly identified by the website users.
- the preferred list of URLs and/or host may further extend to URLs and/or hosts indirectly identified by using methods such as collaborative filtering or Bibliometric analysis, which are known to one of ordinary skill in the art.
- the indirectly identified URLs and/or host include URLs or hosts that have links to/from the directly identified URLs and/or hosts. These indirectly identified URLs and/or hosts are weighted by the distance between them and the directly identified URLs or hosts. For example, when a directly identified URL or host has a weight of 1, URLs or hosts that are one link away may have a weight of 0.5, URLs or hosts that are two links away may have a weight of 0.25, etc.
- This procedure can be further refined by reducing the weight of links that are not related to the topic of the original URL or host, e.g., links to copyright pages or web browser software that can be used to view the documents associated with the user-selected URL or host.
- Irrelevant Links can be identified based on their context or their distribution. For example, copyright links often use specific terms (e.g., “copyright” and “All rights reserved” are commonly used terms in the anchor text of a copyright link); and links to a website from many unrelated websites may suggest that this website is not topically related (e.g., links to the Internet Explorer website are often included in unrelated websites).
- the indirect links can also be classified according to a set of topics and links with very different topics may be excluded or be assigned a low weight.
- the three types of website profiles discussed above are generally complimentary to one another since different profiles characterize the interests of the website users from different vantage points. However, this does not mean that one type of website profile, e.g., the category-based profile, is incapable of playing a role that is typically played by another type of website profile.
- a preferred URL or host in a link-based profile is often associated with a specific topic, e.g., finance.yahoo.com is a URL focusing on financial news. Therefore, what is achieved by a link-based profile that comprises a list of preferred URLs or hosts may also be achievable, at least in part, by a category-based profile that has a set of categories that cover the same topics covered by preferred URLs or hosts.
- FIG. 9 is a flow diagram of a process for generating website-dependent search results using the various types of website profiles in accordance with some embodiments of the present invention.
- the search engine 122 receives a search query from a website 102 submitted by a user through a client 103 ( 910 ).
- the search engine 122 may optionally generate a query strategy ( 915 ).
- the search query is normalized so as to be in proper form for further processing, and/or the search query may be modified in accordance with predefined criteria so as to automatically broaden or narrow the scope of the search query.
- the search engine 122 submits the search query (or the query strategy, if one is generated) to the content database 124 .
- the content database 124 identifies a set of documents that match the search query ( 920 ), each document having a generic ranking score that depends on the document's page rank and the search query. All three operations ( 910 , 915 and 920 ) are typically conducted by the search engine 122 .
- the requesting website's identifier is embedded in the search query.
- the search result ranker 126 Based on the website identifier, the search result ranker 126 identifies the website's profile in the website profile database 128 ( 925 ). Next, the search result ranker 126 analyzes each identified document to determine one or more boost factors using the website profile ( 935 ) and then assigns the document a website-dependent ranking score using the document's generic ranking score and the boost factors ( 940 ). The search result ranker 126 iterates the process for every identified document ( 942 ). Finally, the search result ranker 126 re-orders the list of documents according to their website-dependent ranking scores ( 945 ) and sends a search result including links to the list of documents to the requesting client 103 .
- the analysis of an identified document at 935 includes determining a correlation between the document's content and the website's profile. Furthermore, in some embodiments, this operation includes accessing a previously computed document profile for the document and then determining a correlation between the document profile and the website's profile. In some embodiments, determining the correlation includes one or more operations that are “dot product” computations, which determine the extent of overlap, if any, between the document profile and the website's profile.
- FIG. 10 is a block diagram of exemplary data structures that may be used for storing category-based, term-based, and link-based boost factors for documents in the search results in accordance with some embodiments of the present invention.
- category-based document information table 1010 includes a plurality of identified categories and associated weights
- term-based document information table 1030 includes multiple pairs of relevant terms and associated weights
- link-based document information table 1050 includes a set of links and corresponding weights.
- the rightmost column of each of the three tables ( 1010 , 1030 and 1050 ) stores the boost factor (i.e., a computed score) of a document when the document is evaluated using one specific type of website profile.
- a document's boost factor can be determined by combining the weights of the items associated with the document. For instance, a category-based or term-based boost factor may be computed as follows. The users of a website may favor documents related to science with a weight of 0.6, and disfavor documents related to business with a weight of ⁇ 0.2. Thus, when a science document matches a search query, it will be boosted over a business document. In general, the document topic classification may not be exclusive.
- a candidate document may be classified as being a science document with probability of 0.8 and a business document with probability of 0.4.
- a link-based boost factor may be computed based on the relative weights allocated to the preferred URLs or hosts in the link-based profile.
- the term-based profile rank can be determined using known techniques, such as the term frequency-inverse document frequency (TF-IDF).
- TF-IDF frequency-inverse document frequency
- the term frequency of a term is a function of the number of times the term appears in a document.
- the inverse document frequency is an inverse function of the number of documents in which the term appears within a collection of documents. For example, very common terms like “word” occur in many documents and consequently are assigned a relatively low inverse document frequency, while less common terms like “photograph” and “microprocessor” are assigned a relatively high inverse document frequency.
- a candidate document D that satisfies the search query is assigned a query score, QueryScore, in accordance with the search query.
- This query score is then modulated by document D's page rank, PageRank, to generate a generic ranking score, GenericScore, that is expressed as
- This generic ranking score may not appropriately reflect document D's relevance to a particular website's users if the users' interest is dramatically different from that of a random user of the search engine.
- the relevance of document D to the website users can be accurately characterized by a set of boost factors, based on the correlation between document D's content and the website's term-based profile, herein called the TermBoostFactor, the correlation between one or more categories associated with document D and the website's category-based profile, herein called the CategoryBoostFactor, and the correlation between the URL and/or host of document D and the website's link-based profile, herein called the LinkBoostFactor. Therefore, document D may be assigned a website-dependent ranking score that is a function of both the document's generic ranking score and the various website profile-based boost factors. In one embodiment, this website-dependent ranking score can be expressed as:
- WebsiteScore GenericScore*(TermBoostFactor+CategoryBoostFactor+LinkBoostFactor).
- the website-dependent ranking score can be expressed as:
- WebsiteScore GenericScore*BoostFactor
- BoostFactor is based on the correlation between document D's content and the website's profile.
- FIG. 11 is a flow diagram of another process for generating website-dependent search results using website profiles in accordance with some embodiments of the present invention.
- the generic query strategy is modulated by the website's profile to create a website-dependent query strategy ( 1125 , 1165 ). For example, relevant terms from the website profile may be added to the search query with associated weights.
- the website-dependent query strategy is created by the search engine 122 , the front end server 120 , or the search result ranker 126 , respectively.
- the requesting website 102 has a copy of its profile generated by the website profiler 129 and the website-dependent query strategy is created by the requesting website 102 .
- the search engine 122 searches the content database 124 using the website-dependent query strategy ( 1170 ). As a result, the documents identified by the content database 124 are implicitly ordered by their associated website-dependent ranking score ( 1175 ).
- Some embodiments include a computer program product for use in conjunction with a computer system associated with a search engine.
- the computer program product may comprise a computer readable storage medium and a computer program mechanism embedded therein.
- the computer program mechanism includes instructions for receiving from a website distinct from the search engine multiple search queries submitted by users; instructions for providing to the requesting users search results responsive to the search queries; instructions for monitoring activities of the users on the search results; and instructions for generating a profile for a website using the search queries from the website and the user activities on the search results.
- an exemplary information server 1200 typically includes one or more processing units (CPU's) 1202 , one or more network or other communications interfaces 1210 , memory 1212 , and one or more communication buses 1014 for interconnecting these components.
- the communication buses 1014 may include circuitry (sometimes called a chipset) that interconnects and controls communications between system components.
- the system 1200 may optionally include a user interface, for instance a display and a keyboard.
- Memory 1212 may include high speed random access memory and may also include non-volatile memory, such as one or more magnetic disk storage devices.
- Memory 1212 may include mass storage that is remotely located from the CPU's 1202 .
- memory 1212 stores the following programs, modules and data structures, or a subset or superset thereof:
- the information server 106 may not have access to all the search history associated with a website. For example, there may be an agreement between a website 102 and the information server 106 with respect to the search queries submitted from the website 102 . According to the agreement, when a user visiting the website 1027 submits a search query to the information server 106 , the information server 106 is required to send the corresponding search result to the website 102 rather than the requesting user at a client 103 .
- the website 102 may modify the search result, e.g., attaching advertisements or other information to the search result, and then serves the modified search result to the requesting user at the client 103 .
- the information server 106 may have no information identifying the requesting user and the client 103 , and may also be unable to monitor the user's activities on the search result. For example, the information server 106 may not receive any information identifying the document links in the search result that have been clicked by the user. Similarly, the information server 106 may not receive any information identifying the document links over which the user moves his or her mouse link and the corresponding mouse hovering time. In other words, the information server 106 has very limited or no exposure to the activities of the website users on the search results. Therefore, the information server 106 has to rely on the user activities on the search results from other venues to generate the website profile.
- the information server 106 may identify another website similar to the website in question. Two websites are deemed similar if a predefined number or percentage of search queries submitted from the two websites is identical. It is also reasonable to infer that users of the two similar websites may have similar interests and therefore the user activities associated with one website are a reasonable proxy of the user activities associated with the other one. If the information server 106 can access the user activities associated with one of the two websites (e.g., there is no agreement to deliver the search results to the website), the information server 106 can use the same user activities to create the profile for other website.
- the information server 106 may utilize monitored user activities associated with search queries submitted directly to the search engine (e.g., search queries submitted using a toolbar search box or a webpage associated with the information server 106 ) as the proxy of a particular website.
- search queries submitted directly to the search engine e.g., search queries submitted using a toolbar search box or a webpage associated with the information server 106
- the only search queries for which such “general user population” information will be used are queries that were submitted from the website in question.
- the search query “golf courses in mountain view” may be submitted both to a golf-focused website, and to a general purpose search engine.
- Profile information developed from general user population clicks on the search result of this search query are used to generate the profile for a respective website by combining or aggregating the general user statistical information for the queries received from the respective website.
- the website profile obtained in this way will typically differ significantly from a group profile of the entire user community of the search engine, and therefore the website profile generated in this way will be a reasonable approximation of the website profile that would be generated if user activity information were available for search results returned by the search engine in response to search queries submitted from the website.
- the website profiles can also be used to select advertisements for search queries submitted from different websites. Different advertisements are treated in a way similar to different documents. For example, an advertisement may have a set of key terms. A correlation of this set of key terms with a term-based profile (or a category-based profile, or both) associated with a website produces a booster factor for the advertisement. This boost factor may be used to promote or demote the particular advertisement in response to a search query submitted from the website. For example, when the information server 106 receives a search query “world cup 2006” from a website or webpage devoted to soccer news, it may promote those advertisements covering soccer gears, ticket sale for the 2006 FIFA World Cup Germany, and hotel reservations at the German cities hosting the soccer game. etc.
Abstract
In a method of profiling a website, an information server receives multiple search queries from a website submitted by different users. Different search results responsive to the search queries are provided to the requesting users. The information server monitors activities of the users on the search results and generates a profile for the website using the search queries and the user activities. When the information server receives a same search query from two different websites, it identifies a plurality of information items associated with the search query. The information server uses profiles of the two websites to customize the information items into two different orders and serves the information items to the two websites in the two different orders.
Description
- This application is a continuation of U.S. application Ser. No. 11/394,620, filed Mar. 30, 2006, entitled “Generating Website Profiles Based on Queries from Websites and User Activities on the Search Results,” which is incorporated herein by reference in its entirety.
- This application is related to U.S. patent application Ser. No. 10/890,854, filed Jul. 13, 2004, now U.S. Pat. No. 7,693,827, entitled “Personalization of Placed Content Ordering in Search Results,” which is incorporated herein by reference in its entirety.
- This application is also related to U.S. patent application Ser. No. 10/869,492, filed Jun. 15, 2004, now U.S. Pat. No. 7,565,630, entitled “Customization of Search Results for Search Queries Received from Third Party Sites,” which is incorporated herein by reference in its entirety.
- The present invention relates generally to the field of a search engine in a computer network system, in particular to system and method of generating a profile for a website and using the profile to customize rankings of search results in response to search queries submitted from the website.
- Search engines are a powerful tool of locating and retrieving documents from the Internet (or an intranet). Many websites include at least one search box on their webpages. The search box on a particular webpage typically enables users to submit search queries to search for documents at the website associated with the webpage, or to search for document on the Internet. However, most websites do not have an exclusive, dedicated search engine system for processing these search queries. This is especially true if the search box enables searches of the entire Internet for relevant documents. Rather, the search queries are re-directed to and processed by a third-party search engine (e.g., www.google.com). The third-party search engine generates search results responsive to the search queries (e.g., by searching a database of documents) and returns the search results to the requesting users.
- Traditionally, the search results produced by the third-party search engine are independent of the website from which a search query is submitted. For example, the search engine generates the same search result for the search query “apple” irrespective of whether the search query is from the website of an online retail electronics store frequented by Apple computer users or an online shopping website hosted by a grocery store. Clearly visitors to these two websites have different interests and should receive different search results. As a result, the search results returned for the search query “apple” are likely to include results of little interest to visitors to these respective websites.
- A similar issue could arise for a website that includes multiple search boxes associated with different webpages. For instance, a sports news website may have one webpage covering domestic news and another one devoted to international news. A user entering the term “football” into the search box on the domestic news webpage is probably interested in news related to American football, while a user entering the same term “football” into the search box on the international news webpage is probably more interested in news about soccer (which is known as “football” outside the United States). Similar issues may arise if a sports news website has different webpages covering news for different sports, and search boxes in each of these pages. Thus, when a search engine ignores the webpages from which a search query is submitted, users do not receive search results best tailored to their distinct interests.
- In view of the aforementioned, it would be desirable to have a search engine that can customize its search results in accordance with the websites (or webpages) from which the corresponding search queries are submitted so as to highlight information items in the search results that are most likely to be of interest to the users who submit the search queries. Further, it would be desirable for such a system to operate without explicit input from a user with regard to the user's personal preferences and interests and therefore free the user from concerns over exposing private information.
- In a method of profiling a website, an information server receives multiple search queries from a website submitted by different users. Different search results responsive to the search queries are provided to the requesting users. The information server monitors activities of the users on the search results and generates a profile for the website using the search queries and the user activities.
- In a method of providing website-dependent search results, an information server receives a same query from two websites and identifies a plurality of information items associated with the search query. The information server uses profiles of the two websites to customize the information items into two different orders and serves the information items to the two websites in the two different orders. The two website profiles are related to the search histories of the two websites.
- The present invention, including website profile construction and search results re-ordering and/or scoring, can be implemented on either the client side or the server side of a client-server network environment.
- The aforementioned features and advantages of the invention as well as additional features and advantages thereof will be more clearly understood hereinafter as a result of a detailed description of preferred embodiments of the invention when taken in conjunction with the drawings.
-
FIG. 1 is a block diagram of an exemplary distributed system that includes a plurality of websites and clients requesting information from an information server in accordance with some embodiments of the present invention. -
FIG. 2 is a flow diagram of a process for generating a website (or webpage) profile using search queries, search results and user activities associated with the website (or webpage) in accordance with some embodiments of the present invention. -
FIG. 3 is a block diagram of a process for updating a website (or webpage) profile by merging an incremental website (or webpage) profile into the website (or webpage) profile in accordance with some embodiments of the present invention. -
FIG. 4 is a prophetic example of a curve characterizing the popularity distribution of search queries submitted from a website (or webpage). -
FIG. 5 is a block diagram illustrating how the process of creating a website profile is divided into multiple sub-processes in accordance with some embodiments of the present invention. -
FIG. 6A is a block diagram of an exemplary category map that may be used for generating category-based website profiles in accordance with some embodiments of the present invention. -
FIG. 6B is a block diagram of an exemplary data structure that may be used for storing category-based website profiles in accordance with some embodiments of the present invention. -
FIG. 7 is a block diagram of an exemplary data structure that may be used for storing term-based website profiles in accordance with some embodiments of the present invention. -
FIG. 8 is a block diagram of an exemplary data structure that may be used for storing link-based website profiles in accordance with some embodiments of the present invention. -
FIG. 9 is a flow diagram of a process for generating website-dependent search results using website profiles in accordance with some embodiments of the present invention. -
FIG. 10 is a block diagram of exemplary data structures that may be used for storing category-based, term-based, and link-based boost factors for documents in the search results in accordance with some embodiments of the present invention. -
FIG. 11 is a flow diagram of another process for generating website-dependent search results using website profiles in accordance with some embodiments of the present invention. -
FIG. 12 is a block diagram of an exemplary information server in accordance with some embodiments of the present invention. - Like reference numerals refer to corresponding parts throughout the several views of the drawings.
- For illustrative purposes, the embodiments discussed below only include systems and methods that generate a website profile based on the search history associated with the website and then use the website profile to rank search results in response to search queries submitted from the website. However, it will be apparent to one skilled in the art will that the underlying principles discussed below can be easily extended to create webpage profiles and generate webpage-dependent search results using the webpage profiles.
-
FIG. 1 is a block diagram of anexemplary environment 100 for implementing some embodiments of the present invention. One ormore websites 102 andclients 103 can be connected to acommunication network 104. Thecommunication network 104 can be connected to aninformation server 106. Theinformation server 106 may include afront end server 120, asearch engine 122, adocument profiler 125, awebsite profiler 129, asearch result ranker 126, adocument profile database 123, acontent database 124, asearch history database 127, and awebsite profile database 128. - In some embodiments, the
information server 106 contains a subset or superset of the elements illustrated inFIG. 1 . AlthoughFIG. 1 shows theinformation server 106 as a number of discrete items, the figure is intended more as a functional description of the various features which may be present in theinformation server 106 rather than a structural schematic of the various embodiments. In practice, items shown separately could be combined and some items could be further separated, as would be recognized by one of ordinary skill in the art of designing such systems. For example, the fourdifferent databases information server 106 and the allocation of features among the computers will vary from one implementation to another, and may depend in part on the amount of traffic that theinformation server 106 must handle during peak usage periods as well as during average usage periods. - A
website 102 is typically a collection of webpages associated with a domain name on the Internet. Each website (or webpage) has a universal resource locator (URL) that uniquely identifies the location of the website (or webpage) on the Internet. Any visitor can visit the website by entering its URL in a browser window. A website can be hosted by a web server exclusively owned by the owner of the domain name or by an Internet service provider wherein its web server manages multiple websites associated with different domain names. For illustrative purposes, thewebsite 102 includes twowebpages search box website 102 or the entire Internet for relevant information by entering a search query into the search box. Depending on the context, the term “website” as used in this document refers to a logical location (e.g., an Internet or intranet location) identified by a URL, or it refers to a web server hosting the website represented by the URL, or both. - A
client 103 can be any of a number of devices (e.g., a computer, an internet kiosk, a personal digital assistant, a cell phone, a gaming device, a desktop computer, or a laptop computer) and can include aclient application 132, aclient assistant 134, and/orclient memory 136. Theclient application 132 can be a software application that permits a user to interact with theclient 103 and/or network resources to perform one or more tasks. For example, theclient application 132 can be a browser (e.g., Firefox) or other type of application that permits a user to search for, browse, and/or use resources (e.g., webpages and web services) at thewebsite 102 from theclient 103 and/or accessible via thecommunication network 104. Theclient assistant 134 can be a software application that performs one or more tasks related to monitoring or assisting a user's activities with respect to theclient application 132 and/or other applications. For instance, theclient assistant 134 assists a user at theclient 103 with browsing for resources (e.g., files) hosted by thewebsite 102; processes information (e.g., search results) received from theinformation server 106; and monitors the user's activities on the search results. In some embodiments theclient assistant 134 is part of theclient application 132, available as a plug-in or extension to the client application 132 (provided, for example, from various online sources), while in other embodiments the client application is a stand-alone program separate from theclient application 132. In some embodiments theclient assistant 134 is embedded in one or more webpages or other documents downloaded from one or more servers, such as theinformation server 106.Client memory 136 can store information such as webpages, documents received from theinformation server 106, system information, and/or information about a user, among other things. - The
communication network 104 can be any wired or wireless local area network (LAN) and/or wide area network (WAN), such as an intranet, an extranet, or the Internet. It is sufficient that thecommunication network 104 provide communication capability between thewebsites 102, theclients 103 and theinformation server 106. In some embodiments, thecommunication network 104 uses the HyperText Transport Protocol (HTTP) to transport information using the Transmission Control Protocol/Internet Protocol (TCP/IP). The HTTP permits client computers to access various resources available via thecommunication network 104. The various embodiments of the invention, however, are not limited to the use of any particular protocol. The term “resource” as used throughout this specification refers to any piece of information or service that is accessible via a URL and can be, for example, a webpage, a document, a database, an image, a computational object, a search engine, or other online information service. - In order to receive website-dependent search results, a user from a
client 103 first sends to a website 102 a request for a webpage. The website responds by identifying the requested webpage and returns it to the requestingclient 103. The webpage may include a document of interest to the user (e.g., a newspaper article). The webpage may also include a search box (e.g., at or near the top of the webpage). While or after browsing the content of the webpage, the user may be interested in getting more information. To do so, the user can enter a search query into the search box and submit the search query to thewebsite 102. The search query may include one or more query terms. - As noted above, many websites do not have a dedicated search engine. Their search requests are actually handled by a third-party search engine. In some embodiments, upon receipt of the search query, the
website 102 generates and sends a search request to theinformation server 106. In some other embodiments, theclient 103 generates and sends the search request directly to theinformation server 106 without routing the request through thewebsite 102. In either case, the search request includes the search query and unique identifiers of the requestingwebsite 102 and the requestingclient 103. - Within the
information server 106, thefront end server 120 is configured to handle a variety of requests from thewebsites 102 and theclients 103 via their respective connections with thecommunication network 104. As shown inFIG. 1 , thefront end server 120 is connected to thesearch engine 122 and thesearch engine 122 is connected to thecontent database 124, respectively. Thecontent database 124 stores a large number of indexed documents retrieved from different websites. Alternately, or in addition, thecontent database 124 stores an index of documents stored at various websites. In one embodiment, each indexed document is assigned a page rank according to the document's link structure. The page rank serves as a query-independent measure of the document's importance. - The
front end server 120 passes the search request onto thesearch engine 122. Thesearch engine 122 then communicates with thecontent database 124 to select a plurality of documents in response to the search request. Thesearch engine 122 assigns a generic ranking score to each document based on the document's page rank, the text associated with the document, and the search query. - The
search engine 122 is also connected to thedocument profile database 123. Thedocument profile database 123 stores a document profile for each indexed document in thecontent database 124. Both thedocument profile database 123 and thecontent database 124 are connected to thedocument profiler 125. For each document in thecontent database 124, the document profiler generates a document profile by analyzing the content of the document and its link structure. The generation of document profiles is independent of the operation of thesearch engine 122. In one embodiment, thedocument profiler 125 is invoked to generate a document profile whenever theinformation server 106 identifies a new document or a new version of an existing document on the Internet. In another embodiment, thedocument profiler 125 is invoked periodically to generate document profiles for all new files identified during a predetermined time period. In some embodiments, instead of being two separate entities, thedocument profile database 123 and thecontent database 124 are merged together so that a document and its associated profile can be located by a single database query. - There is a connection from the
search engine 122 to thesearch result ranker 126. Through this connection, thesearch engine 122 sends the identified documents and their associated document profiles to thesearch result ranker 126. Thesearch result ranker 126 has a connection to thewebsite profile database 128. Like thedocument profile database 123, thewebsite profile database 128 stores a large number of website profiles including the profile of the requestingwebsite 102. Using the requestingwebsite 102's profile, thesearch result ranker 126 converts the generic ranking score of each identified document into a website-dependent ranking score. The documents are then re-ordered in accordance with their respective website-dependent ranking scores. Next, thesearch result ranker 126 creates a search result in accordance with the updated order of the documents, the search result including multiple document links, one for each document. The search result, or a portion of the search result (e.g., information identifying the top 10, 15 or 20 results) is returned to the requestingclient 103 and displayed to the user through theclient application 132. The user, after browsing the search result, may click one or more document links in the search result to download and view one or more documents identified by the search result. - While the above description divided tasks among the
search engine 122,search result ranker 126 andfront end server 120 in a particular way, this particular division of tasks is exemplary, and other divisions may be used in other embodiments of the present invention. For instance, the website profile (of the website from which a search query is received) may be transmitted with the search query to thesearch engine 122, and thesearch engine 122 may use that information to compute website specific document scores for ranking the search results. In effect, this would merge thesearch result ranker 126 into thesearch engine 122. In yet other embodiments, other divisions of tasks may be used. - An important aspect of the process of serving website-dependent search results is the generation and maintenance of the website profiles stored in the
website profile database 128. A website profile should reflect the interests of the users of the associated website, and in many embodiments the website profile will be unique to its associated website. For example, a consumer electronics website should have a website profile that boosts webpages related to electronic products while an on-line grocery store website should have a website profile that promotes webpages related to farm produces. - In most embodiments, a website profile is not static, because a static website profile is unlikely to result in the
information server 106 serving the most relevant search results to users of the associated website. Instead, a website profile is updated from time to time, (e.g., periodically) so as to re-align the website profile with the current interest of the users of the website. While some website profiles may remain virtually static for long periods of time (e.g., websites serving a small, static population of users who submit searches from the website on only a very narrow range of topics), many website profiles will vary over time as the users of the website changes and as the interests of the website's users varies over time. - There are similarities between a website profile and a user profile. Both profiles can be used to finely tune the search results generated by the search engine. Both need information about at least one user's search history in order to capture the user's dynamic search interest. But there are also significant differences between the two types of profiles. A typical user profile is generated by analyzing an individual user's search history. This user profile is only used to modulate search results responsive to search queries submitted by the same user. For the same search query, two different users may receive different search results from the same search engine if they have different user profiles. In contrast, a website profile is generated by analyzing the search history of multiple users while visiting the website so as to characterize the multiple users' interests. This website profile can be used to modulate search results responsive to search queries submitted by any user from the same website, including new users of the website who made no prior “contribution” to the website profile. Therefore, the same user submitting the same search query from two different websites may receive different search results if the two websites have different website profiles.
- The website profile also has an important advantage over the user profile in terms of protecting a user's privacy. A user profile is associated with an individual user. To create the user profile, the individual user, either explicitly or implicitly (e.g., by monitoring or logging search queries and other online activities of the user), needs to complete a survey of his or her personal preferences. This survey indicates what information items may be of interest to the user. Further, the user must have an account at a website or a search engine system and the user must log into his or her account to invoke the user profile to personalize the search results. In contrast, the creation and usage of the website profile does not require any personal information from any user. A website profile is associated with a website, not an individual user. Any individual user's activity at the website is attributed to all the users of the website. A user does not need to log into his or her account at the website in order to use the website profile. As long as a search query is submitted from the website, the information server automatically “personalizes” the corresponding search result in accordance with the website profile.
- As shown in
FIG. 1 , thewebsite profiler 129 is responsible for generating and updating website profiles. In order to capture the current user interest associated with a particular website, thewebsite profiler 129 needs to have access to the users search history at the website. The users search history includes the search queries submitted by users while visiting the website, the search results responsive to the search queries, and the user activities on the search results (e.g., selection of a document link, sometimes called “clicking” on a search result, or mouse hovering time over a document link). - For example, when the
front end server 120 receives a search query from a website, it submits a copy of the search query to thesearch engine 122 to solicit a search result. In addition, thefront end server 120 sends another copy of the search query to thesearch history database 127. Thesearch history database 127 then generates a record, the record including at least the search query and an identifier of the website from which the search query was received. - The
search result ranker 126 prepares a search result responsive to the search query. The search result (i.e., information representing at least a portion of the search result) is sent back to the requesting client through thefront end server 120. A copy of the search result, or a portion of the search result, is also stored in thesearch history database 127 together with the search query record. Theclient assistant 134 at the requesting client monitors the requesting user's activities on the search result, e.g., recording the user's selection(s) of the document links in the search result and/or the mouse hovering time on different document links. In some embodiments, theclient assistant 134 or thewebsite profiler 129 determines the document “dwell time” for a document selected by the user, by determining the amount of time between user selection of the corresponding document link and the user exiting from the document. In some embodiments, theclient assistant 134 includes executable instructions, stored in the webpage(s) containing the search result, which monitor the user's actions with respect to the search results and transmit information about the monitored user actions back to theinformation server 106. Theinformation server 106, in turn, stores information about these user activities is transferred back to theinformation server 106 and stored in thesearch history database 127 for subsequent use. - For example, the
website profiler 129 records the moment that a user submits a search query (t0), the moment that the user clicks the first document link in the corresponding search result (t1), and the moment that the user clicks the second document link in the search result (t2), etc. The differences between two consecutive moments (e.g., t1-t0 or t2-t1) are reasonable approximations of the amount of time spent viewing the search result or the document whose link was selected by the user. In some embodiments thewebsite profiler 129 has no information about the user's dwell time for the last document in the search result that the user selects for viewing. In some other embodiments (e.g., where at least some users “opt in” to a version of the client assistant that collects additional information about the users' online activities), thewebsite profiler 129 also receives click and timestamp information for user actions after the user finishes viewing documents from a search result. Continuing the above example, thewebsite profiler 129 further records the moment that the user submits a second query (t3), the moment the user selects a document from the second search results (t4), and so on. Furthermore, thewebsite profiler 129 may record the moment (t5) when the user either closes the browser window that was being used to view search results and documents listed in the search results or navigates away from the website from which the query was received. This additional information enables thewebsite profiler 129 to determine the user dwell time for all search result documents (i.e., documents listed in search results) viewed by a user, which in turn enables thewebsite profiler 129 to generate a more accurate website profile for a website. - Based on a website's search history information, the
website profiler 129 generates a website profile.FIG. 2 is a flow diagram of a process for generating a website profile using the website's search history in accordance with some embodiments of the present invention. Initially, thewebsite profiler 129 identifies search queries submitted from the website (210). While in most cases, this will include all search queries submitted from the website, in the case of very popular or busy websites, the identified search queries may comprise a subset or sampling of the submitted search queries. Search queries submitted from a website during a predetermined time period presumably represent the general interest of users using the website. The search queries are especially relevant to capture dynamic user interests that vary by time. In connection with the search queries, thewebsite profiler 129 identifies the corresponding search results (215). In some embodiments, the search results are served to the requesting users with an embeddedclient assistant 134 that sends information about the user activities on the search results to thewebsite profiler 127. Using the information sent by the client assistants, the website profiler identifies user activities on the search results (230). Identified user activities may include user clicks on document links in search results. In another example, identified user activities may include mouse hovering time on the document links. Generally speaking, a user clicks a document link if the user is interested in the document's content. Similarly, the fact that the mouse moves onto a particular document link and stays there for a substantial amount of time indicates that this document is relevant to the user's interest. In some embodiments, information about the mouse hovering time may be unavailable. - From the user activities on different search results, the
website profiler 129 can identify documents selected by the website users. In some embodiments, thewebsite profiler 129 visits thecontent database 124 to retrieve the profiles of the corresponding documents (235). As noted above, each identified document may have a profile (e.g., a category profile) that was previously generated. If any of the identified documents do not yet have profiles, those documents can be ignored, or the website profiler may call upon thedocument profiler 125 to produce document profiles for those documents. A website profile is then generated from the retrieved document profiles (240). The website profile may include one or more of the following: a weighted listing or vector of categories (sometimes called a category-profile), key terms from the search queries and/or user visited documents (sometimes called a term profile), and information about the links to the user visited documents (sometimes called a link profile). This website profile is stored in thewebsite profile database 128. Thesearch result ranker 126 can retrieve the website profile to re-order the ranks of the documents within a search result. - In some other embodiments,
operations 235 and 240 are replaced by a clustering operation in which user selected documents are clustered purely based on the fact that the same user clicks their associated links. Alternatively, the website profiler directly matches a document's URL against a known set of URLs associated with a particular category. In either case, thewebsite profiler 129 does not need to access the documents' contents in order to generate the website profile. - In yet other embodiments,
operations 230 through 240 are replaced by a process that maps the queries submitted from a website to a set of categories. The categorization of queries can be based on the terms in the queries themselves, or by accessing the profiles of the top N search results (e.g., the top 5, 10, 15 or 20 search results), merging those document profiles to produce a query profile for each query, and merging the query profiles, weighted in accordance with their frequency of submission by the users of the website's search box(es) to generate a website profile. As discussed below with reference toFIG. 4 , this process may exclude queries that are deemed to be unlikely to be related to the primary interests of the website's users. - As noted above, a website profile is updated from time to time in order to keep track of the current interests of the users visiting the website (245). In some embodiments, a website profile is updated at a predetermined time interval (e.g., every week or every day). In some other embodiments, a website profile is updated whenever the number of new search queries at the website reaches a threshold value since a last (i.e., most recent) update. Whenever it is time to update the website profile, the
website profiler 129 repeats the aforementioned process to update the website profile. - In some embodiments, different websites attract substantially different magnitudes of traffic and therefore should be treated differently in terms of profile updating. For instance, a popular website may receive tens of thousands of hits per day while a less popular website may have a much lower hit rate. The
search history database 126 may allocate amounts of storage space for different websites. As a result, the volume of search history associated with the popular website does not exhaust its designated space and the less popular website does not waste too much space before their next scheduled profile updating. - Some websites are so popular that it is impractical to store in the
search history database 127 all the search history for the purpose of profile updating. For example, an on-line bookstore may have a significantly large number of visitors when a new bestseller is released. There are two issues with a website having a significant traffic during a short time period. First, the website's profile may be biased by this traffic peak. Special care may be required to make sure that the website profile has an appropriate balance between the short-term and long-term interests of the website users. Second, thesearch history database 127 may not have the space to store all the search history. One approach to solve this issue is to intentionally ignore some of the search queries, search results and user activities. This may be accomplished by sampling the search queries, search results and/or user activities so as to produce an unbiased sample of the search history. While the extent of the sampling may vary from one embodiment to another, experiments suggest that a search history encompassing several months of user activities will have sufficient data to generate a reliable website profile, for most websites, so long as (A) the sampling is done in a manner that avoids significant biases, and (B) it includes user activity data corresponding to a few weeks of representative search history. - Alternatively, the space shortage issue can be solved by generating a series of incremental website profiles for different portions of the search history and merging the incremental website profiles into the website profile. As shown in
FIG. 3 , thewebsite profiler 129 first generates anincremental profile 311 for thesearch history section 301. Eachsearch history section FIG. 2 . Theincremental profile 311 is equivalent to thesearch history section 301 in terms of characterizing the interests of the website users. Once theincremental profile 311 has been created, the correspondingsearch history section 301 in the database can be overwritten by new entries entering the database. Similarly, thesearch history section 303 can be overwritten after theincremental profile 313 is generated. After the creation of theincremental profile 315, thewebsite profiler 129 can create thenew website profile 337 by merging theincremental profiles old website profile 331. In sum, thewebsite profiler 129 is able to take into account the entire search history by creating incremental website profiles forsearch history sections incremental profiles - A website profile is used for “personalizing” or “flavoring” search results responsive to search queries submitted from a specific website. An underlying assumption in the present specification is that these search queries are, more or less, related to the topics covered by the website. For example, to a golfing website, the search query “Tiger Woods” is reasonably relevant while the search query “Britney Spears” is probably irrelevant at all. But it is quite possible for a user to enter a very popular term like “Britney Spears” into the search box on the golfing website. This is especially true if the search box can be used to search the entire Internet. If not carefully filtered out, the search history associated with these popular, but irrelevant, terms may seriously “contaminate” the website profile and twist the search results in an unexpected direction. Another source of contamination of the website profile is query terms that, although relevant, have very low popularity. Special treatment may be necessary to make sure that user activities with respect to very low popularity query terms do not significantly bias the search results.
-
FIG. 4 is anexemplary curve 400 characterizing the popularity distribution of search queries submitted from a website. All the search queries are divided into three categories by the twothresholds leftmost category 410 includes those search queries that are “abnormally” popular, but less relevant, to the website. The search query “Britney Spears” being submitted by a golfing website's search window is an example of a search query in this category. Thewebsite profiler 129 should eliminate or at least reduce the influence of the search history associated with these queries on the website profile by giving them relatively low weights. Themiddle category 420 includes those search queries that are reasonably popular and relevant to the website. The search history corresponding to these search queries should be granted higher weights to make a major contribution to the website profile. Finally, therightmost category 430 includes those queries that only appear in the website's search box occasionally. They should be treated in a manner similar to the queries in theleftmost category 410. - There are multiple factors determining the contribution of a search query (or a corresponding search result) in the
middle category 420 to the website profile. For example, the popularity of the search query and the amount of user activities on the search result affect the contribution of the search query and the search result on the website profile. Time is another important factor. In some embodiments, recent search history plays a more prominent role than less recent search history in the formation of the website profile. One skilled in the art can easily apply similar principles to other aspects of the search history associated with the website. -
FIG. 5 is a block diagram illustrating how the process of creating a website profile is divided into multiple sub-processes in accordance with some embodiments of the present invention. As noted above, it is a non-trivial process to create aprofile 530 for a website using its search history. The search history involves different types of information from different sources, such as the search queries 501 submitted by users from the website, the search results 503 generated by the search engine in response to the search queries, and theuser activities 505 on the search results. In some embodiments, this process is further divided into multiple sub-processes. Each sub-process produces a specific type of website profile characterizing the interests of the website users from a particular perspective. They are: -
- a category-based
profile 531—this profile correlates the search history with a set of predefined categories, which may be organized in a hierarchal fashion, with each category being given a weight indicating the relevance of the category to the interests of the website users; - a term-based
profile 533—this profile abstracts the search history with a plurality of terms, wherein each term is given a weight indicating the relevance of the term to the interests of the website users; and - a link-based
profile 535—this profile identifies a plurality of links that are directly or indirectly related to the search history, with each link being given a weight indicating the relevance of the link to the interests of the website users.
- a category-based
- In some embodiments, the
website profile 530 includes only a subset of theprofiles website profile 530 may include the term-basedprofile 533 and the category-basedprofile 531, but not the link-basedprofile 535. In some embodiments, thewebsite profile 530 includes a plurality of profiles, at least one of which is a combination of two or more of theaforementioned profiles - The category-based
profile 531 may be constructed, for instance, by mapping search history items (e.g., search queries, content terms, and/or user-selected documents) to categories, and then aggregating the resulting sets of the categories and weighting the categories. The categories may be weighted based on their frequency of occurrence in the search history items. In addition, the categories may be weighted based on the relevance of the search history items to the categories. The search history items accumulated over a period of time may be treated as a group for mapping into weighted categories. Other suitable ways of mapping the search history into weighted categories may also be used. -
FIG. 6A illustrates ahierarchal category map 600 according to the Open Directory Project (http://dmoz.org/). Starting from the root level ofmap 600, documents are organized under several major topics, such as “Art”, “News”, “Sports”, etc. These major topics are often too broad to delineate the specific interest of a website user. They are further divided into multiple more specific sub-topics. For example, the topic “Art” may comprise the sub-topics like “Movie”, “Music”, and “Literature” and the sub-topic “Music” may further comprise sub-sub-topics like “Lyrics”, “News”, and “Reviews.” Note that each topic (or sub-topic) is associated with a unique category identifier like 1.1 for “Art”, 1.4.2.3 for “Talk Show”, and 1.6.1 for “Basketball.” - The categories shown in
FIG. 6A are only for illustrative purposes. One skilled in the art will appreciate that there are many other ways of categorizing documents. For example, different concepts can be extracted from the contents of the documents and different categories of relevant information are grouped in accordance with these concepts. The interests of users of a particular website may be associated with multiple categories at different levels, each having a weight indicative of the category's relevance to the users' interest. The categories and their associated weights can be determined from analyzing the search history associated with the website. -
FIG. 6B is a block diagram of an exemplary data structure, a category-based website profile table 650, which may be used for storing category-based website profiles in accordance with some embodiments of the present invention. The category-based profile table 650 includes a table 640 having a plurality ofrecords 642, each record including a WEBSITE_ID, a FLAVOR_ID and a pointer pointing to another data structure, such as table 660-1. A website may have one or more flavors to better serve different user groups. For example, the website “WEBSITE_1” has at least two different flavors, “FLAVOR_1” and “FLAVOR_2.” These two different “flavors” may correspond to different search boxes on different webpages. In other words, the introduction of different flavors for a website refines the interests of the website users. This is particularly useful for a popular website serving a broad spectrum of customers. Table 660-1 includes two columns, CATEGORY_ID and WEIGHT. The CATEGORY_ID column contains a category's identifier as shown inFIG. 6A , and the value in the WEIGHT column indicates the relevance of the category to the interests of the website users. - In some embodiments, the search history items are automatically classified into different clusters. Clusters are usually more dynamic than categories. As noted above, categories are typically pre-generated. Search history items associated with different websites are classified against the same set of categories. In contrast, there may not be a predefined set of clusters for a particular website. The search history items associated with the website fall into an automatically generated set of clusters. Therefore, clusters may be better tailored to characterize the interests and preferences of the users of the website. For convenience, many discussions of the present invention use categories as an example. But it will be clear to one skilled in the art that the underlying algorithms are also applicable to clusters with no or little adjustment.
- The website profile based upon the
category map 600 is a topic-oriented implementation. The items in a category-based profile can also be organized in other ways. In one embodiment, the interests of the website users can be categorized based on the formats of the documents identified by the website users, such as HTML, plain text, PDF, Microsoft Word, etc. Different formats may have different weights. In another embodiment, the interests of the website users can be categorized according to the types of the identified documents, e.g., an organization's homepage, a person's homepage, a research paper, or a news group posting, each type having an associated weight. Documents can also be categorized by document origin, for instance the country associated with each document's host. In yet another embodiment, two or more of the above-identified category-based profiles may co-exist, with each one reflecting a respective aspect of the interests of the website users. -
FIG. 7 is a block diagram of an exemplary data structure, a term-based profile table 700, which may be used for storing term-based website profiles in accordance with some embodiments of the present invention. The table 700 includes a plurality ofrecords 710, each record corresponding to a website's term-based profile. A term-basedprofile record 710 includes a plurality of columns including aWEBSITE_ID column 720 and multiple columns of (TERM, WEIGHT) pairs 740. The WEBSITE_ID column stores a website identifier. Each (TERM, WEIGHT)pair 740 includes a term of typically one to three words that is deemed relevant to the interests of the website users and a weight associated with the term indicating the relevance of the term. The weight of a term is not necessarily a positive value. A negative weight suggests that the website users disfavor documents including this term in the search results. - Besides term-based and category-based profiles, another type of website profile is referred to as a link-based profile. As discussed above, the page rank of a document is based on the link structure that connects the document to other documents on the Internet. A document having more links pointing to it is often assigned a higher page rank and is therefore deemed more popular by the search engine. Link information of documents selected by a website's users can be used to infer the interests of the website's users. In one embodiment, a list of preferred URLs is identified for the website users by analyzing the click rate of these URLs. Each preferred URL may be further weighted according to the mouse hovering time by the website users at the URL. In another embodiment, a list of preferred web hosts is identified for the website users by analyzing the users' visit rate at different web hosts. When two or more preferred URLs are related to the same web host, the weights of the two or more URLs may be combined as the weight of the web host.
-
FIG. 8 is a block diagram of an exemplary data structure that may be used for storing link-based website profiles in accordance with some embodiments of the present invention. The link-based profile table 800 includes a table 810 that includes a plurality ofrecords 820, each record including a WEBSITE_ID and a pointer pointing to another data structure, such as table 810-1. Table 810-1 may include two columns,LINK_ID 830 andWEIGHT 840. TheLINK_ID 830 may be associated with a preferred URL or host. The actual URL/host may be stored in the table instead of the LINK_ID, however it is preferable to store the LINK_ID to save storage space. - A preferred list of URLs and/or hosts includes URLs and/or hosts that have been directly identified by the website users. The preferred list of URLs and/or host may further extend to URLs and/or hosts indirectly identified by using methods such as collaborative filtering or bibliometric analysis, which are known to one of ordinary skill in the art. In one embodiment, the indirectly identified URLs and/or host include URLs or hosts that have links to/from the directly identified URLs and/or hosts. These indirectly identified URLs and/or hosts are weighted by the distance between them and the directly identified URLs or hosts. For example, when a directly identified URL or host has a weight of 1, URLs or hosts that are one link away may have a weight of 0.5, URLs or hosts that are two links away may have a weight of 0.25, etc. This procedure can be further refined by reducing the weight of links that are not related to the topic of the original URL or host, e.g., links to copyright pages or web browser software that can be used to view the documents associated with the user-selected URL or host. Irrelevant Links can be identified based on their context or their distribution. For example, copyright links often use specific terms (e.g., “copyright” and “All rights reserved” are commonly used terms in the anchor text of a copyright link); and links to a website from many unrelated websites may suggest that this website is not topically related (e.g., links to the Internet Explorer website are often included in unrelated websites). The indirect links can also be classified according to a set of topics and links with very different topics may be excluded or be assigned a low weight.
- The three types of website profiles discussed above are generally complimentary to one another since different profiles characterize the interests of the website users from different vantage points. However, this does not mean that one type of website profile, e.g., the category-based profile, is incapable of playing a role that is typically played by another type of website profile. By way of example, a preferred URL or host in a link-based profile is often associated with a specific topic, e.g., finance.yahoo.com is a URL focusing on financial news. Therefore, what is achieved by a link-based profile that comprises a list of preferred URLs or hosts may also be achievable, at least in part, by a category-based profile that has a set of categories that cover the same topics covered by preferred URLs or hosts.
-
FIG. 9 is a flow diagram of a process for generating website-dependent search results using the various types of website profiles in accordance with some embodiments of the present invention. Initially, thesearch engine 122 receives a search query from awebsite 102 submitted by a user through a client 103 (910). In response, thesearch engine 122 may optionally generate a query strategy (915). For example, the search query is normalized so as to be in proper form for further processing, and/or the search query may be modified in accordance with predefined criteria so as to automatically broaden or narrow the scope of the search query. Next, thesearch engine 122 submits the search query (or the query strategy, if one is generated) to thecontent database 124. Thecontent database 124 identifies a set of documents that match the search query (920), each document having a generic ranking score that depends on the document's page rank and the search query. All three operations (910, 915 and 920) are typically conducted by thesearch engine 122. - In some embodiments, the requesting website's identifier is embedded in the search query. Based on the website identifier, the
search result ranker 126 identifies the website's profile in the website profile database 128 (925). Next, thesearch result ranker 126 analyzes each identified document to determine one or more boost factors using the website profile (935) and then assigns the document a website-dependent ranking score using the document's generic ranking score and the boost factors (940). Thesearch result ranker 126 iterates the process for every identified document (942). Finally, thesearch result ranker 126 re-orders the list of documents according to their website-dependent ranking scores (945) and sends a search result including links to the list of documents to the requestingclient 103. - In some embodiments, the analysis of an identified document at 935 includes determining a correlation between the document's content and the website's profile. Furthermore, in some embodiments, this operation includes accessing a previously computed document profile for the document and then determining a correlation between the document profile and the website's profile. In some embodiments, determining the correlation includes one or more operations that are “dot product” computations, which determine the extent of overlap, if any, between the document profile and the website's profile.
-
FIG. 10 is a block diagram of exemplary data structures that may be used for storing category-based, term-based, and link-based boost factors for documents in the search results in accordance with some embodiments of the present invention. For each candidate document, each identified by a respective DOC_ID, category-based document information table 1010 includes a plurality of identified categories and associated weights, term-based document information table 1030 includes multiple pairs of relevant terms and associated weights, and link-based document information table 1050 includes a set of links and corresponding weights. - The rightmost column of each of the three tables (1010, 1030 and 1050) stores the boost factor (i.e., a computed score) of a document when the document is evaluated using one specific type of website profile. A document's boost factor can be determined by combining the weights of the items associated with the document. For instance, a category-based or term-based boost factor may be computed as follows. The users of a website may favor documents related to science with a weight of 0.6, and disfavor documents related to business with a weight of −0.2. Thus, when a science document matches a search query, it will be boosted over a business document. In general, the document topic classification may not be exclusive. A candidate document may be classified as being a science document with probability of 0.8 and a business document with probability of 0.4. A link-based boost factor may be computed based on the relative weights allocated to the preferred URLs or hosts in the link-based profile. In one embodiment, the term-based profile rank can be determined using known techniques, such as the term frequency-inverse document frequency (TF-IDF). The term frequency of a term is a function of the number of times the term appears in a document. The inverse document frequency is an inverse function of the number of documents in which the term appears within a collection of documents. For example, very common terms like “word” occur in many documents and consequently are assigned a relatively low inverse document frequency, while less common terms like “photograph” and “microprocessor” are assigned a relatively high inverse document frequency.
- In some embodiments, when a search engine generates a search result in response to a search query, a candidate document D that satisfies the search query is assigned a query score, QueryScore, in accordance with the search query. This query score is then modulated by document D's page rank, PageRank, to generate a generic ranking score, GenericScore, that is expressed as
-
GenericScore=QueryScore*PageRank. - This generic ranking score may not appropriately reflect document D's relevance to a particular website's users if the users' interest is dramatically different from that of a random user of the search engine. The relevance of document D to the website users can be accurately characterized by a set of boost factors, based on the correlation between document D's content and the website's term-based profile, herein called the TermBoostFactor, the correlation between one or more categories associated with document D and the website's category-based profile, herein called the CategoryBoostFactor, and the correlation between the URL and/or host of document D and the website's link-based profile, herein called the LinkBoostFactor. Therefore, document D may be assigned a website-dependent ranking score that is a function of both the document's generic ranking score and the various website profile-based boost factors. In one embodiment, this website-dependent ranking score can be expressed as:
-
WebsiteScore=GenericScore*(TermBoostFactor+CategoryBoostFactor+LinkBoostFactor). - In another embodiment, in which the website profile is a single profile, the website-dependent ranking score can be expressed as:
-
WebsiteScore=GenericScore*BoostFactor - where the “BoostFactor” is based on the correlation between document D's content and the website's profile.
-
FIG. 11 is a flow diagram of another process for generating website-dependent search results using website profiles in accordance with some embodiments of the present invention. Unlike the embodiment discussed above in connection withFIG. 9 , the generic query strategy is modulated by the website's profile to create a website-dependent query strategy (1125, 1165). For example, relevant terms from the website profile may be added to the search query with associated weights. In various embodiments, the website-dependent query strategy is created by thesearch engine 122, thefront end server 120, or thesearch result ranker 126, respectively. In some other embodiments, the requestingwebsite 102 has a copy of its profile generated by thewebsite profiler 129 and the website-dependent query strategy is created by the requestingwebsite 102. Next, thesearch engine 122 searches thecontent database 124 using the website-dependent query strategy (1170). As a result, the documents identified by thecontent database 124 are implicitly ordered by their associated website-dependent ranking score (1175). - Some embodiments include a computer program product for use in conjunction with a computer system associated with a search engine. The computer program product may comprise a computer readable storage medium and a computer program mechanism embedded therein. In some embodiments, the computer program mechanism includes instructions for receiving from a website distinct from the search engine multiple search queries submitted by users; instructions for providing to the requesting users search results responsive to the search queries; instructions for monitoring activities of the users on the search results; and instructions for generating a profile for a website using the search queries from the website and the user activities on the search results.
- Referring to
FIG. 12 , anexemplary information server 1200 typically includes one or more processing units (CPU's) 1202, one or more network orother communications interfaces 1210,memory 1212, and one or more communication buses 1014 for interconnecting these components. The communication buses 1014 may include circuitry (sometimes called a chipset) that interconnects and controls communications between system components. Thesystem 1200 may optionally include a user interface, for instance a display and a keyboard.Memory 1212 may include high speed random access memory and may also include non-volatile memory, such as one or more magnetic disk storage devices.Memory 1212 may include mass storage that is remotely located from the CPU's 1202. In some embodiments,memory 1212 stores the following programs, modules and data structures, or a subset or superset thereof: -
- an
operating system 1216 that includes procedures for handling various basic system services and for performing hardware dependent tasks; - a
network communication module 1218 that is used for connecting theinformation server 1200 to other servers or computers via one or more communication networks (wired or wireless), such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on; - a
system initialization module 1220 that initializes other modules and data structures stored inmemory 1212 required for the appropriate operation of theinformation server 1200; - a
search engine 122 for processing a search query, identifying and ordering a search result according to the search query; - a
content database 124 for storing a plurality of indexed document retrieved from the Internet; - a
website profiler 129 for processing search history associated with a website and creating and updating one or more profiles that characterize the interests of the website users; - a
search history database 127 for storing search histories associated with different websites including search queries, search results and user activities; - a
website profile database 123 for storing website profiles associated with different websites on the Internet; - a
document profiler 125 for analyzing a document's content and context and creating a profile for the document; - a
document profile database 123 for storing document profiles associated with different documents stored in thecontent database 124; and - a
search result ranker 126 for generating a website-dependent ranking score for each document identified by thesearch engine 122 using a website profile and re-ordering the documents in a search result in accordance with their website-dependent ranking scores.
- an
- In some embodiments, the
information server 106 may not have access to all the search history associated with a website. For example, there may be an agreement between awebsite 102 and theinformation server 106 with respect to the search queries submitted from thewebsite 102. According to the agreement, when a user visiting the website 1027 submits a search query to theinformation server 106, theinformation server 106 is required to send the corresponding search result to thewebsite 102 rather than the requesting user at aclient 103. Thewebsite 102 may modify the search result, e.g., attaching advertisements or other information to the search result, and then serves the modified search result to the requesting user at theclient 103. - In this scenario, the
information server 106 may have no information identifying the requesting user and theclient 103, and may also be unable to monitor the user's activities on the search result. For example, theinformation server 106 may not receive any information identifying the document links in the search result that have been clicked by the user. Similarly, theinformation server 106 may not receive any information identifying the document links over which the user moves his or her mouse link and the corresponding mouse hovering time. In other words, theinformation server 106 has very limited or no exposure to the activities of the website users on the search results. Therefore, theinformation server 106 has to rely on the user activities on the search results from other venues to generate the website profile. - In some embodiments, by examining the search queries submitted from different website, the
information server 106 may identify another website similar to the website in question. Two websites are deemed similar if a predefined number or percentage of search queries submitted from the two websites is identical. It is also reasonable to infer that users of the two similar websites may have similar interests and therefore the user activities associated with one website are a reasonable proxy of the user activities associated with the other one. If theinformation server 106 can access the user activities associated with one of the two websites (e.g., there is no agreement to deliver the search results to the website), theinformation server 106 can use the same user activities to create the profile for other website. - When there is no other website similar to the website in question, the
information server 106 may utilize monitored user activities associated with search queries submitted directly to the search engine (e.g., search queries submitted using a toolbar search box or a webpage associated with the information server 106) as the proxy of a particular website. However, the only search queries for which such “general user population” information will be used are queries that were submitted from the website in question. For instance, the search query “golf courses in mountain view” may be submitted both to a golf-focused website, and to a general purpose search engine. Profile information developed from general user population clicks on the search result of this search query (as well as general use population clicks on the search results of other search queries submitted both from the website in question and from other users of the search engine) are used to generate the profile for a respective website by combining or aggregating the general user statistical information for the queries received from the respective website. The website profile obtained in this way will typically differ significantly from a group profile of the entire user community of the search engine, and therefore the website profile generated in this way will be a reasonable approximation of the website profile that would be generated if user activity information were available for search results returned by the search engine in response to search queries submitted from the website. - In some embodiments, the website profiles can also be used to select advertisements for search queries submitted from different websites. Different advertisements are treated in a way similar to different documents. For example, an advertisement may have a set of key terms. A correlation of this set of key terms with a term-based profile (or a category-based profile, or both) associated with a website produces a booster factor for the advertisement. This boost factor may be used to promote or demote the particular advertisement in response to a search query submitted from the website. For example, when the
information server 106 receives a search query “world cup 2006” from a website or webpage devoted to soccer news, it may promote those advertisements covering soccer gears, ticket sale for the 2006 FIFA World Cup Germany, and hotel reservations at the German cities hosting the soccer game. etc. - The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated.
Claims (32)
1. A computer-implemented method, comprising:
at a server having memory and one or more processors:
receiving from a website search queries submitted by users of the website;
providing to the users search results responsive to the search queries;
processing activities of the users on the search results;
generating a website profile for the website using the processed activities on the search results by the users; and
modifying, based on the generated website profile, rankings of search results provided in response to new search queries from the website.
2. The computer-implemented method of claim 1 , wherein the user activities include user selections of the search results and mouse hovering time on the search results.
3. The computer-implemented method of claim 1 , wherein a subset of the search queries is used for generating the website profile by sub-sampling the search queries during a predefined time period.
4. The computer-implemented method of claim 1 , wherein a subset of the search queries is used for generating the website profile by choosing search queries having a predefined range of occurrence frequency.
5. The computer-implemented method of claim 1 , wherein temporally recent search queries and user activities on the search results responsive thereto are given more weight than temporally remote search queries and user activities on the search results responsive thereto during the generation of the website profile.
6. The computer-implemented method of claim 1 , wherein the modifying further includes:
receiving from the website a respective new search query submitted by a user at a client;
identifying a plurality of information items associated with the new search query;
ranking the information items in accordance with the website profile; and providing the ranked information items for display to the requesting user at the client.
7. The computer-implemented method of claim 6 , wherein the ranking of an information item further includes:
assigning a generic ranking score to the information item, wherein the generic ranking score is independent from the website profile;
generating a website-dependent ranking score by modifying the generic ranking score with a weighting factor that is determined, at least in part, by the website profile; and
determining a display order for the information item in accordance with the website-dependent ranking score.
8. The computer-implemented method of claim 6 , further including:
processing the user's activities on the ranked information items; and
updating the website profile using the processed user activities and the new search query.
9. The computer-implemented method of claim 8 , further including:
generating an incremental website profile using new search queries and new user activities collected during a predefined time period; and
merging the incremental website profile into the website profile to generate an updated website profile.
10. The computer-implemented method of claim 1 , further including:
for a respective webpage of the website:
identifying a group of search queries submitted by a set of users visiting the webpage from respective clients;
providing search results responsive to the group of search queries to the set of users at the respective clients;
processing activities of the set of users on the search results;
generating a webpage profile for the webpage using the processed user activities on the search results by the set of users; and
modifying, based on the generated webpage profile, rankings of search results provided in response to new search queries associated with the webpage.
11. The computer-implemented method of claim 10 , wherein the website profile includes multiple webpage profiles, each webpage profile being associated with at least one webpage of the website.
12. The computer-implemented method of claim 10 , wherein the modifying further includes:
receiving from the website a respective new search query submitted by a user at a client, wherein the new search query was submitted by the user when visiting the webpage;
identifying a plurality of information items associated with the new search query;
ranking the information items in accordance with the webpage profile; and
providing the ranked information items for display to the requesting user at the client.
13. The computer-implemented method of claim 12 , wherein the ranking of an information item further includes:
assigning a generic ranking score to the information item, wherein the generic ranking score is independent from the webpage profile;
generating a webpage-dependent ranking score by multiplying the generic ranking score by a weighting factor that is determined, at least in part, by the webpage profile; and
determining a display order for the information item in accordance with the webpage-dependent ranking score.
14. The computer-implemented method of claim 12 , further including:
processing the user's activities on the ranked information items; and
updating the webpage profile using the processed user activities and the new search query.
15. A computer system, comprising:
memory;
one or more processors; and
one or more programs, stored in the main memory and executed by the one or more processors, the one or more programs including instructions for:
receiving from a website search queries submitted by users of the website;
providing to the users search results responsive to the search queries;
processing activities of the users on the search results;
generating a website profile for the website using the processed activities on the search results by the users; and
modifying, based on the generated website profile, rankings of search results provided in response to new search queries from the website.
16. The computer system of claim 15 , wherein the user activities include user selections of the search results and mouse hovering time on the search results.
17. The computer system of claim 15 , wherein a subset of the search queries is used for generating the website profile by sub-sampling the search queries during a predefined time period.
18. The computer system of claim 15 , wherein a subset of the search queries is used for generating the website profile by choosing search queries having a predefined range of occurrence frequency.
19. The computer system of claim 15 , wherein temporally recent search queries and user activities on the search results responsive thereto are given more weight than temporally remote search queries and user activities on the search results responsive thereto during the generation of the website profile.
20. The computer system of claim 15 , wherein the instructions for modifying rankings of search results further include instructions for:
receiving from the website a respective new search query submitted by a user at a client;
identifying a plurality of information items associated with the new search query;
ranking the information items in accordance with the website profile; and
providing the ranked information items for display to the requesting user at the client.
21. The computer system of claim 20 , wherein the instructions for ranking an information item further include instructions for:
assigning a generic ranking score to the information item, wherein the generic ranking score is independent from the website profile;
generating a website-dependent ranking score by modifying the generic ranking score with a weighting factor that is determined, at least in part, by the website profile; and
determining a display order for the information item in accordance with the website-dependent ranking score.
22. The computer system of claim 20 , wherein the at least one program further includes instructions for:
processing the user's activities on the ranked information items; and
updating the website profile using the processed user activities and the new search query.
23. A non-transitory computer readable storage medium for use in conjunction with a computer system, the computer readable storage medium storing one or more programs for execution by the computer system, the one or more programs comprising instructions for:
receiving from a website search queries submitted by users of the website;
providing to the users search results responsive to the search queries;
processing activities of the users on the search results;
generating a website profile for the website using the processed activities on the search results by the users; and
modifying, based on the generated website profile, rankings of search results provided in response to new search queries from the website.
24. The non-transitory computer readable storage medium of claim 23 , wherein the user activities include user selections of the search results and mouse hovering time on the search results.
25. The non-transitory computer readable storage medium of claim 23 , wherein a subset of the search queries is used for generating the website profile by sub-sampling the search queries during a predefined time period.
26. The non-transitory computer readable storage medium of claim 23 , wherein a subset of the search queries is used for generating the website profile by choosing search queries having a predefined range of occurrence frequency.
27. The non-transitory computer readable storage medium of claim 23 , wherein temporally recent search queries and user activities on the search results responsive thereto are given more weight than temporally remote search queries and user activities on the search results responsive thereto during the generation of the website profile.
28. The non-transitory computer readable storage medium of claim 23 , wherein the instructions for modifying rankings of search results further include instructions for:
receiving from the website a respective new search query submitted by a user at a client;
identifying a plurality of information items associated with the new search query;
ranking the information items in accordance with the website profile; and
providing the ranked information items for display to the requesting user at the client.
29. The non-transitory computer readable storage medium of claim 28 , wherein the instruction for ranking an information item further includes instructions for:
assigning a generic ranking score to the information item, wherein the generic ranking score is independent from the website profile;
generating a website-dependent ranking score by modifying the generic ranking score with a weighting factor that is determined, at least in part, by the website profile; and
determining a display order for the information item in accordance with the website-dependent ranking score.
30. The non-transitory computer readable storage medium of claim 28 , wherein the at least one program further includes instructions for:
processing the user's activities on the ranked information items; and
updating the website profile using the processed user activities and the new search query.
31. A computer-implemented method, comprising:
at a server having memory and one or more processors:
receiving from a website search queries submitted by users of the website;
identifying search results responsive to the search queries;
identifying user activities on at least a subset of the search results at venues other than the website;
generating a website profile for the website using the search queries from the website and the identified user activities; and
modifying, based on the generated website profile, rankings of search results provided in response to new search queries from the website.
32. The computer-implemented method of claim 31 , wherein the modifying further includes:
ranking the search results in accordance with the website profile; and
providing the ranked search results to the users of the website.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/323,758 US20120089598A1 (en) | 2006-03-30 | 2011-12-12 | Generating Website Profiles Based on Queries from Websites and User Activities on the Search Results |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/394,620 US8078607B2 (en) | 2006-03-30 | 2006-03-30 | Generating website profiles based on queries from webistes and user activities on the search results |
US13/323,758 US20120089598A1 (en) | 2006-03-30 | 2011-12-12 | Generating Website Profiles Based on Queries from Websites and User Activities on the Search Results |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/394,620 Continuation US8078607B2 (en) | 2004-07-13 | 2006-03-30 | Generating website profiles based on queries from webistes and user activities on the search results |
Publications (1)
Publication Number | Publication Date |
---|---|
US20120089598A1 true US20120089598A1 (en) | 2012-04-12 |
Family
ID=38335819
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/394,620 Active 2028-07-31 US8078607B2 (en) | 2004-07-13 | 2006-03-30 | Generating website profiles based on queries from webistes and user activities on the search results |
US11/675,057 Abandoned US20070233671A1 (en) | 2006-03-30 | 2007-02-14 | Group Customized Search |
US13/323,758 Abandoned US20120089598A1 (en) | 2006-03-30 | 2011-12-12 | Generating Website Profiles Based on Queries from Websites and User Activities on the Search Results |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/394,620 Active 2028-07-31 US8078607B2 (en) | 2004-07-13 | 2006-03-30 | Generating website profiles based on queries from webistes and user activities on the search results |
US11/675,057 Abandoned US20070233671A1 (en) | 2006-03-30 | 2007-02-14 | Group Customized Search |
Country Status (4)
Country | Link |
---|---|
US (3) | US8078607B2 (en) |
EP (1) | EP2005339A2 (en) |
CN (1) | CN101454780B (en) |
WO (1) | WO2007115217A2 (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110035375A1 (en) * | 2009-08-06 | 2011-02-10 | Ron Bekkerman | Building user profiles for website personalization |
US20120131008A1 (en) * | 2010-11-23 | 2012-05-24 | Microsoft Corporation | Indentifying referring expressions for concepts |
US8364672B2 (en) | 2010-11-23 | 2013-01-29 | Microsoft Corporation | Concept disambiguation via search engine search results |
US20130332521A1 (en) * | 2012-06-07 | 2013-12-12 | United Video Properties, Inc. | Systems and methods for compiling media information based on privacy and reliability metrics |
US20140108373A1 (en) * | 2012-10-15 | 2014-04-17 | Wixpress Ltd | System for deep linking and search engine support for web sites integrating third party application and components |
US20140143222A1 (en) * | 2012-11-16 | 2014-05-22 | Google Inc. | Ranking signals for sparse corpora |
CN104462357A (en) * | 2014-12-08 | 2015-03-25 | 百度在线网络技术(北京)有限公司 | Method and device for realizing personalized search |
US20150121265A1 (en) * | 2010-03-10 | 2015-04-30 | Lockheed Martin Corporation | Systems and methods for facilitating open source intelligence gathering |
US20150242486A1 (en) * | 2014-02-25 | 2015-08-27 | International Business Machines Corporation | Discovering communities and expertise of users using semantic analysis of resource access logs |
US9646063B1 (en) * | 2007-05-25 | 2017-05-09 | Google Inc. | Sharing of profile information with content providers |
US10061817B1 (en) | 2015-07-29 | 2018-08-28 | Google Llc | Social ranking for apps |
US10147146B2 (en) * | 2012-03-14 | 2018-12-04 | Disney Enterprises, Inc. | Tailoring social elements of virtual environments |
US10698971B2 (en) * | 2016-08-03 | 2020-06-30 | Samsung Electronics Co., Ltd. | Method and apparatus for storing access log based on keyword |
Families Citing this family (266)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7152031B1 (en) * | 2000-02-25 | 2006-12-19 | Novell, Inc. | Construction, manipulation, and comparison of a multi-dimensional semantic space |
US7434219B2 (en) | 2000-01-31 | 2008-10-07 | Commvault Systems, Inc. | Storage of application specific profiles correlating to document versions |
US20090234718A1 (en) * | 2000-09-05 | 2009-09-17 | Novell, Inc. | Predictive service systems using emotion detection |
EP1442387A4 (en) | 2001-09-28 | 2008-01-23 | Commvault Systems Inc | System and method for archiving objects in an information store |
US7565630B1 (en) | 2004-06-15 | 2009-07-21 | Google Inc. | Customization of search results for search queries received from third party sites |
US9047388B2 (en) * | 2004-07-01 | 2015-06-02 | Mindjet Llc | System, method, and software application for displaying data from a web service in a visual map |
US20090228447A1 (en) * | 2004-07-01 | 2009-09-10 | Creekbaum William J | System, method, and solfware application for enabling a user to search an external domain within a visual mapping interface |
GB0506618D0 (en) * | 2005-04-01 | 2005-05-11 | Wine Science Ltd | A method of supplying information articles at a website and system for supplying such articles |
US20060252775A1 (en) * | 2005-05-03 | 2006-11-09 | Henderson Samuel T | Methods for reducing levels of disease associated proteins |
US7925649B2 (en) | 2005-12-30 | 2011-04-12 | Google Inc. | Method, system, and graphical user interface for alerting a computer user to new results for a prior search |
US7631263B2 (en) * | 2006-06-02 | 2009-12-08 | Scenera Technologies, Llc | Methods, systems, and computer program products for characterizing links to resources not activated |
US9443022B2 (en) | 2006-06-05 | 2016-09-13 | Google Inc. | Method, system, and graphical user interface for providing personalized recommendations of popular search queries |
US8103703B1 (en) | 2006-06-29 | 2012-01-24 | Mindjet Llc | System and method for providing content-specific topics in a mind mapping system |
US7577718B2 (en) * | 2006-07-31 | 2009-08-18 | Microsoft Corporation | Adaptive dissemination of personalized and contextually relevant information |
US7849079B2 (en) * | 2006-07-31 | 2010-12-07 | Microsoft Corporation | Temporal ranking of search results |
US7685199B2 (en) * | 2006-07-31 | 2010-03-23 | Microsoft Corporation | Presenting information related to topics extracted from event classes |
US8117197B1 (en) * | 2008-06-10 | 2012-02-14 | Surf Canyon, Inc. | Adaptive user interface for real-time search relevance feedback |
WO2008049955A1 (en) * | 2006-10-27 | 2008-05-02 | Cvon Innovations Ltd | Method and device for managing subscriber connection |
US20100274661A1 (en) * | 2006-11-01 | 2010-10-28 | Cvon Innovations Ltd | Optimization of advertising campaigns on mobile networks |
US8661029B1 (en) | 2006-11-02 | 2014-02-25 | Google Inc. | Modifying search result ranking based on implicit user feedback |
US7734669B2 (en) * | 2006-12-22 | 2010-06-08 | Commvault Systems, Inc. | Managing copies of data |
GB2440990B (en) | 2007-01-09 | 2008-08-06 | Cvon Innovations Ltd | Message scheduling system |
US9405830B2 (en) | 2007-02-28 | 2016-08-02 | Aol Inc. | Personalization techniques using image clouds |
US8892780B2 (en) | 2007-03-08 | 2014-11-18 | Oracle International Corporation | Management of shared storage I/O resources |
US20090077033A1 (en) * | 2007-04-03 | 2009-03-19 | Mcgary Faith | System and method for customized search engine and search result optimization |
US20100100536A1 (en) * | 2007-04-10 | 2010-04-22 | Robin Daniel Chamberlain | System and Method for Evaluating Network Content |
US20080270228A1 (en) * | 2007-04-24 | 2008-10-30 | Yahoo! Inc. | System for displaying advertisements associated with search results |
US9396261B2 (en) | 2007-04-25 | 2016-07-19 | Yahoo! Inc. | System for serving data that matches content related to a search results page |
US20080288310A1 (en) * | 2007-05-16 | 2008-11-20 | Cvon Innovation Services Oy | Methodologies and systems for mobile marketing and advertising |
US8935718B2 (en) * | 2007-05-22 | 2015-01-13 | Apple Inc. | Advertising management method and system |
US7818320B2 (en) * | 2007-05-31 | 2010-10-19 | Yahoo! Inc. | Enhanced search results based on user feedback relating to search result abstracts |
US20080313146A1 (en) * | 2007-06-15 | 2008-12-18 | Microsoft Corporation | Content search service, finding content, and prefetching for thin client |
US8099401B1 (en) * | 2007-07-18 | 2012-01-17 | Emc Corporation | Efficiently indexing and searching similar data |
KR101415022B1 (en) * | 2007-07-24 | 2014-07-09 | 삼성전자주식회사 | Method and apparatus for information recommendation using hybrid algorithm |
US8505046B2 (en) | 2007-08-17 | 2013-08-06 | At&T Intellectual Property I, L.P. | Targeted online, telephone and television advertisements based on cross-service subscriber profiling |
GB2452789A (en) | 2007-09-05 | 2009-03-18 | Cvon Innovations Ltd | Selecting information content for transmission by identifying a keyword in a previous message |
US20090089246A1 (en) * | 2007-09-28 | 2009-04-02 | Yahoo! Inc. | System and method for history clustering |
US8965888B2 (en) * | 2007-10-08 | 2015-02-24 | Sony Computer Entertainment America Llc | Evaluating appropriateness of content |
GB2453810A (en) | 2007-10-15 | 2009-04-22 | Cvon Innovations Ltd | System, Method and Computer Program for Modifying Communications by Insertion of a Targeted Media Content or Advertisement |
US8626823B2 (en) * | 2007-11-13 | 2014-01-07 | Google Inc. | Page ranking system employing user sharing data |
US20090132366A1 (en) * | 2007-11-15 | 2009-05-21 | Microsoft Corporation | Recognizing and crediting offline realization of online behavior |
US8464270B2 (en) * | 2007-11-29 | 2013-06-11 | Red Hat, Inc. | Dependency management with atomic decay |
US8832255B2 (en) | 2007-11-30 | 2014-09-09 | Red Hat, Inc. | Using status inquiry and status response messages to exchange management information |
US20090157616A1 (en) * | 2007-12-12 | 2009-06-18 | Richard Barber | System and method for enabling a user to search and retrieve individual topics in a visual mapping system |
US20090157801A1 (en) * | 2007-12-12 | 2009-06-18 | Richard Barber | System and method for integrating external system data in a visual mapping system |
US8161396B2 (en) * | 2007-12-20 | 2012-04-17 | Mindjet Llc | System and method for facilitating collaboration and communication in a visual mapping system by tracking user presence in individual topics |
US7984035B2 (en) * | 2007-12-28 | 2011-07-19 | Microsoft Corporation | Context-based document search |
US7797314B2 (en) * | 2007-12-31 | 2010-09-14 | International Business Machines Corporation | Adaptive searching |
US8775416B2 (en) * | 2008-01-09 | 2014-07-08 | Yahoo!Inc. | Adapting a context-independent relevance function for identifying relevant search results |
US8244721B2 (en) * | 2008-02-13 | 2012-08-14 | Microsoft Corporation | Using related users data to enhance web search |
US20090210391A1 (en) * | 2008-02-14 | 2009-08-20 | Hall Stephen G | Method and system for automated search for, and retrieval and distribution of, information |
EP2105846A1 (en) * | 2008-03-28 | 2009-09-30 | Sony Corporation | Method of recommending content items |
US8751481B2 (en) * | 2008-04-16 | 2014-06-10 | Iac Search & Media, Inc. | Adaptive multi-channel content selection with behavior-aware query analysis |
JP5089482B2 (en) * | 2008-05-12 | 2012-12-05 | キヤノン株式会社 | Information processing apparatus, data processing method, and program |
KR100987330B1 (en) * | 2008-05-21 | 2010-10-13 | 성균관대학교산학협력단 | A system and method generating multi-concept networks based on user's web usage data |
US8510262B2 (en) * | 2008-05-21 | 2013-08-13 | Microsoft Corporation | Promoting websites based on location |
US8769048B2 (en) | 2008-06-18 | 2014-07-01 | Commvault Systems, Inc. | Data protection scheduling, such as providing a flexible backup window in a data protection system |
US9128883B2 (en) | 2008-06-19 | 2015-09-08 | Commvault Systems, Inc | Data storage resource allocation by performing abbreviated resource checks based on relative chances of failure of the data storage resources to determine whether data storage requests would fail |
US8352954B2 (en) | 2008-06-19 | 2013-01-08 | Commvault Systems, Inc. | Data storage resource allocation by employing dynamic methods and blacklisting resource request pools |
US9183323B1 (en) | 2008-06-27 | 2015-11-10 | Google Inc. | Suggesting alternative query phrases in query results |
US8538958B2 (en) * | 2008-07-11 | 2013-09-17 | Satyam Computer Services Limited Of Mayfair Centre | System and method for context map generation |
US8301437B2 (en) * | 2008-07-24 | 2012-10-30 | Yahoo! Inc. | Tokenization platform |
US20100076786A1 (en) * | 2008-08-06 | 2010-03-25 | H.Lee Moffitt Cancer Center And Research Institute, Inc. | Computer System and Computer-Implemented Method for Providing Personalized Health Information for Multiple Patients and Caregivers |
WO2010022459A1 (en) * | 2008-08-27 | 2010-03-04 | Rob Chamberlain | System and/or method for linking network content |
US8725688B2 (en) | 2008-09-05 | 2014-05-13 | Commvault Systems, Inc. | Image level copy or restore, such as image level restore without knowledge of data object metadata |
US20100070474A1 (en) | 2008-09-12 | 2010-03-18 | Lad Kamleshkumar K | Transferring or migrating portions of data objects, such as block-level data migration or chunk-based data migration |
US20100070891A1 (en) * | 2008-09-18 | 2010-03-18 | Creekbaum William J | System and method for configuring an application via a visual map interface |
US9772798B2 (en) * | 2008-09-19 | 2017-09-26 | Oracle International Corporation | Method and system for implementing workload management by monitoring disk utilizations |
WO2010033877A1 (en) * | 2008-09-19 | 2010-03-25 | Oracle International Corporation | Storage-side storage request management |
US8868831B2 (en) | 2009-09-14 | 2014-10-21 | Oracle International Corporation | Caching data between a database server and a storage system |
US20100082434A1 (en) * | 2008-09-29 | 2010-04-01 | Yahoo! Inc. | Personalized search results to multiple people |
US9064021B2 (en) * | 2008-10-02 | 2015-06-23 | Liveramp, Inc. | Data source attribution system |
US9396455B2 (en) * | 2008-11-10 | 2016-07-19 | Mindjet Llc | System, method, and software application for enabling a user to view and interact with a visual map in an external application |
US10380634B2 (en) * | 2008-11-22 | 2019-08-13 | Callidus Software, Inc. | Intent inference of website visitors and sales leads package generation |
US8645837B2 (en) | 2008-11-26 | 2014-02-04 | Red Hat, Inc. | Graphical user interface for managing services in a distributed computing system |
US8713016B2 (en) | 2008-12-24 | 2014-04-29 | Comcast Interactive Media, Llc | Method and apparatus for organizing segments of media assets and determining relevance of segments to a query |
US11531668B2 (en) | 2008-12-29 | 2022-12-20 | Comcast Interactive Media, Llc | Merging of multiple data sets |
US8386475B2 (en) * | 2008-12-30 | 2013-02-26 | Novell, Inc. | Attribution analysis and correlation |
US8296297B2 (en) * | 2008-12-30 | 2012-10-23 | Novell, Inc. | Content analysis and correlation |
US8301622B2 (en) * | 2008-12-30 | 2012-10-30 | Novell, Inc. | Identity analysis and correlation |
EP2386088A1 (en) * | 2009-01-06 | 2011-11-16 | Tynt Multimedia Inc. | Systems and methods for detecting network resource interaction and improved search result reporting |
US8595228B1 (en) * | 2009-01-09 | 2013-11-26 | Google Inc. | Preferred sites |
US20100185612A1 (en) * | 2009-01-13 | 2010-07-22 | Hotchalk Inc. | Method for Producing an Ordered Search List |
US8352319B2 (en) * | 2009-03-10 | 2013-01-08 | Google Inc. | Generating user profiles |
US8176043B2 (en) * | 2009-03-12 | 2012-05-08 | Comcast Interactive Media, Llc | Ranking search results |
US20100250479A1 (en) * | 2009-03-31 | 2010-09-30 | Novell, Inc. | Intellectual property discovery and mapping systems and methods |
US8185544B2 (en) * | 2009-04-08 | 2012-05-22 | Google Inc. | Generating improved document classification data using historical search results |
US20120046995A1 (en) | 2009-04-29 | 2012-02-23 | Waldeck Technology, Llc | Anonymous crowd comparison |
US8122041B2 (en) * | 2009-05-08 | 2012-02-21 | Microsoft Corporation | Sharing and collaboration of search findings |
US20100318538A1 (en) * | 2009-06-12 | 2010-12-16 | Google Inc. | Predictive searching and associated cache management |
US20110153425A1 (en) * | 2009-06-21 | 2011-06-23 | James Mercs | Knowledge based search engine |
US20100331075A1 (en) * | 2009-06-26 | 2010-12-30 | Microsoft Corporation | Using game elements to motivate learning |
US8979538B2 (en) * | 2009-06-26 | 2015-03-17 | Microsoft Technology Licensing, Llc | Using game play elements to motivate learning |
US8392267B1 (en) | 2009-06-30 | 2013-03-05 | Mindjet Llc | System, method, and software application for dynamically generating a link to an online procurement site within a software application |
US8635255B2 (en) * | 2009-06-30 | 2014-01-21 | Verizon Patent And Licensing Inc. | Methods and systems for automatically customizing an interaction experience of a user with a media content application |
US9892730B2 (en) | 2009-07-01 | 2018-02-13 | Comcast Interactive Media, Llc | Generating topic-specific language models |
US8280869B1 (en) * | 2009-07-10 | 2012-10-02 | Teradata Us, Inc. | Sharing intermediate results |
US9201973B2 (en) * | 2009-07-10 | 2015-12-01 | Geodex Llc | Computerized system and method for tracking the geographic relevance of website listings and providing graphics and data regarding the same |
US8135735B2 (en) | 2009-07-10 | 2012-03-13 | Geodex, Llc | Computerized system and method for tracking the geographic relevance of website listings and providing graphics and data regarding the same |
US9213776B1 (en) * | 2009-07-17 | 2015-12-15 | Open Invention Network, Llc | Method and system for searching network resources to locate content |
US20110015921A1 (en) * | 2009-07-17 | 2011-01-20 | Minerva Advisory Services, Llc | System and method for using lingual hierarchy, connotation and weight of authority |
US8620929B2 (en) * | 2009-08-14 | 2013-12-31 | Google Inc. | Context based resource relevance |
CN101996215B (en) * | 2009-08-27 | 2013-07-24 | 阿里巴巴集团控股有限公司 | Information matching method and system applied to e-commerce website |
US8498974B1 (en) | 2009-08-31 | 2013-07-30 | Google Inc. | Refining search results |
KR20110031087A (en) * | 2009-09-18 | 2011-03-24 | 인터내셔널 비지네스 머신즈 코포레이션 | Link clouds and user/community-driven dynamic interlinking of resources |
US9201965B1 (en) | 2009-09-30 | 2015-12-01 | Cisco Technology, Inc. | System and method for providing speech recognition using personal vocabulary in a network environment |
US8990083B1 (en) | 2009-09-30 | 2015-03-24 | Cisco Technology, Inc. | System and method for generating personal vocabulary from network data |
US8972391B1 (en) * | 2009-10-02 | 2015-03-03 | Google Inc. | Recent interest based relevance scoring |
US8204892B2 (en) * | 2009-10-26 | 2012-06-19 | Oracle International Corporation | Performance boost for sort operations |
US20110106885A1 (en) * | 2009-10-29 | 2011-05-05 | Cisco Technology, Inc. | Methods and apparatus for supporting multiple party login into a single session |
US8560608B2 (en) | 2009-11-06 | 2013-10-15 | Waldeck Technology, Llc | Crowd formation based on physical boundaries and other rules |
US8805838B1 (en) * | 2009-12-22 | 2014-08-12 | Amazon Technologies, Inc. | Systems and methods for automatic item classification |
US20120063367A1 (en) | 2009-12-22 | 2012-03-15 | Waldeck Technology, Llc | Crowd and profile based communication addresses |
US8849785B1 (en) * | 2010-01-15 | 2014-09-30 | Google Inc. | Search query reformulation using result term occurrence count |
US8301364B2 (en) | 2010-01-27 | 2012-10-30 | Navteq B.V. | Method of operating a navigation system to provide geographic location information |
US8732171B2 (en) * | 2010-01-28 | 2014-05-20 | Microsoft Corporation | Providing query suggestions |
US20110191171A1 (en) * | 2010-02-03 | 2011-08-04 | Yahoo! Inc. | Search engine output-associated bidding in online advertising |
TWI616761B (en) * | 2010-03-09 | 2018-03-01 | Alibaba Group Holding Ltd | Information matching method and system applied to e-commerce website |
US9645996B1 (en) | 2010-03-25 | 2017-05-09 | Open Invention Network Llc | Method and device for automatically generating a tag from a conversation in a social networking website |
US8930351B1 (en) * | 2010-03-31 | 2015-01-06 | Google Inc. | Grouping of users |
US9317613B2 (en) * | 2010-04-21 | 2016-04-19 | Yahoo! Inc. | Large scale entity-specific resource classification |
US20110264796A1 (en) * | 2010-04-23 | 2011-10-27 | Ganz | Search and navigational rating system for online social environment |
US20110270850A1 (en) * | 2010-04-30 | 2011-11-03 | Microsoft Corporation | Prioritization of Resources based on User Activities |
US9697500B2 (en) | 2010-05-04 | 2017-07-04 | Microsoft Technology Licensing, Llc | Presentation of information describing user activities with regard to resources |
US8898217B2 (en) | 2010-05-06 | 2014-11-25 | Apple Inc. | Content delivery based on user terminal events |
US8935274B1 (en) | 2010-05-12 | 2015-01-13 | Cisco Technology, Inc | System and method for deriving user expertise based on data propagating in a network environment |
US8504419B2 (en) * | 2010-05-28 | 2013-08-06 | Apple Inc. | Network-based targeted content delivery based on queue adjustment factors calculated using the weighted combination of overall rank, context, and covariance scores for an invitational content item |
US8370330B2 (en) | 2010-05-28 | 2013-02-05 | Apple Inc. | Predicting content and context performance based on performance history of users |
EP2397952A1 (en) * | 2010-06-15 | 2011-12-21 | Axel Springer Digital TV Guide GmbH | Profile based content retrieval for recommender systems |
US9623119B1 (en) | 2010-06-29 | 2017-04-18 | Google Inc. | Accentuating search results |
US8515980B2 (en) * | 2010-07-16 | 2013-08-20 | Ebay Inc. | Method and system for ranking search results based on categories |
US9020922B2 (en) * | 2010-08-10 | 2015-04-28 | Brightedge Technologies, Inc. | Search engine optimization at scale |
US8510309B2 (en) | 2010-08-31 | 2013-08-13 | Apple Inc. | Selection and delivery of invitational content based on prediction of user interest |
US8640032B2 (en) | 2010-08-31 | 2014-01-28 | Apple Inc. | Selection and delivery of invitational content based on prediction of user intent |
US8577915B2 (en) | 2010-09-10 | 2013-11-05 | Veveo, Inc. | Method of and system for conducting personalized federated search and presentation of results therefrom |
CN101957847B (en) * | 2010-09-21 | 2011-11-23 | 百度在线网络技术(北京)有限公司 | Searching system and implementation method thereof |
US20120158712A1 (en) * | 2010-12-16 | 2012-06-21 | Sushrut Karanjkar | Inferring Geographic Locations for Entities Appearing in Search Queries |
US9465795B2 (en) | 2010-12-17 | 2016-10-11 | Cisco Technology, Inc. | System and method for providing feeds based on activity in a network environment |
US8667169B2 (en) | 2010-12-17 | 2014-03-04 | Cisco Technology, Inc. | System and method for providing argument maps based on activity in a network environment |
US20120166428A1 (en) * | 2010-12-22 | 2012-06-28 | Yahoo! Inc | Method and system for improving quality of web content |
US8370365B1 (en) | 2011-01-31 | 2013-02-05 | Go Daddy Operating Company, LLC | Tools for predicting improvement in website search engine rankings based upon website linking relationships |
US8972412B1 (en) * | 2011-01-31 | 2015-03-03 | Go Daddy Operating Company, LLC | Predicting improvement in website search engine rankings based upon website linking relationships |
US8996495B2 (en) * | 2011-02-15 | 2015-03-31 | Ebay Inc. | Method and system for ranking search results based on category demand normalized using impressions |
US8849762B2 (en) | 2011-03-31 | 2014-09-30 | Commvault Systems, Inc. | Restoring computing environments, such as autorecovery of file systems at certain points in time |
US8620136B1 (en) | 2011-04-30 | 2013-12-31 | Cisco Technology, Inc. | System and method for media intelligent recording in a network environment |
US8326862B2 (en) * | 2011-05-01 | 2012-12-04 | Alan Mark Reznik | Systems and methods for facilitating enhancements to search engine results |
US11841912B2 (en) | 2011-05-01 | 2023-12-12 | Twittle Search Limited Liability Company | System for applying natural language processing and inputs of a group of users to infer commonly desired search results |
US8819009B2 (en) | 2011-05-12 | 2014-08-26 | Microsoft Corporation | Automatic social graph calculation |
US9477574B2 (en) | 2011-05-12 | 2016-10-25 | Microsoft Technology Licensing, Llc | Collection of intranet activity data |
US8909624B2 (en) | 2011-05-31 | 2014-12-09 | Cisco Technology, Inc. | System and method for evaluating results of a search query in a network environment |
US9552425B2 (en) * | 2011-06-02 | 2017-01-24 | Ebay Inc. | System and method for determining query aspects at appropriate category levels |
US9298776B2 (en) * | 2011-06-08 | 2016-03-29 | Ebay Inc. | System and method for mining category aspect information |
US8812483B2 (en) * | 2011-06-21 | 2014-08-19 | Julien Bieren | System and method for optimizing web searching and scheduling of service providers |
US10346856B1 (en) * | 2011-07-08 | 2019-07-09 | Microsoft Technology Licensing, Llc | Personality aggregation and web browsing |
US8886797B2 (en) * | 2011-07-14 | 2014-11-11 | Cisco Technology, Inc. | System and method for deriving user expertise based on data propagating in a network environment |
US9185152B2 (en) * | 2011-08-25 | 2015-11-10 | Ustream, Inc. | Bidirectional communication on live multimedia broadcasts |
US8954423B2 (en) * | 2011-09-06 | 2015-02-10 | Microsoft Technology Licensing, Llc | Using reading levels in responding to requests |
KR101783721B1 (en) * | 2011-09-27 | 2017-10-11 | 네이버 주식회사 | Group targeting system and group targeting method using range ip |
US8751591B2 (en) | 2011-09-30 | 2014-06-10 | Blackberry Limited | Systems and methods of adjusting contact importance for a computing device |
US9189550B2 (en) * | 2011-11-17 | 2015-11-17 | Microsoft Technology Licensing, Llc | Query refinement in a browser toolbar |
US8898164B1 (en) * | 2011-11-17 | 2014-11-25 | Quantcast Corporation | Consumption history privacy |
WO2013089259A1 (en) * | 2011-12-13 | 2013-06-20 | 日本電気株式会社 | Information collection device, system, method, and program |
US9292504B2 (en) * | 2011-12-15 | 2016-03-22 | Verizon Patent And Licensing Inc. | Context generation from active viewing region for context sensitive searching |
US8862597B2 (en) * | 2011-12-27 | 2014-10-14 | Sap Portals Israel Ltd | Providing contextually-relevant content |
US8983948B1 (en) * | 2011-12-29 | 2015-03-17 | Google Inc. | Providing electronic content based on a composition of a social network |
CN103186619B (en) * | 2011-12-30 | 2018-08-07 | 北京百度网讯科技有限公司 | A kind of method and apparatus of the evaluation search result based on non click operation information |
US8930339B2 (en) * | 2012-01-03 | 2015-01-06 | Microsoft Corporation | Search engine performance evaluation using a task-based assessment metric |
DE102012100470A1 (en) * | 2012-01-20 | 2013-07-25 | Nektoon Ag | Method of compiling documents |
US9201964B2 (en) | 2012-01-23 | 2015-12-01 | Microsoft Technology Licensing, Llc | Identifying related entities |
US8831403B2 (en) | 2012-02-01 | 2014-09-09 | Cisco Technology, Inc. | System and method for creating customized on-demand video reports in a network environment |
US20130204863A1 (en) * | 2012-02-04 | 2013-08-08 | Rod Rigole | System and Method for Displaying Search Results |
US9311650B2 (en) | 2012-02-22 | 2016-04-12 | Alibaba Group Holding Limited | Determining search result rankings based on trust level values associated with sellers |
WO2013149220A1 (en) * | 2012-03-30 | 2013-10-03 | Xen, Inc. | Centralized tracking of user interest information from distributed information sources |
US10157184B2 (en) | 2012-03-30 | 2018-12-18 | Commvault Systems, Inc. | Data previewing before recalling large data files |
US20130297582A1 (en) * | 2012-04-09 | 2013-11-07 | Eli Zukovsky | Peer sharing of personalized views of detected information based on relevancy to a particular user's personal interests |
WO2013165366A1 (en) * | 2012-04-30 | 2013-11-07 | Intel Corporation | Contextual peer based guidance systems and methods |
US8930392B1 (en) * | 2012-06-05 | 2015-01-06 | Google Inc. | Simulated annealing in recommendation systems |
US9141504B2 (en) | 2012-06-28 | 2015-09-22 | Apple Inc. | Presenting status data received from multiple devices |
US9535996B1 (en) | 2012-08-30 | 2017-01-03 | deviantArt, Inc. | Selecting content objects for recommendation based on content object collections |
US8938438B2 (en) | 2012-10-11 | 2015-01-20 | Go Daddy Operating Company, LLC | Optimizing search engine ranking by recommending content including frequently searched questions |
US9633216B2 (en) | 2012-12-27 | 2017-04-25 | Commvault Systems, Inc. | Application of information management policies based on operation with a geographic entity |
US9459968B2 (en) | 2013-03-11 | 2016-10-04 | Commvault Systems, Inc. | Single index to query multiple backup formats |
US20140280576A1 (en) * | 2013-03-14 | 2014-09-18 | Google Inc. | Determining activities relevant to groups of individuals |
US10152500B2 (en) | 2013-03-14 | 2018-12-11 | Oracle International Corporation | Read mostly instances |
US20140351059A1 (en) * | 2013-03-15 | 2014-11-27 | adRise, Inc. | Interactive advertising |
US9588675B2 (en) | 2013-03-15 | 2017-03-07 | Google Inc. | Document scale and position optimization |
US10356461B2 (en) | 2013-03-15 | 2019-07-16 | adRise, Inc. | Adaptive multi-device content generation based on associated internet protocol addressing |
US10594763B2 (en) | 2013-03-15 | 2020-03-17 | adRise, Inc. | Platform-independent content generation for thin client applications |
US10887421B2 (en) | 2013-03-15 | 2021-01-05 | Tubi, Inc. | Relevant secondary-device content generation based on associated internet protocol addressing |
CN103441860A (en) * | 2013-04-16 | 2013-12-11 | 阿里巴巴集团控股有限公司 | Recommendation method and device of internet service |
US9405803B2 (en) | 2013-04-23 | 2016-08-02 | Google Inc. | Ranking signals in mixed corpora environments |
CN104156359B (en) * | 2013-05-13 | 2018-10-30 | 腾讯科技(深圳)有限公司 | Interior chain information recommends method and device |
WO2014184786A2 (en) * | 2013-05-16 | 2014-11-20 | Yandex Europe Ag | Method and system for presenting image information to a user of a client device |
US9195703B1 (en) | 2013-06-27 | 2015-11-24 | Google Inc. | Providing context-relevant information to users |
US9330209B1 (en) * | 2013-07-09 | 2016-05-03 | Quantcast Corporation | Characterizing an entity in an identifier space based on behaviors of unrelated entities in a different identifier space |
US20150212710A1 (en) * | 2013-10-10 | 2015-07-30 | Go Daddy Operating Company, LLC | Card interface for managing domain search results |
US9729327B2 (en) * | 2013-10-29 | 2017-08-08 | International Business Machines Corporation | Computer-based optimization of digital signature generation for records based on eventual selection criteria for products and services |
US9767178B2 (en) | 2013-10-30 | 2017-09-19 | Oracle International Corporation | Multi-instance redo apply |
CN103810241B (en) * | 2013-11-22 | 2017-04-05 | 北京奇虎科技有限公司 | Filter method and device that a kind of low frequency is clicked on |
US9465878B2 (en) | 2014-01-17 | 2016-10-11 | Go Daddy Operating Company, LLC | System and method for depicting backlink metrics for a website |
US9460219B2 (en) * | 2014-02-03 | 2016-10-04 | Gogobot, Inc. | Selection and rating of locations and related content based on user categorization |
US10169121B2 (en) | 2014-02-27 | 2019-01-01 | Commvault Systems, Inc. | Work flow management for an information management system |
US9648100B2 (en) | 2014-03-05 | 2017-05-09 | Commvault Systems, Inc. | Cross-system storage management for transferring data across autonomous information management systems |
US20150271211A1 (en) * | 2014-03-21 | 2015-09-24 | Konica Minolta Laboratory U.S.A., Inc. | Rights management policies with nontraditional rights control |
US9823978B2 (en) | 2014-04-16 | 2017-11-21 | Commvault Systems, Inc. | User-level quota management of data objects stored in information management systems |
US9740574B2 (en) | 2014-05-09 | 2017-08-22 | Commvault Systems, Inc. | Load balancing across multiple data paths |
US9582482B1 (en) | 2014-07-11 | 2017-02-28 | Google Inc. | Providing an annotation linking related entities in onscreen content |
US11249858B2 (en) | 2014-08-06 | 2022-02-15 | Commvault Systems, Inc. | Point-in-time backups of a production application made accessible over fibre channel and/or ISCSI as data sources to a remote application by representing the backups as pseudo-disks operating apart from the production application and its host |
US9852026B2 (en) | 2014-08-06 | 2017-12-26 | Commvault Systems, Inc. | Efficient application recovery in an information management system based on a pseudo-storage-device driver |
US9965559B2 (en) | 2014-08-21 | 2018-05-08 | Google Llc | Providing automatic actions for mobile onscreen content |
US10489407B2 (en) | 2014-09-19 | 2019-11-26 | Ebay Inc. | Dynamic modifications of results for search interfaces |
US11250081B1 (en) * | 2014-09-24 | 2022-02-15 | Amazon Technologies, Inc. | Predictive search |
US9444811B2 (en) | 2014-10-21 | 2016-09-13 | Commvault Systems, Inc. | Using an enhanced data agent to restore backed up data across autonomous storage management systems |
US9940409B2 (en) | 2014-10-31 | 2018-04-10 | Bank Of America Corporation | Contextual search tool |
US9922117B2 (en) | 2014-10-31 | 2018-03-20 | Bank Of America Corporation | Contextual search input from advisors |
US9785304B2 (en) | 2014-10-31 | 2017-10-10 | Bank Of America Corporation | Linking customer profiles with household profiles |
CN104361092A (en) * | 2014-11-18 | 2015-02-18 | 百度在线网络技术(北京)有限公司 | Searching method and device |
US10127285B2 (en) * | 2015-07-22 | 2018-11-13 | Ariba, Inc. | Customizable ranking of search engine results in multi-tenant architecture |
US9766825B2 (en) | 2015-07-22 | 2017-09-19 | Commvault Systems, Inc. | Browse and restore for block-level backups |
US10715612B2 (en) * | 2015-09-15 | 2020-07-14 | Oath Inc. | Identifying users' identity through tracking common activity |
US10970646B2 (en) | 2015-10-01 | 2021-04-06 | Google Llc | Action suggestions for user-selected content |
US10678788B2 (en) | 2015-10-22 | 2020-06-09 | Oracle International Corporation | Columnar caching in tiered storage |
US11657037B2 (en) | 2015-10-23 | 2023-05-23 | Oracle International Corporation | Query execution against an in-memory standby database |
US10747752B2 (en) | 2015-10-23 | 2020-08-18 | Oracle International Corporation | Space management for transactional consistency of in-memory objects on a standby database |
US10055390B2 (en) | 2015-11-18 | 2018-08-21 | Google Llc | Simulated hyperlinks on a mobile device based on user intent and a centered selection of text |
ITUB20156079A1 (en) | 2015-12-02 | 2017-06-02 | Torino Politecnico | METHOD TO DETECT WEB TRACKING SERVICES |
US10296368B2 (en) | 2016-03-09 | 2019-05-21 | Commvault Systems, Inc. | Hypervisor-independent block-level live browse for access to backed up virtual machine (VM) data and hypervisor-free file-level recovery (block-level pseudo-mount) |
US10698771B2 (en) | 2016-09-15 | 2020-06-30 | Oracle International Corporation | Zero-data-loss with asynchronous redo shipping to a standby database |
US10535005B1 (en) | 2016-10-26 | 2020-01-14 | Google Llc | Providing contextual actions for mobile onscreen content |
US10891291B2 (en) | 2016-10-31 | 2021-01-12 | Oracle International Corporation | Facilitating operations on pluggable databases using separate logical timestamp services |
US20180137179A1 (en) * | 2016-11-15 | 2018-05-17 | Cofame, Inc. | Systems and methods for digital presence profiler service |
US11475006B2 (en) | 2016-12-02 | 2022-10-18 | Oracle International Corporation | Query and change propagation scheduling for heterogeneous database systems |
US11237696B2 (en) | 2016-12-19 | 2022-02-01 | Google Llc | Smart assist for repeated actions |
US10838821B2 (en) | 2017-02-08 | 2020-11-17 | Commvault Systems, Inc. | Migrating content and metadata from a backup system |
US10740193B2 (en) | 2017-02-27 | 2020-08-11 | Commvault Systems, Inc. | Hypervisor-independent reference copies of virtual machine payload data based on block-level pseudo-mount |
US10891069B2 (en) | 2017-03-27 | 2021-01-12 | Commvault Systems, Inc. | Creating local copies of data stored in online data repositories |
US10776329B2 (en) | 2017-03-28 | 2020-09-15 | Commvault Systems, Inc. | Migration of a database management system to cloud storage |
US11074140B2 (en) | 2017-03-29 | 2021-07-27 | Commvault Systems, Inc. | Live browsing of granular mailbox data |
US10423638B2 (en) | 2017-04-27 | 2019-09-24 | Google Llc | Cloud inference system |
US20180336280A1 (en) * | 2017-05-17 | 2018-11-22 | Linkedin Corporation | Customized search based on user and team activities |
US10691722B2 (en) | 2017-05-31 | 2020-06-23 | Oracle International Corporation | Consistent query execution for big data analytics in a hybrid database |
US10664352B2 (en) | 2017-06-14 | 2020-05-26 | Commvault Systems, Inc. | Live browsing of backed up data residing on cloned disks |
US10489425B2 (en) * | 2017-10-26 | 2019-11-26 | Salesforce.Com, Inc. | User clustering based on query history |
US10795927B2 (en) | 2018-02-05 | 2020-10-06 | Commvault Systems, Inc. | On-demand metadata extraction of clinical image data |
US11023551B2 (en) * | 2018-02-23 | 2021-06-01 | Accenture Global Solutions Limited | Document processing based on proxy logs |
US11501006B2 (en) | 2018-03-05 | 2022-11-15 | Hyundai Motor Company | Leveraging natural language processing to refine access control within collections |
CN110232281B (en) * | 2018-03-05 | 2023-07-04 | 现代自动车株式会社 | Improved access control within a collection using natural language processing |
US10789387B2 (en) | 2018-03-13 | 2020-09-29 | Commvault Systems, Inc. | Graphical representation of an information management system |
US11216786B2 (en) * | 2018-07-17 | 2022-01-04 | Kavita Ramchandani Snyder | System and method for dispatching intelligent invitations to users within a network |
US20200104783A1 (en) * | 2018-09-28 | 2020-04-02 | Evernote Corporation | Task-based action generation |
US11170002B2 (en) | 2018-10-19 | 2021-11-09 | Oracle International Corporation | Integrating Kafka data-in-motion with data-at-rest tables |
US10860443B2 (en) | 2018-12-10 | 2020-12-08 | Commvault Systems, Inc. | Evaluation and reporting of recovery readiness in a data storage management system |
US11500930B2 (en) * | 2019-05-28 | 2022-11-15 | Slack Technologies, Llc | Method, apparatus and computer program product for generating tiered search index fields in a group-based communication platform |
US11308034B2 (en) | 2019-06-27 | 2022-04-19 | Commvault Systems, Inc. | Continuously run log backup with minimal configuration and resource usage from the source machine |
US11218443B2 (en) | 2019-07-25 | 2022-01-04 | Coupang Corp. | Dynamic IP address categorization systems and methods |
WO2021039372A1 (en) * | 2019-08-29 | 2021-03-04 | 株式会社Nttドコモ | Re-ranking device |
US11061980B2 (en) * | 2019-09-18 | 2021-07-13 | Capital One Services, Llc | System and method for integrating content into webpages |
US11379532B2 (en) * | 2019-10-17 | 2022-07-05 | The Toronto-Dominion Bank | System and method for generating a recommendation |
US11507576B2 (en) * | 2020-05-20 | 2022-11-22 | T-Mobile Usa, Inc. | Method and system to efficiently analyze and improve database queries |
US10987592B1 (en) | 2020-06-05 | 2021-04-27 | 12traits, Inc. | Systems and methods to correlate user behavior patterns within an online game with psychological attributes of users |
US20220121549A1 (en) * | 2020-10-16 | 2022-04-21 | Oath Inc. | Systems and methods for rendering unified and real-time user interest profiles |
US11775599B2 (en) * | 2020-11-10 | 2023-10-03 | Shopify Inc. | System and method for displaying customized search results based on past behaviour |
US11206263B1 (en) * | 2021-01-25 | 2021-12-21 | 12traits, Inc. | Systems and methods to determine content to present based on interaction information of a given user |
US11727424B2 (en) | 2021-06-04 | 2023-08-15 | Solsten, Inc. | Systems and methods to correlate user behavior patterns within digital application environments with psychological attributes of users to determine adaptations to the digital application environments |
US11962817B2 (en) | 2021-06-21 | 2024-04-16 | Tubi, Inc. | Machine learning techniques for advanced frequency management |
WO2023019089A1 (en) * | 2021-08-11 | 2023-02-16 | Google Llc | User interfaces for surfacing web browser history data |
Citations (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010037325A1 (en) * | 2000-03-06 | 2001-11-01 | Alexis Biderman | Method and system for locating internet users having similar navigation patterns |
US6314420B1 (en) * | 1996-04-04 | 2001-11-06 | Lycos, Inc. | Collaborative/adaptive search engine |
US6327590B1 (en) * | 1999-05-05 | 2001-12-04 | Xerox Corporation | System and method for collaborative ranking of search results employing user and group profiles derived from document collection content analysis |
US20030014399A1 (en) * | 2001-03-12 | 2003-01-16 | Hansen Mark H. | Method for organizing records of database search activity by topical relevance |
US6606657B1 (en) * | 1999-06-22 | 2003-08-12 | Comverse, Ltd. | System and method for processing and presenting internet usage information |
US20030171977A1 (en) * | 2002-03-07 | 2003-09-11 | Compete, Inc. | Clickstream analysis methods and systems |
US20050060311A1 (en) * | 2003-09-12 | 2005-03-17 | Simon Tong | Methods and systems for improving a search ranking using related queries |
US20050108406A1 (en) * | 2003-11-07 | 2005-05-19 | Dynalab Inc. | System and method for dynamically generating a customized menu page |
US6934748B1 (en) * | 1999-08-26 | 2005-08-23 | Memetrics Holdings Pty Limited | Automated on-line experimentation to measure users behavior to treatment for a set of content elements |
US6959319B1 (en) * | 2000-09-11 | 2005-10-25 | International Business Machines Corporation | System and method for automatically personalizing web portals and web services based upon usage history |
US20060026147A1 (en) * | 2004-07-30 | 2006-02-02 | Cone Julian M | Adaptive search engine |
US20060041550A1 (en) * | 2004-08-19 | 2006-02-23 | Claria Corporation | Method and apparatus for responding to end-user request for information-personalization |
US20060064411A1 (en) * | 2004-09-22 | 2006-03-23 | William Gross | Search engine using user intent |
US20060112079A1 (en) * | 2004-11-23 | 2006-05-25 | International Business Machines Corporation | System and method for generating personalized web pages |
US20070005646A1 (en) * | 2005-06-30 | 2007-01-04 | Microsoft Corporation | Analysis of topic dynamics of web search |
US7203909B1 (en) * | 2002-04-04 | 2007-04-10 | Microsoft Corporation | System and methods for constructing personalized context-sensitive portal pages or views by analyzing patterns of users' information access activities |
US20070150465A1 (en) * | 2005-12-27 | 2007-06-28 | Scott Brave | Method and apparatus for determining expertise based upon observed usage patterns |
US20080065631A1 (en) * | 2006-09-12 | 2008-03-13 | Yahoo! Inc. | User query data mining and related techniques |
US7685191B1 (en) * | 2005-06-16 | 2010-03-23 | Enquisite, Inc. | Selection of advertisements to present on a web page or other destination based on search activities of users who selected the destination |
US7774335B1 (en) * | 2005-08-23 | 2010-08-10 | Amazon Technologies, Inc. | Method and system for determining interest levels of online content navigation paths |
Family Cites Families (67)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5724567A (en) * | 1994-04-25 | 1998-03-03 | Apple Computer, Inc. | System for directing relevance-ranked data objects to computer users |
US6460036B1 (en) * | 1994-11-29 | 2002-10-01 | Pinpoint Incorporated | System and method for providing customized electronic newspapers and target advertisements |
US5758257A (en) * | 1994-11-29 | 1998-05-26 | Herz; Frederick | System and method for scheduling broadcast of and access to video programs and other data using customer profiles |
US6092049A (en) * | 1995-06-30 | 2000-07-18 | Microsoft Corporation | Method and apparatus for efficiently recommending items using automated collaborative filtering and feature-guided automated collaborative filtering |
US5790426A (en) * | 1996-04-30 | 1998-08-04 | Athenium L.L.C. | Automated collaborative filtering system |
US6285999B1 (en) * | 1997-01-10 | 2001-09-04 | The Board Of Trustees Of The Leland Stanford Junior University | Method for node ranking in a linked database |
US6012051A (en) * | 1997-02-06 | 2000-01-04 | America Online, Inc. | Consumer profiling system with analytic decision processor |
US6182068B1 (en) * | 1997-08-01 | 2001-01-30 | Ask Jeeves, Inc. | Personalized search methods |
US5974412A (en) * | 1997-09-24 | 1999-10-26 | Sapient Health Network | Intelligent query system for automatically indexing information in a database and automatically categorizing users |
US6421675B1 (en) * | 1998-03-16 | 2002-07-16 | S. L. I. Systems, Inc. | Search engine |
US6317722B1 (en) * | 1998-09-18 | 2001-11-13 | Amazon.Com, Inc. | Use of electronic shopping carts to generate personal recommendations |
US6338066B1 (en) * | 1998-09-25 | 2002-01-08 | International Business Machines Corporation | Surfaid predictor: web-based system for predicting surfer behavior |
US6845370B2 (en) * | 1998-11-12 | 2005-01-18 | Accenture Llp | Advanced information gathering for targeted activities |
US6385619B1 (en) * | 1999-01-08 | 2002-05-07 | International Business Machines Corporation | Automatic user interest profile generation from structured document access information |
US6907566B1 (en) * | 1999-04-02 | 2005-06-14 | Overture Services, Inc. | Method and system for optimum placement of advertisements on a webpage |
US6493702B1 (en) * | 1999-05-05 | 2002-12-10 | Xerox Corporation | System and method for searching and recommending documents in a collection using share bookmarks |
US6807574B1 (en) * | 1999-10-22 | 2004-10-19 | Tellme Networks, Inc. | Method and apparatus for content personalization over a telephone interface |
US6978303B1 (en) * | 1999-10-26 | 2005-12-20 | Iontal Limited | Monitoring of computer usage |
US6489968B1 (en) * | 1999-11-18 | 2002-12-03 | Amazon.Com, Inc. | System and method for exposing popular categories of browse tree |
EP1107128A1 (en) | 1999-12-03 | 2001-06-13 | Hyundai Electronics Industries Co., Ltd. | Apparatus and method for checking the validity of links in a computer network |
US6785671B1 (en) * | 1999-12-08 | 2004-08-31 | Amazon.Com, Inc. | System and method for locating web-based product offerings |
JP3630057B2 (en) * | 2000-01-26 | 2005-03-16 | 日本電気株式会社 | Data structure construction method for search, apparatus thereof, and machine-readable program recording medium |
US6868525B1 (en) * | 2000-02-01 | 2005-03-15 | Alberti Anemometer Llc | Computer graphic display visualization system and method |
US20010037407A1 (en) * | 2000-03-23 | 2001-11-01 | Zvetan Dragulev | System and method for managing user-specific data |
US7725525B2 (en) * | 2000-05-09 | 2010-05-25 | James Duncan Work | Method and apparatus for internet-based human network brokering |
AU2001271397A1 (en) * | 2000-06-23 | 2002-01-08 | Decis E-Direct, Inc. | Component models |
US6535888B1 (en) * | 2000-07-19 | 2003-03-18 | Oxelis, Inc. | Method and system for providing a visual search directory |
US6687696B2 (en) * | 2000-07-26 | 2004-02-03 | Recommind Inc. | System and method for personalized search, information filtering, and for generating recommendations utilizing statistical latent class models |
US6895406B2 (en) * | 2000-08-25 | 2005-05-17 | Seaseer R&D, Llc | Dynamic personalization method of creating personalized user profiles for searching a database of information |
AU2001291248B2 (en) * | 2000-09-28 | 2006-08-31 | Oracle International Corporation | Enterprise web mining system and method |
JP3934325B2 (en) * | 2000-10-31 | 2007-06-20 | 株式会社日立製作所 | Document search method, document search apparatus, and storage medium for document search program |
US20020138331A1 (en) * | 2001-02-05 | 2002-09-26 | Hosea Devin F. | Method and system for web page personalization |
US8001118B2 (en) * | 2001-03-02 | 2011-08-16 | Google Inc. | Methods and apparatus for employing usage statistics in document retrieval |
US20020198882A1 (en) * | 2001-03-29 | 2002-12-26 | Linden Gregory D. | Content personalization based on actions performed during a current browsing session |
US7165105B2 (en) * | 2001-07-16 | 2007-01-16 | Netgenesis Corporation | System and method for logical view analysis and visualization of user behavior in a distributed computer network |
US7207062B2 (en) * | 2001-08-16 | 2007-04-17 | Lucent Technologies Inc | Method and apparatus for protecting web sites from distributed denial-of-service attacks |
US6732092B2 (en) * | 2001-09-28 | 2004-05-04 | Client Dynamics, Inc. | Method and system for database queries and information delivery |
US6801917B2 (en) * | 2001-11-13 | 2004-10-05 | Koninklijke Philips Electronics N.V. | Method and apparatus for partitioning a plurality of items into groups of similar items in a recommender of such items |
US7567953B2 (en) * | 2002-03-01 | 2009-07-28 | Business Objects Americas | System and method for retrieving and organizing information from disparate computer network information sources |
US6917938B2 (en) * | 2002-05-06 | 2005-07-12 | Ideapivot Corporation | Collaborative context information management system |
US6892198B2 (en) | 2002-06-14 | 2005-05-10 | Entopia, Inc. | System and method for personalized information retrieval based on user expertise |
US20040044571A1 (en) * | 2002-08-27 | 2004-03-04 | Bronnimann Eric Robert | Method and system for providing advertising listing variance in distribution feeds over the internet to maximize revenue to the advertising distributor |
US7836391B2 (en) | 2003-06-10 | 2010-11-16 | Google Inc. | Document search engine including highlighting of confident results |
US7363302B2 (en) * | 2003-06-30 | 2008-04-22 | Googole, Inc. | Promoting and/or demoting an advertisement from an advertising spot of one type to an advertising spot of another type |
US7610381B2 (en) * | 2003-09-12 | 2009-10-27 | Hewlett-Packard Development Company, L.P. | System and method for evaluating a capacity of a streaming media server for supporting a workload |
US7606798B2 (en) * | 2003-09-22 | 2009-10-20 | Google Inc. | Methods and systems for improving a search ranking using location awareness |
US7693827B2 (en) * | 2003-09-30 | 2010-04-06 | Google Inc. | Personalization of placed content ordering in search results |
US7797316B2 (en) * | 2003-09-30 | 2010-09-14 | Google Inc. | Systems and methods for determining document freshness |
US7346839B2 (en) * | 2003-09-30 | 2008-03-18 | Google Inc. | Information retrieval based on historical data |
US20050071328A1 (en) * | 2003-09-30 | 2005-03-31 | Lawrence Stephen R. | Personalization of web search |
US20050096997A1 (en) * | 2003-10-31 | 2005-05-05 | Vivek Jain | Targeting shoppers in an online shopping environment |
US7240049B2 (en) * | 2003-11-12 | 2007-07-03 | Yahoo! Inc. | Systems and methods for search query processing using trend analysis |
US7634472B2 (en) | 2003-12-01 | 2009-12-15 | Yahoo! Inc. | Click-through re-ranking of images and other data |
US7885901B2 (en) * | 2004-01-29 | 2011-02-08 | Yahoo! Inc. | Method and system for seeding online social network contacts |
US8631001B2 (en) * | 2004-03-31 | 2014-01-14 | Google Inc. | Systems and methods for weighting a search query result |
US20050246391A1 (en) * | 2004-04-29 | 2005-11-03 | Gross John N | System & method for monitoring web pages |
US20070067297A1 (en) * | 2004-04-30 | 2007-03-22 | Kublickis Peter J | System and methods for a micropayment-enabled marketplace with permission-based, self-service, precision-targeted delivery of advertising, entertainment and informational content and relationship marketing to anonymous internet users |
US7562068B2 (en) * | 2004-06-30 | 2009-07-14 | Microsoft Corporation | System and method for ranking search results based on tracked user preferences |
US7716219B2 (en) | 2004-07-08 | 2010-05-11 | Yahoo ! Inc. | Database search system and method of determining a value of a keyword in a search |
US7580929B2 (en) * | 2004-07-26 | 2009-08-25 | Google Inc. | Phrase-based personalization of searches in an information retrieval system |
CA2578379A1 (en) * | 2004-08-26 | 2006-03-02 | Omni-Branch Wireless Solutions, Inc. | Opt-in directory of verified individual profiles |
US20060074883A1 (en) * | 2004-10-05 | 2006-04-06 | Microsoft Corporation | Systems, methods, and interfaces for providing personalized search and information access |
US20060161553A1 (en) * | 2005-01-19 | 2006-07-20 | Tiny Engine, Inc. | Systems and methods for providing user interaction based profiles |
US7756855B2 (en) * | 2006-10-11 | 2010-07-13 | Collarity, Inc. | Search phrase refinement by search term replacement |
US20070260597A1 (en) * | 2006-05-02 | 2007-11-08 | Mark Cramer | Dynamic search engine results employing user behavior |
US20080147659A1 (en) * | 2006-12-15 | 2008-06-19 | Ratepoint, Inc. | System and method for determining behavioral similarity between users and user data to identify groups to share user impressions of ratable objects |
US7734641B2 (en) * | 2007-05-25 | 2010-06-08 | Peerset, Inc. | Recommendation systems and methods using interest correlation |
-
2006
- 2006-03-30 US US11/394,620 patent/US8078607B2/en active Active
-
2007
- 2007-02-14 US US11/675,057 patent/US20070233671A1/en not_active Abandoned
- 2007-03-30 CN CN2007800197484A patent/CN101454780B/en active Active
- 2007-03-30 WO PCT/US2007/065710 patent/WO2007115217A2/en active Application Filing
- 2007-03-30 EP EP07759892A patent/EP2005339A2/en not_active Ceased
-
2011
- 2011-12-12 US US13/323,758 patent/US20120089598A1/en not_active Abandoned
Patent Citations (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6314420B1 (en) * | 1996-04-04 | 2001-11-06 | Lycos, Inc. | Collaborative/adaptive search engine |
US6327590B1 (en) * | 1999-05-05 | 2001-12-04 | Xerox Corporation | System and method for collaborative ranking of search results employing user and group profiles derived from document collection content analysis |
US6606657B1 (en) * | 1999-06-22 | 2003-08-12 | Comverse, Ltd. | System and method for processing and presenting internet usage information |
US6934748B1 (en) * | 1999-08-26 | 2005-08-23 | Memetrics Holdings Pty Limited | Automated on-line experimentation to measure users behavior to treatment for a set of content elements |
US20010037325A1 (en) * | 2000-03-06 | 2001-11-01 | Alexis Biderman | Method and system for locating internet users having similar navigation patterns |
US6959319B1 (en) * | 2000-09-11 | 2005-10-25 | International Business Machines Corporation | System and method for automatically personalizing web portals and web services based upon usage history |
US20030014399A1 (en) * | 2001-03-12 | 2003-01-16 | Hansen Mark H. | Method for organizing records of database search activity by topical relevance |
US20030171977A1 (en) * | 2002-03-07 | 2003-09-11 | Compete, Inc. | Clickstream analysis methods and systems |
US7203909B1 (en) * | 2002-04-04 | 2007-04-10 | Microsoft Corporation | System and methods for constructing personalized context-sensitive portal pages or views by analyzing patterns of users' information access activities |
US20050060311A1 (en) * | 2003-09-12 | 2005-03-17 | Simon Tong | Methods and systems for improving a search ranking using related queries |
US20050108406A1 (en) * | 2003-11-07 | 2005-05-19 | Dynalab Inc. | System and method for dynamically generating a customized menu page |
US20060026147A1 (en) * | 2004-07-30 | 2006-02-02 | Cone Julian M | Adaptive search engine |
US20060041550A1 (en) * | 2004-08-19 | 2006-02-23 | Claria Corporation | Method and apparatus for responding to end-user request for information-personalization |
US20060064411A1 (en) * | 2004-09-22 | 2006-03-23 | William Gross | Search engine using user intent |
US20060112079A1 (en) * | 2004-11-23 | 2006-05-25 | International Business Machines Corporation | System and method for generating personalized web pages |
US7685191B1 (en) * | 2005-06-16 | 2010-03-23 | Enquisite, Inc. | Selection of advertisements to present on a web page or other destination based on search activities of users who selected the destination |
US20070005646A1 (en) * | 2005-06-30 | 2007-01-04 | Microsoft Corporation | Analysis of topic dynamics of web search |
US7774335B1 (en) * | 2005-08-23 | 2010-08-10 | Amazon Technologies, Inc. | Method and system for determining interest levels of online content navigation paths |
US20070150465A1 (en) * | 2005-12-27 | 2007-06-28 | Scott Brave | Method and apparatus for determining expertise based upon observed usage patterns |
US20080065631A1 (en) * | 2006-09-12 | 2008-03-13 | Yahoo! Inc. | User query data mining and related techniques |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9646063B1 (en) * | 2007-05-25 | 2017-05-09 | Google Inc. | Sharing of profile information with content providers |
US20110035375A1 (en) * | 2009-08-06 | 2011-02-10 | Ron Bekkerman | Building user profiles for website personalization |
US9348934B2 (en) * | 2010-03-10 | 2016-05-24 | Lockheed Martin Corporation | Systems and methods for facilitating open source intelligence gathering |
US20150121265A1 (en) * | 2010-03-10 | 2015-04-30 | Lockheed Martin Corporation | Systems and methods for facilitating open source intelligence gathering |
US20120131008A1 (en) * | 2010-11-23 | 2012-05-24 | Microsoft Corporation | Indentifying referring expressions for concepts |
US8332426B2 (en) * | 2010-11-23 | 2012-12-11 | Microsoft Corporation | Indentifying referring expressions for concepts |
US8364672B2 (en) | 2010-11-23 | 2013-01-29 | Microsoft Corporation | Concept disambiguation via search engine search results |
US10147146B2 (en) * | 2012-03-14 | 2018-12-04 | Disney Enterprises, Inc. | Tailoring social elements of virtual environments |
US20130332521A1 (en) * | 2012-06-07 | 2013-12-12 | United Video Properties, Inc. | Systems and methods for compiling media information based on privacy and reliability metrics |
US20140108373A1 (en) * | 2012-10-15 | 2014-04-17 | Wixpress Ltd | System for deep linking and search engine support for web sites integrating third party application and components |
US9436765B2 (en) * | 2012-10-15 | 2016-09-06 | Wix.Com Ltd. | System for deep linking and search engine support for web sites integrating third party application and components |
US10534818B2 (en) * | 2012-10-15 | 2020-01-14 | Wix.Com Ltd. | System and method for deep linking and search engine support for web sites integrating third party application and components |
US11113456B2 (en) | 2012-10-15 | 2021-09-07 | Wix.Com Ltd. | System and method for deep linking and search engine support for web sites integrating third party application and components |
US9779140B2 (en) * | 2012-11-16 | 2017-10-03 | Google Inc. | Ranking signals for sparse corpora |
US20140143222A1 (en) * | 2012-11-16 | 2014-05-22 | Google Inc. | Ranking signals for sparse corpora |
US20150242486A1 (en) * | 2014-02-25 | 2015-08-27 | International Business Machines Corporation | Discovering communities and expertise of users using semantic analysis of resource access logs |
US9852208B2 (en) * | 2014-02-25 | 2017-12-26 | International Business Machines Corporation | Discovering communities and expertise of users using semantic analysis of resource access logs |
CN104462357A (en) * | 2014-12-08 | 2015-03-25 | 百度在线网络技术(北京)有限公司 | Method and device for realizing personalized search |
US10061817B1 (en) | 2015-07-29 | 2018-08-28 | Google Llc | Social ranking for apps |
US10698971B2 (en) * | 2016-08-03 | 2020-06-30 | Samsung Electronics Co., Ltd. | Method and apparatus for storing access log based on keyword |
Also Published As
Publication number | Publication date |
---|---|
CN101454780B (en) | 2013-09-11 |
WO2007115217A3 (en) | 2008-01-03 |
EP2005339A2 (en) | 2008-12-24 |
US20070239680A1 (en) | 2007-10-11 |
US20070233671A1 (en) | 2007-10-04 |
WO2007115217A2 (en) | 2007-10-11 |
CN101454780A (en) | 2009-06-10 |
US8078607B2 (en) | 2011-12-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8078607B2 (en) | Generating website profiles based on queries from webistes and user activities on the search results | |
US11816114B1 (en) | Modifying search result ranking based on implicit user feedback | |
US8321278B2 (en) | Targeted advertisements based on user profiles and page profile | |
JP5572596B2 (en) | Personalize the ordering of place content in search results | |
US8938463B1 (en) | Modifying search result ranking based on implicit user feedback and a model of presentation bias | |
US9396238B2 (en) | Systems and methods for determining user preferences | |
US9390143B2 (en) | Recent interest based relevance scoring | |
US6718365B1 (en) | Method, system, and program for ordering search results using an importance weighting | |
US7440968B1 (en) | Query boosting based on classification | |
CN102859516B (en) | Generating improved document classification data using historical search results | |
US8645367B1 (en) | Predicting data for document attributes based on aggregated data for repeated URL patterns | |
US20050222989A1 (en) | Results based personalization of advertisements in a search engine | |
US20060064411A1 (en) | Search engine using user intent | |
US20150161256A1 (en) | Method, System, and Graphical User Interface for Providing Personalized Recommendations of Popular Search Queries | |
US7216122B2 (en) | Information processing device and method, recording medium, and program | |
WO2009140272A2 (en) | Search results with most clicked next objects | |
KR20020025142A (en) | A Keyword Recommend System and Method for Keyword Advertise Service | |
US8874570B1 (en) | Search boost vector based on co-visitation information | |
JP7462198B1 (en) | Keyword collection method, information processing device, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: GOOGLE LLC, CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:GOOGLE INC.;REEL/FRAME:044695/0115 Effective date: 20170929 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |