WO1999045487A1 - Identifying the items most relevant to a current query based on items selected in connection with similar queries - Google Patents

Identifying the items most relevant to a current query based on items selected in connection with similar queries Download PDF

Info

Publication number
WO1999045487A1
WO1999045487A1 PCT/US1998/026985 US9826985W WO9945487A1 WO 1999045487 A1 WO1999045487 A1 WO 1999045487A1 US 9826985 W US9826985 W US 9826985W WO 9945487 A1 WO9945487 A1 WO 9945487A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
items
item
facility
identified
Prior art date
Application number
PCT/US1998/026985
Other languages
French (fr)
Inventor
Dwayne Bowman
Ruben E. Ortega
Greg Linden
Joel R. Spiegel
Original Assignee
Amazon.Com, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/041,081 external-priority patent/US6185558B1/en
Application filed by Amazon.Com, Inc. filed Critical Amazon.Com, Inc.
Priority to EP98964094A priority Critical patent/EP1060449B1/en
Priority to AT98964094T priority patent/ATE243869T1/en
Priority to NZ506229A priority patent/NZ506229A/en
Priority to CA002320293A priority patent/CA2320293C/en
Priority to JP2000534960A priority patent/JP4792551B2/en
Priority to AU19290/99A priority patent/AU757550B2/en
Priority to DE69815898T priority patent/DE69815898T2/en
Publication of WO1999045487A1 publication Critical patent/WO1999045487A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3325Reformulation based on results of preceding query
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3349Reuse of stored results of previous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • G06Q30/0625Directed, with specific intent or strategy
    • G06Q30/0627Directed, with specific intent or strategy using item specifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0641Shopping interfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/08Auctions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution

Definitions

  • the present invention is directed to the field of query processing.
  • Many World Wide Web sites permit users to perform searches to identify a small number of interesting items among a much larger domain of items.
  • several web index sites permit users to search for particular web sites among most of the known web sites.
  • many online merchants such as booksellers, permit users to search for particular products among all of the products that can be purchased from a merchant. In many cases, users perform searches in order to ultimately find a single item within an entire domain of items.
  • a user submits a query containing one or more query terms.
  • the query also explicitly or implicitly identifies a domain of items to search.
  • a user may submit a query to an online bookseller containing terms that the user believes are words in the title of a book.
  • a query server program processes the query to identify within the domain items matching the terms of the query.
  • the items identified by the query server program are collectively known as a query result.
  • the query result is a list of books whose titles contain some or all of the query terms.
  • the query result is typically displayed to the user as a list of items. This list may be ordered in various ways. For example, the list may be ordered alphabetically or numerically based on a property of each item, such as the title, author, or release date of each book. As another example, the list may be ordered based on the extent to which each identified item matches the terms of the query.
  • search engines adopt a strategy of effectively automatically revising the query until a non-empty result set is produced.
  • a search engine may progressively delete conjunctive, i.e., ANDed, terms from a multiple term query until the result set produced for that query contains items.
  • This strategy has the disadvantage that important information for choosing the correct items can be lost when query terms are arbitrarily deleted.
  • the first nonempty result set can be quite large, and may contain a large percentage of items that are irrelevant to the original query as a whole. For this reason, a more effective technique for displaying items relating to at least some of the terms in a query even when no items completely match the query would have significant utility.
  • the present invention provides a software facility (“the facility”) for identifying the items most relevant to a current query based on items selected in connection with similar queries.
  • the facility preferably generates ranking values for items indicating their level of relevance to the current query, which specifies one or more query terms.
  • the facility generates a ranking value for an item by combining rating scores, produced by a rating function, that each correspond to the level of relevance of the item to queries containing one of the ranking values.
  • the rating function preferably retrieves a rating score for the combination of an item and a term from a rating table generated by the facility.
  • the scores in the rating table preferably reflect, for a particular item and term, how often users have selected the item when the item has been identified in query results produced for queries containing particular term.
  • the facility uses the rating scores to either generate a ranking value for each item in a query result, or generate ranking values for a smaller number of items in order to select a few items having the top ranking values.
  • the facility combines the rating scores corresponding to that item and the terms of the query.
  • the facility preferably loops through the items in the query results and, for each item, combines all of the rating scores corresponding to that item and any of the terms in the query.
  • the facility preferably loops through the terms in the query, and, for each item, identifies the top few rating scores for that term and any item. The facility then combines the scores identified for each item to generate ranking values for a relatively small number of items, which may include items not identified in the query result. Indeed, these embodiments of the invention are able to generate ranking values for and display items even in cases in which the query result is empty, i.e., when no items completely satisfy the query.
  • the facility preferably orders the items of the query result in decreasing order of ranking value.
  • the facility may also use the ranking values to subset the items in the query result to a smaller number of items.
  • a query result for a query containing the query terms "human” and “dynamic” may contain a book about human dynamics and a book about the effects on human beings of particle dynamics
  • selections by users from early query results produced for queries containing the term "human” show that these users select the human dynamics book much more frequently than they select the particle dynamics book.
  • the facility therefore ranks the human dynamics book higher than the particle dynamics book, allowing users that are more interested in the human dynamics book to select it more easily. This benefit of the facility is especially useful in conjunction with the large, heterogeneous query results that are typically generated for single-term queries, which are commonly submitted by users.
  • Various embodiments of the invention base rating scores on different kinds of selection actions performed by the users on items identified in query results. These include whether the user displayed additional information about an item, how much time the user spent viewing the additional information about the item, how many hyperlinks the user followed within the additional information about the item, whether the user added the item to his or her shopping basket, and whether the user ultimately purchased the item. Embodiments of the invention also consider selection actions not relating to query results, such as typing an item's item identifier rather than choosing the item from a query result. Additional embodiments of the invention incorporate into the ranking process information about the user submitting the query by maintaining and applying separate rating scores for users in different demographic groups, such as those of the same sex, age, income, or geographic category. Certain embodiments also incorporate behavioral information about specific users.
  • rating scores may be produced by a rating function that combines different types of information reflecting collective and individual user preferences. Some embodiments of the invention utilize specialized strategies for incorporating into the rating scores information about queries submitted in different time frames. BRIEF DESCRIPTION OF THE DRAWINGS
  • Figure 1 is a high-level block diagram showing the computer system upon which the facility preferably executes.
  • Figure 2 is a flow diagram showing the steps preferably performed by the facility in order to generate a new rating table.
  • Figures 3 and 4 are table diagrams showing augmentation of an item rating table in accordance with step 206 ( Figure 2).
  • Figure 5 is a table diagram showing the generation of rating tables for composite periods of time from rating tables for constituent periods of time.
  • Figure 6 is a table diagram showing a rating table for a composite period.
  • Figure 7 is a flow diagram showing the steps preferably performed by the facility in order to identify user selections within a web server log.
  • Figure 8 is a flow diagram showing the steps preferably performed by the facility to order a query result using a rating table by generating a ranking value for each item in the query result.
  • Figure 9 is a flow diagram showing the steps preferably performed by the facility to select a few items in a query result having the highest ranking values using a rating table.
  • the present invention provides a software facility (“the facility”) for identifying the items most relevant to a current query based on items selected in connection with similar queries.
  • the facility preferably generates ranking values for items indicating their level of relevance to the current query, which specifies one or more query terms.
  • the facility generates a ranking value for an item by combining rating scores, produced by a rating function, that each correspond to the level of relevance of the item to queries containing one of the ranking values.
  • the rating function preferably retrieves a rating score for the combination of an item and a term from a rating table generated by the facility.
  • the scores in the rating table preferably reflect, for a particular item and term, how often users have selected the item when the item has been identified in query results produced for queries containing the term.
  • the facility uses the rating scores to either generate a ranking value for each item in a query result, or generate ranking values for a smaller number of items in order to select a few items having the top ranking values.
  • the facility combines the rating scores corresponding to that item and the terms of the query.
  • the facility preferably loops through the items in the query results and, for each item, combines all of the rating scores corresponding to that item and any of the terms in the query.
  • the facility preferably loops through the terms in the query, and, for each item, identifies the top few rating scores for that term and any item. The facility then combines the scores identified for each item to generate ranking values for a relatively small number of items, which may include items not identified in the query result. Indeed, these embodiments of the invention are able to generate ranking values for and display items even in cases in which the query result is empty, i.e., when no items completely satisfy the query.
  • the facility preferably orders the items of the query result in decreasing order of ranking value.
  • the facility may also use the ranking values to subset the items in the query result to a smaller number of items.
  • a query result for a query containing the query terms "human” and “dynamic” may contain a book about human dynamics and a book about the effects on human beings of particle dynamics
  • selections by users from early query results produced for queries containing the term "human” show that these users select the human dynamics book much more frequently than they select the particle dynamics book.
  • the facility therefore ranks the human dynamics book higher than the particle dynamics book, allowing users, most of whom are more interested in the human dynamics book, to select it more easily. This benefit of the facility is especially useful in conjunction with the large, heterogeneous query results that are typically generated for single-term queries, which are commonly submitted by users.
  • Various embodiments of the invention base rating scores on different kinds of selection actions performed by the users on items identified in query results. These include whether the user displayed additional information about an item, how much time the user spent viewing the additional information about the item, how many hyperlinks the user followed within the additional information about the item, whether the user added the item to his or her shopping basket, and whether the user ultimately purchased the item. Embodiments of the invention also consider selection actions not relating to query results, such as typing an item's item identifier rather than choosing the item from a query result. Additional embodiments of the invention incorporate into the ranking process information about the user submitting the query by maintaining and applying separate rating scores for users in different demographic groups, such as those of the same sex, age, income, or geographic category. Certain embodiments also incorporate behavioral information about specific users. Further, rating scores may be produced by a rating function that combines different types of information reflecting collective and individual user preferences. Some embodiments of the invention utilize specialized strategies for incorporating into the rating scores information about queries submitted in different time frames.
  • FIG. 1 is a high-level block diagram showing the computer system upon which the facility preferably executes.
  • the computer system 100 comprises a central processing unit (CPU) 110, input/output devices 120, and a computer memory (memory) 130.
  • the input/output devices is a storage device 121, such as a hard disk drive; a computer-readable media drive 122, which can be used to install software products, including the facility, which are provided on a computer-readable medium, such as a CD-ROM; and a network connection 123 for connection the computer system 100 to other computer systems (not shown).
  • a storage device 121 such as a hard disk drive
  • a computer-readable media drive 122 which can be used to install software products, including the facility, which are provided on a computer-readable medium, such as a CD-ROM
  • a network connection 123 for connection the computer system 100 to other computer systems (not shown).
  • the memory 130 preferably contains a query server 131 for generating query results from queries, a query result ranking facility 132 for automatically ranking the items in a query result in accordance with collective user preferences, and item rating tables 133 used by the facility. While the facility is preferably implemented on a computer system configured as described above, those skilled in the art will recognize that it may also be implemented on computer systems having different configurations.
  • the facility preferably generates a new rating table periodically, and, when a query result is received, uses the last-generated rating table to rank the items in the query result.
  • Figure 2 is a flow diagram showing the steps preferably performed by the facility in order to generate a new rating table.
  • the facility initializes a rating table for holding entries each indicating the rating score for a particular combination of a query term and an item identifier.
  • the rating table preferably has no entries when it is initialized.
  • the facility identifies all of the query result item selections made by users during the period of time for which the rating table is being generated.
  • the rating table may be generated for the queries occurring during a period of time such as a day, a week, or month.
  • This group of queries is termed a "rating set" of queries.
  • the facility also identifies the terms of the queries that produced these query results in step 202. Performance of step 202 is discussed in greater detail below in conjunction with Figure 7.
  • steps 204-208 the facility loops through each item selection from a query result that was made by a user during the time period.
  • the facility identifies the terms used in the query that produced the query result in which the item selection took place.
  • steps 205-207 the facility loops through each term in the query.
  • the facility increases the rating score in the rating table corresponding to the current term and item. Where an entry does not yet exist in the rating table for the term and item, the facility adds a new entry to the rating table for the term and item.
  • Increasing the rating score preferably involves adding an increment value, such as 1, to the existing rating score for the term and item.
  • step 207 if additional terms remain to be processed, the facility loops back to step 205 to process the next term in the query, else the facility continues in step 208.
  • step 208 if additional item selections remain to be processed, then the facility loops back to step 203 to process the next item selection, else these steps conclude.
  • Figures 3 and 4 are table diagrams showing augmentation of an item rating table in accordance with step 206 ( Figure 2).
  • Figure 3 shows the state of the item rating table before its augmentation.
  • the table 300 contains a number of entries, including entries 301-306.
  • Each entry contains the rating score for a particular combination of a query term and an item identifier. For example, entry 302 identifies the score "22" for the term “dynamics" the item identifier "1883823064".
  • the facility uses various other data structures to store the rating scores, such as sparse arrays.
  • the facility identifies the selection of the item having item identifier "1883823064" from a query result produced by a query specifying the query terms "human” and "dynamics".
  • Figure 4 shows the state of the item rating table after the item rating table is augmented by the facility to reflect this selection.
  • the facility has incremented the score for this entry from "45” to "46". Similarly, the facility has incremented the rating score for this item identifier the term “dynamics" from "22" to "23".
  • the facility augments the rating table in a similar manner for the other selections from query results that it identifies during the time period. Rather than generating a new rating table from scratch using the steps shown in Figure 2 each time new selection information becomes available, the facility preferably generates and maintains separate rating tables for different constituent time periods, of a relatively short length, such as one day. Each time a rating table is generated for a new constituent time period, the facility preferably combines this new rating table with existing rating tables for earlier constituent time periods to form a 10
  • FIG. 5 is a table diagram showing the generation of rating tables for composite periods of time from rating tables for constituent periods of time. It can be seen in Figure 5 that rating tables 501-506 each correspond to a single day between 8-Feb-98 and 13-Feb-98. Each time a new constituent period is completed, the facility generates a new rating table reflecting the user selections made during that constituent period. For example, at the end of 12-Feb-98, the facility generates rating table 505, which reflects all of the user selections occurring during 12-Feb-98. After the facility generates a new rating table for a completed constituent period, the facility also generates a new rating table for a composite period ending with that constituent period.
  • the facility After generating the rating table 505 for the constituent period 12-Feb-98, the facility generates rating table 515 for the composite period 8-Feb-98 to 12-Feb-98.
  • the facility preferably generates such a rating table for a composite period by combining the entries of the rating tables for the constituent periods making up the composite period, and combining the scores of corresponding entries, for example, by summing them.
  • the scores and rating tables for more recent constituent periods are weighted more heavily than those in rating tables for less recent constituent periods.
  • the rating table for the most recent composite period is preferably used. That is, until rating table 516 can be generated, the facility preferably uses rating table 515 to rank query results. After rating table 516 is generated, the facility preferably uses rating table 516 to rank query results.
  • the lengths of both constituent periods and composite periods are preferably configurable.
  • Figure 6 is a table diagram showing a rating table for a composite period.
  • the contents of rating table 600 constitute the combination of the contents of rating table 400 with several other rating tables for constituent periods.
  • the score for entry 602 is "116", or about five times the score for corresponding entry 402.
  • rating table 400 does not contain an entry for the term "dynamics" and the item identifier "1887650024"
  • entry 607 has been added to table 600 for this combination of term and item identifier, as a 11
  • the process used by the facility to identify user selections is dependent upon both the kind of selection action used by the facility and the manner in which the data relating to such selection actions is stored.
  • One preferred embodiment uses as its selection action requests to display more information about items identified in query results.
  • the facility extracts this information from logs generated by a web server that generates query results for a user using a web client, and allows the user to select an item with the web client in order display additional information about it.
  • a web server generally maintains a log detailing of all the HTTP requests that it has received from web clients and responded to. Such a log is generally made up of entries, each containing information about a different HTTP request. Such logs are generally organized chronologically.
  • Log Entry 1 below is a sample log entry showing an HTTP request submitted by a web client on behalf of the user that submits a query.
  • HTTP_REFERER http : //www . amazo . com/book_query_page
  • the entry further contains a user identifier corresponding to the identity of the user and, in some embodiments, also to this particular interaction with the web server.
  • the query server In response to receiving the HTTP request documented in Log Entry 1 , the query server generates a query result for the query and returns it to the web client submitting the query. Later the user selects an item identified in the query result, and 12
  • the web client submits another HTTP request to display detailed information about the selected item.
  • Log Entry 2 which occurs at a point after Log Entry 1 in the log, describes this second HTTP request.
  • HTTP_REFERER htt : //www . amazon . com/book_query
  • the facility preferably identifies user selections by traversing these logs. Such traversal can occur either in a batch processing mode after a log for a specific period of time has been completely generated, or in a real-time processing mode so that log entries are processed as soon as they are generated.
  • FIG. 7 is a flow diagram showing the steps preferably performed by the facility in order to identify user selections within a web server log.
  • the facility positions a first pointer at the top, or beginning, of the log. The facility then repeats steps 702-708 until the first pointer reaches the end of the log.
  • the facility traverses forward with the first pointer to the next item selection event. In terms of the log entry shown above, step 703 involves traversing forward through log entries until one is found that contains in its "HTTP_REFERER" line a keyword denoting a search entry, such as "book_query”.
  • the facility extracts item identifier "1883823064" and session identifier "82707238761".
  • the facility synchronizes the position of the second pointer with the position of the first pointer. That is, the facility makes the second pointer point to the same log entry as the first pointer.
  • the facility traverses backwards with the second pointer to a query event having a matching user identifier.
  • the facility traverses backward to the log entry having the keyword "book_query” in its "PATH_INFO” line, and having a matching user identifier on its "User Identifier” line.
  • the facility extracts from the query event to which the second pointer points the terms of the query.
  • the facility extracts the quoted words from the query log entry to which the second pointer points, in the lines after the "PATH_INFO" line.
  • the facility extracts the terms "Seagal”, “Human”, and "Dynamics”.
  • step 708 if the first pointer has not yet reached the end of the log, then the facility loops back to step 702 to continue processing the log, else these steps conclude.
  • extracting information about the selection from the web server log can be somewhat more involved. For example, where the facility uses purchase of the item as the selection action, instead of identifying a log entry describing a request by the user for more information about an item, like Log Entry 1, the facility instead identifies a log entry describing a request to purchase items in a "shopping basket.” The facility then traverses backwards in the log, using the entries describing requests to add items to and remove items from the shopping basket to determine which items were in the shopping basket at the time of the request to purchase. The facility then continues traversing backward in the log to identify the log entry describing the query, like Log Entry 2, and to extract the search terms.
  • the facility Rather than relying solely on a web server log where item purchase is the selection action that is used by the facility, the facility alternatively uses a database separate from the web server log to determine which items are purchased in each 14
  • This information from the database is then matched up with the log entry containing the query terms for the query from which item is selected for purchase.
  • This hybrid approach using the web server logs and a separate database, may be used for any of the different kinds of selection actions. Additionally, where a database separate from the web server log contains all the information necessary to augment the rating table, the facility may use the database exclusively, and avoid traversing the web server log.
  • FIG 8 is a flow diagram showing the steps preferably performed by the facility to order a query result using a rating table by generating a ranking value for each item in the query result.
  • steps 801-807 the facility loops through each item identified in the query result.
  • step 802 the facility initializes a ranking value for the current item.
  • steps 803-805 the facility loops through each term occurring in the query.
  • step 804 the facility determines the rating score contained by the most recently-generated rating table for the current term and item.
  • step 805 if any terms of the query remain to be processed, then the facility loops up to step 803, else the facility continues in step 806.
  • step 806 the facility combines the scores for the current item to generate a ranking value for the item.
  • the facility in processing datum having item identifier "1883823064", the facility combines the score "116" extracted from entry 602 for this item and the term “dynamics", and the score "211" extracted from entry 605 for this item and the term "human”.
  • Step 806 preferably involves summing these scores.
  • scores may be combined in other ways, however. In particular, scores may be adjusted to more directly reflect the number of query terms that are matched by the item, so that items that match more query terms than others are favored in the ranking.
  • step 807 if any items remain to be processed, the facility loops back to step 801 to process the next item, else the facility continues in step 808.
  • step 808 the facility displays the items identified in the query result in accordance with the ranking values generated for the items in step 806.
  • Step 808 preferably involves sorting the items in the query result in decreasing order of their ranking values, and/or subsetting the items in the query 15
  • step 808 these steps conclude.
  • Figure 9 is a flow diagram showing the steps preferably performed by the facility to select a few items in a query result having the highest ranking values using a rating table.
  • the facility loops through each term in the query.
  • the facility identifies among the table entries for the current term and those entries having the three highest rating scores. For example, with reference to Figure 6, if the only entries in item rating table 600 for the term "dynamics" are entries 601, 602, 603, and 607, the facility would identify entries 601, 602, and 603, which are the entries for the term "dynamics" having the three highest rating scores. In additional preferred embodiments, a small number of table entries other than three is used.
  • step 903 if additional terms remain in the query to be processed, then the facility loops back to step 901 to process the next term in the query, else the facility continues in step 904.
  • steps 904-906 the facility loops through each unique item among the identified entries.
  • step 905 the facility combines all of the scores for the item among the identified entries.
  • step 906 if additional unique items remain among the identified entries to be processed, then the facility loops back to step 904 to process the next unique item, else the facility continues in step 907.
  • the facility selects for prominent display items having the top three combined scores.
  • the facility selects a small number of items having the top combined scores that is other than three. In the example discussed above, the facility would select for prominent display the items having item identifiers "1883823064", “0814403484", and "9676530409". Because the 16
  • step 907 selects items without regard for their presence in the query result, the facility may select items that are not in the query result.
  • This aspect of this embodiment is particularly advantageous in situations in which a complete query result is not available when the facility is invoked. Such as the case, for instance, where the query server only provides a portion of the items satisfying the query at a time.
  • This aspect of the invention is further advantageous in that, by selecting items without regard for their presence in the query result, the facility is able to select and display to the user items relating to the query even where the query result is empty, i.e., when no items completely satisfy the query.
  • the facility may be used to rank query results of all types.
  • the facility may use various formulae to determine in the case of each item selection, the amount by which to augment rating scores with respect to the selection. Further, the facility may employ various formulae to combine rating scores into a ranking value for an item.
  • the facility may also use a variety of different kinds of selection actions to augment the rating table, and may augment the rating table for more than one kind of selection action at a time. Additionally, the facility, may augment the rating table to reflect selections by users other than human users, such as software agents or other types of artificial users.

Abstract

The present invention provides a software facility for identifiying the items most relevant to a current query based on items selected in connection with similar queries. In preferred embodiments of the invention, the facility receives a query specifiying one or more query terms. In response, the facility generates a query result identifying a plurality of items that satisfy the query. The facility then produces a ranking value for at least a portion of the items identified in the query result by combining the relative frequencies with which users selected that item from the query results generated from queries specifying each of the terms specified by the query. The facility identifies as most relevant those items having the highest ranking values.

Description

IDENTIFYING THE ITEMS MOST RELEVANT TO A CURRENT QUERY BASED ON ITEMS SELECTED IN CONNECTION WITH SIMILAR QUERIES
TECHNICAL FIELD The present invention is directed to the field of query processing.
BACKGROUND OF THE INVENTION
Many World Wide Web sites permit users to perform searches to identify a small number of interesting items among a much larger domain of items. As an example, several web index sites permit users to search for particular web sites among most of the known web sites. Similarly, many online merchants, such as booksellers, permit users to search for particular products among all of the products that can be purchased from a merchant. In many cases, users perform searches in order to ultimately find a single item within an entire domain of items.
In order to perform a search, a user submits a query containing one or more query terms. The query also explicitly or implicitly identifies a domain of items to search. For example, a user may submit a query to an online bookseller containing terms that the user believes are words in the title of a book. A query server program processes the query to identify within the domain items matching the terms of the query. The items identified by the query server program are collectively known as a query result. In the example, the query result is a list of books whose titles contain some or all of the query terms. The query result is typically displayed to the user as a list of items. This list may be ordered in various ways. For example, the list may be ordered alphabetically or numerically based on a property of each item, such as the title, author, or release date of each book. As another example, the list may be ordered based on the extent to which each identified item matches the terms of the query.
When the domain for a query contains a large number of items, it is common for query results to contain tens or hundreds of items. Where the user is performing the search in order to find a single item, application of conventional approaches to ordering the query result often fail to place the sought item or items near the top of the query result, so that the user must read through many other items in the query result before reaching the sought item. In view of this disadvantage of conventional approaches to ordering query results, a new, more effective technique for automatically ordering query results in accordance with collective and individual user behavior would have significant utility.
Further, it is fairly common for users to specify queries that are not satisfied by any items. This may happen, for example, where a user submits a detailed query that is very narrow, or where a user mistypes or misremembers a term in the query. In such cases, conventional techniques, which present only items that satisfy the query, present no items to the user. When no items are presented to a user in response to issuing a query, the user can become frustrated with the search engine, and may even discontinue its use. Accordingly, a technique for displaying items relating to at least some of the terms in a query even when no items completely match the query would have significant utility.
In order to satisfy this need, some search engines adopt a strategy of effectively automatically revising the query until a non-empty result set is produced. For example, a search engine may progressively delete conjunctive, i.e., ANDed, terms from a multiple term query until the result set produced for that query contains items. This strategy has the disadvantage that important information for choosing the correct items can be lost when query terms are arbitrarily deleted. As a result, the first nonempty result set can be quite large, and may contain a large percentage of items that are irrelevant to the original query as a whole. For this reason, a more effective technique for displaying items relating to at least some of the terms in a query even when no items completely match the query would have significant utility.
SUMMARY OF THE INVENTION
The present invention provides a software facility ("the facility") for identifying the items most relevant to a current query based on items selected in connection with similar queries. The facility preferably generates ranking values for items indicating their level of relevance to the current query, which specifies one or more query terms. The facility generates a ranking value for an item by combining rating scores, produced by a rating function, that each correspond to the level of relevance of the item to queries containing one of the ranking values. The rating function preferably retrieves a rating score for the combination of an item and a term from a rating table generated by the facility. The scores in the rating table preferably reflect, for a particular item and term, how often users have selected the item when the item has been identified in query results produced for queries containing particular term. In different embodiments, the facility uses the rating scores to either generate a ranking value for each item in a query result, or generate ranking values for a smaller number of items in order to select a few items having the top ranking values. To generate a ranking value for a particular item in a query result, the facility combines the rating scores corresponding to that item and the terms of the query. In embodiments in which the goal is to generate ranking values for each item in the query result, the facility preferably loops through the items in the query results and, for each item, combines all of the rating scores corresponding to that item and any of the terms in the query. On the other hand, in embodiments in which the goal is to select a few items in the query result having the largest ranking values, the facility preferably loops through the terms in the query, and, for each item, identifies the top few rating scores for that term and any item. The facility then combines the scores identified for each item to generate ranking values for a relatively small number of items, which may include items not identified in the query result. Indeed, these embodiments of the invention are able to generate ranking values for and display items even in cases in which the query result is empty, i.e., when no items completely satisfy the query.
Once the facility has generated ranking values for at least some items, the facility preferably orders the items of the query result in decreasing order of ranking value. The facility may also use the ranking values to subset the items in the query result to a smaller number of items. By ordering and/or subsetting the items in the query result in this way in accordance with collective and individual user behavior rather than in accordance with attributes of the items, the facility substantially increases the likelihood that the user will quickly find within the query result the particular item or items that he or she seeks. For example, while a query result for a query containing the query terms "human" and "dynamic" may contain a book about human dynamics and a book about the effects on human beings of particle dynamics, selections by users from early query results produced for queries containing the term "human" show that these users select the human dynamics book much more frequently than they select the particle dynamics book. The facility therefore ranks the human dynamics book higher than the particle dynamics book, allowing users that are more interested in the human dynamics book to select it more easily. This benefit of the facility is especially useful in conjunction with the large, heterogeneous query results that are typically generated for single-term queries, which are commonly submitted by users.
Various embodiments of the invention base rating scores on different kinds of selection actions performed by the users on items identified in query results. These include whether the user displayed additional information about an item, how much time the user spent viewing the additional information about the item, how many hyperlinks the user followed within the additional information about the item, whether the user added the item to his or her shopping basket, and whether the user ultimately purchased the item. Embodiments of the invention also consider selection actions not relating to query results, such as typing an item's item identifier rather than choosing the item from a query result. Additional embodiments of the invention incorporate into the ranking process information about the user submitting the query by maintaining and applying separate rating scores for users in different demographic groups, such as those of the same sex, age, income, or geographic category. Certain embodiments also incorporate behavioral information about specific users. Further, rating scores may be produced by a rating function that combines different types of information reflecting collective and individual user preferences. Some embodiments of the invention utilize specialized strategies for incorporating into the rating scores information about queries submitted in different time frames. BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a high-level block diagram showing the computer system upon which the facility preferably executes.
Figure 2 is a flow diagram showing the steps preferably performed by the facility in order to generate a new rating table.
Figures 3 and 4 are table diagrams showing augmentation of an item rating table in accordance with step 206 (Figure 2).
Figure 5 is a table diagram showing the generation of rating tables for composite periods of time from rating tables for constituent periods of time. Figure 6 is a table diagram showing a rating table for a composite period.
Figure 7 is a flow diagram showing the steps preferably performed by the facility in order to identify user selections within a web server log.
Figure 8 is a flow diagram showing the steps preferably performed by the facility to order a query result using a rating table by generating a ranking value for each item in the query result.
Figure 9 is a flow diagram showing the steps preferably performed by the facility to select a few items in a query result having the highest ranking values using a rating table.
DETAILED DESCRIPTION OF THE INVENTION The present invention provides a software facility ("the facility") for identifying the items most relevant to a current query based on items selected in connection with similar queries. The facility preferably generates ranking values for items indicating their level of relevance to the current query, which specifies one or more query terms. The facility generates a ranking value for an item by combining rating scores, produced by a rating function, that each correspond to the level of relevance of the item to queries containing one of the ranking values. The rating function preferably retrieves a rating score for the combination of an item and a term from a rating table generated by the facility. The scores in the rating table preferably reflect, for a particular item and term, how often users have selected the item when the item has been identified in query results produced for queries containing the term.
In different embodiments, the facility uses the rating scores to either generate a ranking value for each item in a query result, or generate ranking values for a smaller number of items in order to select a few items having the top ranking values. To generate a ranking value for a particular item in a query result, the facility combines the rating scores corresponding to that item and the terms of the query. In embodiments in which the goal is to generate ranking values for each item in the query result, the facility preferably loops through the items in the query results and, for each item, combines all of the rating scores corresponding to that item and any of the terms in the query. On the other hand, in embodiments in which the goal is to select a few items in the query result having the largest ranking values, the facility preferably loops through the terms in the query, and, for each item, identifies the top few rating scores for that term and any item. The facility then combines the scores identified for each item to generate ranking values for a relatively small number of items, which may include items not identified in the query result. Indeed, these embodiments of the invention are able to generate ranking values for and display items even in cases in which the query result is empty, i.e., when no items completely satisfy the query.
Once the facility has generated ranking values for at least some items, the facility preferably orders the items of the query result in decreasing order of ranking value. The facility may also use the ranking values to subset the items in the query result to a smaller number of items. By ordering and/or subsetting the items in the query result in this way in accordance with collective and individual user behavior rather than in accordance with attributes of the items, the facility substantially increases the likelihood that the user will quickly find within the query result the particular item or items that he or she seeks. For example, while a query result for a query containing the query terms "human" and "dynamic" may contain a book about human dynamics and a book about the effects on human beings of particle dynamics, selections by users from early query results produced for queries containing the term "human" show that these users select the human dynamics book much more frequently than they select the particle dynamics book. The facility therefore ranks the human dynamics book higher than the particle dynamics book, allowing users, most of whom are more interested in the human dynamics book, to select it more easily. This benefit of the facility is especially useful in conjunction with the large, heterogeneous query results that are typically generated for single-term queries, which are commonly submitted by users.
Various embodiments of the invention base rating scores on different kinds of selection actions performed by the users on items identified in query results. These include whether the user displayed additional information about an item, how much time the user spent viewing the additional information about the item, how many hyperlinks the user followed within the additional information about the item, whether the user added the item to his or her shopping basket, and whether the user ultimately purchased the item. Embodiments of the invention also consider selection actions not relating to query results, such as typing an item's item identifier rather than choosing the item from a query result. Additional embodiments of the invention incorporate into the ranking process information about the user submitting the query by maintaining and applying separate rating scores for users in different demographic groups, such as those of the same sex, age, income, or geographic category. Certain embodiments also incorporate behavioral information about specific users. Further, rating scores may be produced by a rating function that combines different types of information reflecting collective and individual user preferences. Some embodiments of the invention utilize specialized strategies for incorporating into the rating scores information about queries submitted in different time frames.
Figure 1 is a high-level block diagram showing the computer system upon which the facility preferably executes. As shown in Figure 1, the computer system 100 comprises a central processing unit (CPU) 110, input/output devices 120, and a computer memory (memory) 130. Among the input/output devices is a storage device 121, such as a hard disk drive; a computer-readable media drive 122, which can be used to install software products, including the facility, which are provided on a computer-readable medium, such as a CD-ROM; and a network connection 123 for connection the computer system 100 to other computer systems (not shown). The memory 130 preferably contains a query server 131 for generating query results from queries, a query result ranking facility 132 for automatically ranking the items in a query result in accordance with collective user preferences, and item rating tables 133 used by the facility. While the facility is preferably implemented on a computer system configured as described above, those skilled in the art will recognize that it may also be implemented on computer systems having different configurations.
The facility preferably generates a new rating table periodically, and, when a query result is received, uses the last-generated rating table to rank the items in the query result. Figure 2 is a flow diagram showing the steps preferably performed by the facility in order to generate a new rating table. In step 201, the facility initializes a rating table for holding entries each indicating the rating score for a particular combination of a query term and an item identifier. The rating table preferably has no entries when it is initialized. In step 202, the facility identifies all of the query result item selections made by users during the period of time for which the rating table is being generated. The rating table may be generated for the queries occurring during a period of time such as a day, a week, or month. This group of queries is termed a "rating set" of queries. The facility also identifies the terms of the queries that produced these query results in step 202. Performance of step 202 is discussed in greater detail below in conjunction with Figure 7. In steps 204-208, the facility loops through each item selection from a query result that was made by a user during the time period. In step 204, the facility identifies the terms used in the query that produced the query result in which the item selection took place. In steps 205-207, the facility loops through each term in the query. In step 206, the facility increases the rating score in the rating table corresponding to the current term and item. Where an entry does not yet exist in the rating table for the term and item, the facility adds a new entry to the rating table for the term and item. Increasing the rating score preferably involves adding an increment value, such as 1, to the existing rating score for the term and item. In step 207, if additional terms remain to be processed, the facility loops back to step 205 to process the next term in the query, else the facility continues in step 208. In step 208, if additional item selections remain to be processed, then the facility loops back to step 203 to process the next item selection, else these steps conclude.
Figures 3 and 4 are table diagrams showing augmentation of an item rating table in accordance with step 206 (Figure 2). Figure 3 shows the state of the item rating table before its augmentation. It can be seen that the table 300 contains a number of entries, including entries 301-306. Each entry contains the rating score for a particular combination of a query term and an item identifier. For example, entry 302 identifies the score "22" for the term "dynamics" the item identifier "1883823064". It can be seen by examining entries 301-303 that, in query results produced from queries including the term "dynamics", the item having item identifier "1883823064" has been selected by users more frequently than the item having item identifier "9676530409", and much more frequently than the item having item identifier "0801062272". In additional embodiments, the facility uses various other data structures to store the rating scores, such as sparse arrays. In augmenting the item rating table 300, the facility identifies the selection of the item having item identifier "1883823064" from a query result produced by a query specifying the query terms "human" and "dynamics". Figure 4 shows the state of the item rating table after the item rating table is augmented by the facility to reflect this selection. It can be seen by comparing entry 405 in item rating table 400 to entry 305 in item rating table 300 that the facility has incremented the score for this entry from "45" to "46". Similarly, the facility has incremented the rating score for this item identifier the term "dynamics" from "22" to "23". The facility augments the rating table in a similar manner for the other selections from query results that it identifies during the time period. Rather than generating a new rating table from scratch using the steps shown in Figure 2 each time new selection information becomes available, the facility preferably generates and maintains separate rating tables for different constituent time periods, of a relatively short length, such as one day. Each time a rating table is generated for a new constituent time period, the facility preferably combines this new rating table with existing rating tables for earlier constituent time periods to form a 10
rating table for a longer composite period of time. Figure 5 is a table diagram showing the generation of rating tables for composite periods of time from rating tables for constituent periods of time. It can be seen in Figure 5 that rating tables 501-506 each correspond to a single day between 8-Feb-98 and 13-Feb-98. Each time a new constituent period is completed, the facility generates a new rating table reflecting the user selections made during that constituent period. For example, at the end of 12-Feb-98, the facility generates rating table 505, which reflects all of the user selections occurring during 12-Feb-98. After the facility generates a new rating table for a completed constituent period, the facility also generates a new rating table for a composite period ending with that constituent period. For example, after generating the rating table 505 for the constituent period 12-Feb-98, the facility generates rating table 515 for the composite period 8-Feb-98 to 12-Feb-98. The facility preferably generates such a rating table for a composite period by combining the entries of the rating tables for the constituent periods making up the composite period, and combining the scores of corresponding entries, for example, by summing them. In one preferred embodiment, the scores and rating tables for more recent constituent periods are weighted more heavily than those in rating tables for less recent constituent periods. When ranking query results, the rating table for the most recent composite period is preferably used. That is, until rating table 516 can be generated, the facility preferably uses rating table 515 to rank query results. After rating table 516 is generated, the facility preferably uses rating table 516 to rank query results. The lengths of both constituent periods and composite periods are preferably configurable.
Figure 6 is a table diagram showing a rating table for a composite period. By comparing the item rating table 600 shown in Figure 6 to item rating table 400 shown in Figure 4, it can be seen that the contents of rating table 600 constitute the combination of the contents of rating table 400 with several other rating tables for constituent periods. For example, the score for entry 602 is "116", or about five times the score for corresponding entry 402. Further, although rating table 400 does not contain an entry for the term "dynamics" and the item identifier "1887650024", entry 607 has been added to table 600 for this combination of term and item identifier, as a 11
corresponding entry occurs in a rating table for one of the other constituent periods within the composite period.
The process used by the facility to identify user selections is dependent upon both the kind of selection action used by the facility and the manner in which the data relating to such selection actions is stored. One preferred embodiment uses as its selection action requests to display more information about items identified in query results. In this embodiment, the facility extracts this information from logs generated by a web server that generates query results for a user using a web client, and allows the user to select an item with the web client in order display additional information about it. A web server generally maintains a log detailing of all the HTTP requests that it has received from web clients and responded to. Such a log is generally made up of entries, each containing information about a different HTTP request. Such logs are generally organized chronologically. Log Entry 1 below is a sample log entry showing an HTTP request submitted by a web client on behalf of the user that submits a query.
1 . Friday, 13 -Feb- 98 16 : 59 : 27
2 . User Identif ier=82707238671
3 . HTTP_REFERER=http : //www . amazo . com/book_query_page
4 . PATH__INFO=/book_query 5 . author = " Seagal"
6 . title= "Hu an Dynamics"
Log Entry 1
It can be seen by the occurrence of the keyword "book_query" in the "PATH_INFO" line 4 of Log Entry 1 that this log entry corresponds to a user's submission of a query. It further can be seen in term lines 5 and 6 that the query includes the terms "Seagal", "Human", and "Dynamics". In line 2, the entry further contains a user identifier corresponding to the identity of the user and, in some embodiments, also to this particular interaction with the web server.
In response to receiving the HTTP request documented in Log Entry 1 , the query server generates a query result for the query and returns it to the web client submitting the query. Later the user selects an item identified in the query result, and 12
the web client submits another HTTP request to display detailed information about the selected item. Log Entry 2, which occurs at a point after Log Entry 1 in the log, describes this second HTTP request.
1 . Friday, 13 -Feb- 98 17 : 02 : 39
2 . User Identif ier=82707238671
3 . HTTP_REFERER=htt : //www . amazon . com/book_query
4 . PATH_INFO=/ISBN=1883823064 Log Entry 2
By comparing the user identifier in line 2 of Log Entry 2 to the user identifier in line 2 of Log Entry 1 , it can be seen that these log entries correspond to the same user and time frame. In the "PATH INFO" line 4 of Log Entry 2, it can be seen that the user has selected an item having item identifier ("ISBN") "1883823064". It can further be seen from the occurrence of the keyword "book_query" on the "HTTP REFERER" line 3 that the selection of this item was from a query result.
Where information about user selections is stored in web server logs such as those discussed above, the facility preferably identifies user selections by traversing these logs. Such traversal can occur either in a batch processing mode after a log for a specific period of time has been completely generated, or in a real-time processing mode so that log entries are processed as soon as they are generated.
Figure 7 is a flow diagram showing the steps preferably performed by the facility in order to identify user selections within a web server log. In step 701, the facility positions a first pointer at the top, or beginning, of the log. The facility then repeats steps 702-708 until the first pointer reaches the end of the log. In step 703, the facility traverses forward with the first pointer to the next item selection event. In terms of the log entry shown above, step 703 involves traversing forward through log entries until one is found that contains in its "HTTP_REFERER" line a keyword denoting a search entry, such as "book_query". In step 704, the facility extracts from this item selection event the identity of the item that was selected and session identifier that identifies the user that selected the item. In terms of the log entries above, this involves reading the ten-digit number following the string "ISBN=" in the "PATH INFO" line 13
of the log entry, and reading the user identifier from the "User Identifier" line of the log entry. Thus, in Log Entry 2, the facility extracts item identifier "1883823064" and session identifier "82707238761". In step 705, the facility synchronizes the position of the second pointer with the position of the first pointer. That is, the facility makes the second pointer point to the same log entry as the first pointer. In step 706, the facility traverses backwards with the second pointer to a query event having a matching user identifier. In terms of the log entries above, the facility traverses backward to the log entry having the keyword "book_query" in its "PATH_INFO" line, and having a matching user identifier on its "User Identifier" line. In step 707, the facility extracts from the query event to which the second pointer points the terms of the query. In terms of the query log entries above, the facility extracts the quoted words from the query log entry to which the second pointer points, in the lines after the "PATH_INFO" line. Thus, in Log Entry 1, the facility extracts the terms "Seagal", "Human", and "Dynamics". In step 708, if the first pointer has not yet reached the end of the log, then the facility loops back to step 702 to continue processing the log, else these steps conclude.
When other selection actions are used by the facility, extracting information about the selection from the web server log can be somewhat more involved. For example, where the facility uses purchase of the item as the selection action, instead of identifying a log entry describing a request by the user for more information about an item, like Log Entry 1, the facility instead identifies a log entry describing a request to purchase items in a "shopping basket." The facility then traverses backwards in the log, using the entries describing requests to add items to and remove items from the shopping basket to determine which items were in the shopping basket at the time of the request to purchase. The facility then continues traversing backward in the log to identify the log entry describing the query, like Log Entry 2, and to extract the search terms.
Rather than relying solely on a web server log where item purchase is the selection action that is used by the facility, the facility alternatively uses a database separate from the web server log to determine which items are purchased in each 14
purchase transaction. This information from the database is then matched up with the log entry containing the query terms for the query from which item is selected for purchase. This hybrid approach, using the web server logs and a separate database, may be used for any of the different kinds of selection actions. Additionally, where a database separate from the web server log contains all the information necessary to augment the rating table, the facility may use the database exclusively, and avoid traversing the web server log.
The facility uses rating tables that it has generated to generate ranking values for items in new query results. Figure 8 is a flow diagram showing the steps preferably performed by the facility to order a query result using a rating table by generating a ranking value for each item in the query result. In steps 801-807, the facility loops through each item identified in the query result. In step 802, the facility initializes a ranking value for the current item. In steps 803-805, the facility loops through each term occurring in the query. In step 804, the facility determines the rating score contained by the most recently-generated rating table for the current term and item. In step 805, if any terms of the query remain to be processed, then the facility loops up to step 803, else the facility continues in step 806. In step 806, the facility combines the scores for the current item to generate a ranking value for the item. As an example, with reference to Figure 6, in processing datum having item identifier "1883823064", the facility combines the score "116" extracted from entry 602 for this item and the term "dynamics", and the score "211" extracted from entry 605 for this item and the term "human". Step 806 preferably involves summing these scores. These scores may be combined in other ways, however. In particular, scores may be adjusted to more directly reflect the number of query terms that are matched by the item, so that items that match more query terms than others are favored in the ranking. In step 807, if any items remain to be processed, the facility loops back to step 801 to process the next item, else the facility continues in step 808. In step 808, the facility displays the items identified in the query result in accordance with the ranking values generated for the items in step 806. Step 808 preferably involves sorting the items in the query result in decreasing order of their ranking values, and/or subsetting the items in the query 15
result to include only those items above a threshold ranking value, or only a predetermined number of items having the highest ranking values. After step 808, these steps conclude.
Figure 9 is a flow diagram showing the steps preferably performed by the facility to select a few items in a query result having the highest ranking values using a rating table. In steps 901-903, the facility loops through each term in the query. In step 902, the facility identifies among the table entries for the current term and those entries having the three highest rating scores. For example, with reference to Figure 6, if the only entries in item rating table 600 for the term "dynamics" are entries 601, 602, 603, and 607, the facility would identify entries 601, 602, and 603, which are the entries for the term "dynamics" having the three highest rating scores. In additional preferred embodiments, a small number of table entries other than three is used. In step 903, if additional terms remain in the query to be processed, then the facility loops back to step 901 to process the next term in the query, else the facility continues in step 904. In steps 904-906, the facility loops through each unique item among the identified entries. In step 905, the facility combines all of the scores for the item among the identified entries. In step 906, if additional unique items remain among the identified entries to be processed, then the facility loops back to step 904 to process the next unique item, else the facility continues in step 907. As an example, if, in item rating table 600, the facility selected entries 601, 602, and 603 for the term "dynamics", and selected entries 604, 605, and 606 for the term "human", then the facility would combine the scores "116" and "211" for the item having item identifier "1883823064", and would use the following single scores for the remaining item identifiers: "77" for the item having item identifier "0814403484", "45" for the item having item identifier "9676530409", "12" for the item having item identifier "6303702473", and "4" for the item having item identifier "0801062272". In step 907, the facility selects for prominent display items having the top three combined scores. In additional embodiments, the facility selects a small number of items having the top combined scores that is other than three. In the example discussed above, the facility would select for prominent display the items having item identifiers "1883823064", "0814403484", and "9676530409". Because the 16
facility in step 907 selects items without regard for their presence in the query result, the facility may select items that are not in the query result. This aspect of this embodiment is particularly advantageous in situations in which a complete query result is not available when the facility is invoked. Such as the case, for instance, where the query server only provides a portion of the items satisfying the query at a time. This aspect of the invention is further advantageous in that, by selecting items without regard for their presence in the query result, the facility is able to select and display to the user items relating to the query even where the query result is empty, i.e., when no items completely satisfy the query. After step 907, these steps conclude. While the present invention has been shown and described with reference to preferred embodiments, it will be understood by those skilled in the art that various changes or modifications in form and detail may be made without departing from the scope of the invention. For example, the facility may be used to rank query results of all types. The facility may use various formulae to determine in the case of each item selection, the amount by which to augment rating scores with respect to the selection. Further, the facility may employ various formulae to combine rating scores into a ranking value for an item. The facility may also use a variety of different kinds of selection actions to augment the rating table, and may augment the rating table for more than one kind of selection action at a time. Additionally, the facility, may augment the rating table to reflect selections by users other than human users, such as software agents or other types of artificial users.

Claims

17
CLAIMS We claim:
1 A method in a computer system for ranking items in a search
2 result, the method comprising the steps of:
3 for each of a multiplicity of search terms, compiling data indicating the
4 extent to which users have selected each of a multiplicity of items when returned in
5 search results produced from queries containing the search term;
6 receiving a query and a search result, the received query containing a
7 term among the multiplicity of terms, the received search result identifying a plurality
8 of items among the multiplicity of items that satisfy the received query; and
9 using the compiled data to rank at least a portion of the items identified
10 in the received search result in accordance with the extent to which users have selected
11 each of the plurality of items identified in the received search result when returned in
12 search results produced from queries containing the search term contained in the
13 received query.
1 2. The method of claim 1 wherein at least a portion of the users are
2 identified with one of a plurality of demographic groups, and wherein the compiling
3 step compiles, for each demographic group of the plurality of demographic groups, data
4 indicating the extent to which users identified with the demographic group have
5 selected each of a multiplicity of items when returned in search results produced from
6 queries containing the search term, and wherein the received query is submitted on
7 behalf of a distinguished user identified with a distinguished demographic group, and
8 wherein the ranking step uses the compiled data to rank the items identified in the
9 received search result in accordance with the extent to which users identified with the
10 distinguished demographic group have selected each of the plurality of items identified
11 in the received search result when returned in search results produced from queries
12 containing the search term contained in the received query. 18
3. The method of claim 1, further comprising the step of imposing on the items identified in the distinguished search result an order in which the ranking values of the items monotonically decreases.
4. The method of claim 1, further comprising the step of using the ranking values of the items identified in the distinguished search result to create a proper subset of the items.
5. The method of claim 4 wherein the creating step creates a subset of the items identified in the distinguished search result that contains all of the items whose ranking values exceed a predetermined minimum ranking value.
6. The method of claim 4 wherein the creating step creates a subset of the items identified in the distinguished search result that contains all of the items whose ranking values exceed a predetermined minimum ranking value.
7. The method of claim 1 wherein the increasing step increases rating values for selections made to display additional information about items.
8. The method of claim 1 wherein the increasing step increases rating values for selections made to purchase items.
9. The method of claim 1 wherein the increasing step increases rating values for selections made to add items to a tentative list of purchases.
10. The method of claim 1 wherein the increasing step increases rating values for selections of portions of detailed information displayed about items. 19
11. The method of claim 1 wherein the increasing step increases rating values for units of time for which the user displays detailed information about items.
12. A computer-readable medium whose contents cause a computer system to rank items in a search result by performing the steps of: receiving a query specifying one or more terms; generating a query result identifying a plurality of items satisfying the query; and for each item identified in the query result, combining the relative frequencies with which users selected the item in earlier queries specifying each of the terms of the query to producing a ranking value for the item.
13. The computer-readable medium of claim 12 wherein the contents of the computer-readable medium further cause the computer system to perform the step of adjusting the ranking value produced for each item identified in the query result to reflect the number of terms specified by the query that are matched by the item.
14. A computer system for ranking items in a search result, comprising: a query memory that stores information about previously submitted queries and items selected from the query results of previously submitted queries; a query receiver that receives queries each specifying one or more terms; a query server that generating a query result for each query received by the query receiver that identifies a plurality of items satisfying the query; and an item ranking subsystem that, for each query result generated by the query server, for at least a portion of the items identified in the query result, combines from the contents of the query memory the relative frequencies with which users selected the item in earlier queries specifying each of the terms of the query to producing a ranking value for the item. 20
15. A computer memory containing a user behavior data structure usable to rank the relevance of items in a query result, the data structure comprising a plurality of rating scores, each rating score corresponding both to a query term and to an item, and reflecting quantitatively the extent to which users have selected the item from query results generated from queries specifying the query term, such that the data structure may be used to rank items in a distinguished query result produced for a distinguished query by, for each item in the distinguished query result, retrieving from the data structure the rating scores corresponding to the item and any term specified in the distinguished query and combining the retrieved rating scores to generate a ranking value for the item.
PCT/US1998/026985 1998-03-03 1998-12-18 Identifying the items most relevant to a current query based on items selected in connection with similar queries WO1999045487A1 (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
EP98964094A EP1060449B1 (en) 1998-03-03 1998-12-18 Identifying the items most relevant to a current query based on items selected in connection with similar queries
AT98964094T ATE243869T1 (en) 1998-03-03 1998-12-18 IDENTIFICATION OF THE MOST RELEVANT ANSWERS TO A CURRENT SEARCH QUERY BASED ON ANSWERS ALREADY SELECTED FOR SIMILAR QUERIES
NZ506229A NZ506229A (en) 1998-03-03 1998-12-18 Identifying the items most relevant to a current query based on items selected in connection with similar queries
CA002320293A CA2320293C (en) 1998-03-03 1998-12-18 Identifying the items most relevant to a current query based on items selected in connection with similar queries
JP2000534960A JP4792551B2 (en) 1998-03-03 1998-12-18 Method and system for ranking items in current search results
AU19290/99A AU757550B2 (en) 1998-03-03 1998-12-18 Identifying the items most relevant to a current query based on items selected in connection with similar queries
DE69815898T DE69815898T2 (en) 1998-03-03 1998-12-18 IDENTIFYING THE MOST RELEVANT ANSWERS TO A CURRENT SEARCH REQUEST BASED ON ANSWERS ALREADY SELECTED FOR SIMILAR INQUIRIES

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US3382498A 1998-03-03 1998-03-03
US09/033,824 1998-03-03
US09/041,081 US6185558B1 (en) 1998-03-03 1998-03-10 Identifying the items most relevant to a current query based on items selected in connection with similar queries
US09/041,081 1998-03-10

Publications (1)

Publication Number Publication Date
WO1999045487A1 true WO1999045487A1 (en) 1999-09-10

Family

ID=26710170

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/026985 WO1999045487A1 (en) 1998-03-03 1998-12-18 Identifying the items most relevant to a current query based on items selected in connection with similar queries

Country Status (8)

Country Link
US (4) US7620572B2 (en)
EP (1) EP1060449B1 (en)
AT (1) ATE243869T1 (en)
AU (1) AU757550B2 (en)
CA (1) CA2320293C (en)
DE (1) DE69815898T2 (en)
NZ (1) NZ506229A (en)
WO (1) WO1999045487A1 (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1098258A1 (en) * 1999-11-03 2001-05-09 BRITISH TELECOMMUNICATIONS public limited company Information access
WO2001080079A2 (en) 2000-04-18 2001-10-25 Amazon.Com, Inc. Search query autocompletion
US6466918B1 (en) 1999-11-18 2002-10-15 Amazon. Com, Inc. System and method for exposing popular nodes within a browse tree
US6489968B1 (en) 1999-11-18 2002-12-03 Amazon.Com, Inc. System and method for exposing popular categories of browse tree
WO2003038680A2 (en) 2001-10-31 2003-05-08 Hewlett-Packard Company Method and system for accessing a collection of images in a database
JP2003178086A (en) * 2001-12-11 2003-06-27 Ntt Data Corp Information providing system and method based on request data
US6772150B1 (en) 1999-12-10 2004-08-03 Amazon.Com, Inc. Search query refinement using related search phrases
WO2005076147A1 (en) * 2004-02-10 2005-08-18 Ian Andrew Maxwell A content distribution system
US6963867B2 (en) 1999-12-08 2005-11-08 A9.Com, Inc. Search query processing to provide category-ranked presentation of search results
US7013263B1 (en) 2001-10-25 2006-03-14 Mindfabric, Inc. Online interaction processing
EP1735725A2 (en) * 2004-03-31 2006-12-27 Google, Inc. Query rewriting with entity detection
GB2428836A (en) * 2005-07-27 2007-02-07 Jobserve Ltd Improved Searching Method and System
EP1880323A2 (en) * 2005-05-11 2008-01-23 W.W. Grainger, Inc. System and method for providing a response to a search query
WO2007101194A3 (en) * 2006-02-28 2008-03-13 Yahoo Inc System and method for identifying related queries for languages with multiple writing systems
US7395259B2 (en) 1999-12-08 2008-07-01 A9.Com, Inc. Search engine system and associated content analysis methods for locating web pages with product offerings
JP2008276764A (en) * 2000-04-06 2008-11-13 Apple Inc System and method for providing custom store
US7464086B2 (en) 2000-08-01 2008-12-09 Yahoo! Inc. Metatag-based datamining
US7702541B2 (en) 2000-08-01 2010-04-20 Yahoo! Inc. Targeted e-commerce system
US7827055B1 (en) 2001-06-07 2010-11-02 Amazon.Com, Inc. Identifying and providing targeted content to users having common interests
US7996398B2 (en) 1998-07-15 2011-08-09 A9.Com, Inc. Identifying related search terms based on search behaviors of users
US8171034B2 (en) 2002-09-24 2012-05-01 Google, Inc. Methods and apparatus for serving relevant advertisements
US10824940B1 (en) 2016-11-30 2020-11-03 Amazon Technologies, Inc. Temporal ensemble of machine learning models trained during different time intervals
US10866976B1 (en) 2018-03-20 2020-12-15 Amazon Technologies, Inc. Categorical exploration facilitation responsive to broad search queries

Families Citing this family (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7403939B1 (en) 2003-05-30 2008-07-22 Aol Llc Resolving queries based on automatic determination of requestor geographic location
US7756750B2 (en) 2003-09-02 2010-07-13 Vinimaya, Inc. Method and system for providing online procurement between a buyer and suppliers over a network
US7562069B1 (en) * 2004-07-01 2009-07-14 Aol Llc Query disambiguation
US7349896B2 (en) 2004-12-29 2008-03-25 Aol Llc Query routing
US7860886B2 (en) * 2006-09-29 2010-12-28 A9.Com, Inc. Strategy for providing query results based on analysis of user intent
US8635203B2 (en) * 2006-11-16 2014-01-21 Yahoo! Inc. Systems and methods using query patterns to disambiguate query intent
US8010520B2 (en) * 2008-01-25 2011-08-30 International Business Machines Corporation Viewing time of search result content for relevancy
US20090254470A1 (en) * 2008-04-02 2009-10-08 Ebay Inc. Method and system for sharing searches
US8244517B2 (en) 2008-11-07 2012-08-14 Yahoo! Inc. Enhanced matching through explore/exploit schemes
US9607324B1 (en) 2009-01-23 2017-03-28 Zakta, LLC Topical trust network
US10191982B1 (en) 2009-01-23 2019-01-29 Zakata, LLC Topical search portal
US10007729B1 (en) 2009-01-23 2018-06-26 Zakta, LLC Collaboratively finding, organizing and/or accessing information
US8301624B2 (en) 2009-03-31 2012-10-30 Yahoo! Inc. Determining user preference of items based on user ratings and user features
CN101887437B (en) 2009-05-12 2016-03-30 阿里巴巴集团控股有限公司 A kind of Search Results generation method and information search system
US8612435B2 (en) * 2009-07-16 2013-12-17 Yahoo! Inc. Activity based users' interests modeling for determining content relevance
US20110197143A1 (en) * 2010-02-05 2011-08-11 David Baszucki Virtual World Location Display Sorting
US8600979B2 (en) 2010-06-28 2013-12-03 Yahoo! Inc. Infinite browse
US8515984B2 (en) 2010-11-16 2013-08-20 Microsoft Corporation Extensible search term suggestion engine
US10346479B2 (en) 2010-11-16 2019-07-09 Microsoft Technology Licensing, Llc Facilitating interaction with system level search user interface
US10073927B2 (en) 2010-11-16 2018-09-11 Microsoft Technology Licensing, Llc Registration for system level search user interface
US20120124072A1 (en) * 2010-11-16 2012-05-17 Microsoft Corporation System level search user interface
US10068266B2 (en) 2010-12-02 2018-09-04 Vinimaya Inc. Methods and systems to maintain, check, report, and audit contract and historical pricing in electronic procurement
US20120143725A1 (en) * 2010-12-02 2012-06-07 Vinimaya Inc. Methods and systems for influencing search and shopping decisions in electronic procurement
US9262513B2 (en) * 2011-06-24 2016-02-16 Alibaba Group Holding Limited Search method and apparatus
US8914390B2 (en) * 2011-07-12 2014-12-16 Facebook, Inc. Repetitive query recognition and processing
US8903951B2 (en) 2011-07-12 2014-12-02 Facebook, Inc. Speculative database authentication
US9619508B2 (en) 2011-07-12 2017-04-11 Facebook, Inc. Speculative begin transaction
US8756217B2 (en) 2011-07-12 2014-06-17 Facebook, Inc. Speculative switch database
US9916301B2 (en) * 2012-12-21 2018-03-13 Microsoft Technology Licensing, Llc Named entity variations for multimodal understanding systems
US9476289B2 (en) 2013-09-12 2016-10-25 G&H Diversified Manufacturing Lp In-line adapter for a perforating gun
CN106716465A (en) * 2014-06-23 2017-05-24 利斯托株式会社 Methods and systems for generating and utilizing crowd-sourced product catalogs
US10409824B2 (en) * 2016-06-29 2019-09-10 International Business Machines Corporation System, method and recording medium for cognitive proximates
US10643178B1 (en) 2017-06-16 2020-05-05 Coupa Software Incorporated Asynchronous real-time procurement system
CN108170743B (en) * 2017-12-19 2019-06-25 北京辰森世纪科技股份有限公司 A kind of search method and device for chain shops's menu
WO2019238526A1 (en) 2018-06-15 2019-12-19 Koninklijke Philips N.V. Synchronized tracking of multiple interventional medical devices
US11204972B2 (en) 2018-06-25 2021-12-21 Ebay Inc. Comprehensive search engine scoring and modeling of user relevance
US10523742B1 (en) * 2018-07-16 2019-12-31 Brandfolder, Inc. Intelligent content delivery networks

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4996642A (en) * 1987-10-01 1991-02-26 Neonics, Inc. System and method for recommending items
US5446891A (en) * 1992-02-26 1995-08-29 International Business Machines Corporation System for adjusting hypertext links with weighed user goals and activities
WO1995029451A1 (en) * 1994-04-25 1995-11-02 Apple Computer, Inc. System for ranking the relevance of information objects accessed by computer users
EP0751471A1 (en) * 1995-06-30 1997-01-02 Massachusetts Institute Of Technology Method and apparatus for item recommendation using automated collaborative filtering

Family Cites Families (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04221489A (en) 1990-12-21 1992-08-11 Yamaha Corp Self-learning type selection auxiliary device
JP3087517B2 (en) 1993-05-24 2000-09-11 日産自動車株式会社 Instructions for creating fillet surface
US5583763A (en) 1993-09-09 1996-12-10 Mni Interactive Method and apparatus for recommending selections based on preferences in a multi-user system
AU1333895A (en) * 1993-11-30 1995-06-19 Raymond R. Burke Computer system for allowing a consumer to purchase packaged goods at home
US5758257A (en) 1994-11-29 1998-05-26 Herz; Frederick System and method for scheduling broadcast of and access to video programs and other data using customer profiles
US6460036B1 (en) * 1994-11-29 2002-10-01 Pinpoint Incorporated System and method for providing customized electronic newspapers and target advertisements
US5754237A (en) 1995-03-20 1998-05-19 Daewoo Electronics Co., Ltd. Method for determining motion vectors using a hierarchical motion estimation
US7937312B1 (en) * 1995-04-26 2011-05-03 Ebay Inc. Facilitating electronic commerce transactions through binding offers
US5748954A (en) 1995-06-05 1998-05-05 Carnegie Mellon University Method for searching a queued and ranked constructed catalog of files stored on a network
US5640553A (en) 1995-09-15 1997-06-17 Infonautics Corporation Relevance normalization for documents retrieved from an information retrieval system in response to a query
US5659742A (en) 1995-09-15 1997-08-19 Infonautics Corporation Method for storing multi-media information in an information retrieval system
US5822731A (en) 1995-09-15 1998-10-13 Infonautics Corporation Adjusting a hidden Markov model tagger for sentence fragments
US5873076A (en) 1995-09-15 1999-02-16 Infonautics Corporation Architecture for processing search queries, retrieving documents identified thereby, and method for using same
US5742816A (en) 1995-09-15 1998-04-21 Infonautics Corporation Method and apparatus for identifying textual documents and multi-mediafiles corresponding to a search topic
US5675788A (en) 1995-09-15 1997-10-07 Infonautics Corp. Method and apparatus for generating a composite document on a selected topic from a plurality of information sources
US5717914A (en) 1995-09-15 1998-02-10 Infonautics Corporation Method for categorizing documents into subjects using relevance normalization for documents retrieved from an information retrieval system in response to a query
US5877485A (en) 1996-01-25 1999-03-02 Symbol Technologies, Inc. Statistical sampling security methodology for self-scanning checkout system
US5875443A (en) 1996-01-30 1999-02-23 Sun Microsystems, Inc. Internet-based spelling checker dictionary system with automatic updating
JP2800769B2 (en) 1996-03-29 1998-09-21 日本電気株式会社 Information filtering method
US5971589A (en) 1996-05-06 1999-10-26 Amadasoft America, Inc. Apparatus and method for managing and distributing design and manufacturing information throughout a sheet metal production facility
US5826261A (en) 1996-05-10 1998-10-20 Spencer; Graham System and method for querying multiple, distributed databases by selective sharing of local relative significance information for terms related to the query
US5920859A (en) 1997-02-05 1999-07-06 Idd Enterprises, L.P. Hypertext document retrieval system and method
US6006222A (en) * 1997-04-25 1999-12-21 Culliss; Gary Method for organizing information
US6014665A (en) 1997-08-01 2000-01-11 Culliss; Gary Method for organizing information
US6421653B1 (en) * 1997-10-14 2002-07-16 Blackbird Holdings, Inc. Systems, methods and computer program products for electronic trading of financial instruments
EP1062602B8 (en) 1998-02-13 2018-06-13 Oath Inc. Search engine using sales and revenue to weight search results
US7050992B1 (en) * 1998-03-03 2006-05-23 Amazon.Com, Inc. Identifying items relevant to a current query based on items accessed in connection with similar queries
US6185558B1 (en) 1998-03-03 2001-02-06 Amazon.Com, Inc. Identifying the items most relevant to a current query based on items selected in connection with similar queries
US7124129B2 (en) 1998-03-03 2006-10-17 A9.Com, Inc. Identifying the items most relevant to a current query based on items selected in connection with similar queries
US6421675B1 (en) 1998-03-16 2002-07-16 S. L. I. Systems, Inc. Search engine
US6006225A (en) 1998-06-15 1999-12-21 Amazon.Com Refining search queries by the suggestion of correlated terms from prior searches
US6493702B1 (en) * 1999-05-05 2002-12-10 Xerox Corporation System and method for searching and recommending documents in a collection using share bookmarks
US6269361B1 (en) 1999-05-28 2001-07-31 Goto.Com System and method for influencing a position on a search result list generated by a computer network search engine
US7062488B1 (en) * 2000-08-30 2006-06-13 Richard Reisman Task/domain segmentation in applying feedback to command control
US8001118B2 (en) 2001-03-02 2011-08-16 Google Inc. Methods and apparatus for employing usage statistics in document retrieval
GB0307148D0 (en) 2003-03-27 2003-04-30 British Telecomm Data retrieval system
US7734632B2 (en) 2005-10-28 2010-06-08 Disney Enterprises, Inc. System and method for targeted ad delivery

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4996642A (en) * 1987-10-01 1991-02-26 Neonics, Inc. System and method for recommending items
US5446891A (en) * 1992-02-26 1995-08-29 International Business Machines Corporation System for adjusting hypertext links with weighed user goals and activities
WO1995029451A1 (en) * 1994-04-25 1995-11-02 Apple Computer, Inc. System for ranking the relevance of information objects accessed by computer users
EP0751471A1 (en) * 1995-06-30 1997-01-02 Massachusetts Institute Of Technology Method and apparatus for item recommendation using automated collaborative filtering

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7996398B2 (en) 1998-07-15 2011-08-09 A9.Com, Inc. Identifying related search terms based on search behaviors of users
EP1098258A1 (en) * 1999-11-03 2001-05-09 BRITISH TELECOMMUNICATIONS public limited company Information access
WO2001033417A1 (en) * 1999-11-03 2001-05-10 British Telecommunications Public Limited Company Information access
US6606619B2 (en) 1999-11-18 2003-08-12 Amazon.Com, Inc. Computer processes for selecting nodes to call to attention of a user during browsing of a hierarchical browse structure
US6489968B1 (en) 1999-11-18 2002-12-03 Amazon.Com, Inc. System and method for exposing popular categories of browse tree
US6466918B1 (en) 1999-11-18 2002-10-15 Amazon. Com, Inc. System and method for exposing popular nodes within a browse tree
US6963867B2 (en) 1999-12-08 2005-11-08 A9.Com, Inc. Search query processing to provide category-ranked presentation of search results
US7430561B2 (en) 1999-12-08 2008-09-30 A9.Com, Inc. Search engine system for locating web pages with product offerings
US7395259B2 (en) 1999-12-08 2008-07-01 A9.Com, Inc. Search engine system and associated content analysis methods for locating web pages with product offerings
US7617209B2 (en) 1999-12-10 2009-11-10 A9.Com, Inc. Selection of search phrases to suggest to users in view of actions performed by prior users
US6772150B1 (en) 1999-12-10 2004-08-03 Amazon.Com, Inc. Search query refinement using related search phrases
US7424486B2 (en) 1999-12-10 2008-09-09 A9.Com, Inc. Selection of search phrases to suggest to users in view of actions performed by prior users
US8249939B2 (en) 2000-04-06 2012-08-21 Apple Inc. Custom stores
JP4594410B2 (en) * 2000-04-06 2010-12-08 アップル インコーポレイテッド System and method for implementing a custom store
JP2008276764A (en) * 2000-04-06 2008-11-13 Apple Inc System and method for providing custom store
US7526437B1 (en) 2000-04-06 2009-04-28 Apple Inc. Custom stores
WO2001080079A2 (en) 2000-04-18 2001-10-25 Amazon.Com, Inc. Search query autocompletion
US6564213B1 (en) 2000-04-18 2003-05-13 Amazon.Com, Inc. Search query autocompletion
US7702541B2 (en) 2000-08-01 2010-04-20 Yahoo! Inc. Targeted e-commerce system
US7464086B2 (en) 2000-08-01 2008-12-09 Yahoo! Inc. Metatag-based datamining
US7827055B1 (en) 2001-06-07 2010-11-02 Amazon.Com, Inc. Identifying and providing targeted content to users having common interests
US8285589B2 (en) 2001-06-07 2012-10-09 Amazon.Com, Inc. Referring-site based recommendations
USRE43031E1 (en) 2001-10-25 2011-12-13 MD Fab Capital L.L.C. Online interaction processing
US7013263B1 (en) 2001-10-25 2006-03-14 Mindfabric, Inc. Online interaction processing
US7130864B2 (en) 2001-10-31 2006-10-31 Hewlett-Packard Development Company, L.P. Method and system for accessing a collection of images in a database
WO2003038680A2 (en) 2001-10-31 2003-05-08 Hewlett-Packard Company Method and system for accessing a collection of images in a database
WO2003038680A3 (en) * 2001-10-31 2004-01-22 Hewlett Packard Co Method and system for accessing a collection of images in a database
JP2003178086A (en) * 2001-12-11 2003-06-27 Ntt Data Corp Information providing system and method based on request data
US9799052B2 (en) 2002-09-24 2017-10-24 Google Inc. Methods and apparatus for serving relevant advertisements
US10198746B2 (en) 2002-09-24 2019-02-05 Google Llc Methods and apparatus for serving relevant advertisements
US10991005B2 (en) 2002-09-24 2021-04-27 Google Llc Methods and apparatus for serving relevant advertisements
US8171034B2 (en) 2002-09-24 2012-05-01 Google, Inc. Methods and apparatus for serving relevant advertisements
US7930347B2 (en) 2004-02-10 2011-04-19 Enikos Pty. Limited Responsible peer-to-peer (P2P) digital content distribution
WO2005076147A1 (en) * 2004-02-10 2005-08-18 Ian Andrew Maxwell A content distribution system
EP1735725A2 (en) * 2004-03-31 2006-12-27 Google, Inc. Query rewriting with entity detection
EP1880323A2 (en) * 2005-05-11 2008-01-23 W.W. Grainger, Inc. System and method for providing a response to a search query
EP1880323A4 (en) * 2005-05-11 2010-09-29 Ww Grainger Inc System and method for providing a response to a search query
US8364661B2 (en) 2005-05-11 2013-01-29 W.W. Grainger, Inc. System and method for providing a response to a search query
US8051067B2 (en) 2005-05-11 2011-11-01 W.W. Grainger, Inc. System and method for providing a response to a search query
GB2428836A (en) * 2005-07-27 2007-02-07 Jobserve Ltd Improved Searching Method and System
CN102750323A (en) * 2006-02-28 2012-10-24 雅虎公司 System and method for identifying related queries for languages with multiple writing systems
CN102750323B (en) * 2006-02-28 2016-05-11 飞扬管理有限公司 Be used to the system and method for the identifying related queries with multiple writing systems
WO2007101194A3 (en) * 2006-02-28 2008-03-13 Yahoo Inc System and method for identifying related queries for languages with multiple writing systems
CN101390097B (en) * 2006-02-28 2012-07-04 雅虎公司 System and method for identifying related queries for languages with multiple writing systems
US10824940B1 (en) 2016-11-30 2020-11-03 Amazon Technologies, Inc. Temporal ensemble of machine learning models trained during different time intervals
US10866976B1 (en) 2018-03-20 2020-12-15 Amazon Technologies, Inc. Categorical exploration facilitation responsive to broad search queries
US11899700B1 (en) 2018-03-20 2024-02-13 Amazon Technologies, Inc. Categorical exploration facilitation responsive to broad search queries

Also Published As

Publication number Publication date
US7974885B1 (en) 2011-07-05
EP1060449A1 (en) 2000-12-20
AU1929099A (en) 1999-09-20
CA2320293A1 (en) 1999-09-10
EP1060449B1 (en) 2003-06-25
US20060053065A1 (en) 2006-03-09
AU757550B2 (en) 2003-02-27
CA2320293C (en) 2004-08-03
US20140222803A1 (en) 2014-08-07
DE69815898T2 (en) 2003-12-18
US7620572B2 (en) 2009-11-17
NZ506229A (en) 2003-02-28
DE69815898D1 (en) 2003-07-31
US8694385B1 (en) 2014-04-08
ATE243869T1 (en) 2003-07-15

Similar Documents

Publication Publication Date Title
CA2320293C (en) Identifying the items most relevant to a current query based on items selected in connection with similar queries
US6185558B1 (en) Identifying the items most relevant to a current query based on items selected in connection with similar queries
US7921119B2 (en) Identifying the items most relevant to a current query based on items selected in connection with similar queries
US7050992B1 (en) Identifying items relevant to a current query based on items accessed in connection with similar queries
US7574426B1 (en) Efficiently identifying the items most relevant to a current query based on items selected in connection with similar queries
US6169986B1 (en) System and method for refining search queries
KR100719009B1 (en) Apparatus for identifying related searches in a database search system
US6772150B1 (en) Search query refinement using related search phrases
US6853993B2 (en) System and methods for predicting correct spellings of terms in multiple-term search queries
EP1062602B1 (en) Search engine using sales and revenue to weight search results
US6850954B2 (en) Information retrieval support method and information retrieval support system
JP2950222B2 (en) Information retrieval method
US8103659B1 (en) Perspective-based item navigation
MXPA00008603A (en) Identifying the items most relevant to a current query based on items selected in connection with similar queries

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref document number: 2320293

Country of ref document: CA

Ref document number: 2320293

Country of ref document: CA

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 506229

Country of ref document: NZ

Ref document number: 19290/99

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: IN/PCT/2000/00299/MU

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 1998964094

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: PA/a/2000/008603

Country of ref document: MX

NENP Non-entry into the national phase

Ref country code: KR

ENP Entry into the national phase

Ref document number: 2000 534960

Country of ref document: JP

Kind code of ref document: A

WWP Wipo information: published in national office

Ref document number: 1998964094

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWG Wipo information: grant in national office

Ref document number: 19290/99

Country of ref document: AU

WWG Wipo information: grant in national office

Ref document number: 1998964094

Country of ref document: EP