US20080222145A1 - Visual method and apparatus for enhancing search result navigation - Google Patents

Visual method and apparatus for enhancing search result navigation Download PDF

Info

Publication number
US20080222145A1
US20080222145A1 US12/061,720 US6172008A US2008222145A1 US 20080222145 A1 US20080222145 A1 US 20080222145A1 US 6172008 A US6172008 A US 6172008A US 2008222145 A1 US2008222145 A1 US 2008222145A1
Authority
US
United States
Prior art keywords
search result
visual
cluster
search
ranked list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/061,720
Inventor
Shixia Liu
Zhong Su
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US12/061,720 priority Critical patent/US20080222145A1/en
Publication of US20080222145A1 publication Critical patent/US20080222145A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9038Presentation of query results
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99934Query formulation, input preparation, or translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access

Definitions

  • the present invention relates to computer information processing technology, specifically, to a visual method and apparatus for enhancing the navigation of search results returned by a search engine.
  • a web user uses a search engine to find required information.
  • the process in which a web user uses a search engine to find required information is as follows: the web user submits a query, which may be, for instance, a single keyword or a combination of keywords, to the search engine. Then, the search engine produces a ranked list of the search results based on the submitted query. The ranked list is returned and displayed on the browser used by the web user. The web user obtains the part of the search results of interest to him through viewing segments of the returned ranked list of the search results.
  • a query which may be, for instance, a single keyword or a combination of keywords
  • Vivisimo Company proposed a solution in which the search results returned by a search engine are clustered and the clustering results are visually displayed together with the ranked list of the search results.
  • this solution may provide a convenient way for web users to know the clustering of the search results, it only displays the clustering results and the ranked list of search results simultaneously but independently, without presenting their correlations clearly and visually to the web users.
  • the solution only clusters and displays part of the search results (for instance, the first 210 search results); if the web user selects a clustering item in the clustering results, the search results contained in the clustering item will be displayed, but no more relevant search results can be produced, so the web user cannot get more information of interest.
  • the present invention is proposed just based on the above technical problems that exist in the prior art.
  • the objective of the present invention is to provide a visual method and apparatus for enhancing search result navigation, whereby the traditional ranked list of the search results and the visual cluster hierarchy of the search results may be displayed in a joint manner and more search results may be obtained dynamically, so as to help web users to get required information rapidly and accurately.
  • a visual method for enhancing search result navigation comprising the following steps:
  • the first search result contains a predetermined number of search result entries in the search results produced by the search engine based on a query.
  • displaying the visual cluster hierarchy and the ranked list of the first search result in a joint manner comprises any one of the following:
  • the method according to the present invention further comprises the following steps:
  • the step of generating new query keywords comprises: combining the current query keywords with the name of the selected cluster to generate new query keywords.
  • a visual apparatus for enhancing search result navigation comprising:
  • the visual apparatus for enhancing search result navigation further comprises:
  • the keyword generator comprises:
  • a browser that comprises the visual apparatus for enhancing search result navigation.
  • a search engine that comprises the visual apparatus for enhancing search result navigation.
  • a program product comprises: program codes for implementing the method; and carrying media for carrying the program codes.
  • FIG. 1 is a flowchart of a visual method for enhancing search result navigation according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of the display of a browser after using the visual method for enhancing search result navigation of the embodiment shown in FIG. 1 ;
  • FIG. 3 is a flowchart of a visual method for enhancing search result navigation according to another embodiment of the present invention.
  • FIG. 4 is a flowchart of a example of generating new query keywords in the embodiment shown in FIG. 3 ;
  • FIG. 5 is a block diagram of a visual apparatus for enhancing search result navigation according to an embodiment of the present invention.
  • FIG. 6 is a block diagram of the keyword generator in the embodiment shown in FIG. 5 .
  • FIG. 1 is a flowchart of a visual method for enhancing search result navigation according to an embodiment of the present invention.
  • a first search result is obtained from a search engine.
  • the search engine receives a query submitted by a web user, a search result may be generated based on the query.
  • the search result includes a plurality of documents, each of which constitutes a search result entry.
  • a web user submits a query through a browser, which may be, for instance, an IE browser from Microsoft Company, a Netscape browser from Netscape Company or the like, and the search engine may be a known search engine such as Google or Yahoo!.
  • a query is usually in the form of a single keyword or a combination of keywords, and conforms to the format as defined by the engine being used.
  • the first search result is clustered so as to get the clustering information of the search result.
  • the clustering operation is performed on the first search result by using a clustering algorithm based on the similarities between the segments of the documents in the search result. In this way, the documents related to a subject may be collected into a cluster.
  • the adopted clustering algorithm should not introduce a substantial delay.
  • the clustering algorithm takes document snippets as input and the generated clustering information has a readable description content that is convenient for a web user to browse quickly.
  • STC algorithm is a fast, incremental and linear time clustering algorithm for clustering web search results. Its basic idea is to identify phrases that are common to a set of documents obtained as the search results.
  • a base cluster is defined to be a set of documents that share a common phrase.
  • each document in the search result is preprocessed, that is, the string of text representing each document is transformed using a stemming algorithm, sentence boundaries are marked, and non-word tokens, such as numbers, HTML tags and most punctuations, are stripped.
  • base clusters are identified using a suffix tree, which step can be viewed as creating an inverted index of phrases for the set of documents.
  • these identified base clusters are merged into clusters and the common phrases may be used as the names of the clusters.
  • STC algorithm is only taken as an example of clustering algorithms. Those skilled in the art may use any other suitable clustering algorithm to cluster the search results.
  • the first search result contains only a predetermined number of search result entries in the search result produced by the search engine, for instance, in the example shown in FIG. 2 , the first search result contains the first 206 documents in the ranked list of the search result.
  • the number of documents in the first search result may be set by the web user through the user interface of the browser, and may affect the time for performing the clustering operation.
  • each clustering information item in the clustering information contains which of search result entries of the first search result, the numbers of the search result entries contained in each clustering information item, in which clustering information items each search result entry is contained, which clustering information item contains the most search result entries, and which clustering information item contains the most search result entries of the first page and so on.
  • Step 115 visualization processing is performed on the obtained clustering information, including representing the clustering information in a form visible to the web user, preferably, by using a tree visualization technique to represent a clustering tree structure; and describing the attributes of various clustering information items in the clustering information, such as, the name of each clustering information item, the number of search result entries contained therein and the like.
  • the clustering information becomes visual cluster hierarchy for displaying on the browser to the web user.
  • Step 110 of calculating the correlations between the clustering information and the ranked list of the first search result is performed before Step 115 of performing visualization processing on the clustering information, essentially, these two steps may be performed in parallel without strict order.
  • Step 115 of performing visualization processing on the clustering information may be performed first, and then Step 110 of calculating the correlations between the clustering information and the ranked list of the first search result may be performed.
  • Step 120 the visual cluster hierarchy generated in Step 115 and the ranked list of the first search result are displayed in a joint manner, so as to help the web user to locate search result entries of interest more easily and know the clustering of the search result entries in the first search result as the whole.
  • Step 120 Displaying the clustering information and the ranked list of the first search result in a joint manner in Step 120 comprises the following cases:
  • the cluster that contains the most search result entries of the first search result in the first page may be highlighted. Because usually the search result entries displayed in the first page have high correlation with the query submitted by the web user, the web user is more concerned with the clustering of the search result on this page, so highlighting such a cluster makes the web user locate the content of interest more conveniently.
  • the cluster in the visual cluster hierarchy that contains the selected search result entry and the most search result entries of the first search result in the first page can be highlighted.
  • the web user can get help to quickly know the cluster to which the selected search result entry belongs and which has the most search result entries in the first page.
  • FIG. 2 that schematically shows a display of a browser after using the visual method for enhancing search result navigation of the embodiment as shown in FIG. 1 , a detailed description will be given to an example in which the visual method for enhancing search result navigation according to this embodiment is practically applied.
  • the example applies an IE browser, which is well known by those skilled in the art, and the query keywords submitted by the web user are “information visualization”.
  • the search engine “Google” generates a search result including at least 2,355,000 search result entries, wherein the first 206 search result entries are selected to be the first search result to be clustered and displayed.
  • the visual cluster hierarchy is displayed, wherein the clusters are displayed in the form of nodes and the name of each cluster and the number of search result entries contained therein are also displayed; on the right of FIG. 2 , the ranked list of the first 206 search result entries in the first page is displayed.
  • the cluster InfoVis is highlighted in the form of a dark node, and contains 18 search result entries.
  • the way of the highlighting may be high brightness displaying, enlarged displaying or using a different color from that of the other clusters. From this it can be seen that, in the search result displayed in the first page, the cluster InfoVis contains the most search result entries. Through such a kind of displaying, the web user can clearly know the clustering of the search results and the most important clustering information item in the clustering information.
  • FIG. 3 is a flowchart of a visual method for enhancing search result navigation according to another embodiment of the present invention.
  • FIG. 3 a description of this embodiment will be given, wherein for the parts similar to those in the embodiment shown in FIG. 1 the same notations will be used and their explanations will be omitted properly.
  • This embodiment is characterized by further searching for the search results related to the cluster selected by the web user and merging them into the cluster, and then performing clustering once more.
  • Step 300 if the user selects a cluster in the visual cluster hierarchy, after this selection step 300 , in addition to displaying the ranked list of the search result entries in the first search result that are contained in this cluster, the following operations may further be performed: in Step 301 , generating new query keywords and submitting them to the search engine. In order to search for more search results related to the cluster, further qualifications on the current query keyword is needed in order to generate new query keywords and submit them to the search engine.
  • the new query keywords may be generated through combining the current query keywords with the name of the selected cluster, as in the example shown in FIG. 2 , if the web user selects a cluster “Software”, the new query keyword would be “information visualization+software”.
  • Step 305 the search engine generates a new search result based on the new query keywords.
  • Step 310 a predetermined number of search result entries, for instance, 300 search result entries, are selected from the new generated search result, so as to generate a second search result.
  • the second search result can be generated through merging these search result entries with the search result entries currently contained in the cluster, further facilitating the user to find the required information quickly.
  • Step 315 the second search result is clustered to obtain sub-clustering information.
  • This step applies a clustering method similar to that used in the embodiment shown in FIG. 1 , and its description is omitted here.
  • Step 320 the correlations between the sub-clustering information and the ranked list of the second search result is calculated, wherein the content of the correlation information has been described in the above embodiments and its description is omitted here.
  • Step 325 visualization processing is performed on the sub-clustering information.
  • the visualization processing of the sub-clustering information is also to represent the sub-clustering information in the form of nodes, and depict the name of the sub-clustering information and the number of the search result entries contained.
  • the sub-clustering information after the visualization processing becomes the visual sub-clustering information.
  • Step 320 of calculating the correlations is performed before Step 325 of visualization processing, essentially, these two steps may be performed in parallel without strict order.
  • the step of visualization may be performed first, then the step of calculating the correlations may be performed.
  • Step 330 the visual sub-clustering information and the ranked list of the second search result are displayed on the browser in a joint manner. Displaying the visual sub-clustering information and the ranked list of the second search result in a joint manner is similar to that in the embodiment shown in FIG. 1 and its description is omitted here.
  • the visual cluster hierarchy and the sub-clustering information are displayed using a tree structure, wherein the clustering information items contained in the visual cluster hierarchy are root nodes and the visual sub-clustering information items contained in the visual sub-clustering information are branch nodes.
  • a tree structure to display visual cluster hierarchy and visual sub-clustering information can make the web user clearly understand their relations, allowing the web user to drill up and down in different levels of the visual cluster hierarchy.
  • Step 335 if the web user further selects a visual sub-clustering information item in the visual sub-clustering information (Step 335 ), Steps 301 to 330 will be repeated. If the web user continues to select a clustering information item in the next level clustering information of the visual cluster hierarchy, Steps 301 to 330 will further be repeated. Through such a repeated performing of the operation of “generating new query keywords—searching for a new search result—clustering”, more accurate search results can be provided to the web user.
  • a method for generating new query keywords as shown in FIG. 4 may also be used.
  • FIG. 4 a detailed description will be given to the generation of new query keywords in the embodiment shown in FIG. 3 .
  • Step 401 the relevant documents are collected.
  • relevant documents There are two kinds of relevant documents, that is, the documents that have been read by the web user or the documents that belong to the selected cluster.
  • Keywords in the collected relevant documents are determined.
  • the tf-idf method is used to determine keywords.
  • weights of all words except stopwords in each document of the collected relevant documents are calculated, wherein the “stopword” refers to those words having zero semantic value, such as “of,” “the,” “to” and the like. Since this kind of words appear in each document in high frequency but with no actual semantic meaning, the weights of this kind of words are not calculated.
  • the formula for calculating the weight of a word with actual meaning is as follows:
  • Step 410 After determining the keywords, in Step 410 , these keywords are combined with the current query keyword to generate new query keywords.
  • the method for generating new query keywords as shown in FIG. 4 is only illustrative and not restrictive. Those skilled in the art can apply any other suitable method for generating keywords.
  • FIG. 5 is a block diagram of a visual apparatus 500 for enhancing search result navigation according to an embodiment of the present invention.
  • the visual apparatus 500 for enhancing search result navigation is installed between a search engine 506 and a browser 505 as a separate apparatus.
  • the visual apparatus 500 for enhancing search result navigation comprises: a dynamic cluster constructor 501 , coupled to the search engine 506 , and configured to dynamically cluster the search result from the search engine 506 to generate clustering information; a correlation processor 502 configured to calculate the correlations between the clustering information and the ranked list of the search result; a visualization engine 503 , coupled to the browser 505 , and configured to perform visualization processing on the clustering information to produce visual cluster hierarchy and display the visual cluster hierarchy and the ranked list of the search result on the browser 505 in a joint manner based on the correlations.
  • the search engine 506 may be a known search engine, such as Google, Yahoo! or the like, and the browser 505 may be, such as, an IE browser from Microsoft Company, a Netscape browser from Netscape Company, or the like.
  • the query When a web user submits a query through the browser 505 , the query is transmitted to the search engine 506 through the visualization engine 503 of the visual apparatus 500 .
  • the query usually takes the form of a single keyword or a combination of keywords and conforms to the format defined by the search engine 506 .
  • the search engine 506 generates a search result based on the query.
  • the search result contains a plurality of documents, each of which constitutes a search result entry. Then the search engine 506 returns a ranked list of the search result to the dynamic cluster constructor 501 of the visual apparatus 500 .
  • the dynamic cluster constructor 501 may further comprises: a search result selecting unit 5011 configured to receive the ranked list of the search result returned by the search engine 506 and select a predetermined number of search result entries from the ranked list of the received search result to generate a first search result and save the first search result; a clustering unit 5012 configured to cluster the first search result to generate clustering information and send the clustering information and the ranked list of the first search result to the correlation processor 502 .
  • the clustering unit 5012 applies the Suffix Tree Clustering (STC) algorithm to perform the clustering, which algorithm has been described in detail above and its explanation is omitted here.
  • STC Suffix Tree Clustering
  • the correlation processor 502 After receiving the generated clustering information and the ranked list of the first search result from the dynamic cluster constructor 501 , the correlation processor 502 calculates the correlations between them, the content contained in the correlation information having been described in the previous embodiments and its explanation being omitted here.
  • the correlation processor 502 After calculating the correlations, the correlation processor 502 sends the clustering information, the ranked list of the first search result and their correlations to the visualization engine 503 , which performs visualization processing, including representing the clustering information in a form readable to web user, depicting the attributes of the clustering information and the like.
  • the visualization engine 503 displays the visual cluster hierarchy and the ranked list of the search result on the browser 505 in a joint manner based on the correlations calculated by the correlation processor 502 .
  • the situations involved in the displaying in a joint manner have been described in the previous embodiments and their explanation is omitted here.
  • the visual apparatus 500 for enhancing search result navigation can be implemented in hardware circuits, such as super large-scale integrated circuits or gate arrays, semiconductors such as logic chips and transistors, or programmable hardware devices such as field programmable gate arrays and programmable logic devices, and also can be implemented in software executed by various kinds of processors, and further can be implemented in a combination of the above-mentioned hardware circuits and software.
  • the visual apparatus 500 for enhancing search result navigation further comprises a keyword generator 504 configured to generate new query keywords when a cluster in the visual cluster hierarchy is selected and send the keywords to the search engine 506 .
  • a keyword generator 504 configured to generate new query keywords when a cluster in the visual cluster hierarchy is selected and send the keywords to the search engine 506 .
  • the web user's selection is also sent to the keyword generator 504 through the visualization engine 503 .
  • the keyword generator 504 generates new query keywords based on the selection. How to generate new keywords has been described in the previous embodiments and its explanation is omitted here.
  • the keyword generator 504 may receive the cluster selected by the user on the browser and transmitted by the visualization engine 503 to generate new keywords, and send the generated new keywords to the search engine 506 for further searching.
  • the search engine 506 generates a ranked list of the new search result based on the new query keywords and returns it to the dynamic cluster constructor 501 .
  • the search result selecting unit 5011 of the dynamic cluster constructor 501 selects a predetermined number of search result entries, for instance, the first 200 search result entries, from the ranked list of the new search result, to generate a second search result and save it.
  • the selected search result entries can also be merged with those search result entries in the currently saved first search result that are contained in the selected cluster to form the second search result and the second search result is saved.
  • the clustering unit 5012 clusters the second search result to generate the sub-clustering information of the selected cluster.
  • the sub-clustering information and the ranked list of the second search result are sent to the correlation processor 502 .
  • the correlation processor 502 calculates the correlations between the sub-clustering information and the ranked list of the second search result, and the content of the correlation information has been described in the above embodiments and its description is omitted here. Then, the correlation processor 502 sends the sub-clustering information, the ranked list of the second search result and their correlations to the visualization engine 503 .
  • the visualization engine 503 visualizes the clustering information and the sub-clustering information into a tree structure, wherein the clustering information items contained in the clustering information are taken as root nodes and the sub-clustering information items contained in the sub-clustering information are taken as branch nodes.
  • the visualization engine 503 directs displaying the sub-clustering information and the ranked list of the second search result in a joint manner on the browser 505 .
  • the visual apparatus 500 for enhancing search result navigation may continue to generate new query keywords for the selected sub-clustering information item through the keyword generator 504 , and search for a new search result and perform clustering through the dynamic cluster constructor 501 , so as to generate visual cluster hierarchy at different levels to facilitate the web user to find the content of interest.
  • the keyword generator 504 can also be integrated into the visualization engine 503 , and receive the selection of the web user through the visualization engine 503 , generates new keywords based on the selection and sends them to the search engine 506 through the visualization engine 503 .
  • the visual apparatus 500 for enhancing search result navigation incorporated with the keyword generator 504 can dynamically search for more search result on the basis of the original limited search result and cluster the combination of the new search result and the original search result, so as to construct the clustering information at various levels together with the previous clustering information, making the web user to get more detailed and more accurate search result easily.
  • FIG. 6 is a block diagram of an example of the keyword generator 504 . Next, in conjunction with FIG. 6 , a detailed description will be given.
  • the keyword generator 504 comprises: a document collector 601 configured to collect relevant documents required for generating query keywords; a weight calculator 602 configured to calculate the weights of all the words except stopwords in each one of the relevant documents; and a keyword combiner 603 configured to select the words with high weights and combine them with the current query keywords to generate new query keywords.
  • the document collector 601 collects relevant documents required for generating new query keywords based on the selection, so as to determine new keywords from these relevant documents.
  • the relevant documents collected by the document collector 601 are sent to the weight calculator 602 , which calculates the weights of all the words except stopwords in each document.
  • the keyword combiner 603 selects the words with high weights as new keywords and combines them with the current query keywords to generate new query keywords. How to collect the relevant documents and how to calculate the weights have been described in the previous embodiments and their explanation is omitted here.
  • the keyword generator 504 of this embodiment and its components can be implemented in hardware circuits, such as super-large scale integrated circuits or gate arrays, semiconductors such as logic chips and transistors, or programmable hardware devices such as field programmable gate arrays and programmable logic devices, and also can be implemented in software executed by various kinds of processors, and further can be implemented in a combination of the above-mentioned hardware circuits and software.
  • the above visual apparatus for enhancing search result navigation may be combined with an existing browser to form a new browser.
  • the existing browser may be, for instance, an IE browser from Microsoft Company, a Netscape browser from Netscape Company or the like.
  • above visual apparatus for enhancing search result navigation may be combined with an existing search engine to form a new search engine.
  • An existing search engine may be a know search engine, such as Google, Yahoo! or the like.
  • the present invention further provides a program product, comprising: program codes for implementing all the above methods and carrying media for carrying the program codes.

Abstract

A visual method for enhancing search result navigation, comprising: obtaining a first search result from a search engine; clustering the first search result to get clustering information; calculating the correlations between the clustering information and the ranked list of the first search result, and performing visualization processing on the clustering information; and displaying the visual cluster hierarchy and the ranked list of the first search result in a joint manner based on the correlations. When a cluster is selected, further searching is performed and the search result is clustered again. Using the present invention, through combining a traditional ranked list of search results and the visual cluster hierarchy of these search results to display them in a joint manner, a convenient way is provided for the web user to find the potential correlations between the visual cluster hierarchy and the ranked list of the search results. Besides, through dynamically searching and clustering more search results again, the web user may get more detailed and more accurate search results easily.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. application Ser. No. 11/619,665 filed Jan. 4, 2007, the complete disclosure of which, in its entirety, is herein incorporated by reference.
  • TECHNICAL FIELD
  • The present invention relates to computer information processing technology, specifically, to a visual method and apparatus for enhancing the navigation of search results returned by a search engine.
  • TECHNICAL BACKGROUND
  • With the wide application of the Internet, people can get lots of information from the web. However, due to the rapid growth of the web contents, it becomes more and more difficult for web users to find required information rapidly and accurately. Currently, web users mainly rely on search engines to find required information. Generally, the process in which a web user uses a search engine to find required information is as follows: the web user submits a query, which may be, for instance, a single keyword or a combination of keywords, to the search engine. Then, the search engine produces a ranked list of the search results based on the submitted query. The ranked list is returned and displayed on the browser used by the web user. The web user obtains the part of the search results of interest to him through viewing segments of the returned ranked list of the search results.
  • However, such a method of using a search engine to find information commonly has a problem that the search engine always returns too many search results. In this situation, since the screen size of a computer display is limited, it is impossible to display all the search results simultaneously to the web user. Thus, the web user usually needs to browse many Web pages to find the required information, resulting in low efficiency of the web user getting information. On the other hand, according to an investigation of web users, in most cases, a web user only looks at the first few web pages of the ranked list of search results. Thus, in fact, the search quality of a web user searching information is also very low.
  • In order to improve search quality, some methods for improving the browsability of search results have been proposed in recent years. Vivisimo Company proposed a solution in which the search results returned by a search engine are clustered and the clustering results are visually displayed together with the ranked list of the search results. Although this solution may provide a convenient way for web users to know the clustering of the search results, it only displays the clustering results and the ranked list of search results simultaneously but independently, without presenting their correlations clearly and visually to the web users. Besides, the solution only clusters and displays part of the search results (for instance, the first 210 search results); if the web user selects a clustering item in the clustering results, the search results contained in the clustering item will be displayed, but no more relevant search results can be produced, so the web user cannot get more information of interest.
  • SUMMARY OF THE INVENTION
  • The present invention is proposed just based on the above technical problems that exist in the prior art. The objective of the present invention is to provide a visual method and apparatus for enhancing search result navigation, whereby the traditional ranked list of the search results and the visual cluster hierarchy of the search results may be displayed in a joint manner and more search results may be obtained dynamically, so as to help web users to get required information rapidly and accurately.
  • According to an aspect of the present invention, there is provided a visual method for enhancing search result navigation, comprising the following steps:
      • obtaining a first search result from a search engine;
      • clustering the first search result to get clustering information;
      • calculating the correlations between the clustering information and the ranked list of the first search result, and performing visualization processing on the clustering information; and
      • displaying the visual cluster hierarchy and the ranked list of the first search result in a joint manner based on the correlations.
  • Preferably, the first search result contains a predetermined number of search result entries in the search results produced by the search engine based on a query.
  • Preferably, displaying the visual cluster hierarchy and the ranked list of the first search result in a joint manner comprises any one of the following:
      • a. when the pages of the ranked list of the first search result are displayed, the cluster(s) in the visual cluster hierarchy that contains the most search result entries of the first search result is highlighted;
      • b. when the pages of the ranked list of the first search result are displayed, the cluster(s) in the visual cluster hierarchy that contains most of the search results in the first page of the first search result is highlighted;
      • c. when a search result entry in the ranked list of the first search result is selected, the cluster(s) in the visual cluster hierarchy that contains the search result entry and the most search result entries of the first search result is highlighted;
      • d. when a search result entry in the ranked list of the first search result is selected, the cluster(s) in the visual cluster hierarchy that contains the search result entry and the most search result entries of the first search result in the first page is highlighted; and
      • e. when a cluster in the visual cluster hierarchy is selected, the ranked list of the search result entries in the first search result that are contained in the cluster is displayed.
  • Preferably, the method according to the present invention further comprises the following steps:
      • selecting a cluster in the visual cluster hierarchy;
      • generating new query keywords and submitting them to the search engine;
      • generating a new search result based on the new query keywords by the search engine;
      • selecting a predetermined number of search result entries in the new search result to produce a second search result;
      • clustering the second search result to obtain sub-clustering information;
        • calculating the correlations between the sub-clustering information and the ranked list of the second search result, and performing visualization processing on the sub-clustering information;
        • displaying the visual sub-clustering information and the ranked list of the second search result in a joint manner based on the correlations; and
        • when a visual sub-clustering information item in the visual sub-clustering information is selected, repeating the above steps.
  • Preferably, the step of generating new query keywords comprises: combining the current query keywords with the name of the selected cluster to generate new query keywords.
      • Preferably, the step of generating new query keywords comprises:
      • collecting relevant documents;
      • determining keywords in the relevant documents; and
      • combining the keywords with the current query keywords to generate new query keywords.
  • According to another aspect of the present invention, there is provided a visual apparatus for enhancing search result navigation, comprising:
      • a dynamic cluster constructor configured to select search results coming from a search engine to get a first search result and dynamically cluster the first search result to generate clustering information;
      • a correlation processor configured to calculate the correlations between the clustering information and the ranked list of the first search result; and
      • a visualization engine configured to perform visualization processing on the clustering information to produce visual cluster hierarchy, and display the visual cluster hierarchy and the ranked list of the first search result on a browser in a joint manner based on the correlations.
      • Preferably, the dynamic cluster constructor comprises:
      • a search result selecting unit configured to select a predetermined number of search result entries from the received search result to generate a first search result and save the first search result; and
        • a clustering unit configured to cluster the first search result to generate clustering information.
  • Preferably, the visual apparatus for enhancing search result navigation further comprises:
      • a keyword generator configured to generate new query keywords when a cluster in the visual cluster hierarchy is selected and send the keywords to the search engine through the dynamic cluster constructor;
      • wherein, the search engine performs searching based on the new query keywords generated by the keyword generator and returns a ranked list of the new search result;
      • the search result selecting unit of the dynamic cluster constructor selects a predetermined number of search result entries from the received ranked list of the new search result to generate a second search result and save it;
      • the clustering unit of the dynamic cluster constructor clusters the second search result to generate sub-clustering information;
      • the visualization engine visualizes the clustering information and the sub-clustering information into a tree structure, wherein the clustering information items contained in the clustering information are taken as root nodes and the sub-clustering information items contained in the sub-clustering information are taken as branch nodes, and the visualization engine also displays the visual cluster hierarchy and the ranked list of the second search result in a joint manner on the browser.
  • Preferably, the keyword generator comprises:
      • a document collector configured to collect relevant documents;
      • a weight calculator configured to calculate weights of all words except stopwords in each one of the relevant documents; and
      • a keyword combiner configured to select the words with high weights and combine them with the current query keywords to generate new query keywords.
  • According to yet another aspect of the present invention, there is provided a browser that comprises the visual apparatus for enhancing search result navigation.
  • According to still another aspect of the present invention, there is provided a search engine that comprises the visual apparatus for enhancing search result navigation.
  • According to further another aspect of the present invention, there is provided a program product, comprises: program codes for implementing the method; and carrying media for carrying the program codes.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a flowchart of a visual method for enhancing search result navigation according to an embodiment of the present invention;
  • FIG. 2 is a schematic diagram of the display of a browser after using the visual method for enhancing search result navigation of the embodiment shown in FIG. 1;
  • FIG. 3 is a flowchart of a visual method for enhancing search result navigation according to another embodiment of the present invention;
  • FIG. 4 is a flowchart of a example of generating new query keywords in the embodiment shown in FIG. 3;
  • FIG. 5 is a block diagram of a visual apparatus for enhancing search result navigation according to an embodiment of the present invention; and
  • FIG. 6 is a block diagram of the keyword generator in the embodiment shown in FIG. 5.
  • DETAILED DESCRIPTION OF THE INVENTION
  • It is believed that the above and other objectives, features and advantages of the present invention will become more apparent through the following detailed description of particular embodiments of the present invention taken in conjunction with the drawings.
  • FIG. 1 is a flowchart of a visual method for enhancing search result navigation according to an embodiment of the present invention.
  • As shown in FIG. 1, first in Step 101, a first search result is obtained from a search engine. When the search engine receives a query submitted by a web user, a search result may be generated based on the query. The search result includes a plurality of documents, each of which constitutes a search result entry. Usually, a web user submits a query through a browser, which may be, for instance, an IE browser from Microsoft Company, a Netscape browser from Netscape Company or the like, and the search engine may be a known search engine such as Google or Yahoo!. As known by those skilled in the art, a query is usually in the form of a single keyword or a combination of keywords, and conforms to the format as defined by the engine being used.
  • After obtaining the first search result, in Step 105, the first search result is clustered so as to get the clustering information of the search result. The clustering operation is performed on the first search result by using a clustering algorithm based on the similarities between the segments of the documents in the search result. In this way, the documents related to a subject may be collected into a cluster. In order to ensure that the search engine still works in real time, the adopted clustering algorithm should not introduce a substantial delay. The clustering algorithm takes document snippets as input and the generated clustering information has a readable description content that is convenient for a web user to browse quickly.
  • Next, a detailed description will be given to the process of clustering the first search result using a clustering algorithm. In this embodiment, preferably, the Suffix Tree Clustering (STC) algorithm is used as the clustering algorithm. STC algorithm is a fast, incremental and linear time clustering algorithm for clustering web search results. Its basic idea is to identify phrases that are common to a set of documents obtained as the search results. First, a base cluster is defined to be a set of documents that share a common phrase. Then each document in the search result is preprocessed, that is, the string of text representing each document is transformed using a stemming algorithm, sentence boundaries are marked, and non-word tokens, such as numbers, HTML tags and most punctuations, are stripped. After that, base clusters are identified using a suffix tree, which step can be viewed as creating an inverted index of phrases for the set of documents. Finally, these identified base clusters are merged into clusters and the common phrases may be used as the names of the clusters.
  • Here the STC algorithm is only taken as an example of clustering algorithms. Those skilled in the art may use any other suitable clustering algorithm to cluster the search results.
  • To make the clustering process fast, preferably, the first search result contains only a predetermined number of search result entries in the search result produced by the search engine, for instance, in the example shown in FIG. 2, the first search result contains the first 206 documents in the ranked list of the search result. The number of documents in the first search result may be set by the web user through the user interface of the browser, and may affect the time for performing the clustering operation.
  • After getting the clustering information of the first search result, in Step 110, the correlations between the clustering information and the ranked list of the first search result are calculated. The correlations comprise, for instance, at least one of following information: each clustering information item in the clustering information contains which of search result entries of the first search result, the numbers of the search result entries contained in each clustering information item, in which clustering information items each search result entry is contained, which clustering information item contains the most search result entries, and which clustering information item contains the most search result entries of the first page and so on.
  • Of course, the above listed examples of the correlations are only illustrative, and this embodiment is not limited thereto. Those skilled in the art can use any other suitable information representing the correlations.
  • In Step 115, visualization processing is performed on the obtained clustering information, including representing the clustering information in a form visible to the web user, preferably, by using a tree visualization technique to represent a clustering tree structure; and describing the attributes of various clustering information items in the clustering information, such as, the name of each clustering information item, the number of search result entries contained therein and the like. After the visualization processing, the clustering information becomes visual cluster hierarchy for displaying on the browser to the web user.
  • Although in this embodiment Step 110 of calculating the correlations between the clustering information and the ranked list of the first search result is performed before Step 115 of performing visualization processing on the clustering information, essentially, these two steps may be performed in parallel without strict order. As an alternative, Step 115 of performing visualization processing on the clustering information may be performed first, and then Step 110 of calculating the correlations between the clustering information and the ranked list of the first search result may be performed.
  • Then in Step 120, the visual cluster hierarchy generated in Step 115 and the ranked list of the first search result are displayed in a joint manner, so as to help the web user to locate search result entries of interest more easily and know the clustering of the search result entries in the first search result as the whole.
  • Displaying the clustering information and the ranked list of the first search result in a joint manner in Step 120 comprises the following cases:
      • 1) When the pages of the ranked list of the first search result are displayed on the browser, the cluster in the visual cluster hierarchy that contains the most search result entries of the first search result is highlighted, that is, the cluster that contains the most search result entries is highlighted. Thus, the web user can easily know which cluster contains the most search result entries.
  • Preferably, in this case, in the visual cluster hierarchy the cluster that contains the most search result entries of the first search result in the first page may be highlighted. Because usually the search result entries displayed in the first page have high correlation with the query submitted by the web user, the web user is more concerned with the clustering of the search result on this page, so highlighting such a cluster makes the web user locate the content of interest more conveniently.
      • 2) When a search result entry in the ranked list of the first search result is selected, in the visual cluster hierarchy the cluster that contains the search result entry and the most search result entries of the first search result is highlighted, that is, the cluster to which the selected search result entry belongs and which contains the most search result entries of the first search result is highlighted. Thus, the web user can get help to quickly know the cluster to which the selected search result entry belongs and which has the most search result entries.
  • Preferably, in this case, the cluster in the visual cluster hierarchy that contains the selected search result entry and the most search result entries of the first search result in the first page can be highlighted. Thus, the web user can get help to quickly know the cluster to which the selected search result entry belongs and which has the most search result entries in the first page.
      • 3) When a cluster in the visual cluster hierarchy is selected, the ranked list of the search result entries in the first search result that are contained in the cluster is displayed, that is, the specific search result entries contained in the selected cluster are displayed, enabling the web user to know the specific content of the selected cluster.
  • From the above description it can be seen that, using the visual method for enhancing search result navigation according to this embodiment, through combining a traditional ranked list of search results and the visual cluster hierarchy of these search results to be displayed in a joint manner, a convenient way is provided for the web user to find the potential correlations between the visual cluster hierarchy and the ranked list of the search results, making the web user to locate the required content more easily.
  • Referring to FIG. 2 that schematically shows a display of a browser after using the visual method for enhancing search result navigation of the embodiment as shown in FIG. 1, a detailed description will be given to an example in which the visual method for enhancing search result navigation according to this embodiment is practically applied.
  • As shown in FIG. 2, the example applies an IE browser, which is well known by those skilled in the art, and the query keywords submitted by the web user are “information visualization”. Based on these query keywords, the search engine “Google” generates a search result including at least 2,355,000 search result entries, wherein the first 206 search result entries are selected to be the first search result to be clustered and displayed. On the left of FIG. 2 the visual cluster hierarchy is displayed, wherein the clusters are displayed in the form of nodes and the name of each cluster and the number of search result entries contained therein are also displayed; on the right of FIG. 2, the ranked list of the first 206 search result entries in the first page is displayed. According to the above description, in this case, it is needed to highlight the cluster that contains the most search result entries or the cluster that contains the most search result entries in the first page. In FIG. 2, the cluster InfoVis is highlighted in the form of a dark node, and contains 18 search result entries. The way of the highlighting may be high brightness displaying, enlarged displaying or using a different color from that of the other clusters. From this it can be seen that, in the search result displayed in the first page, the cluster InfoVis contains the most search result entries. Through such a kind of displaying, the web user can clearly know the clustering of the search results and the most important clustering information item in the clustering information.
  • FIG. 3 is a flowchart of a visual method for enhancing search result navigation according to another embodiment of the present invention. Next, in conjunction with FIG. 3, a description of this embodiment will be given, wherein for the parts similar to those in the embodiment shown in FIG. 1 the same notations will be used and their explanations will be omitted properly.
  • This embodiment is characterized by further searching for the search results related to the cluster selected by the web user and merging them into the cluster, and then performing clustering once more.
  • As shown in FIG. 3, in Step 300, if the user selects a cluster in the visual cluster hierarchy, after this selection step 300, in addition to displaying the ranked list of the search result entries in the first search result that are contained in this cluster, the following operations may further be performed: in Step 301, generating new query keywords and submitting them to the search engine. In order to search for more search results related to the cluster, further qualifications on the current query keyword is needed in order to generate new query keywords and submit them to the search engine. The new query keywords may be generated through combining the current query keywords with the name of the selected cluster, as in the example shown in FIG. 2, if the web user selects a cluster “Software”, the new query keyword would be “information visualization+software”.
  • In Step 305, the search engine generates a new search result based on the new query keywords. Then in Step 310, a predetermined number of search result entries, for instance, 300 search result entries, are selected from the new generated search result, so as to generate a second search result. Preferably, the second search result can be generated through merging these search result entries with the search result entries currently contained in the cluster, further facilitating the user to find the required information quickly.
  • Then, in Step 315, the second search result is clustered to obtain sub-clustering information. This step applies a clustering method similar to that used in the embodiment shown in FIG. 1, and its description is omitted here.
  • After obtaining the sub-clustering information, in Step 320, the correlations between the sub-clustering information and the ranked list of the second search result is calculated, wherein the content of the correlation information has been described in the above embodiments and its description is omitted here.
  • Then, in Step 325, visualization processing is performed on the sub-clustering information. In this embodiment, the visualization processing of the sub-clustering information is also to represent the sub-clustering information in the form of nodes, and depict the name of the sub-clustering information and the number of the search result entries contained. The sub-clustering information after the visualization processing becomes the visual sub-clustering information.
  • Although in this embodiment Step 320 of calculating the correlations is performed before Step 325 of visualization processing, essentially, these two steps may be performed in parallel without strict order. As an alternative, the step of visualization may be performed first, then the step of calculating the correlations may be performed.
  • In Step 330, the visual sub-clustering information and the ranked list of the second search result are displayed on the browser in a joint manner. Displaying the visual sub-clustering information and the ranked list of the second search result in a joint manner is similar to that in the embodiment shown in FIG. 1 and its description is omitted here.
  • In this embodiment, the visual cluster hierarchy and the sub-clustering information are displayed using a tree structure, wherein the clustering information items contained in the visual cluster hierarchy are root nodes and the visual sub-clustering information items contained in the visual sub-clustering information are branch nodes. Using a tree structure to display visual cluster hierarchy and visual sub-clustering information can make the web user clearly understand their relations, allowing the web user to drill up and down in different levels of the visual cluster hierarchy.
  • Besides, if the web user further selects a visual sub-clustering information item in the visual sub-clustering information (Step 335), Steps 301 to 330 will be repeated. If the web user continues to select a clustering information item in the next level clustering information of the visual cluster hierarchy, Steps 301 to 330 will further be repeated. Through such a repeated performing of the operation of “generating new query keywords—searching for a new search result—clustering”, more accurate search results can be provided to the web user.
  • From the above description it can be seen that, using the visual method for enhancing search result navigation according to this embodiment, it is possible to dynamically search for more search results on the basis of the original limited search result and to cluster the combination of the new search results and the original search result, so as to form the clustering information at various levels together with the previous clustering information, making the web user to get more detailed and more accurate search result easily.
  • As to how to generate new query keywords, in addition to the above mentioned method of combining the previous query with the name of the selected cluster to generate new query keywords, a method for generating new query keywords as shown in FIG. 4 may also be used. Next, in conjunction with FIG. 4, a detailed description will be given to the generation of new query keywords in the embodiment shown in FIG. 3.
  • As shown in FIG. 4, in Step 401, the relevant documents are collected. There are two kinds of relevant documents, that is, the documents that have been read by the web user or the documents that belong to the selected cluster.
  • Then, keywords in the collected relevant documents are determined. In this embodiment, the tf-idf method is used to determine keywords. First in Step 405, weights of all words except stopwords in each document of the collected relevant documents are calculated, wherein the “stopword” refers to those words having zero semantic value, such as “of,” “the,” “to” and the like. Since this kind of words appear in each document in high frequency but with no actual semantic meaning, the weights of this kind of words are not calculated. The formula for calculating the weight of a word with actual meaning is as follows:

  • valuei =tf·idf,
  • where value represents the weight of a word, tf is the frequency of the term in the relevant document set; idf=all_documents/keyword_documents, where all_documents represents the number of all the relevant documents, keyword_documents represents the number of the relevant documents that contain this word. The formula (1) results in larger weights for terms that appear more frequently in the relevant documents, and larger weights for more unusual terms. Then in Step 407, the words with high weights are determined as keywords.
  • After determining the keywords, in Step 410, these keywords are combined with the current query keyword to generate new query keywords.
  • From the above description it can be seen that, using the method for generating new query keywords of this embodiment, it is possible to determine keywords more accurately based on the selection of and the documents read by the web user, and use the keywords to search for the content of interest to the web user.
  • Here, the method for generating new query keywords as shown in FIG. 4 is only illustrative and not restrictive. Those skilled in the art can apply any other suitable method for generating keywords.
  • Under the same inventive concept, FIG. 5 is a block diagram of a visual apparatus 500 for enhancing search result navigation according to an embodiment of the present invention. Next, in conjunction with FIG. 5, a detailed description will be given to this embodiment, in which the visual apparatus 500 for enhancing search result navigation is installed between a search engine 506 and a browser 505 as a separate apparatus.
  • As shown in FIG. 5, the visual apparatus 500 for enhancing search result navigation comprises: a dynamic cluster constructor 501, coupled to the search engine 506, and configured to dynamically cluster the search result from the search engine 506 to generate clustering information; a correlation processor 502 configured to calculate the correlations between the clustering information and the ranked list of the search result; a visualization engine 503, coupled to the browser 505, and configured to perform visualization processing on the clustering information to produce visual cluster hierarchy and display the visual cluster hierarchy and the ranked list of the search result on the browser 505 in a joint manner based on the correlations.
  • In this embodiment, the search engine 506 may be a known search engine, such as Google, Yahoo! or the like, and the browser 505 may be, such as, an IE browser from Microsoft Company, a Netscape browser from Netscape Company, or the like.
  • Next, a detailed description will be given to the specific operation process of the visual apparatus 500 for enhancing search result navigation.
  • When a web user submits a query through the browser 505, the query is transmitted to the search engine 506 through the visualization engine 503 of the visual apparatus 500. The query usually takes the form of a single keyword or a combination of keywords and conforms to the format defined by the search engine 506. The search engine 506 generates a search result based on the query. The search result contains a plurality of documents, each of which constitutes a search result entry. Then the search engine 506 returns a ranked list of the search result to the dynamic cluster constructor 501 of the visual apparatus 500.
  • Preferably, the dynamic cluster constructor 501 may further comprises: a search result selecting unit 5011 configured to receive the ranked list of the search result returned by the search engine 506 and select a predetermined number of search result entries from the ranked list of the received search result to generate a first search result and save the first search result; a clustering unit 5012 configured to cluster the first search result to generate clustering information and send the clustering information and the ranked list of the first search result to the correlation processor 502. In this embodiment, the clustering unit 5012 applies the Suffix Tree Clustering (STC) algorithm to perform the clustering, which algorithm has been described in detail above and its explanation is omitted here. Here the STC algorithm is taken only as an example of clustering algorithms, and those skilled in the art may use any other suitable clustering algorithm to cluster the search result.
  • After receiving the generated clustering information and the ranked list of the first search result from the dynamic cluster constructor 501, the correlation processor 502 calculates the correlations between them, the content contained in the correlation information having been described in the previous embodiments and its explanation being omitted here.
  • After calculating the correlations, the correlation processor 502 sends the clustering information, the ranked list of the first search result and their correlations to the visualization engine 503, which performs visualization processing, including representing the clustering information in a form readable to web user, depicting the attributes of the clustering information and the like.
  • Then the visualization engine 503 displays the visual cluster hierarchy and the ranked list of the search result on the browser 505 in a joint manner based on the correlations calculated by the correlation processor 502. The situations involved in the displaying in a joint manner have been described in the previous embodiments and their explanation is omitted here.
  • The visual apparatus 500 for enhancing search result navigation according to this embodiment and its components can be implemented in hardware circuits, such as super large-scale integrated circuits or gate arrays, semiconductors such as logic chips and transistors, or programmable hardware devices such as field programmable gate arrays and programmable logic devices, and also can be implemented in software executed by various kinds of processors, and further can be implemented in a combination of the above-mentioned hardware circuits and software.
  • From the above description it can be seen that, using the visual apparatus 500 for enhancing search result navigation according to this embodiment, through combining a traditional ranked list of search results and the visual cluster hierarchy of these search results to display them in a joint manner, a convenient way is provided for the web user to find the potential correlations between the visual cluster hierarchy and the ranked list of the search results, making the web user to locate the required content more easily.
  • Preferably, the visual apparatus 500 for enhancing search result navigation further comprises a keyword generator 504 configured to generate new query keywords when a cluster in the visual cluster hierarchy is selected and send the keywords to the search engine 506. In order to further help the web user to obtain more search result with higher relevance, when the web user uses the browser 505 to select a cluster in the visual cluster hierarchy, in addition to displaying the ranked list of the search result entries contained in the cluster, the web user's selection is also sent to the keyword generator 504 through the visualization engine 503. The keyword generator 504 generates new query keywords based on the selection. How to generate new keywords has been described in the previous embodiments and its explanation is omitted here. Preferably, the keyword generator 504 may receive the cluster selected by the user on the browser and transmitted by the visualization engine 503 to generate new keywords, and send the generated new keywords to the search engine 506 for further searching. The search engine 506 generates a ranked list of the new search result based on the new query keywords and returns it to the dynamic cluster constructor 501.
  • After receiving the returned ranked list of the new search result, the search result selecting unit 5011 of the dynamic cluster constructor 501 selects a predetermined number of search result entries, for instance, the first 200 search result entries, from the ranked list of the new search result, to generate a second search result and save it. Preferably, the selected search result entries can also be merged with those search result entries in the currently saved first search result that are contained in the selected cluster to form the second search result and the second search result is saved. Then, the clustering unit 5012 clusters the second search result to generate the sub-clustering information of the selected cluster. The sub-clustering information and the ranked list of the second search result are sent to the correlation processor 502.
  • Similarly, the correlation processor 502 calculates the correlations between the sub-clustering information and the ranked list of the second search result, and the content of the correlation information has been described in the above embodiments and its description is omitted here. Then, the correlation processor 502 sends the sub-clustering information, the ranked list of the second search result and their correlations to the visualization engine 503.
  • In addition to performing visualization processing on the sub-clustering information, the visualization engine 503 visualizes the clustering information and the sub-clustering information into a tree structure, wherein the clustering information items contained in the clustering information are taken as root nodes and the sub-clustering information items contained in the sub-clustering information are taken as branch nodes.
  • Then, based on the correlations between the sub-clustering information and the ranked list of the second search result, the visualization engine 503 directs displaying the sub-clustering information and the ranked list of the second search result in a joint manner on the browser 505.
  • If the web user continues to select a sub-clustering information item, the visual apparatus 500 for enhancing search result navigation may continue to generate new query keywords for the selected sub-clustering information item through the keyword generator 504, and search for a new search result and perform clustering through the dynamic cluster constructor 501, so as to generate visual cluster hierarchy at different levels to facilitate the web user to find the content of interest.
  • Alternatively, the keyword generator 504 can also be integrated into the visualization engine 503, and receive the selection of the web user through the visualization engine 503, generates new keywords based on the selection and sends them to the search engine 506 through the visualization engine 503.
  • From the above description it can be seen that, the visual apparatus 500 for enhancing search result navigation incorporated with the keyword generator 504 can dynamically search for more search result on the basis of the original limited search result and cluster the combination of the new search result and the original search result, so as to construct the clustering information at various levels together with the previous clustering information, making the web user to get more detailed and more accurate search result easily.
  • FIG. 6 is a block diagram of an example of the keyword generator 504. Next, in conjunction with FIG. 6, a detailed description will be given.
  • As shown in FIG. 6, the keyword generator 504 comprises: a document collector 601 configured to collect relevant documents required for generating query keywords; a weight calculator 602 configured to calculate the weights of all the words except stopwords in each one of the relevant documents; and a keyword combiner 603 configured to select the words with high weights and combine them with the current query keywords to generate new query keywords.
  • When the keyword generator 504 receives a selection of the web user, the document collector 601 collects relevant documents required for generating new query keywords based on the selection, so as to determine new keywords from these relevant documents. The relevant documents collected by the document collector 601 are sent to the weight calculator 602, which calculates the weights of all the words except stopwords in each document. The keyword combiner 603 selects the words with high weights as new keywords and combines them with the current query keywords to generate new query keywords. How to collect the relevant documents and how to calculate the weights have been described in the previous embodiments and their explanation is omitted here.
  • The keyword generator 504 of this embodiment and its components can be implemented in hardware circuits, such as super-large scale integrated circuits or gate arrays, semiconductors such as logic chips and transistors, or programmable hardware devices such as field programmable gate arrays and programmable logic devices, and also can be implemented in software executed by various kinds of processors, and further can be implemented in a combination of the above-mentioned hardware circuits and software.
  • From the above description it can be seen that, using the keyword generator 504 of this embodiment, it is possible to determine keywords more accurately based on the selection of and the documents read by the web user, and use the keywords to search for the content of interest to the web user.
  • Besides, the above visual apparatus for enhancing search result navigation may be combined with an existing browser to form a new browser. The existing browser may be, for instance, an IE browser from Microsoft Company, a Netscape browser from Netscape Company or the like.
  • On the other hand, above visual apparatus for enhancing search result navigation may be combined with an existing search engine to form a new search engine. An existing search engine may be a know search engine, such as Google, Yahoo! or the like.
  • The present invention further provides a program product, comprising: program codes for implementing all the above methods and carrying media for carrying the program codes.
  • Though a visual method and corresponding apparatus for enhancing search result navigation of the present invention has been described in detail in conjunction with embodiments, it should be understand that those skilled in the art can make various modifications to the above-mentioned embodiments without departing from the spirit and scope of the present invention.

Claims (22)

1. A visual method for enhancing search result navigation, comprising:
obtaining a first search result from a search engine;
clustering the first search result to get clustering information;
calculating the correlations between the clustering information and a ranked list of the first search result, and performing visualization processing on the clustering information; and
displaying the visual cluster hierarchy and the ranked list of the first search result in a joint manner based on the correlations.
2. The visual method for enhancing search result navigation according to claim 1, all the limitations of which are incorporated herein by reference, wherein the first search result contains a predetermined number of search result entries in the search results produced by the search engine based on a query.
3. The visual method for enhancing search result navigation according to claim 1, all the limitations of which are incorporated herein by reference, wherein displaying the visual cluster hierarchy and the ranked list of the first search result in a joint manner comprises one of the following:
a. when the pages of the ranked list of the first search result are displayed, the cluster in the visual cluster hierarchy that contains the most search result entries of the first search result is highlighted;
b. when the pages of the ranked list of the first search result are displayed, the cluster in the visual cluster hierarchy that contains the most search result entries of the first search result in the first page is highlighted;
c. when a search result entry in the ranked list of the first search result is selected, the cluster in the visual cluster hierarchy that contains the search result entry and the most search result entries of the first search result is highlighted;
d. when a search result entry in the ranked list of the first search result is selected, the cluster in the visual cluster hierarchy that contains the search result entry and the most search result entries of the first search result in the first page is highlighted; and
e. when a cluster in the visual cluster hierarchy is selected, the ranked list of the search result entries in the first search result that are contained in the cluster is displayed.
4. The visual method for enhancing search result navigation according to claim 1, all the limitations of which are incorporated herein by reference, wherein the step of clustering the first search result applies the Suffix Tree Clustering algorithm.
5. The visual method for enhancing search result navigation according to anyone of claim 1, all the limitations of which are incorporated herein by reference, further comprising:
selecting a cluster in the visual cluster hierarchy;
generating new query keywords and submitting them to the search engine;
producing a new search result based on the new query keywords by the search engine;
selecting a predetermined number of search result entries in the new search result to produce a second search result;
clustering the second search result to obtain sub-clustering information;
calculating the correlations between the sub-clustering information and the ranked list of the second search result, and performing visualization processing on the sub-clustering information;
displaying the visual sub-clustering information and the ranked list of the second search result in a joint manner based on the correlations; and
when a visual sub-clustering information item in the visual sub-clustering information is selected, repeating the above steps.
6. The visual method for enhancing search result navigation according to claim 5, all the limitations of which are incorporated herein by reference, wherein the visual cluster hierarchy and the visual sub-clustering information form a tree structure, wherein the clusters contained in the visual cluster hierarchy are taken as root nodes and the visual sub-clustering information items contained in the visual sub-clustering information are taken as branch nodes
7. The visual method for enhancing search result navigation according to claim 5, all the limitations of which are incorporated herein by reference, wherein the step of generating new query keywords comprises: combining the current query keywords with the name of the selected cluster to generate new query keywords.
8. The visual method for enhancing search result navigation according to claim 5, all the limitations of which are incorporated herein by reference, wherein the step of generating new query keywords comprises:
collecting relevant documents;
determining keywords in the relevant documents; and
combining the keywords with the current query to produce a new query.
9. The visual method for enhancing search result navigation according to claim 8, all the limitations of which are incorporated herein by reference, wherein the relevant documents are the documents that have been read by the web user or the documents that belong to the selected cluster.
10. The visual method for enhancing search result navigation according to claim 8, all the limitations of which are incorporated herein by reference, wherein the step of determining keywords in the relevant documents comprises:
calculating weights of all words except stopwords in each document of the relevant documents with the following formula:

valuei =tf·idf,
where value represents the weight of a word, tf represents the frequency at which the word appears in the relevant documents;
idf=all_documents/keyword_documents, where all_documents represents the number of all the relevant documents, keyword_documents represents the number of the relevant documents that contain this word; and
determining the words with high weights as keywords.
11. A visual apparatus for enhancing search result navigation, comprising:
a dynamic cluster constructor configured to select search results coming from a search engine to get a first search result and dynamically cluster the first search result to generate clustering information;
a correlation processor configured to calculate the correlations between the clustering information and the ranked list of the first search result; and
a visualization engine configured to perform visualization processing on the clustering information to produce visual cluster hierarchy, and display the visual cluster hierarchy and the ranked list of the first search result on a browser in a joint manner based on the correlations.
12. The visual apparatus for enhancing search result navigation according to claim 11, all the limitations of which are incorporated herein by reference, wherein the dynamic cluster constructor comprises:
a search result selecting unit configured to select a predetermined number of search result entries from the received search result to generate a first search result and save the first search result; and
a clustering unit configured to cluster the first search result to generate the clustering information.
13. The visual apparatus for enhancing search result navigation according to claim 12, all the limitations of which are incorporated herein by reference, wherein the visualization engine displaying the visual cluster hierarchy and the ranked list of the first search result on a browser in a joint manner comprises any one of the following cases:
a. when the pages of the ranked list of the first search result are displayed, the cluster in the visual cluster hierarchy that contains the most search result entries of the first search result is highlighted;
b. when the pages of the ranked list of the first search result are displayed, the cluster in the visual cluster hierarchy that contains the most search result entries of the first search result in the first page is highlighted;
c. when a search result entry in the ranked list of the first search result is selected, the cluster in the visual cluster hierarchy that contains the search result entry and the most search result entries of the first search result is highlighted;
d. when a search result entry in the ranked list of the first search result is selected, the cluster in the visual cluster hierarchy that contains the search result entry and the most search result entries of the first search result in the first page is highlighted; and
e. when a cluster in the visual cluster hierarchy is selected, the ranked list of the search result entries in the first search result that are contained in the cluster is displayed.
14. The visual apparatus for enhancing search result navigation according to claim 12, all the limitations of which are incorporated herein by reference, wherein the clustering unit applies the Suffix Tree Clustering algorithm.
15. The visual apparatus for enhancing search result navigation according to claim 12, all the limitations of which are incorporated herein by reference, further comprising:
a keyword generator configured to generate new query keywords when selecting a cluster in the visual cluster hierarchy and send the keywords to the search engine through the dynamic cluster constructor;
wherein, the search engine performs searching based on the new query keywords generated by the keyword generator and returns a ranked list of the new search result;
the search result selecting unit of the dynamic cluster constructor selects a predetermined number of search result entries from the received ranked list of the new search result to generate a second search result and save it;
the clustering unit of the dynamic cluster constructor clusters the second search result to generate sub-clustering information;
the visualization engine visualizes the clustering information and the sub-clustering information into a tree structure, wherein the clustering information items contained in the clustering information are taken as root nodes and the sub-clustering information items contained in the sub-clustering information are taken as branch nodes, and the visualization engine also displays the visual cluster hierarchy and the ranked list of the second search result in a joint manner on the browser.
16. The visual apparatus for enhancing search result navigation according to claim 15, all the limitations of which are incorporated herein by reference, wherein the keyword generator combines the current query keywords with the name of the selected cluster to generate new query keywords.
17. The visual apparatus for enhancing search result navigation according to claim 15, all the limitations of which are incorporated herein by reference, wherein the keyword generator comprises:
a document collector configured to collect relevant documents;
a weight calculator configured to calculate weights of all words except stopwords in each one of the relevant documents; and
a keyword combiner configured to select the words with high weights and combine them with the current query keywords to generate new query keywords.
18. The visual apparatus for enhancing search result navigation according to claim 17, all the limitations of which are incorporated herein by reference, wherein the relevant documents are the documents that have been read by the web user or the documents that belong to the selected cluster.
19. The visual apparatus for enhancing search result navigation according to claim 17, all the limitations of which are incorporated herein by reference, wherein the weight calculator applies the following formula:

valuei =tf·idf;
where value represents the weight of a word, tf represents the frequency at which the word appears in the relevant documents;

idf=all_documents/keyword_documents,
where all_documents represents the number of all the relevant documents, keyword_documents represents the number of the relevant documents that contain this word.
20. A browser, comprising the visual apparatus for enhancing search result navigation comprising:
a dynamic cluster constructor configured to select search results coming from a search engine to get a first search result and dynamically cluster the first search result to generate clustering information;
a correlation processor configured to calculate the correlations between the clustering information and the ranked list of the first search result; and
a visualization engine configured to perform visualization processing on the clustering information to produce visual cluster hierarchy, and display the visual cluster hierarchy and the ranked list of the first search result on a browser in a joint manner based on the correlations.
21. A search engine, comprising the visual apparatus for enhancing search result navigation comprising:
a dynamic cluster constructor configured to select search results coming from a search engine to get a first search result and dynamically cluster the first search result to generate clustering information;
a correlation processor configured to calculate the correlations between the clustering information and the ranked list of the first search result; and
a visualization engine configured to perform visualization processing on the clustering information to produce visual cluster hierarchy, and display the visual cluster hierarchy and the ranked list of the first search result on a browser in a joint manner based on the correlations.
22. A program product, comprising:
program codes and carrying media for carrying said program codes, for implementing a method comprising:
obtaining a first search result from a search engine;
clustering the first search result to get clustering information;
calculating the correlations between the clustering information and a ranked list of the first search result, and performing visualization processing on the clustering information; and
displaying the visual cluster hierarchy and the ranked list of the first search result in a joint manner based on the correlations.
US12/061,720 2006-01-12 2008-04-03 Visual method and apparatus for enhancing search result navigation Abandoned US20080222145A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/061,720 US20080222145A1 (en) 2006-01-12 2008-04-03 Visual method and apparatus for enhancing search result navigation

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CNB2006100012678A CN100481077C (en) 2006-01-12 2006-01-12 Visual method and device for strengthening search result guide
CNCN9200610001267.8 2006-01-12
US11/619,665 US7502786B2 (en) 2006-01-12 2007-01-04 Visual method and apparatus for enhancing search result navigation
US12/061,720 US20080222145A1 (en) 2006-01-12 2008-04-03 Visual method and apparatus for enhancing search result navigation

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US11/619,665 Continuation US7502786B2 (en) 2006-01-12 2007-01-04 Visual method and apparatus for enhancing search result navigation

Publications (1)

Publication Number Publication Date
US20080222145A1 true US20080222145A1 (en) 2008-09-11

Family

ID=38692584

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/619,665 Expired - Fee Related US7502786B2 (en) 2006-01-12 2007-01-04 Visual method and apparatus for enhancing search result navigation
US12/061,720 Abandoned US20080222145A1 (en) 2006-01-12 2008-04-03 Visual method and apparatus for enhancing search result navigation

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US11/619,665 Expired - Fee Related US7502786B2 (en) 2006-01-12 2007-01-04 Visual method and apparatus for enhancing search result navigation

Country Status (2)

Country Link
US (2) US7502786B2 (en)
CN (1) CN100481077C (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080120292A1 (en) * 2006-11-20 2008-05-22 Neelakantan Sundaresan Search clustering
US20100211588A1 (en) * 2009-02-13 2010-08-19 Microsoft Corporation Context-Aware Query Suggestion By Mining Log Data
US20100277480A1 (en) * 2009-04-30 2010-11-04 International Business Machines Corporation Layout method and system in a display area for disconnected dynamic networks
US20110029933A1 (en) * 2008-01-23 2011-02-03 Shixian Chu Method and apparatus for information visualized expression, and visualized human computer interactive expression interface thereof
US8756231B2 (en) 2010-01-28 2014-06-17 International Business Machines Corporation Search using proximity for clustering information
US20140250139A1 (en) * 2008-07-24 2014-09-04 Marissa H. Dulaney Method and Apparatus Requesting Information Upon Returning To A Search Results List
WO2014194656A1 (en) * 2013-06-05 2014-12-11 Tencent Technology (Shenzhen) Company Limited Method and device for data screening
CN104881447A (en) * 2015-05-14 2015-09-02 百度在线网络技术(北京)有限公司 Searching method and device
US20160378858A1 (en) * 2010-07-14 2016-12-29 Yahoo! Inc. Clustering of search results
US9922139B2 (en) 2013-06-05 2018-03-20 Tencent Technology (Shenzhen) Company Limited Method and device for data screening
US10963476B2 (en) 2015-08-03 2021-03-30 International Business Machines Corporation Searching and visualizing data for a network search based on relationships within the data
WO2021185326A1 (en) * 2020-03-20 2021-09-23 北京三快在线科技有限公司 User interaction method used for searching

Families Citing this family (168)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8131736B1 (en) 2005-03-01 2012-03-06 Google Inc. System and method for navigating documents
US20070233678A1 (en) * 2006-04-04 2007-10-04 Bigelow David H System and method for a visual catalog
US8930331B2 (en) 2007-02-21 2015-01-06 Palantir Technologies Providing unique views of data based on changes or rules
US8073803B2 (en) * 2007-07-16 2011-12-06 Yahoo! Inc. Method for matching electronic advertisements to surrounding context based on their advertisement content
US20090157610A1 (en) * 2007-12-13 2009-06-18 Allen Jr Lloyd W Method, system, and computer program product for applying a graphical hierarchical context in a search query
US20090216563A1 (en) * 2008-02-25 2009-08-27 Michael Sandoval Electronic profile development, storage, use and systems for taking action based thereon
US7996432B2 (en) * 2008-02-25 2011-08-09 International Business Machines Corporation Systems, methods and computer program products for the creation of annotations for media content to enable the selective management and playback of media content
US7996431B2 (en) * 2008-02-25 2011-08-09 International Business Machines Corporation Systems, methods and computer program products for generating metadata and visualizing media content
US20090216743A1 (en) * 2008-02-25 2009-08-27 International Business Machines Corporation Systems, Methods and Computer Program Products for the Use of Annotations for Media Content to Enable the Selective Management and Playback of Media Content
US8027999B2 (en) * 2008-02-25 2011-09-27 International Business Machines Corporation Systems, methods and computer program products for indexing, searching and visualizing media content
US8255396B2 (en) 2008-02-25 2012-08-28 Atigeo Llc Electronic profile development, storage, use, and systems therefor
WO2009151640A1 (en) * 2008-06-13 2009-12-17 Ebay Inc. Method and system for clustering
US8358308B2 (en) * 2008-06-27 2013-01-22 Microsoft Corporation Using visual techniques to manipulate data
US8090715B2 (en) * 2008-07-14 2012-01-03 Disney Enterprises, Inc. Method and system for dynamically generating a search result
US9348499B2 (en) 2008-09-15 2016-05-24 Palantir Technologies, Inc. Sharing objects that rely on local resources with outside servers
US20100125569A1 (en) * 2008-11-18 2010-05-20 Yahoo! Inc. System and method for autohyperlinking and navigation in url based context queries
US8122820B2 (en) * 2008-12-19 2012-02-28 Whirlpool Corporation Food processor with dicing tool
US9104695B1 (en) 2009-07-27 2015-08-11 Palantir Technologies, Inc. Geotagging structured data
US20110093478A1 (en) * 2009-10-19 2011-04-21 Business Objects Software Ltd. Filter hints for result sets
US20110113357A1 (en) * 2009-11-12 2011-05-12 International Business Machines Corporation Manipulating results of a media archive search
CN102971738A (en) 2010-05-06 2013-03-13 水宙责任有限公司 Systems, methods, and computer readable media for security in profile utilizing systems
FR2960324B1 (en) * 2010-05-20 2013-04-12 Sagemcom Broadband Sas METHOD FOR NAVIGATING A RESULT OF A SEARCH USED BY A SEARCH ENGINE
US9355179B2 (en) 2010-09-24 2016-05-31 Microsoft Technology Licensing, Llc Visual-cue refinement of user query results
US9092482B2 (en) 2013-03-14 2015-07-28 Palantir Technologies, Inc. Fair scheduling for mixed-query loads
US9547693B1 (en) 2011-06-23 2017-01-17 Palantir Technologies Inc. Periodic database search manager for multiple data sources
US8799240B2 (en) 2011-06-23 2014-08-05 Palantir Technologies, Inc. System and method for investigating large amounts of data
US8732574B2 (en) 2011-08-25 2014-05-20 Palantir Technologies, Inc. System and method for parameterizing documents for automatic workflow generation
US9459767B2 (en) 2011-08-29 2016-10-04 Ebay Inc. Tablet web visual browsing
US8504542B2 (en) 2011-09-02 2013-08-06 Palantir Technologies, Inc. Multi-row transactions
US9043350B2 (en) * 2011-09-22 2015-05-26 Microsoft Technology Licensing, Llc Providing topic based search guidance
US9009144B1 (en) * 2012-02-23 2015-04-14 Google Inc. Dynamically identifying and removing potential stopwords from a local search query
US9176948B2 (en) 2012-03-27 2015-11-03 Google Inc. Client/server-based statistical phrase distribution display and associated text entry technique
US9477711B2 (en) * 2012-05-16 2016-10-25 Google Inc. Knowledge panel
CN102937983A (en) * 2012-10-19 2013-02-20 北京奇虎科技有限公司 Personalized website navigation system
US9348677B2 (en) 2012-10-22 2016-05-24 Palantir Technologies Inc. System and method for batch evaluation programs
CN103020206A (en) * 2012-12-05 2013-04-03 北京海量融通软件技术有限公司 Knowledge-network-based search result focusing system and focusing method
US9501507B1 (en) 2012-12-27 2016-11-22 Palantir Technologies Inc. Geo-temporal indexing and searching
US9069882B2 (en) * 2013-01-22 2015-06-30 International Business Machines Corporation Mapping and boosting of terms in a format independent data retrieval query
US9123086B1 (en) 2013-01-31 2015-09-01 Palantir Technologies, Inc. Automatically generating event objects from images
US10037314B2 (en) 2013-03-14 2018-07-31 Palantir Technologies, Inc. Mobile reports
US9514191B2 (en) * 2013-03-14 2016-12-06 Microsoft Technology Licensing, Llc Visualizing ranking factors for items in a search result list
US8788405B1 (en) 2013-03-15 2014-07-22 Palantir Technologies, Inc. Generating data clusters with customizable analysis strategies
US8909656B2 (en) 2013-03-15 2014-12-09 Palantir Technologies Inc. Filter chains with associated multipath views for exploring large data sets
US9965937B2 (en) 2013-03-15 2018-05-08 Palantir Technologies Inc. External malware data item clustering and analysis
US8917274B2 (en) 2013-03-15 2014-12-23 Palantir Technologies Inc. Event matrix based on integrated data
US8937619B2 (en) 2013-03-15 2015-01-20 Palantir Technologies Inc. Generating an object time series from data objects
US8868486B2 (en) 2013-03-15 2014-10-21 Palantir Technologies Inc. Time-sensitive cube
US10275778B1 (en) 2013-03-15 2019-04-30 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation based on automatic malfeasance clustering of related data in various data structures
US8799799B1 (en) 2013-05-07 2014-08-05 Palantir Technologies Inc. Interactive geospatial map
US9720972B2 (en) * 2013-06-17 2017-08-01 Microsoft Technology Licensing, Llc Cross-model filtering
US9223773B2 (en) 2013-08-08 2015-12-29 Palatir Technologies Inc. Template system for custom document generation
US9335897B2 (en) 2013-08-08 2016-05-10 Palantir Technologies Inc. Long click display of a context menu
US8713467B1 (en) 2013-08-09 2014-04-29 Palantir Technologies, Inc. Context-sensitive views
CN104346413A (en) * 2013-08-09 2015-02-11 聚游互动(北京)科技发展有限公司 Method and system for presenting visual search results on mobile terminal
EP2840512B1 (en) * 2013-08-21 2015-10-21 Ontoforce NV A data processing system for adaptive visualisation of faceted search results
US9785317B2 (en) 2013-09-24 2017-10-10 Palantir Technologies Inc. Presentation and analysis of user interaction data
US8938686B1 (en) 2013-10-03 2015-01-20 Palantir Technologies Inc. Systems and methods for analyzing performance of an entity
US8812960B1 (en) 2013-10-07 2014-08-19 Palantir Technologies Inc. Cohort-based presentation of user interaction data
US9116975B2 (en) 2013-10-18 2015-08-25 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive simultaneous querying of multiple data stores
US8924872B1 (en) 2013-10-18 2014-12-30 Palantir Technologies Inc. Overview user interface of emergency call data of a law enforcement agency
US11238056B2 (en) 2013-10-28 2022-02-01 Microsoft Technology Licensing, Llc Enhancing search results with social labels
US9021384B1 (en) 2013-11-04 2015-04-28 Palantir Technologies Inc. Interactive vehicle information map
US9542440B2 (en) 2013-11-04 2017-01-10 Microsoft Technology Licensing, Llc Enterprise graph search based on object and actor relationships
US8868537B1 (en) 2013-11-11 2014-10-21 Palantir Technologies, Inc. Simple web search
US9105000B1 (en) 2013-12-10 2015-08-11 Palantir Technologies Inc. Aggregating data from a plurality of data sources
US10019520B1 (en) * 2013-12-13 2018-07-10 Joy Sargis Muske System and process for using artificial intelligence to provide context-relevant search engine results
US9727622B2 (en) 2013-12-16 2017-08-08 Palantir Technologies, Inc. Methods and systems for analyzing entity performance
US9552615B2 (en) 2013-12-20 2017-01-24 Palantir Technologies Inc. Automated database analysis to detect malfeasance
US10356032B2 (en) 2013-12-26 2019-07-16 Palantir Technologies Inc. System and method for detecting confidential information emails
US9043696B1 (en) 2014-01-03 2015-05-26 Palantir Technologies Inc. Systems and methods for visual definition of data associations
US8832832B1 (en) 2014-01-03 2014-09-09 Palantir Technologies Inc. IP reputation
US11645289B2 (en) 2014-02-04 2023-05-09 Microsoft Technology Licensing, Llc Ranking enterprise graph queries
US9483162B2 (en) 2014-02-20 2016-11-01 Palantir Technologies Inc. Relationship visualizations
US9009827B1 (en) 2014-02-20 2015-04-14 Palantir Technologies Inc. Security sharing system
US9870432B2 (en) 2014-02-24 2018-01-16 Microsoft Technology Licensing, Llc Persisted enterprise graph queries
US11657060B2 (en) 2014-02-27 2023-05-23 Microsoft Technology Licensing, Llc Utilizing interactivity signals to generate relationships and promote content
US10757201B2 (en) 2014-03-01 2020-08-25 Microsoft Technology Licensing, Llc Document and content feed
US10255563B2 (en) 2014-03-03 2019-04-09 Microsoft Technology Licensing, Llc Aggregating enterprise graph content around user-generated topics
US10169457B2 (en) 2014-03-03 2019-01-01 Microsoft Technology Licensing, Llc Displaying and posting aggregated social activity on a piece of enterprise content
US10394827B2 (en) 2014-03-03 2019-08-27 Microsoft Technology Licensing, Llc Discovering enterprise content based on implicit and explicit signals
US9727376B1 (en) 2014-03-04 2017-08-08 Palantir Technologies, Inc. Mobile tasks
US8935201B1 (en) 2014-03-18 2015-01-13 Palantir Technologies Inc. Determining and extracting changed data from a data source
US9857958B2 (en) 2014-04-28 2018-01-02 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive access of, investigation of, and analysis of data objects stored in one or more databases
US9009171B1 (en) 2014-05-02 2015-04-14 Palantir Technologies Inc. Systems and methods for active column filtering
CN104063430A (en) * 2014-06-10 2014-09-24 百度在线网络技术(北京)有限公司 Method and device for displaying search result
US9129219B1 (en) 2014-06-30 2015-09-08 Palantir Technologies, Inc. Crime risk forecasting
US9619557B2 (en) 2014-06-30 2017-04-11 Palantir Technologies, Inc. Systems and methods for key phrase characterization of documents
US9535974B1 (en) 2014-06-30 2017-01-03 Palantir Technologies Inc. Systems and methods for identifying key phrase clusters within documents
US10572496B1 (en) 2014-07-03 2020-02-25 Palantir Technologies Inc. Distributed workflow system and database with access controls for city resiliency
US9021260B1 (en) 2014-07-03 2015-04-28 Palantir Technologies Inc. Malware data item analysis
US9256664B2 (en) 2014-07-03 2016-02-09 Palantir Technologies Inc. System and method for news events detection and visualization
US9202249B1 (en) 2014-07-03 2015-12-01 Palantir Technologies Inc. Data item clustering and analysis
US9785773B2 (en) 2014-07-03 2017-10-10 Palantir Technologies Inc. Malware data item analysis
US9454281B2 (en) 2014-09-03 2016-09-27 Palantir Technologies Inc. System for providing dynamic linked panels in user interface
US10061826B2 (en) 2014-09-05 2018-08-28 Microsoft Technology Licensing, Llc. Distant content discovery
US9501851B2 (en) 2014-10-03 2016-11-22 Palantir Technologies Inc. Time-series analysis system
US9767172B2 (en) 2014-10-03 2017-09-19 Palantir Technologies Inc. Data aggregation and analysis system
US9785328B2 (en) 2014-10-06 2017-10-10 Palantir Technologies Inc. Presentation of multivariate data on a graphical user interface of a computing system
US9984133B2 (en) 2014-10-16 2018-05-29 Palantir Technologies Inc. Schematic and database linking system
US9229952B1 (en) 2014-11-05 2016-01-05 Palantir Technologies, Inc. History preserving data pipeline system and method
US9043894B1 (en) 2014-11-06 2015-05-26 Palantir Technologies Inc. Malicious software detection in a computing system
US10552994B2 (en) 2014-12-22 2020-02-04 Palantir Technologies Inc. Systems and interactive user interfaces for dynamic retrieval, analysis, and triage of data items
US10362133B1 (en) 2014-12-22 2019-07-23 Palantir Technologies Inc. Communication data processing architecture
US9348920B1 (en) 2014-12-22 2016-05-24 Palantir Technologies Inc. Concept indexing among database of documents using machine learning techniques
US9367872B1 (en) 2014-12-22 2016-06-14 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation of bad actor behavior based on automatic clustering of related data in various data structures
US9817563B1 (en) 2014-12-29 2017-11-14 Palantir Technologies Inc. System and method of generating data points from one or more data stores of data items for chart creation and manipulation
US9335911B1 (en) 2014-12-29 2016-05-10 Palantir Technologies Inc. Interactive user interface for dynamic data analysis exploration and query processing
US9870205B1 (en) 2014-12-29 2018-01-16 Palantir Technologies Inc. Storing logical units of program code generated using a dynamic programming notebook user interface
US10372879B2 (en) 2014-12-31 2019-08-06 Palantir Technologies Inc. Medical claims lead summary report generation
CN104598549B (en) * 2014-12-31 2019-03-05 北京畅游天下网络技术有限公司 Data analysing method and system
US10387834B2 (en) 2015-01-21 2019-08-20 Palantir Technologies Inc. Systems and methods for accessing and storing snapshots of a remote application in a document
US9727560B2 (en) 2015-02-25 2017-08-08 Palantir Technologies Inc. Systems and methods for organizing and identifying documents via hierarchies and dimensions of tags
EP3070622A1 (en) 2015-03-16 2016-09-21 Palantir Technologies, Inc. Interactive user interfaces for location-based data analysis
US9886467B2 (en) 2015-03-19 2018-02-06 Plantir Technologies Inc. System and method for comparing and visualizing data entities and data entity series
CN108647276B (en) * 2015-05-11 2022-04-05 何杨洲 Searching method
US10360621B2 (en) * 2015-05-20 2019-07-23 Ebay Inc. Near-identical multi-faceted entity identification in search
US9460175B1 (en) 2015-06-03 2016-10-04 Palantir Technologies Inc. Server implemented geographic information system with graphical interface
US10671677B2 (en) 2015-06-12 2020-06-02 Smugmug, Inc. Advanced keyword search application
US9454785B1 (en) 2015-07-30 2016-09-27 Palantir Technologies Inc. Systems and user interfaces for holistic, data-driven investigation of bad actor behavior based on clustering and scoring of related data
US9996595B2 (en) 2015-08-03 2018-06-12 Palantir Technologies, Inc. Providing full data provenance visualization for versioned datasets
US9456000B1 (en) 2015-08-06 2016-09-27 Palantir Technologies Inc. Systems, methods, user interfaces, and computer-readable media for investigating potential malicious communications
US9600146B2 (en) 2015-08-17 2017-03-21 Palantir Technologies Inc. Interactive geospatial map
US10489391B1 (en) 2015-08-17 2019-11-26 Palantir Technologies Inc. Systems and methods for grouping and enriching data items accessed from one or more databases for presentation in a user interface
US10102369B2 (en) 2015-08-19 2018-10-16 Palantir Technologies Inc. Checkout system executable code monitoring, and user account compromise determination system
US10853378B1 (en) 2015-08-25 2020-12-01 Palantir Technologies Inc. Electronic note management via a connected entity graph
US11150917B2 (en) 2015-08-26 2021-10-19 Palantir Technologies Inc. System for data aggregation and analysis of data from a plurality of data sources
US9485265B1 (en) 2015-08-28 2016-11-01 Palantir Technologies Inc. Malicious activity detection system capable of efficiently processing data accessed from databases and generating alerts for display in interactive user interfaces
US10706434B1 (en) 2015-09-01 2020-07-07 Palantir Technologies Inc. Methods and systems for determining location information
US9639580B1 (en) 2015-09-04 2017-05-02 Palantir Technologies, Inc. Computer-implemented systems and methods for data management and visualization
US9576015B1 (en) 2015-09-09 2017-02-21 Palantir Technologies, Inc. Domain-specific language for dataset transformations
US10296617B1 (en) 2015-10-05 2019-05-21 Palantir Technologies Inc. Searches of highly structured data
CN106815274B (en) * 2015-12-02 2022-02-18 中兴通讯股份有限公司 Hadoop-based log data mining method and system
US9542446B1 (en) 2015-12-17 2017-01-10 Palantir Technologies, Inc. Automatic generation of composite datasets based on hierarchical fields
US10109094B2 (en) 2015-12-21 2018-10-23 Palantir Technologies Inc. Interface to index and display geospatial data
US9823818B1 (en) 2015-12-29 2017-11-21 Palantir Technologies Inc. Systems and interactive user interfaces for automatic generation of temporal representation of data objects
US10089289B2 (en) 2015-12-29 2018-10-02 Palantir Technologies Inc. Real-time document annotation
US9612723B1 (en) 2015-12-30 2017-04-04 Palantir Technologies Inc. Composite graphical interface with shareable data-objects
CN105786969B (en) * 2016-02-01 2020-07-03 百度在线网络技术(北京)有限公司 Information display method and device
US10698938B2 (en) 2016-03-18 2020-06-30 Palantir Technologies Inc. Systems and methods for organizing and identifying documents via hierarchies and dimensions of tags
US10068199B1 (en) 2016-05-13 2018-09-04 Palantir Technologies Inc. System to catalogue tracking data
US10324609B2 (en) 2016-07-21 2019-06-18 Palantir Technologies Inc. System for providing dynamic linked panels in user interface
US10719188B2 (en) 2016-07-21 2020-07-21 Palantir Technologies Inc. Cached database and synchronization system for providing dynamic linked panels in user interface
US9686357B1 (en) 2016-08-02 2017-06-20 Palantir Technologies Inc. Mapping content delivery
US10437840B1 (en) 2016-08-19 2019-10-08 Palantir Technologies Inc. Focused probabilistic entity resolution from multiple data sources
US10318630B1 (en) 2016-11-21 2019-06-11 Palantir Technologies Inc. Analysis of large bodies of textual data
US10515433B1 (en) 2016-12-13 2019-12-24 Palantir Technologies Inc. Zoom-adaptive data granularity to achieve a flexible high-performance interface for a geospatial mapping system
US10270727B2 (en) 2016-12-20 2019-04-23 Palantir Technologies, Inc. Short message communication within a mobile graphical map
US10460602B1 (en) 2016-12-28 2019-10-29 Palantir Technologies Inc. Interactive vehicle information mapping system
US10579239B1 (en) 2017-03-23 2020-03-03 Palantir Technologies Inc. Systems and methods for production and display of dynamically linked slide presentations
US10895946B2 (en) 2017-05-30 2021-01-19 Palantir Technologies Inc. Systems and methods for using tiled data
US11334216B2 (en) 2017-05-30 2022-05-17 Palantir Technologies Inc. Systems and methods for visually presenting geospatial information
US10956406B2 (en) 2017-06-12 2021-03-23 Palantir Technologies Inc. Propagated deletion of database records and derived data
US10403011B1 (en) 2017-07-18 2019-09-03 Palantir Technologies Inc. Passing system with an interactive user interface
US10371537B1 (en) 2017-11-29 2019-08-06 Palantir Technologies Inc. Systems and methods for flexible route planning
US11599706B1 (en) 2017-12-06 2023-03-07 Palantir Technologies Inc. Systems and methods for providing a view of geospatial information
US10698756B1 (en) 2017-12-15 2020-06-30 Palantir Technologies Inc. Linking related events for various devices and services in computer log files on a centralized server
US11599369B1 (en) 2018-03-08 2023-03-07 Palantir Technologies Inc. Graphical user interface configuration system
US10896234B2 (en) 2018-03-29 2021-01-19 Palantir Technologies Inc. Interactive geographical map
US10830599B2 (en) 2018-04-03 2020-11-10 Palantir Technologies Inc. Systems and methods for alternative projections of geographical information
US11585672B1 (en) 2018-04-11 2023-02-21 Palantir Technologies Inc. Three-dimensional representations of routes
US10754822B1 (en) 2018-04-18 2020-08-25 Palantir Technologies Inc. Systems and methods for ontology migration
US10885021B1 (en) 2018-05-02 2021-01-05 Palantir Technologies Inc. Interactive interpreter and graphical user interface
US10429197B1 (en) 2018-05-29 2019-10-01 Palantir Technologies Inc. Terrain analysis for automatic route determination
US11119630B1 (en) 2018-06-19 2021-09-14 Palantir Technologies Inc. Artificial intelligence assisted evaluations and user interface for same
US10467435B1 (en) 2018-10-24 2019-11-05 Palantir Technologies Inc. Approaches for managing restrictions for middleware applications
US11025672B2 (en) 2018-10-25 2021-06-01 Palantir Technologies Inc. Approaches for securing middleware data access
CN109816127B (en) * 2019-01-11 2022-12-30 广州市骑鹅游信息技术咨询服务有限公司 Intelligent ticket recommendation method and system
CN116502241B (en) * 2023-06-29 2023-10-10 中汽智联技术有限公司 Method and system for enhancing vulnerability scanning tool based on PoC load library

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010037347A1 (en) * 2000-03-10 2001-11-01 Kelliher Margaret Therese Method for automated web site maintenance via searching
US6415282B1 (en) * 1998-04-22 2002-07-02 Nec Usa, Inc. Method and apparatus for query refinement
US20020138487A1 (en) * 2001-03-22 2002-09-26 Dov Weiss Method and system for mapping and searching the internet and displaying the results in a visual form
US20020169764A1 (en) * 2001-05-09 2002-11-14 Robert Kincaid Domain specific knowledge-based metasearch system and methods of using
US20040093321A1 (en) * 2002-11-13 2004-05-13 Xerox Corporation Search engine with structured contextual clustering
US20050144158A1 (en) * 2003-11-18 2005-06-30 Capper Liesl J. Computer network search engine
US20060026152A1 (en) * 2004-07-13 2006-02-02 Microsoft Corporation Query-based snippet clustering for search result grouping
US20070208697A1 (en) * 2001-06-18 2007-09-06 Pavitra Subramaniam System and method to enable searching across multiple databases and files using a single search

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002041190A2 (en) * 2000-11-15 2002-05-23 Holbrook David M Apparatus and method for organizing and/or presenting data
US7334195B2 (en) * 2003-10-14 2008-02-19 Microsoft Corporation System and process for presenting search results in a histogram/cluster format

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6415282B1 (en) * 1998-04-22 2002-07-02 Nec Usa, Inc. Method and apparatus for query refinement
US20010037347A1 (en) * 2000-03-10 2001-11-01 Kelliher Margaret Therese Method for automated web site maintenance via searching
US7340464B2 (en) * 2000-03-10 2008-03-04 General Electric Company Method for automated web site maintenance via searching
US20020138487A1 (en) * 2001-03-22 2002-09-26 Dov Weiss Method and system for mapping and searching the internet and displaying the results in a visual form
US7085753B2 (en) * 2001-03-22 2006-08-01 E-Nvent Usa Inc. Method and system for mapping and searching the Internet and displaying the results in a visual form
US20020169764A1 (en) * 2001-05-09 2002-11-14 Robert Kincaid Domain specific knowledge-based metasearch system and methods of using
US20070208697A1 (en) * 2001-06-18 2007-09-06 Pavitra Subramaniam System and method to enable searching across multiple databases and files using a single search
US20040093321A1 (en) * 2002-11-13 2004-05-13 Xerox Corporation Search engine with structured contextual clustering
US20050144158A1 (en) * 2003-11-18 2005-06-30 Capper Liesl J. Computer network search engine
US20060026152A1 (en) * 2004-07-13 2006-02-02 Microsoft Corporation Query-based snippet clustering for search result grouping

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080120292A1 (en) * 2006-11-20 2008-05-22 Neelakantan Sundaresan Search clustering
US8131722B2 (en) * 2006-11-20 2012-03-06 Ebay Inc. Search clustering
US8589398B2 (en) 2006-11-20 2013-11-19 Ebay Inc. Search clustering
US8732621B2 (en) 2008-01-23 2014-05-20 Senovation, L.L.C. Method and apparatus for information visualized expression, and visualized human computer interactive expression interface thereof
US20110029933A1 (en) * 2008-01-23 2011-02-03 Shixian Chu Method and apparatus for information visualized expression, and visualized human computer interactive expression interface thereof
US9449092B2 (en) * 2008-07-24 2016-09-20 Adobe Systems Incorporated Method and apparatus requesting information upon returning to a search results list
US20140250139A1 (en) * 2008-07-24 2014-09-04 Marissa H. Dulaney Method and Apparatus Requesting Information Upon Returning To A Search Results List
US9330165B2 (en) 2009-02-13 2016-05-03 Microsoft Technology Licensing, Llc Context-aware query suggestion by mining log data
US20100211588A1 (en) * 2009-02-13 2010-08-19 Microsoft Corporation Context-Aware Query Suggestion By Mining Log Data
US8674991B2 (en) * 2009-04-30 2014-03-18 International Business Machines Corporation Layout method and system in a display area for disconnected dynamic networks
US20100277480A1 (en) * 2009-04-30 2010-11-04 International Business Machines Corporation Layout method and system in a display area for disconnected dynamic networks
US8756231B2 (en) 2010-01-28 2014-06-17 International Business Machines Corporation Search using proximity for clustering information
US20160378858A1 (en) * 2010-07-14 2016-12-29 Yahoo! Inc. Clustering of search results
WO2014194656A1 (en) * 2013-06-05 2014-12-11 Tencent Technology (Shenzhen) Company Limited Method and device for data screening
US9922139B2 (en) 2013-06-05 2018-03-20 Tencent Technology (Shenzhen) Company Limited Method and device for data screening
CN104881447A (en) * 2015-05-14 2015-09-02 百度在线网络技术(北京)有限公司 Searching method and device
US10963476B2 (en) 2015-08-03 2021-03-30 International Business Machines Corporation Searching and visualizing data for a network search based on relationships within the data
WO2021185326A1 (en) * 2020-03-20 2021-09-23 北京三快在线科技有限公司 User interaction method used for searching

Also Published As

Publication number Publication date
US20070162443A1 (en) 2007-07-12
US7502786B2 (en) 2009-03-10
CN101000607A (en) 2007-07-18
CN100481077C (en) 2009-04-22

Similar Documents

Publication Publication Date Title
US7502786B2 (en) Visual method and apparatus for enhancing search result navigation
US9697249B1 (en) Estimating confidence for query revision models
US9192684B1 (en) Customization of search results for search queries received from third party sites
US7565345B2 (en) Integration of multiple query revision models
US7912847B2 (en) Comparative web search system and method
US8812559B2 (en) Methods and systems for creating an advertising database
US7647314B2 (en) System and method for indexing web content using click-through features
US7707208B2 (en) Identifying sight for a location
US7657504B2 (en) User interface for displaying images of sights
US8504567B2 (en) Automatically constructing titles
US20070174269A1 (en) Generating clusters of images for search results
US20080215565A1 (en) Searching heterogeneous interrelated entities
US20110191327A1 (en) Method for Human Ranking of Search Results
US20060184481A1 (en) Method and system for mining information based on relationships
US20060230005A1 (en) Empirical validation of suggested alternative queries
WO2006014835B1 (en) Search systems and methods using in-line contextual queries
Pandit et al. Navigationaided retrieval
JP5146108B2 (en) Document importance calculation system, document importance calculation method, and program
Varadarajan et al. Beyond single-page web search results
RU2473119C1 (en) Method and system for semantic search of electronic documents
Lin et al. Knowledge-assisted retrieval of online product information in architectural/engineering/construction
AU2011247862A1 (en) Integration of multiple query revision models
Rokade et al. Summarization of Text Document Using Query Dependent Parsing Techniques
Upate et al. Review on Efficient Approach for Web Search Engine Using Page Level Keyword

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION