WO2000067161A2 - Method and apparatus for categorizing and retrieving network pages and sites - Google Patents

Method and apparatus for categorizing and retrieving network pages and sites Download PDF

Info

Publication number
WO2000067161A2
WO2000067161A2 PCT/US2000/012376 US0012376W WO0067161A2 WO 2000067161 A2 WO2000067161 A2 WO 2000067161A2 US 0012376 W US0012376 W US 0012376W WO 0067161 A2 WO0067161 A2 WO 0067161A2
Authority
WO
WIPO (PCT)
Prior art keywords
categories
page
pages
search
subject matter
Prior art date
Application number
PCT/US2000/012376
Other languages
French (fr)
Other versions
WO2000067161A3 (en
Inventor
Lee H. Grant
Susan A. Capizzi
Original Assignee
Grant Lee H
Capizzi Susan A
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Grant Lee H, Capizzi Susan A filed Critical Grant Lee H
Priority to AU49891/00A priority Critical patent/AU4989100A/en
Publication of WO2000067161A2 publication Critical patent/WO2000067161A2/en
Publication of WO2000067161A3 publication Critical patent/WO2000067161A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/954Navigation, e.g. using categorised browsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/01Automatic library building

Abstract

The invention includes a method for categorizing pages on a network, including the steps of determining whether a page is involved in transacting business or providing information, has information relating to a subject matter category, and has particular types of files associated with it. The invention also includes a method for searching for information on a network. The steps include providing an opportunity to limit a search to categories including commerce and information, subject matter, and file type; and providing an opportunity to limit the search by keyword. Also included are the steps of identifying pages within the selected categories and which contain the keyword, and reporting the results of the search. The invention also includes apparatus for searching for information on a network.

Description

Method and Apparatus for Categorizing and Retrieving Network Pages and Sites
RELATED APPLICATION
This application claims the benefit of the filing date of earlier-filed, co- pending provisional application serial no. 60/132,694 filed on May 4, 1999, pursuant to 35 U.S.C. § 119(e).
BACKGROUND OF THE INVENTION
1. Field Of The Invention
The present invention relates generally to methods and apparatus for categorizing and searching for information on a network and. more specifically, to categorizing and searching Web pages on the Internet. 2. Description Of The Related Art
The Internet contains over one billion Web pages. It has been estimated that two million Web pages are added to the Internet each day (The Industry Standard, February 28, 2000). This vast amount of information is a tremendous resource for the public to use. However, there is no effective way for a user to obtain relevant information. Although 85 percent of users use search engines to find information on the Internet, "a mind-boggling 92 percent of searches fail to find relevant information or to arrange the results in a meaningful order. " (The Industry Standard, April 17, 2000, referring to a Forrester Research review of Web sites.) There are two fundamental problems. First, there is no standardized international categorization system or catalog of the information contained on the Internet. A group of librarians and others have been working on a cataloging system for the Internet for the last few years. This work is referred to as the Dublin Core Metadata Element Set. This system suffers from a number of problems, including requiring a high degree of cataloging knowledge and being time-consuming and very expensive. In addition, because of the size of the Internet, it is a system that is unworkable.
Second, because there is no standardized categorization system or catalog, the existing search methods, which primarily include directories and search engines, are often cumbersome, ineffective, and inefficient.
Directories or indices are human-compiled databases of Web sites or pages. Most directories use editors to review and categorize Web sites. Some use contributions by their visitors. A user searches a directory by reviewing lists of categories and subcategories, or also typing in keywords. The result is a list of documents that the user can access by links. Directories are helpful to familiarize a user with the scope of a subject, but are not very useful in finding specific information. Also, directories can be slow, and the results may be haphazard. Another major problem is that directories review and categorize only a small percentage of pages and sites. Examples of directories commonly used are Yahoo! and LookSmart.
Search engines are huge databases that automatically index large portions of the Internet and continually update that index. Search engines typically include a Web crawler or spider (also called a worm, robot, or bot) that automatically crawls through the Internet on hyperlinks indexing Web pages, a database which is the index compiled by the crawler, and a search tool which the user can use to search the database. The databases of the existing search engines differ in how they are created. Some Web crawlers index each word in a document, some index only keywords, including META tags, and some index other parts of a Web page, such as title, headings, etc. Most search engines require a search to be conducted by typing in keywords. The way in which the search query is formulated may be by Boolean logic, where keywords are used with various terms, or by natural language, where keywords are used in the form of a question. Although natural language searches may be easier for a user to formulate, both types of formulations rely on keywords.
Most search engines use mathematical algorithms to weigh or rank the results, with the most relevant items listed first. These rankings may be based on the number of times a keyword is used on a page or the location of the keyword on the page. Some search engines also allow the user to organize or group the results by category, date, or other variable, such as the folders used by Northern Light, U. S. Patent no. 5,924,090 to Krellenstein. Another search engine, known as the Clever Project, by IBM, analyzes hyperlinks between pages, in addition to text and citations, in order to develop algorithms that are intended to increase the relevancy of search results. This method is a marginal improvement over other search engines, but has its own set of problems. "A shortcoming of Clever has been that for a narrow topic, such as Frank Lloyd Wright's house Fallingwater, the system sometimes broadens its search and retrieves information on a general subject, such as American architecture. " ("Hypersearching the Web, " Scientific American, June 1999.) Search engines do not index the entire Internet. Most have indexed about one-third of the available or publicly indexable Web pages (i.e. , excluding Web pages with authorization requirements). Examples of search engines are: Inktomi (the largest, with about 500 million Web pages indexed as of April 11 , 2000); FAST (with about 340 million Web pages indexed); AltaVista, Northern Light, and Excite. A greater portion of the Internet can be searched using a meta-search. This technology allows the user to search several search engines at the same time and presents all the results in a single list, but exacerbates the problems inherent in existing search engines.
Because they contain such huge databases, existing search engines often produce search results too voluminous for the user to review. Also, the search results typically contain a vast amount of irrelevant or unrelated items . As stated above, it has been found that 92 percent of searches did not yield relevant information or did not organize the results in a usable fashion (The Industry Standard, April 17, 2000). Another problem is that search engines are more likely to index pages with more links, pages with commercial information, and pages in the United States, rather than lesser known, educational, or non-United States pages.
Another major problem of existing search engines is that they may allow minors access to pornography on the Internet. Current filtering software is an ineffective and often clumsy tool that fails to limit access to many pornographic sites, but blocks other sites that are educational or medical in nature. In addition, the controversy surrounding this issue has created enormous difficulties for public libraries with respect to allowing minors access to the Internet.
Still further objects of the inventive method and apparatus disclosed herein will be apparent from the drawings and following detailed description thereof.
SUMMARY OF THE INVENTION
The method and apparatus for categorizing and retrieving network pages and sites of the present invention are adapted to overcome the above-noted shortcomings and to fulfill the stated needs.
The first embodiment of the invention is a method and apparatus for categorizing a network page. The method comprises the steps of providing a list of categories and assigning a page to one or more of a plurality of the categories. The apparatus includes means for providing a list of categories and means for assigning a page to one or more of a plurality of categories. The second embodiment of the invention is a method and apparatus for categorizing pages on a network. The method comprises the steps of determining whether a page is involved in transacting business or providing information, determining whether a page has information relating to one or more of a plurality of subject matter categories, and determining the type of files associated with a page. The apparatus includes means for determining whether a page is involved in transacting business or providing information, means for determining whether a page has information relating to one or more of a plurality of subject matter categories, and means for determining the type of files associated with a page. The third embodiment of the invention is a method and apparatus for searching for and locating information on a network. The method comprises the steps of providing the opportunity to limit the search to categories for pages involved in transacting business, pages involved in providing information, and pages involved in both transacting business and providing information; providing an opportunity to limit the search to one or more of a plurality of subject matter categories; providing an opportunity to limit the search to one or more of a plurality of file-type categories; and providing an opportunity to limit the search by keyword. The apparatus comprises means for providing an opportunity to limit the search to one or more of a plurality of categories, means for providing an opportunity to limit the search by keyword, means for identifying pages categorized into all the categories to which the search was limited, means for determining which of the identified pages contain the keyword to which the search was limited, and means for reporting to a user all said identified pages and keyword-containing pages. It is an object of the invention to provide a method and apparatus for categorizing a page on a network, during or after the time that the page is created, according to whether the page is involved in transacting business or providing information.
It is an object of the invention to provide a method and apparatus for categorizing a page on a network, during or after the time that the page is created, according to the subject matter contained on the page.
It is a further object of the invention to provide a method and apparatus for categorizing a page on a network, during or after the time that the page is created, according to the type of files associated with a page. It is also an object of the invention to provide a method and apparatus for searching a network containing pages that have been categorized according to whether the page is involved in transacting business or providing information, the subject matter of the page, and the type of files associated with a page.
It is an object of the invention to provide a method and apparatus for searching a network, such as the Internet, to allow the user access to a larger percentage of information contained on the network.
It is a further object of the invention to provide a method and apparatus for searching a network, such as the Internet, to obtain more relevant results more quickly than existing methods for searching allow.
It is a further object of the invention to provide a method and apparatus to easily obtain audio or visual material located on a network. It is another object of the invention to provide a method and apparatus for searching a network that is easy to use.
It is also an object of the invention to provide a method and apparatus that does not require the user to understand or use a particular language, including English. It is a further object of the invention to provide a method and apparatus for limiting the results of a search, such as a search on the Internet, to exclude pornographic materials.
It is also an object of the invention to provide a method and apparatus with the advantages of pornography-filtering software, but without the disadvantages of such software.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a representation of the preferred graphical user interface showing the three tiers and the categories within those tiers. Figure 2 is a chart of the Government, Medical, News, and History categories of the second tier showing examples of topics contained within those categories.
Figure 3 is a chart of the Education & Social Sciences, Science & Technology, Sports & Recreation, and Arts & Humanities categories of the second tier showing examples of topics contained within those categories.
Figure 4 is a chart of the Finance & Business, Reference, Explicit, and Other categories of the second tier showing examples of topics contained within those categories.
Figure 5 is a Venn diagram showing the intersection of the domains corresponding to the categories of Commerce and Information.
Figure 6 is a Venn diagram showing the intersection of the domains corresponding to the categories of Information and Medical.
Figure 7 is a Venn diagram showing the intersection of the domains corresponding to the categories of Information, Medical, and History.
Figure 8 is a Venn diagram showing the intersection of the domains corresponding to the categories of Information. Medical. History, and Visual. Figure 9 is a diagram showing an example of the relationship between the subcategory created by selecting a combination of the categories and the keyword search.
DESCRIPTION OF THE PREFERRED EMBODIMENT The invention includes methods and apparatus for categorizing a page as it is being created or as it exists on a network, and for searching a network. Networks include the Internet and private corporate networks, such as intranets and local area networks . Pages on the Internet are identifiable by unique addresses and include both Web sites and Web pages. As shown on Figure 1 , the invention utilizes a graphical user interface
(GUI) 10, including a hierarchy of three tiers, 12, 14, and 16, to categorize, and to search for information located on, Web pages. First tier 12 is a division into one or both of two major categories: pages that are involved in transacting business and pages that are involved in providing information. In the preferred embodiment of the invention, the first category 18 is designated "Commerce" and the second category 20 is designated "Information. " Web pages involved in transacting business include e-commerce pages, which provide users with the ability to conduct online purchases, sales, leases, or other financial transactions, pages that may be involved in transacting business, but do not enable the user to conduct the transaction on-line, and other pages that contain commercial information. Web pages involved in providing information include pages that contain articles, journals, publications, or other non-commercial materials. Some Web pages may be involved in both transacting business and providing information and thereby fall within both the categories of "Commerce" and "Information. "
Second tier 14 is a division into one or more categories based on the subject matter the Web page contains. Many different categories can be used and many different terms may be used to identify a given category . The preferred embodiment of the invention includes twelve categories encompassing like subjects that have been carefully selected to allow users to locate and access information in an efficient manner: Government 22, Medical 24, Education & Social Science 26, News 28, Sports & Recreation 30, History 32, Science & Technology 34, Arts & Humanities 36, Finance & Business 38, Reference 40, Explicit 42, and Other 44. Each of these categories includes many topics. Figures 2, 3, and 4 list examples of the topics included in each category. For example, category 22. Government, includes the following topics: federal/state/local government, law, military, nations, politics, and taxes. Category 42, Explicit, includes pornography and sexually- explicit material. Category 44, Other, is for subjects that do not fit into any of the other categories of second tier 14.
Third tier 16 is a division into one or more categories according to the type of files associated with a Web page. There are several different types of files, including text, graphics, audio, video, multimedia, and files for communications between persons. Most search engines can recognize the type of files associated with a Web page by scanning the files and identifying the file extensions (for example, .gif, .au, .wav). The preferred embodiment of the invention includes the following five file-type categories: Visual 46, Audio 48, Multimedia 50, Text- only 52, and Communication 54. Category 46. Visual, includes files containing pictures, charts, graphs, and diagrams. Category 48, Audio, includes files containing sound, such as music, voice, and sound effects. Category 50, Multimedia, includes files containing video, film clips, and virtual reality. Category 52, Text-only, includes files that do not contain any visual, audio, or multimedia material. Category 54, Communication, includes files containing e- mail, telnet links, ICQ, and other messaging systems.
The first embodiment of the invention is a method and apparatus for categorizing a page on a network, as the page is being created or during editing at a later time. The method includes the steps of providing the creator with a list of categories and allowing the creator to assign the page to one or more of the categories. The preferred categories are the categories of the three tiers 12, 14, and 16, as shown in Figure 1. The list of categories includes a different indicium to indicate each category. The indicium is preferably a universal symbol or icon that is not associated with any one language. The indicia preferably used are shown in Figure 1.
The creator of a Web page may assign the Web page to any number or combination of the categories of three tiers 12, 14, and 16, depending on which categories best characterize the Web page. The steps of assigning a page to categories may be performed in several different ways known to those skilled in the art. The creator may also decide not to assign the page to any of the categories of a particular tier. The outcome of the categorization method is that a page is designated to be "in" or "within" the categories that best characterize the page. First tier 12 includes two categories: Commerce 18 and Information 20, as shown in Figure 1. The creator may assign the page to either one of the two categories of Commerce 18 or Information 20. If the page is involved in both transacting business and providing information, the creator may assign it to both
Commerce 18 and Information 20.
Second tier 14 includes twelve subject matter categories: Government 22,
Medical 24, Education & Social Science 26, News 28, Sports & Recreation 30, History 32, Science & Technology 34, Arts & Humanities 36, Financial 38,
Reference 40, Explicit 42, and Other 44, as shown on Figure 1. The creator may assign the page to one or more of these twelve categories.
Third tier 16 includes five file-type categories: Visual 46, Audio 48,
Multimedia 50, Text-only 52, and Communication 54, as shown in Figure 1. The creator may assign the page to one or more of the five file-type categories.
After the creator decides to which categories to assign the page, the creator may mark or tag the page as belonging in or within the assigned categories by associating, with the page, the corresponding indicium for each assigned category.
In addition, or alternatively, the creator may communicate the categories to which the page is assigned to one or more search engines for the purpose of allowing such search engines to locate the page, by its assigned categories, in conducting a search.
The creator may change the categories during editing at a later point in time as frequently as desired.
A risk with any system whereby the creators of pages are permitted to categorize their own pages is that the creator will assign more categories to the page than are justified in order to increase the number of visitors to the page. The invention addresses this problem by including a method for verifying the accuracy of categorization of a network page. The method includes the step of scanning Web pages categorized into one or more categories, which step can be performed by a Web crawler. Pages assigned to a larger number of categories are scanned more frequently. The crawler will determine whether the page was categorized automatically, for example, by a Web crawler. If the Web page was not categorized automatically, the Web crawler further determines whether the page was properly assigned to each such category.
The apparatus for categorizing a page includes means or mechanisms for providing a list of categories with corresponding indicia, and means for assigning the page to one or more of a plurality of categories. The preferred categories are the categories of the three tiers 12, 14, and 16, as shown in Figure 1.
The second embodiment of the invention is a method and apparatus for categorizing pages on a network. This method may be performed by a Web crawler. The method includes the steps of determining whether a page is involved in transacting business or providing information; assigning a business-transacting page to one category, an information-providing page to a second category, and a page that is involved in both transacting business and providing information to both the first and second categories; determining whether a page has information relating to one or more subject matter categories; assigning a page to one or more subject matter categories; determining the types of files associated with a page; and assigning a page to one or more file-type categories. The method further includes the step of assigning a page that has been assigned to any two or more categories, to a subcategory that consists of only pages assigned to the identical two or more categories. The outcome of the method is that a page is determined to be "in" or "within" the categories that best characterize the page.
The step of determining whether a page is involved in transacting business (i.e. , is a business-transacting page) may be performed by determining whether the page includes encryption software. If the page includes encryption software, it will be determined to be involved in transacting business. Additionally, or alternatively, the step may be performed by determining whether the page has the capability of permitting a user to conduct a financial transaction through the page. If so, the page will be determined to be involved in transacting business (i.e. , a business- transacting page) . A page involved in providing information will be determined to be an information-providing page.
The step of assigning business-transacting pages to one category (preferably designated Commerce 18), pages involved in providing information to a second 5 category (preferably designated Information 20) , and pages that are involved in both transacting business and providing information to both categories is preferably performed by assigning business-transacting pages to a first list (containing only business-transacting pages), assigning pages involved in providing information to a second list (containing only information-providing pages), and assigning pages o that are involved in both transacting business and providing information to both the first and second lists. The lists are preferably databases.
The step of determining whether a page has information relating to one or more subject matter categories is preferably performed by parsing the text of the page. There are various technologies currently available that parse text that may 5 perform this function satisfactorily.
The step of assigning a page to one or more subject matter categories is preferably performed by assigning a page that has information related to particular subject matter categories to a separate list for each such subject matter category, where each list contains only pages having information related to that subject matter category. The categories are preferably the twelve categories of second tier 14.
The lists are preferably databases.
The step of determining the type of files associated with a page may be performed by identifying files containing text, graphics, audio, video, multimedia, and communications between persons. This step can be satisfactorily accomplished by search engines that scan Web pages and recognize file extensions such as .au (audio), .wav (sound), .gif (image), .jpeg (image), pg (image), .avi (video), .mpeg (movies), and .mpg (movies). The step of assigning a page to one or more categories based on file type is preferably performed by assigning a page that is associated with particular file types to a separate list for each such file type where each list contains only pages associated with a single file type. The categories are preferably the five file-type categories of third tier 16. The lists are preferably databases.
The step of assigning a page to a subcategory is performed after the page has been assigned to all possible categories from three tiers 12, 14, and 16. The Web crawlei assigns a page that has been assigned to two or more categories to a subcategory consisting of only pages assigned to the identical categories. For example, a page that has been categorized into the categories of Information History, Medical, and Visual would be assigned to a subcategory containing only pages also assigned to the identical categories of Information 20, History 32, Medical 24, and Visual 46. A separate list is created for each of the possible combinations of any two or more categories of three tiers 12, 14, and 16. Each list is preferably a separate database. Examples of software that can be used for creating and managing databases are Oracle 8i version 2 with the File System option and Informix Dynamic server.
The apparatus for categorizing pages on a network includes means or mechanisms for determining whether a page is involved in transacting business or providing information; means for assigning business-transacting pages to one category, information-providing pages to a second category, and pages involved in both transacting business and providing information to both the first and second categories; means for determining whether a page has information related to one or more subject matter categories; means for assigning a page to one or more subject matter categories; means for determining the types of files associated with a page; and means for assigning the page to one or more file-type categories. The apparatus may also include means for indicating to a search engine that the page has been categorized automatically.
The third embodiment of the invention is a method and apparatus for searching for and locating information on a network. The method allows the user to search pages on a network that have already been categorized into three tiers of categories 12, 14, and 16. The categorization may have been done by the creator of a page at the time the page was created or during editing at a later time, or by a Web crawler automatically at some time after the page was created. The method also includes a categorization step, preferably performed by a search engine, before the search is begun in order to categorize any new pages that have not yet been categorized. The categorizing step comprises assigning the page to one or more categories, including a category for pages involved in transacting business and a category for pages involved in providing information, assigning the page to one or more subject matter categories, and assigning the page to one or more file-type categories. This categorizing step may be accomplished using a Web crawler, by the method and apparatus of the second embodiment.
The method provides the user with the opportunity to limit the search by selecting categories from three tiers 12, 14, and 16 and by utilizing a keyword search. The user may select one or more categories from each of three tiers 12, 14, and 16, from one or two of the tiers, or from none of the tiers, and may or may not use the keyword search function. For convenience, as is well known in the art, when an icon is selected, its appearance changes such that it is emphasized (for example, highlighted).
The user may select, from first tier 12, the category of Commerce 18, the category of Information 20, or both categories 18 and 20. The categories may be conveniently represented on the user's screen by an icon or a symbol, for example, as is preferred: "$ " for Commerce 18 and "i" for Information 20. If the user selects "$, " the search will be restricted to only those Web pages that are categorized as Commerce 18. This will include all pages in the Commerce category 18 as well as the subcategory that is both Commerce 18 and Information 20. Pages only in the Information category 20, and not also in Commerce 18, will automatically be excluded. If the user selects "i, " the search will be restricted to only those Web pages that are categorized as Information 20. This will include all pages in Information category 20 as well as the subcategory that is both Information 20 and Commerce 18. Pages only in the Commerce category 18, and not also in Information 20, will automatically be excluded. If the user selects both "$" and "i, " as shown in Figure 5, the search will be restricted to only those Web pages that are categorized as both Commerce 18 and Information 20. Only subcategory 56 of Commerce and Information will be searched. Pages only in Commerce 18 and pages only in Information 20 will be excluded. If none of the categories of first tier 12 are selected, the search will include Web pages of both categories and the subcategory and will not be narrowed based on whether the page is involved in transacting business or providing information.
The user next may select one or more categories from second tier 14: Government 22, Medical 24, Education & Social Science 26. News 28, Sports & Recreation 30, History 32, Science & Technology 34, Arts & Humanities 36, Finance & Business 38, Reference 40, Explicit 42, and Other 44. As shown in Figure 1 , each of these twelve categories may be conveniently represented on the user's screen by a different icon or symbol, for example, as is preferred: a flag for Government, a caduceus for Medical, a mortarboard for Education & Social Science, a satellite dish for News, a bicycle for Sports & Recreation, a pyramid for History, a microscope for Science & Technology, an artist's pallette for Arts & Humanities, a briefcase for Financial, a book for Reference, an "X" for Explicit (pornographic or sexually-explicit material), and a "? " for Other. The user may also view a list of topics included in each category by clicking on the category. The twelve subject matter categories and their corresponding topics are shown in Figures 2, 3, and 4. If none of the categories are selected, the search will include Web pages of all twelve categories and will not be narrowed based on the subject 5 matter contained in the page.
Next, the user may select one or more categories from third tier 16: Visual 46, Audio 48, Multimedia 50, Text-only 52, and Communication 54. As shown in Figure 1 , each of the five categories may be conveniently represented on the user's screen by an icon or symbol, for example, as is preferred: an eye for l o Visual, an ear for Audio, a lightning bolt for Multimedia, a text page for Text-only, and a mouth for Communication. If no selection is made from this tier, the results from the search will include Web pages that are associated with file-types of text, visual, audio, multimedia, and communications and will not be narrowed based on the types of files contained on the page.
15 Combining categories restricts the search results to only the relevant categories and subcategories. The greater the number of categories chosen, the more refined the search and the greater the number of pages that are excluded from the search. When the user selects several categories, the user does not get results from each of those categories, but only from the subcategory that is created from o the combination of the selected categories. Combining categories acts as a filtering process, eliminating irrelevant material from the search and from subsequent results. This method allows the user to exclude unwanted material, such as pornography, which is contained in Explicit category 42.
The user may next enter a keyword 58, which can be a single word or 5 multiple words. The keyword search can be formulated by using either Boolean logic terms or natural language.
After making the selections, the user initiates the search. The symbols for the categories selected and the keyword preferably remain visible on the user's screen during the search.
After the search is initiated, a determination is made as to whether a page is categorized. A page may have been categorized using the same categories as are available to the user to limit the search, or the site may have been categorized using different categories. The determination of whether a page is categorized is preferably performed by determining whether the page is contained or referred to on a list of categorized pages. The list may be a database or an index created automatically by a Web crawler, which contains the addresses of Web pages. Where the network being searched contains at least one page categorized into one or more of the categories which were provided to the user to limit the search, after a user initiates a category-limited search, an identification is made of all pages that have been assigned all of the categories to which the search was limited. This may be accomplished by a search engine reviewing a database corresponding to a subcategory that is equal to the combination of categories selected by the user. If the search has been limited using keyword, an identification is made of all pages containing the keyword. If the search is both category-limited and keyword-limited, an identification is made of all pages that have been assigned to all of the categories to which the search was category-limited, which also contain the keyword. An example of how a search works is shown in Figures 6 through 9. As shown in Figure 6, if the user selects category 20 Information from first tier 12 and category 24 Medical from second tier 14, the search and subsequent search results will be limited to subcategory 60 that is created by the combination of Information 20 and Medical 24 categories, as shown by gray area. The search results will not include pages from Information category 20 or Medical category 24 that are not contained within smaller subcategory 60.
Figure 7 shows a search in which the user selected Information 20 from first tier 12 and History 32 and Medical 24 from second tier 14. In that case, the search and subsequent search results would be limited to subcategory 62 created by the combination of Information 20, Medical 24, and History 32 categories, as shown by the gray area. The search results will not include pages from Information 20, Medical 24, or History 32 categories that are not contained within smaller subcategory 62.
Figure 8 shows a search in which the user selected Information 20 from first tier 12, Medical 24 and History 32 from second tier 14, and Visual 46 from third tier 16. In that case, the search and subsequent search results would be limited to subcategory 64 created by the combination of Information 20, Medical 24, History 32, and Visual 46 categories, as shown by the gray area. The search results will not include pages from Information 20, Medical 24, History 32, or Visual 46 categories that are not contained within smaller subcategory 64.
Figure 9 shows a search in which the user selected Information 20 from first tier 12, Medical 24 and History 32 from second tier 14, Visual 46 from third tier 16, and the keyword 58 "Pasteur. " In that case, the search and subsequent search results would be limited to the subcategory created by the combination of Information 20, Medical 24, History 32, and Visual 46 categories that contain the keyword 58 "Pasteur. " The search results will not include pages from Information 20, Medical 24, History 32, and Visual 46 categories that are not contained in the subcategory.
All sites identified by the search are reported as search results to the user, by network address, such as a Web page's "uniform resource locator" (URL), so that the user can access any identified page. Other information, such as the first line, may also be reported. For each site reported, the results will show all of the symbols corresponding to all of the categories to which that page had been assigned. The results will also indicate whether the categorization step was performed automatically (for example, by a Web crawler).
The apparatus for searching for and locating information on a network includes means or mechanisms for providing an opportunity to limit the search to one or more categories from three tiers 12, 14, and 16; means for providing an opportunity to limit the search by keyword; means for identifying all pages categorized into the categories to which the search was limited which contain the keyword; and means for reporting the results to a user.
The foregoing detailed disclosure of the inventive method and apparatus is considered as only illustrative of the preferred embodiment of, and not a limitation upon the scope of, the invention. Those skilled in the art will envision many other variations of the method and apparatus disclosed herein that nevertheless fall within the scope of the following claims. Alternative uses for this inventive method and apparatus may later be realized. Accordingly, the scope of the invention should be determined with reference to the appended claims and not by the examples that have been given herein.

Claims

1. A method of categorizing a network page, comprising the steps of: a. providing a list of categories; and, b. assigning a page to one or more of a plurality of said categories.
2. The method of Claim 1 , wherein said categories include a category for pages involved in transacting business and a category for pages involved in providing information.
3. The method of Claim 1 , wherein said categories include a plurality of categories based on subject matter.
4. The method of Claim 3, wherein said categories comprise categories related to government, medical, education and social science, news, sports and recreation, history, science and technology, arts and humanities, finance and business, reference, explicit, and other.
5. The method of Claim 1 , wherein said categories include a plurality of categories based on the type of files associated with a page.
6. The method of Claim 5, wherein said categories comprise visual, audio, multimedia, text-only, and communication.
7. The method of Claim 1 , wherein said categories include: a. a category for pages involved in transacting business and a category for pages involved in providing information; b. a plurality of categories based on subject matter; and, c. a plurality of categories based on the type of files associated with a page.
8. The method of Claim 1 , further comprising the step of providing an indicium for each of said categories.
9. The method of Claim 8, wherein said indicium comprises an icon.
10. The method of Claim 1 , further comprising the step of communicating said categories assigned to a page to a search engine.
11. A method for verifying the accuracy of categorization of a network page, comprising the steps of: a. for a page that has been categorized into one or more of a plurality of categories, scanning the page; and, b. determining whether said page is properly included in each of said categories.
12. Apparatus for categorizing a network page, comprising: a. means for providing a list of categories; and, b. means for assigning a page to one or more of a plurality of said categories.
13. A method for categorizing pages on a network, comprising the steps of: a. determining whether a page is involved in transacting business or in providing information; b. determining whether a page has information relating to one or more of a plurality of subject matter categories; and, c. determining the type of files associated with a page.
14. The method of Claim 13, wherein said involvement-determining step comprises the step of determining whether a page includes encryption software.
15. The method of Claim 13, wherein said involvement-determining step comprises the step of determining whether a page includes the capability of permitting a user to conduct a financial transaction through a page.
16. The method of Claim 13, further comprising the step of assigning said business-transacting pages to a first category, said information-providing pages to a second category, and pages involved in both transacting business and providing information to both said first and second categories.
17. The method of Claim 16, wherein said assigning step comprises assigning said business-transacting pages to a first list, said information-providing pages to a second list, and pages involved in both transacting business and providing information to both said first and second lists.
18. The method of Claim 16, wherein said first category consists of all pages that may be utilized in the buying, selling, or leasing of a product or service.
19. The method of Claim 13, wherein said subject matter-determining step comprises the step of parsing the text of a page.
20. The method of Claim 13, further comprising the step of assigning a page to one or more of a plurality of subject matter categories.
21. The method of Claim 20, wherein said assigning step comprises assigning a page that has information relating to a subject matter category to a list containing only pages having information relating to said subject matter category.
22. The method of Claim 20, wherein said plurality of subject matter categories comprises categories related to science and technology and medical.
23. The method of Claim 20, wherein said plurality of subject matter categories comprises categories related to finance and business and reference.
24. The method of Claim 20, wherein said plurality of subject matter categories comprises categories related to government, medical, and news.
25. The method of Claim 20, wherein said plurality of subject matter categories comprises categories related to sports and recreation and history.
26. The method of Claim 20, wherein said plurality of subject matter categories comprises categories related to education and social science, arts and humanities, and reference.
27. The method of Claim 20, wherein said plurality of subject matter categories comprises categories related to pornography or sexually-explicit material.
28. The method of Claim 20, wherein said plurality of subject matter categories comprise categories related to finance and business, government, and news.
29. The method of Claim 20, wherein said plurality of subject matter categories comprises categories related to government, medical, education and social science, news, sports and recreation, history, science and technology, arts and humanities, finance and business, reference, explicit, and other.
30. The method of Claim 13, wherein said file type-determining step comprises the step of identifying files containing text, graphics, audio, video, multimedia, and communications between persons.
31. The method of Claim 13, wherein said file-type determining step comprises identifying file extensions comprising .au, .jpeg, .jpg, .mpg, .mpeg, .avi, .wav, and .gif.
32. The method of Claim 13, further comprising the step of assigning a page to one or more of a plurality of categories based on the file type.
33. The method of Claim 32, wherein said assigning step comprises assigning a page that contains a file type to a list containing only pages containing said file type.
34. The method of Claim 32, wherein said plurality of file-type categories comprise visual, audio, multimedia, text-only, and communication.
35. A method for categorizing pages on a network, comprising the steps of: a. determining whether a page is involved in transacting business or involved in providing information; b. assigning a business-transacting page to a first category, an information-providing page to a second category, and a page involved in both transacting business and providing information to both said first and second categories; c. determining whether a page has information relating to one or more of a plurality of subject matter categories; d. assigning a page to one or more of a plurality of subject matter categories; e. determining the type of files contained on a page; and, f . assigning a page to one or more of a plurality of categories based on the type of files associated with the page.
36. The method of Claim 35, further comprising assigning a page that has been assigned to a plurality of categories to a subcategory consisting only of pages assigned to all of said plurality of categories.
37. Apparatus for categorizing pages on a network, comprising: a. means for determining whether a page is involved in transacting business or in providing information; b. means for determining whether a page has information relating to one or more of a plurality of subject matter categories; and, c. means for determining the type of files associated with a page.
38. The apparatus of Claim 37, wherein said means for determining whether a page is involved in transacting business comprises means for determining whether a page includes encryption software.
39. The apparatus of Claim 37, wherein said means for determining whether a page is involved in transacting business comprises means for determining whether a page includes the capability of permitting a user to conduct a financial transaction through said page.
40. The apparatus of Claim 37, further comprising means for assigning said business-transacting pages to a first category, said information-providing pages to a second category, and pages involved in both transacting business and providing information to both said first and second categories.
41. The apparatus of Claim 40, wherein said assigning means operates to assign said business-transacting pages to a first list, said information-providing pages to a second list, and pages involved in both transacting business and providing information to both said first and second lists.
42. The apparatus of Claim 40, wherein said first category consists of all pages that may be utilized in the buying, selling, or leasing of a product or service.
43. The apparatus of Claim 37, wherein said means for determining whether a page has information relating to one or more of a plurality of subject matter categories comprises means for parsing the text of a page.
44. The apparams of Claim 37, further comprising means for assigning a page to one or more of a plurality of subject matter categories.
45. The apparams of Claim 44, wherein said assigning means operates to assign a page that has information relating to a subject matter category to a list containing only pages having information relating to said subject matter category.
5
46. The method of Claim 44, wherein said plurality of subject matter categories comprises categories related to sports and recreation and history .
47. The apparams of Claim 44, wherein said plurality of subject matter l o categories comprises categories related to finance and business , government, and news.
48. The apparatos of Claim 44, wherein said plurality of subject matter categories comprises categories related to education and social science, arts
15 and humanities, and reference.
49. The apparams of Claim 44, wherein said plurality of subject matter categories comprises categories related to science and technology and medical. 0
50. The apparatus of Claim 44, wherein said plurality of subject matter categories comprises categories related to finance and business and reference.
5 51. The method of Claim 44, wherein said plurality of subject matter categories comprises categories related to government, medical, and news.
52. The apparams of Claim 44, wherein said plurality of subject matter categories comprises categories related to government, medical, education and social science, news, sports and recreation, history, science and technology, arts and humanities, finance and business, reference, explicit, and other.
53. The apparams of Claim 37, wherein said means for determining the file types associated with a page comprises means for identifying files containing text, graphics, audio, video, multimedia, and communications between persons.
54. The apparams of Claim 37, wherein said means for determining the file types contained on a page comprises means for identifying file extensions comprising .au, .jpeg, .jpg, .mpg, .mpeg, .avi, .wav, and .gif.
55. The apparams of Claim 37, further comprising means for assigning a page to one or more of a plurality of categories based on the file type.
56. The apparams of Claim 55, wherein said assigning means operates to assign a page that contains a file type to a list containing only pages associated with said file type.
57. The apparatus of Claim 55, wherein said plurality of file-type categories comprises visual, audio, multimedia, text-only, and communication.
58. Apparams for categorizing a page on a network, comprising: a. means for categorizing a page based on whether it is involved in transacting business or in providing information, comprising: i. means for determining whether a page is involved in transacting business or in providing information; and, ii. means for assigning said business-transacting pages to a first category, said information-providing pages to a second category, and pages involved in both transacting business and providing information to both said first and second categories; b. means for categorizing a page based on whether it has information relating to one or more of a plurality of subject matter categories, comprising: i. means for determining whether a page has information relating to one or more of a plurality of subject matter categories; and, ii. means for assigning a page to one or more of a plurality of subject matter categories; and, c. means for categorizing a page based on the type of files associated with a page, comprising: i. means for determining the type of files associated with a page; and, ii. means for assigning a page to one or more of a plurality of categories based on the type of files associated with a page.
59. The apparams of Claim 58, further comprising means for assigmng a page that has been assigned to a plurality of categories to a subcategory consisting of only pages assigned to all of said plurality of categories.
60. A method for searching for and locating information on a network. comprising the steps of: a. providing an oppormnity to limit the search to one or more of a plurality of categories, wherein the categories are pages involved in transacting business, pages involved in providing information, and
5 pages involved in both transacting business and providing information; b. providing an opportunity to limit the search to one or more of a plurality of subject matter categories; c. providing an oppormnity to limit the search to one or more of a 0 plurality of categories based on the type of files associated with a page; and, d. providing an oppormnity to limit the search by keyword.
61. The method of Claim 60, wherein the oppormnity to limit the search by 5 category is exercised by a user selecting an indicium corresponding to each such category.
62. The method of Claim 60, wherein the step of providing an oppormnity to limit the search to one or more of a plurality of subject matter categories o further comprises the step of providing a separate subject matter category for pornographic material and providing an oppormnity to limit the search to categories other than said pornographic material category.
63. The method of Claim 60, wherein said subject matter categories comprise categories related to government, medical, education and social science, news, sports and recreation, history, science and technology, arts and humanities, finance and business, referenced, explicit, and other.
64. The method of Claim 60, wherein said file-type categories comprise visual, audio, multimedia, text-only, and communications.
65. The method of Claim 60, before providing the opportunity to limit the search, further comprising the step of categorizing a page on a network.
66. The method of Claim 65, wherein the categorizing step comprises assigning a page to one or more of a plurality of categories, wherein the categories are pages involved in transacting business, pages involved in providing information, and pages involved in both transacting business and providing information.
67. The method of Claim 65, wherein the categorizing step comprises assigning a page to one or more of a plurality of subject matter categories.
68. The method of Claim 65, wherein the categorizing step comprises assigning a page to one or more of a plurality of categories based on the type of files associated with a page.
69. The method of Claim 65, wherein the categorizing step comprises: a. assigning a page to one or more of a plurality of categories, wherein the categories are pages involved in transacting business, pages involved in providing information, and pages involved in both transacting business and providing information; b. assigning a page to one or more of a plurality of subject matter categories; and, c. assigning a page to one or more of a plurality of categories based on the type of files associated with a page.
70. The method of Claim 69, further comprising assigning a page that has been assigned to a plurality of categories, to a subcategory consisting of only pages assigned to all of said plurality of categories.
71. The method of Claim 60, further comprising the step of, after a user initiates a category-limited search, making a determination whether a page is categorized into one or more categories to which the user has had the oppormnity to limit the search.
72. The method of Claim 71 , wherein the step of determining whether a page is categorized comprises the step of determining whether the page is contained on a list of categorized pages.
73. The method of Claim 60, on a network having a plurality of pages categorized into one or more categories to which the user has had an oppormnity to limit the search, after a user initiates a category-limited search, further comprising the step of identifying all pages categorized into all of the categories to which the search was limited.
74 The method of Claim 73, further comprising the step of reporting to a user all said identified pages.
75. The method of Claim 60, after a user initiates a keyword-limited search, further comprising the step of identifying all pages containing the keyword to which the search was limited.
76. The method of Claim 75, further comprising the step of reporting to a user all said identified pages.
77. The method of Claim 73, wherein said category-limited search was also keyword-limited, further comprising the step of determining which of said identified pages contain the keyword to which the search was limited.
78. The method of Claim 77, further comprising the step of reporting to a user all said keyword-containing pages.
79. A method for searching for and locating information on a network, comprising the steps of: a. providing an opportunity to limit the search to one or more of a plurality of categories, wherein the categories are pages involved in transacting business, pages involved in providing information, and pages involved in both transacting business and providing information; b. providing an oppormnity to limit the search to one or more of a plurality of subject matter categories, wherein said subject matter categories are: government, medical, education and social science, news, sports and recreation, history, science and technology, arts and humanities, finance and business, reference, explicit, and other; c. providing an oppormnity to limit the search to one or more of a plurality of categories based on the type of files associated with a page, wherein said categories are: visual, audio, multimedia, text- only, and communications; and, d. providing an oppormnity to limit the search by keyword.
80. The method of Claim 79, before providing the oppormnity to limit the search, further comprising the step of categorizing a page on a network.
81. The method of Claim 80, wherein the categorizing step comprises: a. assigning a page to one or more of a plurality of categories, wherein the categories are pages involved in transacting business, pages involved in providing information, and pages involved in both transacting business and providing information; b. assigmng a page to one or more of a plurality of subject matter categories; and, c. assigning a page to one or more of a plurality of categories based on the type of files associated with a page.
82. The method of Claim 81 , further comprising assigning a page that has been assigned to a plurality of categories, to a subcategory consisting of only pages assigned to all of said plurality of categories.
83. The method of Claim 79, on a network having a plurality of pages categorized into one or more categories to which the user has had an oppormnity to limit the search, after a user initiates a category-limited search, further comprising the step of identifying all pages categorized into all of the categories to which the search was limited.
84. The method of Claim 83, further comprising the step of reporting to a user all said identified pages.
85. The method Claim 79, after a user initiates a keyword-limited search, further comprising the step of identifying all pages containing the keyword to which the search was limited.
86. The method Claim 85, further comprising the step of reporting to a user all said identified pages.
87. The method of Claim 83, wherein said category-limited search was also keyword-limited, further comprising the step of determining which of said identified pages contain the keyword to which the search was limited.
88. The method of Claim 87, further comprising the step of reporting to a user all said keyword-containing pages.
89. Apparams for searching for and locating information on a network, comprising: a. means for providing an oppormnity to limit the search to one or more of a plurality of categories; b. means for providing an oppormnity to limit the search by keyword; c. means for identifying all pages categorized into all of the categories to which the search was limited; d. means for determining which of said identified pages contain said keyword to which the search was limited; e. means for reporting to a user all said identified pages; and, f. means for reporting to a user all said keyword-containing identified pages.
PCT/US2000/012376 1999-05-04 2000-05-03 Method and apparatus for categorizing and retrieving network pages and sites WO2000067161A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU49891/00A AU4989100A (en) 1999-05-04 2000-05-03 Method and apparatus for categorizing and retrieving network pages and sites

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US13269499P 1999-05-04 1999-05-04
US60/132,694 1999-05-04
US56569500A 2000-05-03 2000-05-03
US09/565,695 2000-05-03

Publications (2)

Publication Number Publication Date
WO2000067161A2 true WO2000067161A2 (en) 2000-11-09
WO2000067161A3 WO2000067161A3 (en) 2002-06-06

Family

ID=26830640

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/012376 WO2000067161A2 (en) 1999-05-04 2000-05-03 Method and apparatus for categorizing and retrieving network pages and sites

Country Status (2)

Country Link
AU (1) AU4989100A (en)
WO (1) WO2000067161A2 (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002054292A2 (en) * 2000-12-29 2002-07-11 Treetop Ventures Llc A cooperative, interactive, heuristic system for the creation and ongoing modification of categorization systems
GB2386440A (en) * 2002-03-12 2003-09-17 Univ Hertfordshire Searching and navigating an information source
EP1388091A2 (en) * 2001-02-28 2004-02-11 Microsoft Corporation Category name service
US7168034B2 (en) * 1999-03-31 2007-01-23 Microsoft Corporation Method for promoting contextual information to display pages containing hyperlinks
WO2008030529A2 (en) * 2006-09-06 2008-03-13 Nexplore Corporation System and method for providing focused search term results
WO2009034473A2 (en) 2007-09-12 2009-03-19 Novartis Ag Gas57 mutant antigens and gas57 antibodies
WO2009081274A2 (en) 2007-12-21 2009-07-02 Novartis Ag Mutant forms of streptolysin o
WO2010079464A1 (en) 2009-01-12 2010-07-15 Novartis Ag Cna_b domain antigens in vaccines against gram positive bacteria
EP2258365A1 (en) 2003-03-28 2010-12-08 Novartis Vaccines and Diagnostics, Inc. Use of organic compounds for immunopotentiation
EP2277595A2 (en) 2004-06-24 2011-01-26 Novartis Vaccines and Diagnostics, Inc. Compounds for immunopotentiation
EP2357184A1 (en) 2006-03-23 2011-08-17 Novartis AG Imidazoquinoxaline compounds as immunomodulators
EP2360175A2 (en) 2005-11-22 2011-08-24 Novartis Vaccines and Diagnostics, Inc. Norovirus and Sapovirus virus-like particles (VLPs)
WO2011149564A1 (en) 2010-05-28 2011-12-01 Tetris Online, Inc. Interactive hybrid asynchronous computer game infrastructure
EP2583678A2 (en) 2004-06-24 2013-04-24 Novartis Vaccines and Diagnostics, Inc. Small molecule immunopotentiators and assays for their detection
EP2612679A1 (en) 2004-07-29 2013-07-10 Novartis Vaccines and Diagnostics, Inc. Immunogenic compositions for gram positive bacteria such as streptococcus agalactiae
US8549436B1 (en) 2007-06-04 2013-10-01 RedZ, Inc. Visual web search interface

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5706507A (en) * 1995-07-05 1998-01-06 International Business Machines Corporation System and method for controlling access to data located on a content server
US5924090A (en) * 1997-05-01 1999-07-13 Northern Light Technology Llc Method and apparatus for searching a database of records

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5706507A (en) * 1995-07-05 1998-01-06 International Business Machines Corporation System and method for controlling access to data located on a content server
US5924090A (en) * 1997-05-01 1999-07-13 Northern Light Technology Llc Method and apparatus for searching a database of records

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"SIMPLIFICATION OF A DATABASE QUERY THROUGH THE USE OF A CATEGORY WINDOW" IBM TECHNICAL DISCLOSURE BULLETIN, IBM CORP. NEW YORK, US, vol. 33, no. 3B, 1 August 1990 (1990-08-01), pages 459-461, XP000124421 ISSN: 0018-8689 *
ANONYMOUS: "Taxonomized Web Search" IBM TECHNICAL DISCLOSURE BULLETIN, IBM CORP. NEW YORK, US, vol. 40, no. 5, 1 May 1997 (1997-05-01), pages 195-196, XP002133594 ISSN: 0018-8689 *
BEERUD SHETH ET AL: "EVOLVING AGENTS FOR PERSONALIZED INFORMATION FILTERING" PROCEEDINGS OF THE CONFERENCE ON ARTIFICIAL INTELLIGENCE FOR APPLICATIONS. ORLANDO, MAR. 1 - 5, 1993, LOS ALAMITOS, IEEE COMP. SOC. PRESS, US, vol. CONF. 9, 1 March 1993 (1993-03-01), pages 345-352, XP000379626 Florida, USA ISBN: 0-8186-3840-0 *
CHEN H ET AL: "INTERNET CATEGORIZATION AND SEARCH: A SELF-ORGANIZING APPROACH" JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, ACADEMIC PRESS, INC, US, vol. 7, no. 1, 1 March 1996 (1996-03-01), pages 88-102, XP000619822 ISSN: 1047-3203 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7168034B2 (en) * 1999-03-31 2007-01-23 Microsoft Corporation Method for promoting contextual information to display pages containing hyperlinks
WO2002054292A3 (en) * 2000-12-29 2003-11-06 Treetop Ventures Llc A cooperative, interactive, heuristic system for the creation and ongoing modification of categorization systems
WO2002054292A2 (en) * 2000-12-29 2002-07-11 Treetop Ventures Llc A cooperative, interactive, heuristic system for the creation and ongoing modification of categorization systems
EP1388091A2 (en) * 2001-02-28 2004-02-11 Microsoft Corporation Category name service
EP1388091A4 (en) * 2001-02-28 2006-01-18 Microsoft Corp Category name service
US7213069B2 (en) 2001-02-28 2007-05-01 Microsoft Corporation Category name service able to override the category name based on requestor privilege information
GB2386440A (en) * 2002-03-12 2003-09-17 Univ Hertfordshire Searching and navigating an information source
EP2258365A1 (en) 2003-03-28 2010-12-08 Novartis Vaccines and Diagnostics, Inc. Use of organic compounds for immunopotentiation
EP2583678A2 (en) 2004-06-24 2013-04-24 Novartis Vaccines and Diagnostics, Inc. Small molecule immunopotentiators and assays for their detection
EP2277595A2 (en) 2004-06-24 2011-01-26 Novartis Vaccines and Diagnostics, Inc. Compounds for immunopotentiation
EP2612679A1 (en) 2004-07-29 2013-07-10 Novartis Vaccines and Diagnostics, Inc. Immunogenic compositions for gram positive bacteria such as streptococcus agalactiae
EP2360175A2 (en) 2005-11-22 2011-08-24 Novartis Vaccines and Diagnostics, Inc. Norovirus and Sapovirus virus-like particles (VLPs)
EP2357184A1 (en) 2006-03-23 2011-08-17 Novartis AG Imidazoquinoxaline compounds as immunomodulators
WO2008030529A3 (en) * 2006-09-06 2008-05-22 Nexplore Corp System and method for providing focused search term results
WO2008030529A2 (en) * 2006-09-06 2008-03-13 Nexplore Corporation System and method for providing focused search term results
US8549436B1 (en) 2007-06-04 2013-10-01 RedZ, Inc. Visual web search interface
WO2009034473A2 (en) 2007-09-12 2009-03-19 Novartis Ag Gas57 mutant antigens and gas57 antibodies
WO2009081274A2 (en) 2007-12-21 2009-07-02 Novartis Ag Mutant forms of streptolysin o
EP2537857A2 (en) 2007-12-21 2012-12-26 Novartis AG Mutant forms of streptolysin O
WO2010079464A1 (en) 2009-01-12 2010-07-15 Novartis Ag Cna_b domain antigens in vaccines against gram positive bacteria
WO2011149564A1 (en) 2010-05-28 2011-12-01 Tetris Online, Inc. Interactive hybrid asynchronous computer game infrastructure

Also Published As

Publication number Publication date
AU4989100A (en) 2000-11-17
WO2000067161A3 (en) 2002-06-06

Similar Documents

Publication Publication Date Title
US7181459B2 (en) Method of coding, categorizing, and retrieving network pages and sites
US6363377B1 (en) Search data processor
Lawrence Context in web search
Schwartz Web search engines
Choi et al. Searching for images: The analysis of users' queries for image retrieval in American history
US9576055B2 (en) Techniques for including collection items in search results
US5920859A (en) Hypertext document retrieval system and method
US7065523B2 (en) Scoping queries in a search engine
US6684218B1 (en) Standard specific
Nelson We have the information you want, but getting it will cost you! held hostage by information overload.
US6574625B1 (en) Real-time bookmarks
US20020129062A1 (en) Apparatus and method for cataloging data
US20060129538A1 (en) Text search quality by exploiting organizational information
JP2009238241A (en) Method and apparatus for searching data of database
US8364718B2 (en) Collaborative bookmarking
WO2000067161A2 (en) Method and apparatus for categorizing and retrieving network pages and sites
WO2007084852A2 (en) Systems and methods for providing sorted search results
WO2008109980A1 (en) Entity recommendation system using restricted information tagged to selected entities
US7024405B2 (en) Method and apparatus for improved internet searching
US6847960B1 (en) Document retrieval by information unit
EP1586021A2 (en) Database for allowing multiple customized views
Gill Metadata and the world wide web
WO1997049048A1 (en) Hypertext document retrieval system and method
Pu An analysis of Web image queries for search
Prime‐Claverie et al. Transposition of the cocitation method with a view to classifying web pages

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP