WO2001048579A2 - Information search and retrieval system - Google Patents
Information search and retrieval system Download PDFInfo
- Publication number
- WO2001048579A2 WO2001048579A2 PCT/IL2000/000851 IL0000851W WO0148579A2 WO 2001048579 A2 WO2001048579 A2 WO 2001048579A2 IL 0000851 W IL0000851 W IL 0000851W WO 0148579 A2 WO0148579 A2 WO 0148579A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- information source
- user
- request
- words
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Definitions
- the present invention relates generally to the search and retrieval of information from an information source
- the Internet contains an overwhelming amount of information about a multitude of topics and the information available continues to increase at a rapid rate
- the nature of the Internet is that of an unorganized mass of information Navigation to a specific desired page requires either knowing the uniform resource locator (URL) for the site, having a bookmark to the site (which is actually a stored URL for the site), or successive requests for Web pages until the site is found
- URL uniform resource locator
- HTTP Hypertext transfer protocol
- An HTTP request includes details such as the URL of the site, cookies (data stored on the computer of a user to track his patterns and preferences), and a referer field (the page from which the current request was made)
- Cookies data stored on the computer of a user to track his patterns and preferences
- a referer field the page from which the current request was made
- the first page, 101 represents the US Patent and Trademark Office (USPTO) home page It shows three choices “general information”, “searchable databases”, and “PTO fees” Selecting "searchable databases” returns page 103, Search on the PTO Web Server This page has two choices “patent bibliographic and abstract database” and “trademark database with images” Selecting "patent bibliographic and abstract database” returns page 105, USPTO Web Patent Database. Under the heading bibliographic Database are three choices: “boolean search”, “advanced search”, and “patent number search”. Selecting "patent number search” returns page 107A, Patent Bibliographic Database.
- the user needs information about a specific U.S. patent, for example, US 5,123,456. If the user initially only has the URL of the home page of the USPTO, he has to make each of the following Web requests:
- the user is looking for information about a specific topic, but does not know where to find the information. In such a case, he may make use of a search engine to find sources for the information he needs.
- the user could search, using any of the public domain search engines available on the Web, using criteria such as "patent" or "US patent” and would find one of the USPTO Web pages among the matching results. The more knowledgeable the user, the narrower he can make the search criteria, thus resulting in a finer tuned list of results. Additionally, there ex st some products that make shortcuts for specific searches.
- AltaVista Search a search engine available from AltaVista Company at http://www.altavista.com/, the user enters "us patent 5123456" in the "Find This" box
- AltaVista will return what they call an "Internet keyword", in this case U.S. patent 5123456RN, which, if clicked will return a page similar to page 111 , the USPTO Full Text and Image Database page containing the full text of the US patent.
- This search requires three Web retrievals, herein called "information requests”, one to get the AltaVista Search page, the search itself, and the retrieval of the abstract page.
- An object of the present invention is to provide an improved information retrieval and searching system and method
- a system for retrieving information from at least one information source The system includes an input unit, a specification unit, and a retrieval engine
- the input unit receives words from a user
- the specification unit receives a user-selected specification which provides details for the retrieval of information from the at least one information source
- the retrieval engine retrieves the information from the at least one information source using the words and the specification
- the retrieval engine includes a unit for generating at least one information request to the at least one information source
- the specification is a script for generating information requests for at least one specific information source
- the specification includes variables and the words are the values for the variables used to generate the information request
- the specification unit includes a unit for updating the specification from an external information source
- the selected specification belongs to a group of specifications and the specification unit includes a unit for updating the group of specifications from an external information source
- the specification group generally forms a related unit with a common theme
- the system for retrieving information also includes an interface unit connected to the input unit and to the specification unit
- the interface unit includes the input unit, a plurality of user selectable references each referring to a different specification, and a unit for providing the specification corresponding to the selected reference to the specification unit.
- an information request includes instructions to an external application.
- the external application includes a unit for manipulating the information request into a new information request and a unit for sending the new information request to the one information source.
- the external application includes a unit for generating multiple information requests, a unit for manipulating the output result of the information requests of each information source, and a unit for sending the manipulated output to the user.
- an interface unit operable with a system for retrieving information from at least one information source.
- the interface unit includes an input area where words may be entered, a plurality of user selectable references, and a unit for providing the words and the specification corresponding to the selected reference to the system for retrieving information.
- Each user selectable reference refers to a different specification and each specification provides details for the retrieval of information from one different information source.
- the one information source is a World Wide Web (“Web”) site.
- Web World Wide Web
- the one information request is a hypertext transfer protocol (HTTP) request.
- HTTP hypertext transfer protocol
- the external application is accessible on an external information source.
- a method for retrieving information from at least one information source includes the steps of inputting words received from a user, receiving a user-selected specification, and retrieving the information.
- the user-selected specification at least provides details for the retrieval of information from the at least one information source.
- the retrieval of the information from the at least one information source uses the words and the specification.
- a system for retrieving information from at least one information source includes an input unit for receiving words from a user, at least one specification, selectable by a user, wherein each specification at least provides details for the retrieval of information from the at least one information source, a specification unit for receiving the user-selected specification and a retrieval engine at least for retrieving the information from the at least one information source using the words and the specification.
- a method for retrieving information from at least one information source includes the steps of having at least one specification, wherein each specification at least provides details for the retrieval of information from the at least one information source, inputting words received from a user, receiving a user-selected one of the at least one specification, and retrieving the information from the at least one information source using the words and the specification.
- Fig. 1 is a schematic illustration of prior art searches and retrievals of information using the Internet
- Fig. 2 is a block diagram illustration of an information search and retrieval system, constructed and operative in accordance with a preferred embodiment of the present invention
- Fig. 3 is a block diagram illustration of the engine of Fig. 2;
- Fig. 4 is a schematic illustration of the service selector of Fig. 2, constructed and operative in accordance with a preferred embodiment of the present invention
- Fig. 5 is a schematic illustration of a graphical user input device for an information search and retrieval system, constructed and operative in accordance with a preferred embodiment of the present invention.
- Figs. 6A, 6B, and 6C are illustrations of data flow during the execution of an information request system of the invention.
- the present invention is a system and method for information retrieval and for searching in an information source
- This invention includes a number of features, for example, allowing a user to successfully retrieve information without knowledge about the resources available for use with a specific information source and, bypassing intermediate search and retrieval requests This is true for all types of information retrieval, though for the sake of clarity, examples using the World Wide Web (“Web”) will be described
- Fig 2 is a block diagram illustration of an information search and retrieval system, constructed and operative in accordance with a preferred embodiment of the present invention
- the system comprises a service selector 10, a text handler 12, and an information search and retrieval engine 14 hereinbelow referred to as the "engine"
- Service selector 10 contains a plurality of specifications 11
- Service selector 10 passes a selected specification 11 , describing a user-selected information source, to engine 14
- Selected specification 11 contains details particular to an information source 16, so that an appropriate information request can be built for the information source
- text handler 12 passes user-provided words defining a search to be done, to engine 14 Engine 14 then uses these two inputs to build an information request defined by the words of the user and appropriate to the user-selected information source 16
- Information source 16 performs the search or retrieval and provides the results back to the user
- selected specification 11 describes the details for retrieval of the information the user wants Thus, the user does not need to know what Web page is an appropriate data source, the uniform resource locator (URL) of the page, or how to fill in the necessary data in the form on the page
- Service selector 10 contains up to N specifications 11 that the user chooses from The specifications 11 contain details telling engine 14 how to build the information request Specification 11 provides a definition of where to search, what specific resource(s) to use, and the parameters needed for the resource being used
- an exemplary specification 11 for patent searches of the Web refers to a Web page, for example page 107 described hereinabove The user need not know anything about the details of the site or its specification 11 , only that he wants a certain type of data source and search
- specifications 11 could be a dictionary search or a search of Broadway show listings
- Specification 11 for a dictionary search would include the URL of an on-line dictionary and would include code for creating a request for the definition of a specific word Specification 11 for a Broadway show listing would include the URL of a site listing Broadway shows and code for selecting the requested show and finding its Web page
- Specifications 11 can be updated as changes are made to the underlying Web pages, but this does not affect the user, since these details are hidden within the specifications 11 Using text handler 12, the user specifies the information he wants.
- the words would minimally be a patent number, whereas in the dictionary example they would include the word(s) whose definition the user wants.
- the information could be taken from a document and could include words alluding to the context within which the information is described. For example, in the dictionary example, if the unknown word is "negotiator", other information could limit the definition to the field of law.
- Text handler 12 outputs words that describe the information wanted. Text handler 12 can receive input from any application integrated into the computer desktop.
- Fig. 3 is a block diagram illustration of engine 14 of Fig. 2.
- Engine 14 comprises a specification interpreter 20, a request engine 24, and a request object 26.
- Specification interpreter 20 manages engine 14 and receives words and selected specification 1 1. Specification interpreter 20 interprets the lines of selected specification 1 1 and provides the relevant information to request engine 24. This is described in detail with respect to an exemplary specification 11 hereinbelow.
- the instruction code contained in specification 1 1 may be written in any language that specification interpreter 20 understands.
- code including "key words” in an Extensible Markup Language (XML) wrapper around JavaScript is used.
- the code describing the instructions for the exemplary specification 11 "patent number" is the following:
- HttpRequest ("http://128.109.179.23/cgi-bin/num_srch4?INDEX 0");
- Lines 1 - 4 of Code 1 are used by specification interpreter 20.
- the title is used to designate specification 11 to the user and the URL of line 1 refers to the site from which specification 11 will be refreshed, as described hereinbelow.
- Specification interpreter 20 passes the instructions, in lines 5 - 16 of the code, and the words, as the text and context parameters of line 5, to request engine 24 to compose request object 26, which is an abstract object encapsulating the request details.
- Request engine 24 can use any available scripting engine.
- request engine 24 interprets Java Script. It interprets and runs the function "guidelet” referred to in line 5, instantiating, in line 10, a "req" object that is an instance of an HttpRequest object, and adding its parameters as defined in lines 1 1 - 15. The resulting "req" object is request object 26 that is returned to specification interpreter 20 in line 16 of the code.
- request engine 24 creates an object making use of parameters DBSELECT2, RANKTYPE, ELEMENT_SET, and QUERY.
- Specification interpreter 20 then uses the contents of request object 26 to construct an information request appropriate to the information source.
- the information request would be written in HTTP.
- the resulting HTTP code for the patent number example would be:
- Proxy-Connection Keep-Alive 22.
- User-Agent Mozilla/4.7 [en] (WinNT; I)
- Code lines 1 - 18 of Code 1 hereinabove describe the instructions for a simple specification 11.
- code lines 32 - 82 of Code 3 below describe the instructions for a specification 1 1 in which specification interpreter 20 first manipulates the information on the client machine. Specification interpreter 20 then sends an information request to information source 16 incorporating the results of the manipulations.
- the domain name is extracted from a user given URL, and an information request for information about this domain name is made.
- Lines 70 - 80 are similar to the function guidelet in Code 1, line 5 hereinabove.
- the function "getNameDomain” is called on line 75 with the text parameter set to the URL the user entered.
- new parameters such as, "result.name” and “result.domain” can be added to "req", request object 26, on lines 76 and 77.
- parameter and result manipulation can be performed on the server machine as described hereinbelow with respect to Figs. 6A - 6B.
- Service selector 10 comprises a specification handler 32 and an interface unit 40.
- Interface unit 40 comprises one or more file references 42, each file reference 42 comprising one or more specification references 44.
- File reference 42 refers to a file containing related specifications 11.
- Specification reference 44 refers to the section of code within the file that encapsulates details of one information request, in other words, one specification 11.
- Code lines 83 - 134 represent the contents of one exemplary file containing two different specifications 11 , one titled “Mega Search” and the other "Golden Retriever".
- code lines 83 - 98 and 134 are used to describe file reference 42.
- Line 87 gives the title of file reference 42 used to designate this grouping of specifications 1 1 to the user.
- Each specification reference 44 refers to a specific section of the code.
- Lines 92 - 97 show the hierarchy of specification references 44 included in the file. In this example, there are two specification references 44: "Mega Search” and "Golden Retriever", as given by lines 93 and 95.
- Lines 94 and 96 give a "url” value which is matched in lines 99 and 1 16 to find the specification 1 1 code relating to the specific specification reference 44.
- Specification handler 32 When a user chooses a specification reference 44 using interface unit 40, it passes the selected specification reference 44 to specification handler 32 as shown by the arrow labeled "selected reference". Specification handler 32 requests specification 11 corresponding to specification reference 44 from a local storage, as shown by arrow 34.
- the local storage contains a plurality of local specifications 35 and returns the local specification 35 corresponding to specification reference 44, as shown by arrow 38.
- the returned local specification 35 is used in specification handler 32 as selected specification 1 1.
- Specifications 1 1 can require management, for example, to update or add to currently available specifications 1 1. For example, specification 1 1 must be updated whenever changes are made to its associated information source. Further, if specification 1 1 refers to a broad type of search for which multiple sites exist (like the dictionary example), as better resources become available, the associated specification 1 1 can change to use these resources instead. There are two modes for updating specification 1 1. It can be done whenever a user chooses a specification 1 1 using interface unit 40 or it can be done in the background, by a process invoked by specification handler 32. Updates are retrieved from an information source that contains a multiplicity of remote specifications 37.
- specification handler 32 queries the original information source, typically through the Web, as shown by arrow 36. If a corresponding remote specification 37 does not have the same version number as selected specification 11 , remote specification 37 is retrieved, as shown by arrow 39, and replaces selected specification 11. The retrieved remote specification 37 is then stored in the local storage as well, as shown by arrow 34, replacing the appropriate local specification 35.
- This updated selected specification, now labeled 11 is the input to specification interpreter 20 of engine 14.
- specification handler 32 performs a process similar to that just described. For each file being updated, the version number of its associated remote specification 37 is checked against the version number of local specification 35 and replaced as necessary.
- Specification 11 may be written for an individual user, for a local group of users, or for a remote group of users. Specifications 11 can be obtained locally or from other users or suppliers, for example, from a Web site. When a user finds a specification 11 of interest, it is stored by specification handler 32 as local specification 35.
- FIG. 5 is a schematic illustration of a graphical user input device 50, useful in the system of the present invention.
- Graphical user input device 50 consists of an input area 51 , a graphical interface unit 40', and a description box 55.
- Input area 51 is available to receive the words the user has selected or has typed.
- Graphical interface unit 40' has a number of graphical file references 42' listed. Those graphical file references 42' containing several graphical specification references 44' contain an arrow 53. Selecting one of graphical file references 42' containing arrow 53, for example "dictionaries", causes a pull-down list of its graphical specification references 44' to appear. When a specific graphical specification references 44' is selected, its description box 55 appears.
- Description box 55 provides a "descriptive tip" to the user, a short sentence describing the use of graphical specification reference 44'.
- description box 55 might read "Enter word/phrase. Get meaning” for an exemplary graphical specification reference 44' called “Dictionary.com”.
- Figs. 6A, 6B, and 6C are illustrations of data flow during the execution of an information request in the information search and retrieval system of the invention.
- Some searches make use of a server 47 on a remote information source. Server 47 performs special operations and manipulates query parameters creating new information request queries.
- Fig. 6A illustrates the simplest type of information request, such as the patent search described hereinabove. In this search, all manipulation and interpretation of the text, context, and specification can be accomplished within engine 14. Thus, the information request is sent directly to a specific information resource 46 and the results are returned.
- Figs. 6B and 6C illustrate other types of information requests.
- Refinement of the request requires information on server 47, which is accessed on a remote information source.
- Lines 106 and 123 of code section 4 instruct engine 14 to send the information request to server 47 first. Examples requiring such a data flow model would be the use of a history file (which, for example, keeps a record of instructions on a computer) or a database relating to the context of the search being used to cause a search redirection or a more refined search.
- engine 14 produces an information request, which is sent to server 47.
- Server 47 uses the information available to it locally, for example the history file, to create a new information request that is sent to an external information resource 46.
- Information resource 46 then outputs the results.
- Specification reference 44 "Golden Retriever", detailed in lines 1 16 - 133 of Code 4 above, describes this type of search. Line 128 of the code indicates to the server what type of manipulation of the parameters is necessary.
- Fig. 6C illustrates the most complex of the data flows. This is an example of meta searching. Such an algorithm is described in the above mentioned US provisional patent application 60/171 ,586, entitled “Autonomous Context-Driven Search".
- the information request is passed to server 47, in a manner similar to that explained for Fig. 6B.
- Server 47 creates one or more new information requests to be executed on one or more different information resources 46 as shown by arrows 48.
- the results of the information request(s) are returned to server 47, shown by arrows 49, and further manipulated, possibly resulting in further information requests to the same or different information resources 46.
- server 47 manipulates or combines the intermediate search results and returns the final results to the user.
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU20217/01A AU2021701A (en) | 1999-12-23 | 2000-12-21 | Information search and retrieval system |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17158699P | 1999-12-23 | 1999-12-23 | |
US60/171,586 | 1999-12-23 | ||
US52456900A | 2000-03-13 | 2000-03-13 | |
US09/524,569 | 2000-03-13 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2001048579A2 true WO2001048579A2 (en) | 2001-07-05 |
WO2001048579A3 WO2001048579A3 (en) | 2002-03-21 |
Family
ID=26867229
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IL2000/000851 WO2001048579A2 (en) | 1999-12-23 | 2000-12-21 | Information search and retrieval system |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2021701A (en) |
WO (1) | WO2001048579A2 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5715444A (en) * | 1994-10-14 | 1998-02-03 | Danish; Mohamed Sherif | Method and system for executing a guided parametric search |
US5907838A (en) * | 1996-12-10 | 1999-05-25 | Seiko Epson Corporation | Information search and collection method and system |
US5983216A (en) * | 1997-09-12 | 1999-11-09 | Infoseek Corporation | Performing automated document collection and selection by providing a meta-index with meta-index values indentifying corresponding document collections |
US6005565A (en) * | 1997-03-25 | 1999-12-21 | Sony Corporation | Integrated search of electronic program guide, internet and other information resources |
-
2000
- 2000-12-21 WO PCT/IL2000/000851 patent/WO2001048579A2/en active Application Filing
- 2000-12-21 AU AU20217/01A patent/AU2021701A/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5715444A (en) * | 1994-10-14 | 1998-02-03 | Danish; Mohamed Sherif | Method and system for executing a guided parametric search |
US5907838A (en) * | 1996-12-10 | 1999-05-25 | Seiko Epson Corporation | Information search and collection method and system |
US6005565A (en) * | 1997-03-25 | 1999-12-21 | Sony Corporation | Integrated search of electronic program guide, internet and other information resources |
US5983216A (en) * | 1997-09-12 | 1999-11-09 | Infoseek Corporation | Performing automated document collection and selection by providing a meta-index with meta-index values indentifying corresponding document collections |
Also Published As
Publication number | Publication date |
---|---|
WO2001048579A3 (en) | 2002-03-21 |
AU2021701A (en) | 2001-07-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100413309B1 (en) | Method and system for providing native language query service | |
US7107536B1 (en) | Remote-agent-object based multilevel browser | |
US7418440B2 (en) | Method and system for extraction and organizing selected data from sources on a network | |
US5835712A (en) | Client-server system using embedded hypertext tags for application and database development | |
US7290061B2 (en) | System and method for internet content collaboration | |
US6055538A (en) | Methods and system for using web browser to search large collections of documents | |
US7653623B2 (en) | Information searching apparatus and method with mechanism of refining search results | |
WO2005052811A1 (en) | Searching in a computer network | |
US20070180066A1 (en) | System and method for searching data partially displayed on a user interface | |
US10078665B2 (en) | Customized retrieval and presentation of information from a database | |
WO2007115082A2 (en) | Systems and methods of transforming data for web communities and web applications | |
WO2001095088A1 (en) | Converting, and presenting the source document in a target format | |
US7765203B2 (en) | Implicit context collection and processing | |
Ozen et al. | Highly personalized information delivery to mobile clients | |
US10255362B2 (en) | Method for performing a search, and computer program product and user interface for same | |
WO2001048579A2 (en) | Information search and retrieval system | |
KR100491254B1 (en) | Method and System for Making a Text Introducing a Web Site Directory or Web Page into a Hypertext | |
US8495247B2 (en) | Linking a user selected sequence of received World Wide Web documents into a stored document string available to the user at a receiving web station | |
KR100335173B1 (en) | Distributed/parallel processing search engine, search method and personal search engine providing apparatus | |
Meeks et al. | Transducers and associates: circumventing limitations of the World Wide Web | |
Ervin | Dynamic delivery of information via the World Wide Web | |
Veen | Renewing the information infrastructure of the Koninklijke Bibliotheek | |
CA2602410C (en) | Client-server application development and deployment system and methods | |
Hu | Advanced WML | |
LAW | PROMULGATION AND ACCESSIBILITY IMPROVEMENT OF LEGAL REGULATIONS USING WWW AND HYPERTEXT TECHNOLOGY |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
AK | Designated states |
Kind code of ref document: A3 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |