WO2002027562A3 - Method and apparatus to retrieve information from a network - Google Patents

Method and apparatus to retrieve information from a network Download PDF

Info

Publication number
WO2002027562A3
WO2002027562A3 PCT/US2001/030584 US0130584W WO0227562A3 WO 2002027562 A3 WO2002027562 A3 WO 2002027562A3 US 0130584 W US0130584 W US 0130584W WO 0227562 A3 WO0227562 A3 WO 0227562A3
Authority
WO
WIPO (PCT)
Prior art keywords
links
weighted
relevant
additional
files
Prior art date
Application number
PCT/US2001/030584
Other languages
French (fr)
Other versions
WO2002027562A2 (en
Inventor
Gabriel J Kaigham
Evan M Indianer
Christopher M Umbel
Joel Lenhart
Original Assignee
Ninesigma Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ninesigma Inc filed Critical Ninesigma Inc
Priority to AU2001293193A priority Critical patent/AU2001293193A1/en
Publication of WO2002027562A2 publication Critical patent/WO2002027562A2/en
Publication of WO2002027562A3 publication Critical patent/WO2002027562A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access

Abstract

A method and apparatus to index network information is described. A network is searched for files of information relevant to people and resources in a particular field using a search list of weighted links to said files. The information is parsed into content and additional links to additional files. The content is weighted and copied to memory (such as a database). A determination is made as to whether the additional links are relevant to the people and resources in the given technical field. Those additional links that are relevant are weighted using a predetermined weighting algorithm. The relevant additional weighted links are copied to the search list. The process continues until an ending condition occurs.
PCT/US2001/030584 2000-09-29 2001-09-28 Method and apparatus to retrieve information from a network WO2002027562A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001293193A AU2001293193A1 (en) 2000-09-29 2001-09-28 Method and apparatus to retrieve information from a network

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/675,594 2000-09-29
US09/675,594 US6584468B1 (en) 2000-09-29 2000-09-29 Method and apparatus to retrieve information from a network

Publications (2)

Publication Number Publication Date
WO2002027562A2 WO2002027562A2 (en) 2002-04-04
WO2002027562A3 true WO2002027562A3 (en) 2003-10-09

Family

ID=24711174

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/030584 WO2002027562A2 (en) 2000-09-29 2001-09-28 Method and apparatus to retrieve information from a network

Country Status (3)

Country Link
US (1) US6584468B1 (en)
AU (1) AU2001293193A1 (en)
WO (1) WO2002027562A2 (en)

Families Citing this family (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7596755B2 (en) * 1997-12-22 2009-09-29 Ricoh Company, Ltd. Multimedia visualization and integration environment
EP1348168A1 (en) 2000-10-24 2003-10-01 Singingfish.Com, Inc. Method of collecting data using an embedded media player page
US8122236B2 (en) 2001-10-24 2012-02-21 Aol Inc. Method of disseminating advertisements using an embedded media player page
US20020103920A1 (en) * 2000-11-21 2002-08-01 Berkun Ken Alan Interpretive stream metadata extraction
US7062486B2 (en) * 2000-12-05 2006-06-13 International Business Machines Corporation Method, system and program product for enabling authorized access and request-initiated translation of data files
US7047482B1 (en) 2001-02-28 2006-05-16 Gary Odom Automatic directory supplementation
US7272594B1 (en) 2001-05-31 2007-09-18 Autonomy Corporation Ltd. Method and apparatus to link to a related document
CN1589445B (en) * 2001-11-19 2010-04-28 富士通株式会社 Information navigation system
US20040064500A1 (en) * 2001-11-20 2004-04-01 Kolar Jennifer Lynn System and method for unified extraction of media objects
US8527495B2 (en) * 2002-02-19 2013-09-03 International Business Machines Corporation Plug-in parsers for configuring search engine crawler
US7949648B2 (en) * 2002-02-26 2011-05-24 Soren Alain Mortensen Compiling and accessing subject-specific information from a computer network
US7743045B2 (en) 2005-08-10 2010-06-22 Google Inc. Detecting spam related and biased contexts for programmable search engines
US7716199B2 (en) 2005-08-10 2010-05-11 Google Inc. Aggregating context data for programmable search engines
US7693830B2 (en) 2005-08-10 2010-04-06 Google Inc. Programmable search engine
CN1628452B (en) * 2002-05-17 2010-09-01 株式会社Ntt都科摩 De-fragmentation of transmission sequences
US7024405B2 (en) * 2002-07-18 2006-04-04 The United States Of America As Represented By The Secretary Of The Air Force Method and apparatus for improved internet searching
JP3997412B2 (en) * 2002-11-13 2007-10-24 ソニー株式会社 Information processing apparatus and method, recording medium, and program
US7028029B2 (en) * 2003-03-28 2006-04-11 Google Inc. Adaptive computation of ranking
US7216123B2 (en) * 2003-03-28 2007-05-08 Board Of Trustees Of The Leland Stanford Junior University Methods for ranking nodes in large directed graphs
DE10319427A1 (en) * 2003-04-29 2004-12-02 Contraco Consulting & Software Ltd. Method for creating short data records characteristic of data records from a database, in particular from the World Wide Web, method for determining data records relevant to a specifiable search query from a database and search system for carrying out the method
US8321278B2 (en) * 2003-09-30 2012-11-27 Google Inc. Targeted advertisements based on user profiles and page profile
US20050222989A1 (en) * 2003-09-30 2005-10-06 Taher Haveliwala Results based personalization of advertisements in a search engine
US20050071328A1 (en) * 2003-09-30 2005-03-31 Lawrence Stephen R. Personalization of web search
GB2411014A (en) * 2004-02-11 2005-08-17 Autonomy Corp Ltd Automatic searching for relevant information
US7716223B2 (en) 2004-03-29 2010-05-11 Google Inc. Variable personalization of search results in a search engine
US7565630B1 (en) 2004-06-15 2009-07-21 Google Inc. Customization of search results for search queries received from third party sites
US7340672B2 (en) * 2004-09-20 2008-03-04 Intel Corporation Providing data integrity for data streams
US20070073894A1 (en) * 2005-09-14 2007-03-29 O Ya! Inc. Networked information indexing and search apparatus and method
CN1790335A (en) * 2005-12-19 2006-06-21 无锡永中科技有限公司 XML file data access method
US9633356B2 (en) 2006-07-20 2017-04-25 Aol Inc. Targeted advertising for playlists based upon search queries
US9165040B1 (en) 2006-10-12 2015-10-20 Google Inc. Producing a ranking for pages using distances in a web-link graph
US8156056B2 (en) * 2007-04-03 2012-04-10 Fernando Luege Mateos Method and system of classifying, ranking and relating information based on weights of network links
US9477719B2 (en) * 2008-08-28 2016-10-25 Oracle International Corporation Search using business intelligence dimensions
US9836538B2 (en) * 2009-03-03 2017-12-05 Microsoft Technology Licensing, Llc Domain-based ranking in document search
US9529915B2 (en) * 2011-06-16 2016-12-27 Microsoft Technology Licensing, Llc Search results based on user and result profiles
US9436726B2 (en) 2011-06-23 2016-09-06 BCM International Regulatory Analytics LLC System, method and computer program product for a behavioral database providing quantitative analysis of cross border policy process and related search capabilities
US9323767B2 (en) 2012-10-01 2016-04-26 Longsand Limited Performance and scalability in an intelligent data operating layer system
US9910899B1 (en) * 2014-09-03 2018-03-06 State Farm Mutual Automobile Insurance Company Systems and methods for electronically mining intellectual property
CN116186368B (en) * 2023-03-17 2023-11-14 广东朝恒科技有限公司 Data crawling method and system

Family Cites Families (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5724567A (en) 1994-04-25 1998-03-03 Apple Computer, Inc. System for directing relevance-ranked data objects to computer users
US5758257A (en) 1994-11-29 1998-05-26 Herz; Frederick System and method for scheduling broadcast of and access to video programs and other data using customer profiles
US5855015A (en) * 1995-03-20 1998-12-29 Interval Research Corporation System and method for retrieval of hyperlinked information resources
US6067552A (en) 1995-08-21 2000-05-23 Cnet, Inc. User interface system and method for browsing a hypertext database
US5974409A (en) 1995-08-23 1999-10-26 Microsoft Corporation System and method for locating information in an on-line network
US5867799A (en) 1996-04-04 1999-02-02 Lang; Andrew K. Information system and method for filtering a massive flow of information entities to meet user information classification needs
US5903892A (en) * 1996-05-24 1999-05-11 Magnifi, Inc. Indexing of media content on a network
US5913208A (en) 1996-07-09 1999-06-15 International Business Machines Corporation Identifying duplicate documents from search results without comparing document content
US5842206A (en) 1996-08-20 1998-11-24 Iconovex Corporation Computerized method and system for qualified searching of electronically stored documents
US6085186A (en) 1996-09-20 2000-07-04 Netbot, Inc. Method and system using information written in a wrapper description language to execute query on a network
EP0945811B1 (en) 1996-10-23 2003-01-22 Access Co., Ltd. Information apparatus having automatic web reading function
GB2331166B (en) * 1997-11-06 2002-09-11 Ibm Database search engine
US6078914A (en) 1996-12-09 2000-06-20 Open Text Corporation Natural language meta-search system and method
US5966126A (en) 1996-12-23 1999-10-12 Szabo; Andrew J. Graphic user interface for database system
US5875446A (en) 1997-02-24 1999-02-23 International Business Machines Corporation System and method for hierarchically grouping and ranking a set of objects in a query context based on one or more relationships
US5987454A (en) * 1997-06-09 1999-11-16 Hobbs; Allen Method and apparatus for selectively augmenting retrieved text, numbers, maps, charts, still pictures and/or graphics, moving pictures and/or graphics and audio information from a network resource
US6078917A (en) 1997-12-18 2000-06-20 International Business Machines Corporation System for searching internet using automatic relevance feedback
US6055538A (en) 1997-12-22 2000-04-25 Hewlett Packard Company Methods and system for using web browser to search large collections of documents
US5983221A (en) 1998-01-13 1999-11-09 Wordstream, Inc. Method and apparatus for improved document searching
US6421675B1 (en) 1998-03-16 2002-07-16 S. L. I. Systems, Inc. Search engine
US6038574A (en) 1998-03-18 2000-03-14 Xerox Corporation Method and apparatus for clustering a collection of linked documents using co-citation analysis
AU3874899A (en) 1998-05-01 1999-11-23 Citizen 1 Software, Inc. Method and apparatus for simultaneously accessing a plurality of dispersed databases
US6356899B1 (en) * 1998-08-29 2002-03-12 International Business Machines Corporation Method for interactively creating an information database including preferred information elements, such as preferred-authority, world wide web pages
WO2000038086A2 (en) 1998-12-08 2000-06-29 Livetechnology (Pty) Ltd. Information network search engine
US6434556B1 (en) * 1999-04-16 2002-08-13 Board Of Trustees Of The University Of Illinois Visualization of Internet search information
US6295559B1 (en) * 1999-08-26 2001-09-25 International Business Machines Corporation Rating hypermedia for objectionable content
US6389467B1 (en) * 2000-01-24 2002-05-14 Friskit, Inc. Streaming media search and continuous playback system of media resources located by multiple network addresses
US6438539B1 (en) * 2000-02-25 2002-08-20 Agents-4All.Com, Inc. Method for retrieving data from an information network through linking search criteria to search strategy

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
DE BRA P M E ET AL: "Information retrieval in the World-Wide Web: Making client-based searching feasible", COMPUTER NETWORKS AND ISDN SYSTEMS, NORTH HOLLAND PUBLISHING. AMSTERDAM, NL, vol. 27, no. 2, 1 November 1994 (1994-11-01), pages 183 - 192, XP004037989, ISSN: 0169-7552 *
HERSOVICI M ET AL: "The shark-search algorithm. An application: tailored Web site mapping", COMPUTER NETWORKS AND ISDN SYSTEMS, NORTH HOLLAND PUBLISHING. AMSTERDAM, NL, vol. 30, no. 1-7, 1 April 1998 (1998-04-01), pages 317 - 326, XP004121415, ISSN: 0169-7552 *
JUNGHOO C ET AL: "Efficient crawling through URL ordering", COMPUTER NETWORKS AND ISDN SYSTEMS, NORTH HOLLAND PUBLISHING. AMSTERDAM, NL, vol. 30, no. 1-7, 1 April 1998 (1998-04-01), pages 161 - 172, XP004121430, ISSN: 0169-7552 *
SOUMEN CHAKRABARTI ET AL: "Distributed hypertext resource discovery through examples", PROCEEDINGS OF 25TH INTERNATIONAL CONFERENCE ON VERY LARGE DATABASES, EDINBURGH, UK, 7 September 1999 (1999-09-07) - 10 September 1999 (1999-09-10), 1999, Orlando, FL, USA, Morgan Kaufmann Publishers, USA, pages 375 - 386, XP002250581 *

Also Published As

Publication number Publication date
AU2001293193A1 (en) 2002-04-08
WO2002027562A2 (en) 2002-04-04
US6584468B1 (en) 2003-06-24

Similar Documents

Publication Publication Date Title
WO2002027562A3 (en) Method and apparatus to retrieve information from a network
US20180374491A1 (en) Systems and Methods for Recognizing Sound and Music Signals in High Noise and Distortion
US5855015A (en) System and method for retrieval of hyperlinked information resources
US8650025B2 (en) Method and apparatus for determining text passage similarity
WO2001084374A3 (en) Information access method
JP2001519952A (en) Data summarization device
US20040167876A1 (en) Method and apparatus for improved web scraping
WO2005070019A3 (en) Contextual searching
CN101452470A (en) Method and apparatus for a web search engine generating summary-style search results
CN102622445A (en) User interest perception based webpage push system and webpage push method
WO2003107127A3 (en) System and method for personalized information retrieval based on user expertise
WO2002027541A1 (en) A method and apparatus for concept-based searching across a network
Williams et al. What's Next? Index Structures for Efficient Phrase Querying.
CN110795627A (en) Information recommendation method and device and electronic equipment
CA2353533A1 (en) Search engine for video and graphics
Hannappel et al. MSEEC-a multi search engine with multiple clustering
Gey et al. Term importance, Boolean conjunct training, negative terms, and foreign language retrieval: probabilistic algorithms at TREC-5.
CN111723378B (en) Website directory blasting method based on website map
CN108280085A (en) The method and device of data deduplication
EA200100467A1 (en) METHOD OF SEARCHING FOR STORAGE OF ELECTRON DOCUMENTS AND THEIR FRAGMENTS ON STORAGE DEVICES
KR20010092899A (en) Method of mpeg-7 meta data hiding and detection to retrieve multimedia for multimedia indexing retrieval system
WO1998052130A1 (en) Text retrieval method
WO2001065416A3 (en) Probabilistic matching engine
WO2002003311A3 (en) File search service system and method through the internet
US20040049496A1 (en) Interactive searching system and method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP