WO2010076785A4 - System and method for aggregating data from a plurality of web sites - Google Patents
System and method for aggregating data from a plurality of web sites Download PDFInfo
- Publication number
- WO2010076785A4 WO2010076785A4 PCT/IL2009/001218 IL2009001218W WO2010076785A4 WO 2010076785 A4 WO2010076785 A4 WO 2010076785A4 IL 2009001218 W IL2009001218 W IL 2009001218W WO 2010076785 A4 WO2010076785 A4 WO 2010076785A4
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- record
- records
- analyzing
- geometrical
- data
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
Abstract
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
RU2011130218/08A RU2011130218A (en) | 2008-12-31 | 2009-12-27 | SYSTEM AND METHOD OF DATA AGREEMENT FROM MANY WEBSITES |
JP2011542972A JP5501373B2 (en) | 2008-12-31 | 2009-12-27 | System and method for collecting and ranking data from multiple websites |
EP09807502A EP2380099A1 (en) | 2008-12-31 | 2009-12-27 | System and method for aggregating data from a plurality of web sites |
CN2009801568512A CN102317937A (en) | 2008-12-31 | 2009-12-27 | System and method for aggregating and ranking data from a plurality of web sites |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US19386208P | 2008-12-31 | 2008-12-31 | |
US61/193,862 | 2008-12-31 | ||
US12/567,773 US8880498B2 (en) | 2008-12-31 | 2009-09-27 | System and method for aggregating and ranking data from a plurality of web sites |
US12/567,773 | 2009-09-27 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2010076785A1 WO2010076785A1 (en) | 2010-07-08 |
WO2010076785A4 true WO2010076785A4 (en) | 2010-10-07 |
Family
ID=42286118
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IL2009/001218 WO2010076785A1 (en) | 2008-12-31 | 2009-12-27 | System and method for aggregating data from a plurality of web sites |
Country Status (6)
Country | Link |
---|---|
US (2) | US8880498B2 (en) |
EP (1) | EP2380099A1 (en) |
JP (1) | JP5501373B2 (en) |
CN (1) | CN102317937A (en) |
RU (1) | RU2011130218A (en) |
WO (1) | WO2010076785A1 (en) |
Families Citing this family (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006108069A2 (en) * | 2005-04-06 | 2006-10-12 | Google, Inc. | Searching through content which is accessible through web-based forms |
US10380652B1 (en) | 2008-10-18 | 2019-08-13 | Clearcapital.Com, Inc. | Method and system for providing a home data index model |
US8484286B1 (en) * | 2009-11-16 | 2013-07-09 | Hydrabyte, Inc | Method and system for distributed collecting of information from a network |
WO2012006509A1 (en) * | 2010-07-09 | 2012-01-12 | Google Inc. | Table search using recovered semantic information |
US9183573B2 (en) | 2011-06-03 | 2015-11-10 | Facebook, Inc. | Überfeed |
US20130019195A1 (en) * | 2011-07-12 | 2013-01-17 | Oracle International Corporation | Aggregating multiple information sources (dashboard4life) |
US10083247B2 (en) | 2011-10-01 | 2018-09-25 | Oracle International Corporation | Generating state-driven role-based landing pages |
US10210465B2 (en) * | 2011-11-11 | 2019-02-19 | Facebook, Inc. | Enabling preference portability for users of a social networking system |
US9672252B2 (en) | 2012-03-08 | 2017-06-06 | Hewlett-Packard Development Company, L.P. | Identifying and ranking solutions from multiple data sources |
US20130238972A1 (en) * | 2012-03-09 | 2013-09-12 | Nathan Woodman | Look-alike website scoring |
US8688713B1 (en) * | 2012-03-22 | 2014-04-01 | Google Inc. | Resource identification from organic and structured content |
US20130311440A1 (en) * | 2012-05-15 | 2013-11-21 | International Business Machines Corporation | Comparison search queries |
CN102750372A (en) * | 2012-06-15 | 2012-10-24 | 翁时锋 | Analytical method for automatically acquiring webpage structured information |
US9582494B2 (en) | 2013-02-22 | 2017-02-28 | Altilia S.R.L. | Object extraction from presentation-oriented documents using a semantic and spatial approach |
US9733638B2 (en) * | 2013-04-05 | 2017-08-15 | Symbotic, LLC | Automated storage and retrieval system and control system thereof |
US9317873B2 (en) | 2014-03-28 | 2016-04-19 | Google Inc. | Automatic verification of advertiser identifier in advertisements |
US11080777B2 (en) * | 2014-03-31 | 2021-08-03 | Monticello Enterprises LLC | System and method for providing a social media shopping experience |
US20150287099A1 (en) | 2014-04-07 | 2015-10-08 | Google Inc. | Method to compute the prominence score to phone numbers on web pages and automatically annotate/attach it to ads |
US11115529B2 (en) | 2014-04-07 | 2021-09-07 | Google Llc | System and method for providing and managing third party content with call functionality |
US10817884B2 (en) * | 2014-05-08 | 2020-10-27 | Google Llc | Building topic-oriented audiences |
JP6386089B2 (en) | 2014-06-26 | 2018-09-05 | グーグル エルエルシー | Optimized browser rendering process |
CN106462582B (en) | 2014-06-26 | 2020-05-15 | 谷歌有限责任公司 | Batch optimized rendering and fetching architecture |
KR102133486B1 (en) * | 2014-06-26 | 2020-07-13 | 구글 엘엘씨 | Optimized browser rendering process |
US20160048548A1 (en) * | 2014-08-13 | 2016-02-18 | Microsoft Corporation | Population of graph nodes |
US10529031B2 (en) * | 2014-09-25 | 2020-01-07 | Sai Suresh Ganesamoorthi | Method and systems of implementing a ranked health-content article feed |
US20160125081A1 (en) * | 2014-10-31 | 2016-05-05 | Yahoo! Inc. | Web crawling |
US10083295B2 (en) * | 2014-12-23 | 2018-09-25 | Mcafee, Llc | System and method to combine multiple reputations |
US10643258B2 (en) * | 2014-12-24 | 2020-05-05 | Keep Holdings, Inc. | Determining commerce entity pricing and availability based on stylistic heuristics |
WO2017115272A1 (en) * | 2015-12-28 | 2017-07-06 | Sixgill Ltd. | Dark web monitoring, analysis and alert system and method |
US10469424B2 (en) | 2016-10-07 | 2019-11-05 | Google Llc | Network based data traffic latency reduction |
US11023526B2 (en) * | 2017-06-02 | 2021-06-01 | International Business Machines Corporation | System and method for graph search enhancement |
US11461829B1 (en) | 2019-06-27 | 2022-10-04 | Amazon Technologies, Inc. | Machine learned system for predicting item package quantity relationship between item descriptions |
JP7002804B2 (en) | 2019-12-13 | 2022-01-20 | 翼 加藤 | Search device, search application and search method |
CN111291155A (en) * | 2020-01-17 | 2020-06-16 | 青梧桐有限责任公司 | Method and system for identifying homonymous cells based on text similarity |
CN112734165A (en) * | 2020-12-18 | 2021-04-30 | 中国平安财产保险股份有限公司 | Intelligent function display method, device, equipment and storage medium |
Family Cites Families (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5659732A (en) * | 1995-05-17 | 1997-08-19 | Infoseek Corporation | Document retrieval over networks wherein ranking and relevance scores are computed at the client for multiple database documents |
US6067552A (en) * | 1995-08-21 | 2000-05-23 | Cnet, Inc. | User interface system and method for browsing a hypertext database |
US6012053A (en) * | 1997-06-23 | 2000-01-04 | Lycos, Inc. | Computer system with user-controlled relevance ranking of search results |
US6275820B1 (en) * | 1998-07-16 | 2001-08-14 | Perot Systems Corporation | System and method for integrating search results from heterogeneous information resources |
AU4712601A (en) * | 1999-12-08 | 2001-07-03 | Amazon.Com, Inc. | System and method for locating and displaying web-based product offerings |
US7240067B2 (en) * | 2000-02-08 | 2007-07-03 | Sybase, Inc. | System and methodology for extraction and aggregation of data from dynamic content |
WO2001075664A1 (en) * | 2000-03-31 | 2001-10-11 | Kapow Aps | Method of retrieving attributes from at least two data sources |
US7346858B1 (en) * | 2000-07-24 | 2008-03-18 | The Hive Group | Computer hierarchical display of multiple data characteristics |
JP2002108846A (en) * | 2000-09-27 | 2002-04-12 | Fuji Xerox Co Ltd | Device/method for processing document image and recording medium |
US7231381B2 (en) * | 2001-03-13 | 2007-06-12 | Microsoft Corporation | Media content search engine incorporating text content and user log mining |
JP2003216647A (en) * | 2002-01-18 | 2003-07-31 | Matsushita Electric Ind Co Ltd | Merchandise search device for use in cyber store, cyber store service providing device, media, and information assembly |
US7246306B2 (en) * | 2002-06-21 | 2007-07-17 | Microsoft Corporation | Web information presentation structure for web page authoring |
JP4370783B2 (en) * | 2002-06-27 | 2009-11-25 | 沖電気工業株式会社 | Information processing apparatus and method |
US7251648B2 (en) * | 2002-06-28 | 2007-07-31 | Microsoft Corporation | Automatically ranking answers to database queries |
US20060047649A1 (en) * | 2003-12-29 | 2006-03-02 | Ping Liang | Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation |
US7330608B2 (en) * | 2004-12-22 | 2008-02-12 | Ricoh Co., Ltd. | Semantic document smartnails |
US7672958B2 (en) * | 2005-01-14 | 2010-03-02 | Im2, Inc. | Method and system to identify records that relate to a pre-defined context in a data set |
EP1866806A1 (en) * | 2005-03-09 | 2007-12-19 | Medio Systems, Inc. | Method and system for active ranking of browser search engine results |
WO2006108069A2 (en) * | 2005-04-06 | 2006-10-12 | Google, Inc. | Searching through content which is accessible through web-based forms |
US20060282455A1 (en) * | 2005-06-13 | 2006-12-14 | It Interactive Services Inc. | System and method for ranking web content |
US20070078814A1 (en) * | 2005-10-04 | 2007-04-05 | Kozoru, Inc. | Novel information retrieval systems and methods |
US8065286B2 (en) * | 2006-01-23 | 2011-11-22 | Chacha Search, Inc. | Scalable search system using human searchers |
US20070208732A1 (en) * | 2006-02-07 | 2007-09-06 | Future Vistas, Inc. | Telephonic information retrieval systems and methods |
US20070294240A1 (en) * | 2006-06-07 | 2007-12-20 | Microsoft Corporation | Intent based search |
US20080033996A1 (en) * | 2006-08-03 | 2008-02-07 | Anandsudhakar Kesari | Techniques for approximating the visual layout of a web page and determining the portion of the page containing the significant content |
US8510298B2 (en) * | 2006-08-04 | 2013-08-13 | Thefind, Inc. | Method for relevancy ranking of products in online shopping |
US7917492B2 (en) * | 2007-09-21 | 2011-03-29 | Limelight Networks, Inc. | Method and subsystem for information acquisition and aggregation to facilitate ontology and language-model generation within a content-search-service system |
US20080098300A1 (en) * | 2006-10-24 | 2008-04-24 | Brilliant Shopper, Inc. | Method and system for extracting information from web pages |
US8707167B2 (en) * | 2006-11-15 | 2014-04-22 | Ebay Inc. | High precision data extraction |
US7930302B2 (en) * | 2006-11-22 | 2011-04-19 | Intuit Inc. | Method and system for analyzing user-generated content |
JP5056133B2 (en) * | 2007-04-13 | 2012-10-24 | 日本電気株式会社 | Information extraction system, information extraction method, and information extraction program |
US8392446B2 (en) | 2007-05-31 | 2013-03-05 | Yahoo! Inc. | System and method for providing vector terms related to a search query |
US20090077180A1 (en) * | 2007-09-14 | 2009-03-19 | Flowers John S | Novel systems and methods for transmitting syntactically accurate messages over a network |
US8117208B2 (en) | 2007-09-21 | 2012-02-14 | The Board Of Trustees Of The University Of Illinois | System for entity search and a method for entity scoring in a linked document database |
KR100938830B1 (en) | 2007-12-18 | 2010-01-26 | 한국과학기술정보연구원 | Method constructing knowledge base and thereof server |
US20090265611A1 (en) * | 2008-04-18 | 2009-10-22 | Yahoo ! Inc. | Web page layout optimization using section importance |
US20100169352A1 (en) * | 2008-12-31 | 2010-07-01 | Flowers John S | Novel systems and methods for transmitting syntactically accurate messages over a network |
US8874552B2 (en) | 2009-11-29 | 2014-10-28 | Rinor Technologies Inc. | Automated generation of ontologies |
-
2009
- 2009-09-27 US US12/567,773 patent/US8880498B2/en not_active Expired - Fee Related
- 2009-12-27 CN CN2009801568512A patent/CN102317937A/en active Pending
- 2009-12-27 RU RU2011130218/08A patent/RU2011130218A/en unknown
- 2009-12-27 EP EP09807502A patent/EP2380099A1/en not_active Ceased
- 2009-12-27 WO PCT/IL2009/001218 patent/WO2010076785A1/en active Application Filing
- 2009-12-27 JP JP2011542972A patent/JP5501373B2/en not_active Expired - Fee Related
-
2014
- 2014-09-28 US US14/499,188 patent/US9430569B2/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN102317937A (en) | 2012-01-11 |
JP5501373B2 (en) | 2014-05-21 |
US9430569B2 (en) | 2016-08-30 |
JP2013515977A (en) | 2013-05-09 |
US20100169301A1 (en) | 2010-07-01 |
RU2011130218A (en) | 2013-02-10 |
EP2380099A1 (en) | 2011-10-26 |
US8880498B2 (en) | 2014-11-04 |
WO2010076785A1 (en) | 2010-07-08 |
US20150134636A1 (en) | 2015-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2010076785A4 (en) | System and method for aggregating data from a plurality of web sites | |
Gauch et al. | ProFusion*: Intelligent fusion from multiple, distributed search engines | |
US8117208B2 (en) | System for entity search and a method for entity scoring in a linked document database | |
US8161050B2 (en) | Visualizing hyperlinks in a search results list | |
Barbosa et al. | Organizing hidden-web databases by clustering visible web documents | |
US20070162448A1 (en) | Adaptive hierarchy structure ranking algorithm | |
US20100299343A1 (en) | Identifying Task Groups for Organizing Search Results | |
US9460207B2 (en) | Automated database generation for answering fact lookup queries | |
WO2005031614A1 (en) | Systems and methods for clustering search results | |
JP2000339350A (en) | Multi-mode information access | |
US9405803B2 (en) | Ranking signals in mixed corpora environments | |
CN111506727B (en) | Text content category acquisition method, apparatus, computer device and storage medium | |
Radu et al. | A hybrid machine-crowd approach to photo retrieval result diversification | |
Tsai | A review of image retrieval methods for digital cultural heritage resources | |
US9779140B2 (en) | Ranking signals for sparse corpora | |
KR19990048712A (en) | Map Type Classification Search Method | |
WO2001039008A1 (en) | Method and system for collecting topically related resources | |
Sathya et al. | Link based K-Means clustering algorithm for information retrieval | |
Bokhari et al. | A new criterion for evaluating news search systems | |
Yoshida et al. | Query transformation by visualizing and utilizing information about what users are or are not searching | |
Yadav et al. | Ontdr: An ontology-based augmented method for document retrieval | |
AU5126700A (en) | Method and system for creating a topical data structure | |
Umesh et al. | Web images evaluations based on visual content | |
Vadivu et al. | Ranking images in web documents based on HTML TAGs for image retrieval from WWW | |
Vadivu et al. | Image Retrieval From WWW Using Attributes in HTML TAGs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200980156851.2 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09807502 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase in: |
Ref document number: 2011542972 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2009807502 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011130218 Country of ref document: RU |