CN1755677B - 使用索引关键词的范围搜索系统和方法 - Google Patents

使用索引关键词的范围搜索系统和方法 Download PDF

Info

Publication number
CN1755677B
CN1755677B CN2005100882120A CN200510088212A CN1755677B CN 1755677 B CN1755677 B CN 1755677B CN 2005100882120 A CN2005100882120 A CN 2005100882120A CN 200510088212 A CN200510088212 A CN 200510088212A CN 1755677 B CN1755677 B CN 1755677B
Authority
CN
China
Prior art keywords
documents
scope
relative index
index
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2005100882120A
Other languages
English (en)
Other versions
CN1755677A (zh
Inventor
C·C·梅里耿
D·J·李
D·梅耶松
K·G·佩尔托宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Publication of CN1755677A publication Critical patent/CN1755677A/zh
Application granted granted Critical
Publication of CN1755677B publication Critical patent/CN1755677B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99932Access augmentation or optimizing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99936Pattern matching access
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99943Generating database or data structure, e.g. via user interface

Abstract

在与搜索范围相关联的而不是搜索目标文档内容相关的索引搜索系统中包括了一索引关键词集合。这些范围相关的索引关键词,或是范围关键词允许选择搜索的范围,减少了一个搜索为获得结果所要求筛选的文档数。进一步来说,复合范围被辨识和存储,使得复杂搜索范围索引被提供用来消除基于这些复杂搜索范围的搜索的再散列。

Description

使用索引关键词的范围搜索系统和方法
背景技术
在网络和文件系统中搜索内容已经有很多种形式,但是大多数通常是由各种搜索引擎提供的。搜索引擎就是利用特定关键词来搜索文档并且返回关键词所在文档列表的程序。
典型地,搜索引擎通过放出网络蜘蛛来得到尽可能多文档来工作的。另一种程序,称之为索引器,读取这些文档然后基于每个文档中所包含的单词来创建一索引。一个索引就是关键词或者是关键词列表,每一个关键词都能唯一确定记录。索引能够更快地发现特定记录并通过索引字段对这些记录分类。搜索引擎通常使用能创建其索引的算法,使得理想地,对于客户或者使用者的每次查询,只有有意义的结果才会返回。
这些查询的一个公平一致的方面就是关键词或者是索引关键词的使用。无论用户作为长段文本字符串或者布尔操作符连接输入搜索查询,这些搜索引擎都会检查与这些字符串匹配的和所输入关键词对应的文档的所有记录。随之会返回满足布尔操作数限制或者与长字段相对应的记录子集。检查这些记录可能是一个耗费时间又很昂贵的工作。此外,客户可能不需要包含特定关键词文档的全记录。
发明内容
该发明实施例是与通过提供索引关键词类,称为范围关键词定义了搜索范围而不是仅仅提供关键词,来解决上述限制的系统和方法相关的。当在搜索查询中输入范围关键词时,该范围关键词限制了所搜寻的索引记录的范围。举例说明,一个范围关键词能够通过将搜索结果限制为某种文件类型,如.mpg文件来限制搜索范围。另一种搜索关键词能够根据URL(统一资源定位器)来限制搜索范围,这样只能搜索在该URL下的文档。还有另一种范围关键词能够将搜索范围限制为在用户计算机或者其他网络计算机上的特定数据库中。因此该项发明通过允许用户通过使用特定索引关键词类来限制搜索范围从而明显地缩减了搜索时间和开销,解决了上述提及的问题。
按照本发明另一方面,也可以辨认和存储复合范围。这个附加的索引划分包括这些基本范围组合的范围定义。当这些组合范围被引用时,与这些组合范围相对应的文档已经被分辨,允许更快的搜索。
附图说明
图1说明可在该发明实施例中应用的典型计算设备。
图2说明根据该发明实施例使用索引关键词用于范围索引的示范系统模型示意图。
图3说明根据该发明实施例索引的示范结构模型示意图
图4说明根据该发明实施例管理复合范围的示范模型示意图
图5说明根据该发明实施例产生索引的示范过程逻辑进程图。
图6说明根据该发明实施例编辑索引的示范过程逻辑流程图。
图7说明根据该发明实施例处理查询的示范过程逻辑流程图。
具体实施方式
通过附图以下将会更完善地描述该项发明,这也将形成本文的一部分,通过图示表明实践该发明的特定示范实施例。该发明可能也会通过很多不同形式来体现,而不是仅仅局限于本文所设定的实施例;更甚者,通过提供这些实施例,这个揭示才会更彻底和完善,并且能够完全把这个发明应用传递给在该方面资深的那些人。在这些事物中,该项发明可以通过方法或是设备来体现,相应地,该发明可以采用全部地硬件实施例,或全部地软件实施例,或是软硬件结合的实施例形式。因而下述详细描述不是以限制意义作出的。
说明性运行环境
参照图1,实现该发明的示范性系统包含一个计算设备,如计算设备100。计算设备可以作为客机,服务器,移动设备或是任何其他计算设备配置而成。在最基本的配置中,计算设备100一般包含至少一个处理单元102和系统存储器104。根据计算设备的确切配置和类型,系统存储器104可以是易失性的(如RAM),非易失性的(如ROM,闪存等)或是这两种的组合。系统存储器104一般包含操作系统105,一个或是多个应用程序106,以及可能包含程序数据107。在一个实施例中,应用程序106包含一个搜索范围应用程序120用于实现该发明功能。图1虚线108中这些元件显示为基本配置。
计算设备100可能包含其他功能特性。举例来说,计算设备100可能包含额外的数据存储设备(可移动的或是不可移动的)如,举例来说,磁盘,光盘或是磁带。图1通过可移动存储器109和不可移动存储器110显示这些额外存储设备。计算机存储媒介可能包括易失性和非易失性,可移动和不可移动媒质,这些媒质可以在任何方法或是技术中实现用于存储信息,如计算机可读指令,数据结构,程序模型或其他数据。系统存储器104,可移动存储109以及不可移动存储100包含了计算机存储媒质所有例子。计算机存储媒质包含,但不仅仅局限于,RAM,ROM,EEPROM,闪存或其他存储技术,CD-ROM,数字易失性光盘(DVD)或是其他光学存储器,磁带盒,磁带,磁盘存储或是其他磁性存储设备,或是能够用于存储想要的信息同时又是计算设备100可访问的任何其他媒质。计算设备100可以具有输入设备112如键盘,鼠标,笔,声音输入设备,触摸输入设备等。也可包括输出设备114如显示器,扬声器,打印机等。
计算设备100还包含通信连接器116,允许设备和其他计算设备118,如经网络通信。通信连接116是通信媒介的一种示例。通信媒介一般通过计算可读指令,数据结构,程序模型或是其他调制数据信号中的数据,如载波或是其他发送机制来体现,同时包括任意信息传递媒质。“调制数据信号”表示该信号具有一个或是多个特性以某种方式被设定或是改变来编码信号中的信息。通过示例,但不限制,通信媒介包括有线媒介,如有线网络或是直线连接,无线媒介如声学的,射频,红外以及其他无线媒介。其中所运用的计算机可读媒介同时包含存储媒介和通信媒介。
用于范围搜索的说明性实施例
在接下来的描述和权利要求中,术语“文档”是指任何可以返回作为搜索查询和网络爬行结果的任何可能资源,如网络文档,文件,文件夹,Web网页以及其他资源.文字“索引关键词”是指任何关键词或是与在搜索查询中或创建索引时用于锁定一系列目标文档的搜索相关的关键词.文字“范围关键词”是指用来缩小搜索范围的任何索引关键词,使得在搜索开始前所搜索的文档数将被缩减.可以根据属性如文件类型,位置如某些数据库或是URL,通过其他缩减搜索文档数的准则来缩小搜索范围.
本发明的实施例通过呈现每一项具有关键词索引范围以及对查询增加合适的范围关键词来限制范围以提高在所有范围内查询效率。所提供的方法与基于文档特性(如URL或其他元数据)为与用户关键词匹配范围的每个文档重新计算范围条件相反。由于大多数范围表示所有爬行项目集合的一个窄带,范围查询的效率提高的倍数小于但是与该窄带的狭窄性范围相关。举例而言,一个门户管理员(如提供包括搜索服务在内主机服务的web网站)可能会决定该门户的使用者对功能说明特别感兴趣。因此提供只通过这些说明来搜索关键词而不用是在其他文档中筛选得到那些词汇的搜索机制是有利的。通过一个管理界面,管理员可以定义具有下列规则的范围:特征=“spec”,或者这一范围可以已经被定义为一基本范围。管理员可以用基本范围来定义具有以下规则的复合范围:特征=“spec”AND文件类型=“Text”AND作者!=“John Doe”。在为客户机用户界面给出这个范围的友好名字并且指定从哪些站点可以获得,这些范围出现在客户机的下拉范围列表中。这个范围选择只返回具有那个特性值作为查询结果的文档。以这种方式由管理员定义的范围可以被称为是编辑范围。
此外,还可以获取几个默认的范围选择。可以将搜索“所有内容”的任选项呈现一般客户,这样就可以简单表示无规定范围的搜索。对于该“站点”的范围选择搜寻在当前有站点以及子站点的所有文档。包含一定结果的范围也可以被生成而不是简单地识别包含在特定子集中的文档的范围。举例而言,一个范围定义可以对应于搜索除了文本文档(.txt)以外的所有文档的搜索相对应。另一范围定义可以对应于对网络中除了和特定网站URL相关的文档以外的所有文档的搜索。范围选择数量无论是默认的,授予的或其它方式的是不受限于此处所描述的那些形式。
在另一实施例中,也会给予客户选项将范围关键词直接键入搜索请求。为了方便客户使用,授予范围关键词以名字,将名字键入搜索请求限制了相关范围的搜索。
图2说明了根据本发明使用索引关键词来进行范围搜索的示范系统功能框图。系统200包括索引210,管道220,文档界面230,客户机界面240,范围插件250,索引插件260,范围描述270以及管理界面280。
索引210被构建为包含用于内容关键词(如关键词)和范围关键词的分开的索引。索引210更详细的结构描述在图3中提供。这些索引记录用来向客户查询提供结果,在一个实施例中,索引210和多个集中提供索引记录存储的多个数据库相对应。
管道220是用于索引获取文档或是文档记录的收集机制的说明性表示。管道220在键入与数据相呼应的记录至索引210之前允许通过各种插件滤除数据(如范围插件250)。
文档界面230可以为跨多个数据库和网络位置检索文档提供协议,网络访问节点以及数据库访问节点。例举,文档界面230可以在提供对本地服务器数据库的访问以及对当前计算设备上的数据库访问的同时提供对互联网的访问。其他实施例可以通过各种协议而不脱离本发明原则和范围来访问其他文档位置。
客户机界面240可以通过客户机提供访问来定义和启动一个搜索.可以根据关键词和/或范围关键词来定义搜索.用于处理搜索查询的示例方法会在以下图7中进行更详细的讨论.
范围插件250为若干收集管道插件之一。范围插件250识别特性值来作为范围关键词要被重新发出(即作为范围索引中要被索引的项)。被识别为与范围相关的感兴趣的特性(如文档类型,URL等)的这些特性由范围插件250随着通过文档界面230提供的文档被爬行而被收集。这些特性由范围插件250重新发射到管道220中从而被包含在索引中210。根据这些特性对于管理员或是其他实施体向客户提供范围选择来说这些特性同样也是可用的。
索引插件260是与管道220相连的另一个插件。索引插件提供用于产生,划分以及更新索引210的机制。用于产生索引210的示例方法将在下面的图5中详细讨论。用于更新索引210的示范方法在下面图6中详细讨论。在一个实施例中,在将结果刷新到索引210之前,索引插件260能够提供暂时高速缓存由被爬行的文档产生的关键词和范围关键词词汇列表。从包含在这些词汇列表中的爬行结果中填充索引210记录。
范围描述270提供了存储有关范围的信息的表。例如,范围描述270可以包含管理信息和内部信息,如范围,范围规则,可视性以及其他与范围相关属性和为搜索查询中使用而产生的范围选择相对应的属性。范围描述270通过范围插件250接收特性来产生范围选择。范围描述270还可以由索引插件260访问用于在索引210内对范围索引产生和组织。索引210还访问范围描述270用于产生和更新复合范围索引(如图3和4)。范围描述同样也可以在客户机界面240被访问,这样客户可以选择一范围用于加入在搜索查询中或是选择一范围以应用于一搜索。
管理界面280还访问范围描述270以允许管理员或是其他控制机制(如自动程序)来获取由范围插入250提供的特性并为使用搜索查询创建范围选择。管理界面280可以根据允许对范围选择的创造以及对范围描述的操作的任何形式来提供(如通过因特网登录访问)。
尽管在系统200中说明了功能模型间的单向和双向通信,这些通信方式的任一种可以改变为其他形式而不偏离本发明的主旨或范围(如所有通信方式可以具有要求双向通信而不是单向通信的确认信息)。
图3说明根据本发明的索引的示例结构的功能框图。索引300包含内容索引(.ci)310,基本范围索引(.bsi)320以及复合范围索引(.csi)330。
内容索引310包括以反转索引组织的记录,该索引列出了与在搜索查询中使用的关键词和其它索引字相对应的文档。然而这些范围关键词被转换为基本范围索引320。
基本范围索引320包括了与基本范围相对应的文档记录。基本范围一般是指与文档的单个范围相关的特性相对应的范围选择。举例而言,在网址http://www.example.com上爬行的文档数字ID可以被记录在范围插件250包含的范围关键词的文档列表中来体现特性(网站)和数值(“example.com”)。
复合范围索引330包含由在基本范围索引320中基本范围组合产生的范围。例如,复合范围可以包括与特定URL相关的特定文件类型的文档记录。
图4说明根据本发明管理复合范围的示例框图。框图400包含基本范围索引410,原始复合范围索引420以及新复合范围索引430。
当基本范围索引用另外的基本范围(见图6)更新时,复合范围索引必定也会被更新。原始复合范围索引420的位置线422说明了新复合范围应被包括的位置。制作原始复合范围索引420的拷贝会产生新的复合范围430。只有在达到新复合范围被包括的位置后,才会制作原始范围索引420的拷贝。新的复合范围432随之被写入新的复合范围索引430中。在包含了新的复合范围432之后,继续拷贝原始复合范围索引420。跟在新的复合范围432之后的复合范围434从原始组合范围索引420拷贝并具有偏移量以补偿包含新的复合范围432。
图5说明根据本发明产生索引的示例过程逻辑流程图。过程500起始于框502,在此处提供了对文档全集的访问。过程继续至框504。
在框504,爬行文档全集以确定存在的文档以及与这些文档相关联的特性(如文件类型)。用于每一个文档的标识符或ID以及其相关特性随之被作为爬行结果传送。进程继续至框506。
在框506,与范围相关的文档相关联的特性由范围插件获得。范围插件创从特性创建范围定义。管理员可以利用范围定义来创建范围选择,以允许客户机根据其范围限制搜索。进程继续至508。
在框508中,从已获取的特性创建的范围定义被作为爬行结果中的范围关键词发送。在被指向搜索范围而不是文档内容的同时,这些范围关键词的操作类似于从爬行产生的关键词和其他索引关键词。用于产生范围关键词而获得的一些特性包括文档类型,文档URL,文档作者以及其他特性。范围关键词被产生以包括范围关键词类型的标识符(ID)和用于识别特定范围关键词的文本字符串。举例而言,如果用于和URL相关的范围关键词的ID为237,则与在http://www.example.com中文档相对应的范围关键词将会为“[237]http://www.example.com”。该范围关键词被发送至管道并随后与索引中文档相关联。一旦范围关键词被发射,进程继续至框510。
在框510中,在所有文档中发现的范围关键词、关键词以及其他累计特性都被刷新至索引。刷新将关键词和特性写进盘中。在刷新过程中,范围关键词被分开并发送至基本范围索引,同时剩下的数据发送到内容索引。进程继续至框512。
在框512中,复合范围索引在索引内产生。在一个实施例中,复合范围索引是响应于索引产生的编译进程而产生。用于产生复合范围索引的示例过程在接下来的图6中进行描述。在一个实施例中,复合范围是由来自客户机的查询定义的。在另一个实施例中,复合范围列表是由管理员在具体说明索引之前产生的。一旦产生组合范围索引之后,进程继续至框514,此时进程500结束。
在一个实施例中,在爬行开始时基本范围索引被填充,但是组合范围并没有被填充直到爬行结束并且基本范围索引被完全建立。通过缩减对基本范围索引的查询,等待建立组合范围索引减少了开销。
图6说明根据该发明编译一个索引的示例进程逻辑流程图。进程600起始于框602,此处编译进程开始。在一实施例中,进程600异步地起始于某个时间间隔(如每15分钟)来更新任何已经存在的复合范围。在另一实施例中,进程600在图5中的进程500进入框502时开始以从新产生的基本范围索引产生复合范围索引。在另外还有实施例中,进程600响应于对索引的其他更新而开始。一旦进程600开始,进程继续至判决框604。
在判决框604中,要对在复合范围索引内的与每一个范围对应的变化记录是否指明当前复合范围已经发生变化作出确定。在一个实施例中,类似的进程对第一次利用假定所有范围已经发生变化的默认设置来产生复合范围索引。因而,产生一个新的复合范围索引以及更新一个组合范围索引由同一个编译进程处理。如果组合范围已变化,进程移至框606。
在框606中,编译进程执行与用户查询类似地的查询并允许查询进程更新对应于复合范围索引中的范围的文档列表。当复合范围从它先前的版本被拷贝后(在存在先前版本时),文档列表随后被附于在复合范围索引中。进程继续至判决框610。
或者,如果变化记录表明特定范围并没有变化,进程移至框608。在框608,由于范围并没有变化,在复合范围先前版本中的与范围相对应的文档ID列表被一字不差地拷贝。
在判决框610,要对在编译过程中是否拷贝更多复合范围至新的复合范围索引作出确定。如果还要拷贝复合范围,进程返回至判决框610以确定复合范围是否已变化。然而,如果没有复合范围需要被发送到新的复合范围,进程移至框612,此处进程600结束。
对于文档全集的更新可能在任何时候发生,文档全集的ID会被持续不断地在内部存储词汇列表或多词汇列表中被更新。词汇列表的填充要么来自于客户机初始化的搜索、导致重新爬行文档全集的刷新行为,或是来自与导致在文档全集中发现变化的其他各种操作。当文档变化时(如添加,删除,修改),变化文档的ID随后将于变化类型一并转发至内存储词汇列表。具有更新文档ID的词汇列表随后更新至索引。在基本范围索引也被更新的同时,文档变化导致对内容索引的更新。一旦发现变化的增量爬行完成后,并且基本范围索引被更新后,复合范围索引也被更新以反应网络中文档的变化。进程600用来反映复合范围更新是否与下列各项无关:在搜索文档全集中是否更新一个新文档,是否从全集中移除文档,是否发生影响文档范围的修改。通过异步运行编译进程,复合范围索引应如此频繁地被更新以反映网络中文档变化。
图7说明根据该发明处理查询的示例进程逻辑流程图。进程700起始于框702,此处为具体说明索引以及准备接受来自客户机的查询。进程继续进行至框704。
在判决框704,要对是否由用户端来初始化搜索查询作出决定。用户端可以与初始化查询的使用者相对应,或是请求搜索的程序对应。如果没有初始化搜索,进程环路返回至框704等待搜索查询初始化。然而,一旦搜索查询开始后,进程持续至判决框706。
在判决框706,要对是否在搜索请求中使用范围关键词作出决定。假定不存在范围关键词,进程前至框716。然而,如果在搜索请求中旬在范围关键词,进程持续至判决框708。
在判决框708,要对范围关键词实例是否作为复合范围一部分作出确定。如果范围关键词并没有被用做为复合范围的一部分,进程前进至判决框712。然而,如果范围关键词是复合范围的一部分,进程移至框710。
在框710中,作为对应于搜索查询中包括的复合范围,对于由它们的文档ID标识的文档查阅组合范围索引(csi.)。与这些文档对应的文档ID随后返回被添加到搜索未决的搜索结果。进程继续至判决框712。
在框712中,要对是否将范围关键词实例作为复合范围的一部分作出确定.如果范围关键词并不与基本范围对应,进程移至框716.然而,如果范围关键词与基本范围不对应,进程移至框714.
在框714中,作为对应于搜索查询中包括的范围关键词,对于由它们的文档ID标识的文档查询基本范围索引(bsi.)。与这些文档相对应的文档ID随后返回被添加到搜索未决的搜索结果。进程移至框716。
在判决框716中,要对与文档内容相关的关键词或是其他索引关键词是否要包含在搜索请求中作出确定。如果关键词并没有包含在搜索请求中,进程前移至判决框720。然而,如果关键词被包含在搜索请求中,进程移至框718。
在框708中,作为对应于搜索查询中包括的范围关键词,对于由它们的文档ID标识的文档查询内容索引(ci.)。在一个实施例中,当内容索引搜索关键词时,搜索则限制于先前根据基本范围和/或复合范围定义的范围。与这些文档对应的文档ID随后被返回并被添加至搜索未决的搜索结果。进程移至判决框710。
在框720中,在不同索引划区中重合的文档ID集合被作为查询结果返回。例如,文档ID可与基本范围索引的范围相一致,并且还包含一特定关键词。如果搜索请求受到这个特定范围限制并且包括了该关键词,则文档ID会在索引划分之间重叠。这些重叠的ID表示了搜索结果。对指的该结果中包含的每一个文档的指针能够响应与搜索请求而被提供至用户端。典型说来,在索引中确定重叠文档要快得多而不是查询文档特性来验证文档是否在该特定范围中。当文档在数据库中处于随机位置时,根据关键词(范围关键词或关键词)索引被聚类在盘上。因此本发明通过应用范围于搜索请求极大增加了速度以及方便度。一旦提供结果以后,进程移至框722,此处进程结束。
在实施例中,在操作框708-71中提供的处理步骤并不是顺序的。相反基本范围索引或是复合范围索引或是内容索引都是基于查询的普通关键词以及范围关键词来查询的,查询的顺序取决于与这些关键词对应的文档ID排列顺序。此外,操作框708-718提供的处理步骤可能需要重复多次,因为在查询请求中可能有多个范围,包括复合范围和基本范围。
在另一实施例中,如果新复合范围是由请求创建的话,在查询请求之后复合范围索引由新复合范围更新。复合范围索引则根据上述图4讨论的方法更新。
然而在另一实施例中,搜索请求确定的范围部分是由客户机所作的搜索选择。根据所提供的范围列表,搜索选择对应于结果的选择范围。根据范围定义,这个范围列表可由管理员产生。
上述规格说明,例子和数据提供了本发明组合的生产和使用的完整描述。由于本发明的许多实施例都能在不偏离发明主旨和范围的情况下实现,本发明驻存于在随后附录的权利要求中。

Claims (23)

1.用于建立对存储在计算设备的网络中的多个文档的查询的搜索范围的计算机实现的方法,包括:
产生至少两个范围相关索引关键词,其中每个范围相关索引关键词均根据与多个文档的范围相关联的特性产生;
产生一个或多个内容相关索引关键词,所述内容相关索引关键词与对所述多个文档的爬行相关联;
产生一个或多个复合范围相关索引关键词,所述复合范围相关索引关键词与至少两个范围相关索引关键词的一个或多个组合相关联;
产生索引,该索引包括:
用于标识对应于至少两个范围相关索引关键词的多个文档的第一子集的第一文档列表;
用于标识对应于所述一个或多个内容相关索引关键词的多个文档的第二子集的第二文档列表;
用于标识对应于所述一个或多个复合范围相关索引关键词的多个文档的第三子集的第三文档列表;
将所述索引存储在所述网络中的至少一个计算设备上;
通过以下方式提供所述查询的结果,其中所述查询包括所述至少两个范围相关索引关键词的第一范围相关索引关键词,和至少一个或多个内容相关索引关键词的第一内容相关索引关键词:
仅检索第一文档列表,以便标识对应于所述第一范围相关索引关键词的多个文档中的一个或多个文档;以及
仅检索第二文档列表,以便标识对应于所述第一内容相关索引关键词的多个文档中的一个或多个文档。
2.如权利要求1中的计算机实现的方法,其特征在于,所述查询还包括第一复合范围相关索引关键词,而提供所述查询的结果还包括仅检索第三文档列表,以便标识对应用所述第三内容相关索引关键词的多个文档中的一个或多个文档。
3.如权利要求1中的计算机实现的方法,其特征在于,所述至少两个范围相关索引关键词的一个或多个组合对应于一布尔组合。
4.如权利要求1中的计算机实现的方法,其特征在于,还包括当范围相关索引关键词的新组合通过拷贝所述第三文档列表被创建时更新第三文档列表,以及将对应于所述新组合的多个文档的附加子集插入到第三文档列表中,以创建更新后的第三文档列表。
5.如权利要求1中的计算机实现的方法,其特征在于,被查询的所述多个文档中的变化导致对所述索引的更新,使得当附加文档与涉及第一范围相关索引关键词的第一特性相关联时,该附加文档与所述索引中的第一范围相关索引关键词相关联。
6.如权利要求1中的计算机实现的方法,其特征在于,产生索引还包括在所述第一、第二和第三文档列表中提供一个或多个文档标识符,其中每个文档标识符用于标识所述多个文档中的一个文档。
7.如权利要求1中的计算机实现的方法,进一步包括,根据所述至少两个范围相关索引关键词生成一范围选择,使得所述范围选择是可被客户机选择用于向客户机生成的查询提供一范围.
8.如权利要求1中的计算机实现的方法,进一步包括,提供一界面,用于从与文档的附加搜索范围相关联的附加特性中、手动生成和操作附加的范围相关索引关键词。
9.一种建立用于多个文档查询的搜索范围的系统,所述系统包括:
处理器;
包含由范围插件和索引插件定义的计算机可执行指令的存储器;
其中,在所述处理器执行所述范围插件时,所述范围插件用于配置标识与和所述多个文档的范围相关联的多个文档相关的特性,并根据所标识的特性生成范围相关索引关键词,其中所述范围相关索引关键词在与所述多个文档的爬行相关联的其他索引关键词集之中发送;
其中,在所述处理器执行所述索引插件时,所述索引插件用于产生包括多个文档列表的索引,且所述文档列表包括:
用于标识对应于所述范围相关索引关键词的多个文档的第一子集的第一文档列表;以及
用于标识对应于所述其它索引关键词集的多个文档的第二子集的第二文档列表;以及
用于存储所述索引以便允许仅搜索所述索引中的所述第一文档列表以便提供所述查询的结果的所述存储器的计算机可读介质。
10.如权利要求9的系统,其中所述其它索引关键词集与所述多个文档的内容相关。
11.如权利要求9中的系统,还包括:
根据所述多个文档的附加标识属性所生成的第二范围相关索引关键词;以及
所述索引还包括根据所述范围相关索引关键词的一个或多个组合组织的用于标识所述多个文档的第三子集的第三文档列表,以便由所述组合定义另一搜索范围。
12.如权利要求9中的系统,其特征在于,所述索引被进一步配置成:在附加文档被插入至被查询的多个文档中时被更新,使得当所述附加的文档与范围相关索引关键词相关联时,附加的文档与所述索引中所述第一文档列表内的所述范围相关索引关键词相关联。
13.如权利要求9中的系统,进一步包括了一界面,该界面用于从与文档的附加搜索范围相关联的附加特性手动产生和操作附加的范围相关索引关键词。
14.一种用于建立多个文档查询的搜索范围的方法,所述方法包括:
产生与所述多个文档的爬行相关联的一个或多个内容相关索引关键词;
根据与所述多个文档的范围相关联的一个或多个属性生成至少两个范围相关索引关键词;
产生与所述范围相关索引关键词相关联的一个或多个组合相关联的一个或多个复合范围相关索引关键词;
产生索引,所述索引包括;
用于标识对应于所述一个或多个内容相关索引关键词的多个文档的第一子集的内容划分;
用于标识对应于所述至少两个范围相关索引关键词的多个文档的第二子集的基本范围划分;以及
用于标识对应于所述一个或多个复合范围相关索引关键词的多个文档的第三子集的复合范围划分;
将所述索引存储在至少一个所述网络中的至少一个计算设备上;以及
通过以下方式来提供所述查询的结果,其中所述查询包括所述一个或多个复合范围相关索引关键词的第一复合范围相关索引关键词、所述至少两个基本范围相关索引关键词的第一基本范围相关索引关键词、和所述一个或多个内容相关索引关键词的第一内容相关索引关键词:
仅搜索所述复合范围划分,以标识对应于所述第一复合范围相关索引关键词的多个文档中的一个或多个文档;
仅搜索所述基本范围划分,以标识对应于所述第一基本范围相关索引关键词的多个文档中的一个或多个文档;以及
仅搜索所述内容划分,以标识对应于所述第一内容相关索引关键词的多个文档中的一个或多个文档。
15.如权利要求14中的方法,其特征在于,还包括所述复合范围划分在范围相关索引关键词的新组合通过拷贝所述复合范围划分而被创建以创建一新的复合范围划分时被更新,与此同时把一新的文档列表插入至与所述新的组合相对应的所述新的复合范围划分。
16.如权利要求14中的方法,其特征在于,还包括和所述内容划分和基本范围划分的更新相异步地更新所述复合范围划分。
17.如权利要求14中的方法,其特征在于,还包括:在一附加文档被插入到要被查询的多个文档中时更新所述索引,使得一附加文档被包括在与所述范围相关索引关键词相关联的基本范围划分中。
18.如权利要求14中的方法,其特征在于,通过一文档标识符来标识对应于第一复合范围相关索引关键词且被标识在所述内容划分中的一个或多个文档、对应于第一基本范围相关索引关键词且被标识在所述基本范围划分中的一个或多个文档以及对应于第一复合范围相关索引关键词且被标识在所述复合范围划分中的一个或多个文档,其中每个文档标识符对应于所述多个文档中的一个文档。
19.如权利要求14中的方法,其特征在于,还包括,根据至少两个范围相关索引关键词生成一范围选择,使得所述范围选择是可由客户机选择用于向客户生成的查询提供一范围。
20.如权利要求14中的方法,还包括,提供一界面,用于从与所述文档的附加搜索范围相关联的附加特性中手动生成和操作附加的范围相关的索引关键词。
21.如权利要求1中的计算机实现方法,其特征在于,提供所述查询的结果还包括确定为对应于第一范围相关索引关键词的多个文档中的一个或多个文档,和对应于第一内容相关索引关键词的多个文档中的一个或多个文档所公有的重叠文档集。
22.如权利要求4中的计算机实现方法,其特征在于,与所述第一文档列表和第二文档列表的更新相异步地更新所述第三文档列表。
23.如权利要求18中的方法,其特征在于,提供所述查询的结果还包括从对所述内容划分、基本范围划分和复合范围划分的搜索中标识公有文档标识符集。
CN2005100882120A 2004-09-27 2005-07-25 使用索引关键词的范围搜索系统和方法 Active CN1755677B (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/951,123 2004-09-27
US10/951,123 US7606793B2 (en) 2004-09-27 2004-09-27 System and method for scoping searches using index keys

Publications (2)

Publication Number Publication Date
CN1755677A CN1755677A (zh) 2006-04-05
CN1755677B true CN1755677B (zh) 2010-05-12

Family

ID=34940142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2005100882120A Active CN1755677B (zh) 2004-09-27 2005-07-25 使用索引关键词的范围搜索系统和方法

Country Status (5)

Country Link
US (2) US7606793B2 (zh)
EP (1) EP1659505A1 (zh)
JP (4) JP5323300B2 (zh)
KR (1) KR100981857B1 (zh)
CN (1) CN1755677B (zh)

Families Citing this family (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7606793B2 (en) 2004-09-27 2009-10-20 Microsoft Corporation System and method for scoping searches using index keys
US20070061298A1 (en) * 2005-09-14 2007-03-15 Wilson Jeff K Method and apparatus for adding a search filter for web pages based on page type
EP1862916A1 (en) * 2006-06-01 2007-12-05 Microsoft Corporation Indexing Documents for Information Retrieval based on additional feedback fields
US20080082578A1 (en) * 2006-09-29 2008-04-03 Andrew Hogue Displaying search results on a one or two dimensional graph
EP1909193A1 (en) * 2006-10-03 2008-04-09 INuron BVBA Proximity based query scoping for information searches
US7647353B2 (en) * 2006-11-14 2010-01-12 Google Inc. Event searching
US9098603B2 (en) * 2007-06-10 2015-08-04 Apple Inc. Index partitioning and scope checking
US20090083214A1 (en) * 2007-09-21 2009-03-26 Microsoft Corporation Keyword search over heavy-tailed data and multi-keyword queries
US9348912B2 (en) 2007-10-18 2016-05-24 Microsoft Technology Licensing, Llc Document length as a static relevance feature for ranking search results
US8280886B2 (en) * 2008-02-13 2012-10-02 Fujitsu Limited Determining candidate terms related to terms of a query
US8812493B2 (en) 2008-04-11 2014-08-19 Microsoft Corporation Search results ranking using editing distance and document information
KR101667232B1 (ko) * 2010-04-12 2016-10-19 삼성전자주식회사 의미기반 검색 장치 및 그 방법과, 의미기반 메타데이터 제공 서버 및 그 동작 방법
US8738635B2 (en) 2010-06-01 2014-05-27 Microsoft Corporation Detection of junk in search result ranking
US10346479B2 (en) * 2010-11-16 2019-07-09 Microsoft Technology Licensing, Llc Facilitating interaction with system level search user interface
US10073927B2 (en) 2010-11-16 2018-09-11 Microsoft Technology Licensing, Llc Registration for system level search user interface
US9902864B2 (en) 2011-03-29 2018-02-27 Sun Chemical Corporation Two-coat barrier system comprising polyurethane
US9495462B2 (en) 2012-01-27 2016-11-15 Microsoft Technology Licensing, Llc Re-ranking search results
CN102682068A (zh) * 2012-03-01 2012-09-19 沈文策 一种搜索用户名的方法及系统
US9141908B1 (en) * 2012-06-29 2015-09-22 Emc Corporation Dynamic information assembly for designated purpose based on suitability reasoning over metadata
US9977810B2 (en) 2014-08-21 2018-05-22 Dropbox, Inc. Multi-user search system with methodology for personal searching
CN105374356B (zh) * 2014-08-29 2019-07-30 株式会社理光 语音识别方法、语音评分方法、语音识别系统及语音评分系统
US9384226B1 (en) 2015-01-30 2016-07-05 Dropbox, Inc. Personal content item searching system and method
US9183303B1 (en) 2015-01-30 2015-11-10 Dropbox, Inc. Personal content item searching system and method
US11281639B2 (en) 2015-06-23 2022-03-22 Microsoft Technology Licensing, Llc Match fix-up to remove matching documents
US10242071B2 (en) 2015-06-23 2019-03-26 Microsoft Technology Licensing, Llc Preliminary ranker for scoring matching documents
US11392568B2 (en) 2015-06-23 2022-07-19 Microsoft Technology Licensing, Llc Reducing matching documents for a search query
US10229143B2 (en) * 2015-06-23 2019-03-12 Microsoft Technology Licensing, Llc Storage and retrieval of data from a bit vector search index
US10275708B2 (en) 2015-10-27 2019-04-30 Yardi Systems, Inc. Criteria enhancement technique for business name categorization
US10275841B2 (en) 2015-10-27 2019-04-30 Yardi Systems, Inc. Apparatus and method for efficient business name categorization
US10268965B2 (en) 2015-10-27 2019-04-23 Yardi Systems, Inc. Dictionary enhancement technique for business name categorization
US10274983B2 (en) 2015-10-27 2019-04-30 Yardi Systems, Inc. Extended business name categorization apparatus and method
CN108090064B (zh) * 2016-11-21 2021-10-08 腾讯科技(深圳)有限公司 一种数据查询方法、装置、数据存储服务器及系统
RU2655121C1 (ru) * 2017-01-25 2018-05-23 Федеральное государственное бюджетное образовательное учреждение высшего образования "Южно-Уральский государственный медицинский университет" Министерства здравоохранения Российской Федерации (ФГБОУ ВО ЮУГМУ Минздрава России) Способ оценки воспалительного процесса в тканях пародонта
CN107391535B (zh) * 2017-04-20 2021-01-12 创新先进技术有限公司 在文档应用中搜索文档的方法及装置
CN107122466A (zh) * 2017-04-28 2017-09-01 福建中金在线信息科技有限公司 一种网络文章查询方法及系统
WO2019093837A1 (ko) * 2017-11-10 2019-05-16 양병철 검색결과 프로세스 통합 올인원 검색 서비스 제공 방법
CN108062368B (zh) * 2017-12-08 2021-05-07 北京百度网讯科技有限公司 全量数据翻译方法、装置、服务器及存储介质
WO2020080931A1 (en) * 2018-10-15 2020-04-23 Mimos Berhad Management of data for content based data locality search
CN109522389B (zh) * 2018-11-07 2020-09-01 中国联合网络通信集团有限公司 文档推送方法、装置和存储介质
CN115168690B (zh) * 2022-09-06 2022-12-27 深圳市明源云科技有限公司 基于浏览器插件的资料查询方法、装置、电子设备及介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1226033A (zh) * 1997-11-19 1999-08-18 国际商业机器公司 指定数据处理操作的方法和装置
US6199081B1 (en) * 1998-06-30 2001-03-06 Microsoft Corporation Automatic tagging of documents and exclusion by content
US6598040B1 (en) * 2000-08-14 2003-07-22 International Business Machines Corporation Method and system for processing electronic search expressions

Family Cites Families (374)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4819156A (en) 1986-06-13 1989-04-04 International Business Machines Corporation Database index journaling for enhanced recovery
US5369778A (en) 1987-08-21 1994-11-29 Wang Laboratories, Inc. Data processor that customizes program behavior by using a resource retrieval capability
US5226161A (en) 1987-08-21 1993-07-06 Wang Laboratories, Inc. Integration of data between typed data structures by mutual direct invocation between data managers corresponding to data types
US5222236A (en) 1988-04-29 1993-06-22 Overdrive Systems, Inc. Multiple integrated document assembly data processing system
US5321833A (en) 1990-08-29 1994-06-14 Gte Laboratories Incorporated Adaptive ranking system for information retrieval
US5257577A (en) 1991-04-01 1993-11-02 Clark Melvin D Apparatus for assist in recycling of refuse
JPH08506911A (ja) 1992-11-23 1996-07-23 パラゴン、コンセプツ、インコーポレーテッド ファイル・アクセスを行うためにユーザーがカテゴリを選択するコンピュータ・ファイリング・システム
US6202058B1 (en) * 1994-04-25 2001-03-13 Apple Computer, Inc. System for ranking the relevance of information objects accessed by computer users
US6038310A (en) * 1994-08-01 2000-03-14 British Telecommunications Public Limited Company Service node for a telephony network
US5606609A (en) * 1994-09-19 1997-02-25 Scientific-Atlanta Electronic document verification system and method
US5594660A (en) * 1994-09-30 1997-01-14 Cirrus Logic, Inc. Programmable audio-video synchronization method and apparatus for multimedia systems
US5642502A (en) * 1994-12-06 1997-06-24 University Of Central Florida Method and system for searching for relevant documents from a text database collection, using statistical ranking, relevancy feedback and small pieces of text
US5729730A (en) * 1995-03-28 1998-03-17 Dex Information Systems, Inc. Method and apparatus for improved information storage and retrieval system
US5826269A (en) 1995-06-21 1998-10-20 Microsoft Corporation Electronic mail interface for a network server
US5933851A (en) 1995-09-29 1999-08-03 Sony Corporation Time-stamp and hash-based file modification monitor with multi-user notification and method thereof
US5974455A (en) * 1995-12-13 1999-10-26 Digital Equipment Corporation System for adding new entry to web page table upon receiving web page including link to another web page not having corresponding entry in web page table
JPH09204442A (ja) * 1996-01-24 1997-08-05 Dainippon Screen Mfg Co Ltd ドキュメントデータ検索システム
US5855020A (en) 1996-02-21 1998-12-29 Infoseek Corporation Web scan process
US6314420B1 (en) 1996-04-04 2001-11-06 Lycos, Inc. Collaborative/adaptive search engine
JP3113814B2 (ja) * 1996-04-17 2000-12-04 インターナショナル・ビジネス・マシーンズ・コーポレ−ション 情報検索方法及び情報検索装置
US5905866A (en) 1996-04-30 1999-05-18 A.I. Soft Corporation Data-update monitoring in communications network
US5828999A (en) 1996-05-06 1998-10-27 Apple Computer, Inc. Method and system for deriving a large-span semantic language model for large-vocabulary recognition systems
JP3653333B2 (ja) * 1996-05-13 2005-05-25 株式会社日立製作所 データベース管理方法およびシステム
US5920859A (en) 1997-02-05 1999-07-06 Idd Enterprises, L.P. Hypertext document retrieval system and method
US6038610A (en) 1996-07-17 2000-03-14 Microsoft Corporation Storage of sitemaps at server sites for holding information regarding content
EP0822502A1 (en) * 1996-07-31 1998-02-04 BRITISH TELECOMMUNICATIONS public limited company Data access system
US5745890A (en) 1996-08-09 1998-04-28 Digital Equipment Corporation Sequential searching of a database index using constraints on word-location pairs
US5765150A (en) 1996-08-09 1998-06-09 Digital Equipment Corporation Method for statistically projecting the ranking of information
US5920854A (en) 1996-08-14 1999-07-06 Infoseek Corporation Real-time document collection search engine with phrase indexing
JP4025379B2 (ja) 1996-09-17 2007-12-19 株式会社ニューズウオッチ 検索システム
US5870739A (en) * 1996-09-20 1999-02-09 Novell, Inc. Hybrid query apparatus and method
US5893116A (en) * 1996-09-30 1999-04-06 Novell, Inc. Accessing network resources using network resource replicator and captured login script for use when the computer is disconnected from the network
US5870740A (en) * 1996-09-30 1999-02-09 Apple Computer, Inc. System and method for improving the ranking of information retrieval results for short queries
GB2323003B (en) * 1996-10-02 2001-07-04 Nippon Telegraph & Telephone Method and apparatus for graphically displaying hierarchical structure
GB2331166B (en) * 1997-11-06 2002-09-11 Ibm Database search engine
US5966126A (en) 1996-12-23 1999-10-12 Szabo; Andrew J. Graphic user interface for database system
US6285999B1 (en) 1997-01-10 2001-09-04 The Board Of Trustees Of The Leland Stanford Junior University Method for node ranking in a linked database
US6415319B1 (en) 1997-02-07 2002-07-02 Sun Microsystems, Inc. Intelligent network browser using incremental conceptual indexer
US5960383A (en) 1997-02-25 1999-09-28 Digital Equipment Corporation Extraction of key sections from texts using automatic indexing techniques
JPH10240757A (ja) 1997-02-27 1998-09-11 Hitachi Ltd 協調分散検索システム
US5890147A (en) * 1997-03-07 1999-03-30 Microsoft Corporation Scope testing of documents in a search engine using document to folder mapping
US5848404A (en) 1997-03-24 1998-12-08 International Business Machines Corporation Fast query search in large dimension database
US6272507B1 (en) 1997-04-09 2001-08-07 Xerox Corporation System for ranking search results from a collection of documents using spreading activation techniques
US6256675B1 (en) 1997-05-06 2001-07-03 At&T Corp. System and method for allocating requests for objects and managing replicas of objects on a network
AUPO710597A0 (en) * 1997-06-02 1997-06-26 Knowledge Horizons Pty. Ltd. Methods and systems for knowledge management
US6029164A (en) * 1997-06-16 2000-02-22 Digital Equipment Corporation Method and apparatus for organizing and accessing electronic mail messages using labels and full text and label indexing
US6012053A (en) * 1997-06-23 2000-01-04 Lycos, Inc. Computer system with user-controlled relevance ranking of search results
JPH1125104A (ja) 1997-06-30 1999-01-29 Canon Inc 情報処理装置および方法
JPH1125119A (ja) 1997-06-30 1999-01-29 Canon Inc ハイパーテキスト閲覧システム
US5933822A (en) 1997-07-22 1999-08-03 Microsoft Corporation Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision
JPH1145243A (ja) 1997-07-25 1999-02-16 Just Syst Corp 索引作成支援装置およびその装置としてコンピュータを機能させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体
US5983216A (en) 1997-09-12 1999-11-09 Infoseek Corporation Performing automated document collection and selection by providing a meta-index with meta-index values indentifying corresponding document collections
US6182113B1 (en) * 1997-09-16 2001-01-30 International Business Machines Corporation Dynamic multiplexing of hyperlinks and bookmarks
US5956722A (en) 1997-09-23 1999-09-21 At&T Corp. Method for effective indexing of partially dynamic documents
US6999959B1 (en) * 1997-10-10 2006-02-14 Nec Laboratories America, Inc. Meta search engine
US6026398A (en) * 1997-10-16 2000-02-15 Imarket, Incorporated System and methods for searching and matching databases
US6070191A (en) 1997-10-17 2000-05-30 Lucent Technologies Inc. Data distribution techniques for load-balanced fault-tolerant web access
US6351467B1 (en) * 1997-10-27 2002-02-26 Hughes Electronics Corporation System and method for multicasting multimedia content
US6594682B2 (en) 1997-10-28 2003-07-15 Microsoft Corporation Client-side system for scheduling delivery of web content and locally managing the web content
US6128701A (en) 1997-10-28 2000-10-03 Cache Flow, Inc. Adaptive and predictive cache refresh policy
US5991756A (en) 1997-11-03 1999-11-23 Yahoo, Inc. Information retrieval from hierarchical compound documents
US5943670A (en) 1997-11-21 1999-08-24 International Business Machines Corporation System and method for categorizing objects in combined categories
US5987457A (en) 1997-11-25 1999-11-16 Acceleration Software International Corporation Query refinement method for searching documents
US6473752B1 (en) 1997-12-04 2002-10-29 Micron Technology, Inc. Method and system for locating documents based on previously accessed documents
US6389436B1 (en) 1997-12-15 2002-05-14 International Business Machines Corporation Enhanced hypertext categorization using hyperlinks
US6145003A (en) 1997-12-17 2000-11-07 Microsoft Corporation Method of web crawling utilizing address mapping
US7010532B1 (en) * 1997-12-31 2006-03-07 International Business Machines Corporation Low overhead methods and apparatus for shared access storage devices
US6151624A (en) 1998-02-03 2000-11-21 Realnames Corporation Navigating network resources based on metadata
JP3998794B2 (ja) 1998-02-18 2007-10-31 株式会社野村総合研究所 ブラウジングクライアントサーバーシステム
KR100285265B1 (ko) * 1998-02-25 2001-04-02 윤덕용 데이터 베이스 관리 시스템과 정보 검색의 밀결합을 위하여 서브 인덱스와 대용량 객체를 이용한 역 인덱스 저장 구조
US6185558B1 (en) * 1998-03-03 2001-02-06 Amazon.Com, Inc. Identifying the items most relevant to a current query based on items selected in connection with similar queries
US5913210A (en) 1998-03-27 1999-06-15 Call; Charles G. Methods and apparatus for disseminating product information via the internet
US6125361A (en) 1998-04-10 2000-09-26 International Business Machines Corporation Feature diffusion across hyperlinks
US6151595A (en) 1998-04-17 2000-11-21 Xerox Corporation Methods for interactive visualization of spreading activation using time tubes and disk trees
US6167402A (en) 1998-04-27 2000-12-26 Sun Microsystems, Inc. High performance message store
US6240407B1 (en) 1998-04-29 2001-05-29 International Business Machines Corp. Method and apparatus for creating an index in a database system
US6314421B1 (en) 1998-05-12 2001-11-06 David M. Sharnoff Method and apparatus for indexing documents for message filtering
JPH11328191A (ja) 1998-05-13 1999-11-30 Nec Corp Wwwロボット検索システム
US6098064A (en) 1998-05-22 2000-08-01 Xerox Corporation Prefetching and caching documents according to probability ranked need S list
US6285367B1 (en) 1998-05-26 2001-09-04 International Business Machines Corporation Method and apparatus for displaying and navigating a graph
US6182085B1 (en) * 1998-05-28 2001-01-30 International Business Machines Corporation Collaborative team crawling:Large scale information gathering over the internet
US6208988B1 (en) * 1998-06-01 2001-03-27 Bigchalk.Com, Inc. Method for identifying themes associated with a search query using metadata and for organizing documents responsive to the search query in accordance with the themes
CA2334616A1 (en) 1998-06-08 1999-12-16 Kaufman Consulting Services Ltd. Method and system for retrieving relevant documents from a database
US6006225A (en) 1998-06-15 1999-12-21 Amazon.Com Refining search queries by the suggestion of correlated terms from prior searches
JP3665480B2 (ja) * 1998-06-24 2005-06-29 富士通株式会社 文書整理装置および方法
US6216123B1 (en) * 1998-06-24 2001-04-10 Novell, Inc. Method and system for rapid retrieval in a full text indexing system
US6638314B1 (en) 1998-06-26 2003-10-28 Microsoft Corporation Method of web crawling utilizing crawl numbers
US6424966B1 (en) 1998-06-30 2002-07-23 Microsoft Corporation Synchronizing crawler with notification source
EP1105819B1 (en) 1998-08-26 2008-03-19 Fractal Edge Limited Methods and devices for mapping data files
US6324551B1 (en) * 1998-08-31 2001-11-27 Xerox Corporation Self-contained document management based on document properties
RU2138076C1 (ru) 1998-09-14 1999-09-20 Закрытое акционерное общество "МедиаЛингва" Система поиска информации в компьютерной сети
US6115709A (en) 1998-09-18 2000-09-05 Tacit Knowledge Systems, Inc. Method and system for constructing a knowledge profile of a user having unrestricted and restricted access portions according to respective levels of confidence of content of the portions
US6549897B1 (en) 1998-10-09 2003-04-15 Microsoft Corporation Method and system for calculating phrase-document importance
US6360215B1 (en) * 1998-11-03 2002-03-19 Inktomi Corporation Method and apparatus for retrieving documents based on information other than document content
US6385602B1 (en) 1998-11-03 2002-05-07 E-Centives, Inc. Presentation of search results using dynamic categorization
US6574632B2 (en) * 1998-11-18 2003-06-03 Harris Corporation Multiple engine information retrieval and visualization system
US6628304B2 (en) 1998-12-09 2003-09-30 Cisco Technology, Inc. Method and apparatus providing a graphical user interface for representing and navigating hierarchical networks
US6167369A (en) 1998-12-23 2000-12-26 Xerox Company Automatic language identification using both N-gram and word information
JP2000194713A (ja) 1998-12-25 2000-07-14 Nippon Telegr & Teleph Corp <Ntt> 文字列検索方法及び装置及び文字列検索プログラムを格納した記憶媒体
US6922699B2 (en) 1999-01-26 2005-07-26 Xerox Corporation System and method for quantitatively representing data objects in vector space
US6418433B1 (en) 1999-01-28 2002-07-09 International Business Machines Corporation System and method for focussed web crawling
JP3347088B2 (ja) 1999-02-12 2002-11-20 インターナショナル・ビジネス・マシーンズ・コーポレーション 関連情報検索方法およびシステム
US6510406B1 (en) 1999-03-23 2003-01-21 Mathsoft, Inc. Inverse inference engine for high performance web search
US6862710B1 (en) * 1999-03-23 2005-03-01 Insightful Corporation Internet navigation using soft hyperlinks
US6763496B1 (en) 1999-03-31 2004-07-13 Microsoft Corporation Method for promoting contextual information to display pages containing hyperlinks
US6304864B1 (en) 1999-04-20 2001-10-16 Textwise Llc System for retrieving multimedia information from the internet using multiple evolving intelligent agents
US6336117B1 (en) * 1999-04-30 2002-01-01 International Business Machines Corporation Content-indexing search system and method providing search results consistent with content filtering and blocking policies implemented in a blocking engine
US6327590B1 (en) 1999-05-05 2001-12-04 Xerox Corporation System and method for collaborative ranking of search results employing user and group profiles derived from document collection content analysis
US7835943B2 (en) 1999-05-28 2010-11-16 Yahoo! Inc. System and method for providing place and price protection in a search result list generated by a computer network search engine
US6990628B1 (en) * 1999-06-14 2006-01-24 Yahoo! Inc. Method and apparatus for measuring similarity among electronic documents
US7072888B1 (en) 1999-06-16 2006-07-04 Triogo, Inc. Process for improving search engine efficiency using feedback
US6973490B1 (en) 1999-06-23 2005-12-06 Savvis Communications Corp. Method and system for object-level web performance and analysis
US6547829B1 (en) 1999-06-30 2003-04-15 Microsoft Corporation Method and system for detecting duplicate documents in web crawls
US6631369B1 (en) 1999-06-30 2003-10-07 Microsoft Corporation Method and system for incremental web crawling
US6873982B1 (en) * 1999-07-16 2005-03-29 International Business Machines Corporation Ordering of database search results based on user feedback
US6557036B1 (en) 1999-07-20 2003-04-29 Sun Microsystems, Inc. Methods and apparatus for site wide monitoring of electronic mail systems
US7181438B1 (en) * 1999-07-21 2007-02-20 Alberti Anemometer, Llc Database access system
US6598047B1 (en) 1999-07-26 2003-07-22 David W. Russell Method and system for searching text
CA2279119C (en) 1999-07-29 2004-10-19 Ibm Canada Limited-Ibm Canada Limitee Heuristic-based conditional data indexing
JP3931496B2 (ja) 1999-08-11 2007-06-13 富士ゼロックス株式会社 ハイパーテキスト解析装置
US6442606B1 (en) 1999-08-12 2002-08-27 Inktomi Corporation Method and apparatus for identifying spoof documents
US6636853B1 (en) 1999-08-30 2003-10-21 Morphism, Llc Method and apparatus for representing and navigating search results
US6381597B1 (en) * 1999-10-07 2002-04-30 U-Know Software Corporation Electronic shopping agent which is capable of operating with vendor sites which have disparate formats
US7346604B1 (en) * 1999-10-15 2008-03-18 Hewlett-Packard Development Company, L.P. Method for ranking hypertext search results by analysis of hyperlinks from expert documents and keyword scope
US6687698B1 (en) * 1999-10-18 2004-02-03 Fisher Rosemount Systems, Inc. Accessing and updating a configuration database from distributed physical locations within a process control system
JP3772606B2 (ja) 1999-10-19 2006-05-10 株式会社日立製作所 電子文書管理方法及びシステム並びに記録媒体
AU1039301A (en) 1999-10-29 2001-05-08 British Telecommunications Public Limited Company Method and apparatus for processing queries
US6263364B1 (en) 1999-11-02 2001-07-17 Alta Vista Company Web crawler system using plurality of parallel priority level queues having distinct associated download priority levels for prioritizing document downloading and maintaining document freshness
US6351755B1 (en) * 1999-11-02 2002-02-26 Alta Vista Company System and method for associating an extensible set of data with documents downloaded by a web crawler
US6418453B1 (en) 1999-11-03 2002-07-09 International Business Machines Corporation Network repository service for efficient web crawling
US6418452B1 (en) 1999-11-03 2002-07-09 International Business Machines Corporation Network repository service directory for efficient web crawling
US6539376B1 (en) * 1999-11-15 2003-03-25 International Business Machines Corporation System and method for the automatic mining of new relationships
US7016540B1 (en) * 1999-11-24 2006-03-21 Nec Corporation Method and system for segmentation, classification, and summarization of video images
US6886129B1 (en) 1999-11-24 2005-04-26 International Business Machines Corporation Method and system for trawling the World-wide Web to identify implicitly-defined communities of web pages
US6772141B1 (en) * 1999-12-14 2004-08-03 Novell, Inc. Method and apparatus for organizing and using indexes utilizing a search decision table
US6546388B1 (en) 2000-01-14 2003-04-08 International Business Machines Corporation Metadata search results ranking system
US6883135B1 (en) 2000-01-28 2005-04-19 Microsoft Corporation Proxy server using a statistical model
US7240067B2 (en) 2000-02-08 2007-07-03 Sybase, Inc. System and methodology for extraction and aggregation of data from dynamic content
US6931397B1 (en) 2000-02-11 2005-08-16 International Business Machines Corporation System and method for automatic generation of dynamic search abstracts contain metadata by crawler
US6910029B1 (en) 2000-02-22 2005-06-21 International Business Machines Corporation System for weighted indexing of hierarchical documents
JP2001265774A (ja) 2000-03-16 2001-09-28 Nippon Telegr & Teleph Corp <Ntt> 情報検索方法、装置、および情報検索プログラムを記録した記録媒体、ハイパーテキスト情報検索システム
US6516312B1 (en) * 2000-04-04 2003-02-04 International Business Machine Corporation System and method for dynamically associating keywords with domain-specific search engine queries
US6633867B1 (en) 2000-04-05 2003-10-14 International Business Machines Corporation System and method for providing a session query within the context of a dynamic search result set
US6549896B1 (en) 2000-04-07 2003-04-15 Nec Usa, Inc. System and method employing random walks for mining web page associations and usage to optimize user-oriented web page refresh and pre-fetch scheduling
US6718365B1 (en) 2000-04-13 2004-04-06 International Business Machines Corporation Method, system, and program for ordering search results using an importance weighting
US6859800B1 (en) * 2000-04-26 2005-02-22 Global Information Research And Technologies Llc System for fulfilling an information need
US6772160B2 (en) * 2000-06-08 2004-08-03 Ingenuity Systems, Inc. Techniques for facilitating information acquisition and storage
US6741986B2 (en) * 2000-12-08 2004-05-25 Ingenuity Systems, Inc. Method and system for performing information extraction and quality control for a knowledgebase
DE10029644B4 (de) 2000-06-16 2008-02-07 Deutsche Telekom Ag Verfahren zur Relevanzbewertung bei der Indexierung von Hypertext-Dokumenten mittels Suchmaschine
JP3573688B2 (ja) * 2000-06-28 2004-10-06 松下電器産業株式会社 類似文書検索装置及び関連キーワード抽出装置
US6678692B1 (en) * 2000-07-10 2004-01-13 Northrop Grumman Corporation Hierarchy statistical analysis system and method
JP2002024015A (ja) * 2000-07-11 2002-01-25 Misawa Van Corp クライアントサーバシステム構築方法
US6601075B1 (en) 2000-07-27 2003-07-29 International Business Machines Corporation System and method of ranking and retrieving documents based on authority scores of schemas and documents
US6633868B1 (en) 2000-07-28 2003-10-14 Shermann Loyall Min System and method for context-based document retrieval
US7080073B1 (en) 2000-08-18 2006-07-18 Firstrain, Inc. Method and apparatus for focused crawling
KR100378240B1 (ko) 2000-08-23 2003-03-29 학교법인 통진학원 엔트로피와 사용자 프로파일을 적용한 문서순위 조정방법
US6959326B1 (en) 2000-08-24 2005-10-25 International Business Machines Corporation Method, system, and program for gathering indexable metadata on content at a data repository
US20030217052A1 (en) 2000-08-24 2003-11-20 Celebros Ltd. Search engine method and apparatus
WO2002017212A1 (en) * 2000-08-25 2002-02-28 Jonas Ulenas Method and apparatus for obtaining consumer product preferences through product selection and evaluation
JP3472540B2 (ja) 2000-09-11 2003-12-02 日本電信電話株式会社 サーバ選択装置、サーバ選択方法、及びサーバ選択プログラムを記録した記録媒体
NO313399B1 (no) * 2000-09-14 2002-09-23 Fast Search & Transfer Asa Fremgangsmate til soking og analyse av informasjon i datanettverk
US6598051B1 (en) 2000-09-19 2003-07-22 Altavista Company Web page connectivity server
JP3525885B2 (ja) 2000-10-25 2004-05-10 日本電信電話株式会社 多角的検索サービス方法およびそのプログラムを記録した記録媒体
US6560600B1 (en) 2000-10-25 2003-05-06 Alta Vista Company Method and apparatus for ranking Web page search results
JP2002140365A (ja) * 2000-11-01 2002-05-17 Mitsubishi Electric Corp データ検索方法
US7200606B2 (en) 2000-11-07 2007-04-03 The Regents Of The University Of California Method and system for selecting documents by measuring document quality
US6622140B1 (en) 2000-11-15 2003-09-16 Justsystem Corporation Method and apparatus for analyzing affect and emotion in text
JP2002157271A (ja) 2000-11-20 2002-05-31 Yozan Inc ブラウザ装置、サーバ装置、記録媒体、検索システムおよび検索方法
US20020103920A1 (en) 2000-11-21 2002-08-01 Berkun Ken Alan Interpretive stream metadata extraction
US8402068B2 (en) 2000-12-07 2013-03-19 Half.Com, Inc. System and method for collecting, associating, normalizing and presenting product and vendor information on a distributed network
US20020078045A1 (en) 2000-12-14 2002-06-20 Rabindranath Dutta System, method, and program for ranking search results using user category weighting
US6898592B2 (en) * 2000-12-27 2005-05-24 Microsoft Corporation Scoping queries in a search engine
JP2002202992A (ja) 2000-12-28 2002-07-19 Speed System:Kk ホームページ検索システム
US6778997B2 (en) 2001-01-05 2004-08-17 International Business Machines Corporation XML: finding authoritative pages for mining communities based on page structure criteria
US7356530B2 (en) 2001-01-10 2008-04-08 Looksmart, Ltd. Systems and methods of retrieving relevant information
US6766316B2 (en) 2001-01-18 2004-07-20 Science Applications International Corporation Method and system of ranking and clustering for document indexing and retrieval
US6526440B1 (en) * 2001-01-30 2003-02-25 Google, Inc. Ranking search results by reranking the results based on local inter-connectivity
US20020103798A1 (en) 2001-02-01 2002-08-01 Abrol Mani S. Adaptive document ranking method based on user behavior
US20020107886A1 (en) 2001-02-07 2002-08-08 Gentner Donald R. Method and apparatus for automatic document electronic versioning system
WO2002063493A1 (en) 2001-02-08 2002-08-15 2028, Inc. Methods and systems for automated semantic knowledge leveraging graph theoretic analysis and the inherent structure of communication
US20040003028A1 (en) * 2002-05-08 2004-01-01 David Emmett Automatic display of web content to smaller display devices: improved summarization and navigation
JP2002245089A (ja) 2001-02-19 2002-08-30 Hitachi Eng Co Ltd ウェブページ検索システム、二次情報収集装置、インターフェース装置
US7627596B2 (en) 2001-02-22 2009-12-01 International Business Machines Corporation Retrieving handwritten documents using multiple document recognizers and techniques allowing both typed and handwritten queries
US8001118B2 (en) 2001-03-02 2011-08-16 Google Inc. Methods and apparatus for employing usage statistics in document retrieval
US7269545B2 (en) 2001-03-30 2007-09-11 Nec Laboratories America, Inc. Method for retrieving answers from an information retrieval system
US20020169770A1 (en) 2001-04-27 2002-11-14 Kim Brian Seong-Gon Apparatus and method that categorize a collection of documents into a hierarchy of categories that are defined by the collection of documents
US7188106B2 (en) * 2001-05-01 2007-03-06 International Business Machines Corporation System and method for aggregating ranking results from various sources to improve the results of web searching
US20020165860A1 (en) 2001-05-07 2002-11-07 Nec Research Insititute, Inc. Selective retrieval metasearch engine
US6738764B2 (en) 2001-05-08 2004-05-18 Verity, Inc. Apparatus and method for adaptively ranking search results
IES20020335A2 (en) * 2001-05-10 2002-11-13 Changing Worlds Ltd Intelligent internet website with hierarchical menu
US6865295B2 (en) 2001-05-11 2005-03-08 Koninklijke Philips Electronics N.V. Palette-based histogram matching with recursive histogram vector generation
US6782383B2 (en) 2001-06-18 2004-08-24 Siebel Systems, Inc. System and method to implement a persistent and dismissible search center frame
US6947920B2 (en) 2001-06-20 2005-09-20 Oracle International Corporation Method and system for response time optimization of data query rankings and retrieval
US7519529B1 (en) 2001-06-29 2009-04-14 Microsoft Corporation System and methods for inferring informational goals and preferred level of detail of results in response to questions posed to an automated information-retrieval or question-answering service
CN1533540A (zh) * 2001-07-19 2004-09-29 ���������˼�빫˾ 用于重新组织数据库中表格空间的方法和系统
US7039234B2 (en) * 2001-07-19 2006-05-02 Microsoft Corporation Electronic ink as a software object
US6928425B2 (en) * 2001-08-13 2005-08-09 Xerox Corporation System for propagating enrichment between documents
US6868411B2 (en) * 2001-08-13 2005-03-15 Xerox Corporation Fuzzy text categorizer
KR100509276B1 (ko) 2001-08-20 2005-08-22 엔에이치엔(주) 웹페이지별 방문인기도에 기반한 웹페이지 검색방법 및 그장치
JP3895955B2 (ja) 2001-08-24 2007-03-22 株式会社東芝 情報検索方法および情報検索システム
US7076483B2 (en) 2001-08-27 2006-07-11 Xyleme Sa Ranking nodes in a graph
US20030046389A1 (en) * 2001-09-04 2003-03-06 Thieme Laura M. Method for monitoring a web site's keyword visibility in search engines and directories and resulting traffic from such keyword visibility
US6970863B2 (en) * 2001-09-18 2005-11-29 International Business Machines Corporation Front-end weight factor search criteria
US6766422B2 (en) 2001-09-27 2004-07-20 Siemens Information And Communication Networks, Inc. Method and system for web caching based on predictive usage
US6944609B2 (en) 2001-10-18 2005-09-13 Lycos, Inc. Search results using editor feedback
US7428695B2 (en) 2001-10-22 2008-09-23 Hewlett-Packard Development Company, L.P. System for automatic generation of arbitrarily indexed hyperlinked text
JP2003208434A (ja) 2001-11-07 2003-07-25 Nec Corp 情報検索システム及びそれに用いる情報検索方法
US20030101183A1 (en) * 2001-11-26 2003-05-29 Navin Kabra Information retrieval index allowing updating while in use
US6763362B2 (en) 2001-11-30 2004-07-13 Micron Technology, Inc. Method and system for updating a search engine
US7565367B2 (en) 2002-01-15 2009-07-21 Iac Search & Media, Inc. Enhanced popularity ranking
JP3871201B2 (ja) 2002-01-29 2007-01-24 ソニー株式会社 コンテンツ提供取得システム
US6829606B2 (en) 2002-02-14 2004-12-07 Infoglide Software Corporation Similarity search engine for use with relational databases
JP4021681B2 (ja) 2002-02-22 2007-12-12 日本電信電話株式会社 ページレイティング/フィルタリング方法および装置とページレイティング/フィルタリングプログラムおよび該プログラムを記録したコンピュータ読取り可能な記録媒体
US20060004732A1 (en) * 2002-02-26 2006-01-05 Odom Paul S Search engine methods and systems for generating relevant search results and advertisements
US6934714B2 (en) 2002-03-04 2005-08-23 Intelesis Engineering, Inc. Method and system for identification and maintenance of families of data records
US7693830B2 (en) * 2005-08-10 2010-04-06 Google Inc. Programmable search engine
KR100490748B1 (ko) 2002-04-11 2005-05-24 한국전자통신연구원 유,알,엘 포함관계에 기반한 유사도 재계산을 통한효과적인 홈페이지 검색 방법
US7039631B1 (en) 2002-05-24 2006-05-02 Microsoft Corporation System and method for providing search results with configurable scoring formula
RU2273879C2 (ru) 2002-05-28 2006-04-10 Владимир Владимирович Насыпный Способ синтеза самообучающейся системы извлечения знаний из текстовых документов для поисковых систем
US20040006559A1 (en) * 2002-05-29 2004-01-08 Gange David M. System, apparatus, and method for user tunable and selectable searching of a database using a weigthted quantized feature vector
AU2003243533A1 (en) 2002-06-12 2003-12-31 Jena Jordahl Data storage, retrieval, manipulation and display tools enabling multiple hierarchical points of view
JP3922693B2 (ja) 2002-06-17 2007-05-30 Necシステムテクノロジー株式会社 インターネット情報検索システム
JP2004054588A (ja) 2002-07-19 2004-02-19 Just Syst Corp 文書検索装置、文書検索方法およびその方法をコンピュータに実行させるプログラム
CA2395905A1 (en) * 2002-07-26 2004-01-26 Teraxion Inc. Multi-grating tunable chromatic dispersion compensator
US7599911B2 (en) * 2002-08-05 2009-10-06 Yahoo! Inc. Method and apparatus for search ranking using human input and automated ranking
US7152059B2 (en) 2002-08-30 2006-12-19 Emergency24, Inc. System and method for predicting additional search results of a computerized database search user based on an initial search query
US7013458B2 (en) * 2002-09-09 2006-03-14 Sun Microsystems, Inc. Method and apparatus for associating metadata attributes with program elements
JP2004164555A (ja) 2002-09-17 2004-06-10 Fuji Xerox Co Ltd 検索装置および方法ならびにそのインデクス構築装置および方法
US20040064442A1 (en) 2002-09-27 2004-04-01 Popovitch Steven Gregory Incremental search engine
US6886010B2 (en) 2002-09-30 2005-04-26 The United States Of America As Represented By The Secretary Of The Navy Method for data and text mining and literature-based discovery
US7085755B2 (en) 2002-11-07 2006-08-01 Thomson Global Resources Ag Electronic document repository management and access system
US7231379B2 (en) * 2002-11-19 2007-06-12 Noema, Inc. Navigation in a hierarchical structured transaction processing system
US7386527B2 (en) 2002-12-06 2008-06-10 Kofax, Inc. Effective multi-class support vector machine classification
US7020648B2 (en) 2002-12-14 2006-03-28 International Business Machines Corporation System and method for identifying and utilizing a secondary index to access a database using a management system without an internal catalogue of online metadata
US7734565B2 (en) 2003-01-18 2010-06-08 Yahoo! Inc. Query string matching method and apparatus
US20040148278A1 (en) 2003-01-22 2004-07-29 Amir Milo System and method for providing content warehouse
RU2236699C1 (ru) 2003-02-25 2004-09-20 Открытое акционерное общество "Телепортал. Ру" Способ поиска и выборки информации с повышенной релевантностью
JP4299022B2 (ja) 2003-02-28 2009-07-22 トヨタ自動車株式会社 コンテンツ検索用インデックス生成装置
US20040181515A1 (en) 2003-03-13 2004-09-16 International Business Machines Corporation Group administration of universal resource identifiers with members identified in search result
US6947930B2 (en) 2003-03-21 2005-09-20 Overture Services, Inc. Systems and methods for interactive search query refinement
DE60315947T2 (de) 2003-03-27 2008-05-21 Sony Deutschland Gmbh Verfahren zur Sprachmodellierung
US7028029B2 (en) 2003-03-28 2006-04-11 Google Inc. Adaptive computation of ranking
US7216123B2 (en) * 2003-03-28 2007-05-08 Board Of Trustees Of The Leland Stanford Junior University Methods for ranking nodes in large directed graphs
US7451130B2 (en) 2003-06-16 2008-11-11 Google Inc. System and method for providing preferred country biasing of search results
US7451129B2 (en) 2003-03-31 2008-11-11 Google Inc. System and method for providing preferred language ordering of search results
US7051023B2 (en) 2003-04-04 2006-05-23 Yahoo! Inc. Systems and methods for generating concept units from search queries
US7197497B2 (en) * 2003-04-25 2007-03-27 Overture Services, Inc. Method and apparatus for machine learning a document relevance function
US7283997B1 (en) 2003-05-14 2007-10-16 Apple Inc. System and method for ranking the relevance of documents retrieved by a query
US7502779B2 (en) * 2003-06-05 2009-03-10 International Business Machines Corporation Semantics-based searching for information in a distributed data processing system
US8239380B2 (en) 2003-06-20 2012-08-07 Microsoft Corporation Systems and methods to tune a general-purpose search engine for a search entry point
US7228301B2 (en) 2003-06-27 2007-06-05 Microsoft Corporation Method for normalizing document metadata to improve search results using an alias relationship directory service
US7630963B2 (en) * 2003-06-30 2009-12-08 Microsoft Corporation Fast ranked full-text searching
US7308643B1 (en) 2003-07-03 2007-12-11 Google Inc. Anchor tag indexing in a web crawler system
KR100543255B1 (ko) 2003-08-19 2006-01-20 문영섭 용접부 절삭가공장치
US20050060186A1 (en) * 2003-08-28 2005-03-17 Blowers Paul A. Prioritized presentation of medical device events
US7505964B2 (en) * 2003-09-12 2009-03-17 Google Inc. Methods and systems for improving a search ranking using related queries
US7454417B2 (en) * 2003-09-12 2008-11-18 Google Inc. Methods and systems for improving a search ranking using population information
US8589373B2 (en) 2003-09-14 2013-11-19 Yaron Mayer System and method for improved searching on the internet or similar networks and especially improved MetaNews and/or improved automatically generated newspapers
US20050071328A1 (en) * 2003-09-30 2005-03-31 Lawrence Stephen R. Personalization of web search
US7346839B2 (en) 2003-09-30 2008-03-18 Google Inc. Information retrieval based on historical data
US7693827B2 (en) 2003-09-30 2010-04-06 Google Inc. Personalization of placed content ordering in search results
US7552109B2 (en) 2003-10-15 2009-06-23 International Business Machines Corporation System, method, and service for collaborative focused crawling of documents on a network
US20050086192A1 (en) 2003-10-16 2005-04-21 Hitach, Ltd. Method and apparatus for improving the integration between a search engine and one or more file servers
US7346208B2 (en) 2003-10-25 2008-03-18 Hewlett-Packard Development Company, L.P. Image artifact reduction using a neural network
US7231399B1 (en) 2003-11-14 2007-06-12 Google Inc. Ranking documents based on large data sets
US7181447B2 (en) 2003-12-08 2007-02-20 Iac Search And Media, Inc. Methods and systems for conceptually organizing and presenting information
CN100495392C (zh) 2003-12-29 2009-06-03 西安迪戈科技有限责任公司 一种智能搜索方法
US20060047649A1 (en) * 2003-12-29 2006-03-02 Ping Liang Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
US7685104B2 (en) * 2004-01-08 2010-03-23 International Business Machines Corporation Dynamic bitmap processing, identification and reusability
US7483891B2 (en) 2004-01-09 2009-01-27 Yahoo, Inc. Content presentation and management system associating base content and relevant additional content
US7392278B2 (en) 2004-01-23 2008-06-24 Microsoft Corporation Building and using subwebs for focused search
US7499913B2 (en) 2004-01-26 2009-03-03 International Business Machines Corporation Method for handling anchor text
JP2004192657A (ja) * 2004-02-09 2004-07-08 Nec Corp 情報検索システム、情報検索方法および情報検索用プログラムを記録した記録媒体
US7310632B2 (en) 2004-02-12 2007-12-18 Microsoft Corporation Decision-theoretic web-crawling and predicting web-page change
US7281002B2 (en) 2004-03-01 2007-10-09 International Business Machine Corporation Organizing related search results
US9104689B2 (en) 2004-03-17 2015-08-11 International Business Machines Corporation Method for synchronizing documents for disconnected operation
US7584221B2 (en) 2004-03-18 2009-09-01 Microsoft Corporation Field weighting in text searching
JP2005277445A (ja) * 2004-03-22 2005-10-06 Fuji Xerox Co Ltd 会議映像処理装置、会議映像処理方法およびプログラム
US7343374B2 (en) 2004-03-29 2008-03-11 Yahoo! Inc. Computation of page authority weights using personalized bookmarks
US7580568B1 (en) 2004-03-31 2009-08-25 Google Inc. Methods and systems for identifying an image as a representative image for an article
US7693825B2 (en) 2004-03-31 2010-04-06 Google Inc. Systems and methods for ranking implicit search results
US20050251499A1 (en) 2004-05-04 2005-11-10 Zezhen Huang Method and system for searching documents using readers valuation
US7257577B2 (en) 2004-05-07 2007-08-14 International Business Machines Corporation System, method and service for ranking search results using a modular scoring system
US7136851B2 (en) * 2004-05-14 2006-11-14 Microsoft Corporation Method and system for indexing and searching databases
US7260573B1 (en) 2004-05-17 2007-08-21 Google Inc. Personalizing anchor text scores in a search engine
US20050283473A1 (en) 2004-06-17 2005-12-22 Armand Rousso Apparatus, method and system of artificial intelligence for data searching applications
US7716225B1 (en) 2004-06-17 2010-05-11 Google Inc. Ranking documents based on user behavior and/or feature data
US7730012B2 (en) 2004-06-25 2010-06-01 Apple Inc. Methods and systems for managing data
US8131674B2 (en) 2004-06-25 2012-03-06 Apple Inc. Methods and systems for managing data
US7428530B2 (en) 2004-07-01 2008-09-23 Microsoft Corporation Dispersing search engine results by using page category information
US7363296B1 (en) 2004-07-01 2008-04-22 Microsoft Corporation Generating a subindex with relevant attributes to improve querying
US7634461B2 (en) * 2004-08-04 2009-12-15 International Business Machines Corporation System and method for enhancing keyword relevance by user's interest on the search result documents
US7395260B2 (en) * 2004-08-04 2008-07-01 International Business Machines Corporation Method for providing graphical representations of search results in multiple related histograms
US20060036598A1 (en) * 2004-08-09 2006-02-16 Jie Wu Computerized method for ranking linked information items in distributed sources
US20060047643A1 (en) * 2004-08-31 2006-03-02 Chirag Chaman Method and system for a personalized search engine
WO2006033763A2 (en) * 2004-09-16 2006-03-30 Telenor Asa A method, system, and computer program product for searching for, navigating among, and ranking of documents in a personal web
WO2006036781A2 (en) * 2004-09-22 2006-04-06 Perfect Market Technologies, Inc. Search engine using user intent
US7606793B2 (en) 2004-09-27 2009-10-20 Microsoft Corporation System and method for scoping searches using index keys
US7644107B2 (en) * 2004-09-30 2010-01-05 Microsoft Corporation System and method for batched indexing of network documents
US7761448B2 (en) 2004-09-30 2010-07-20 Microsoft Corporation System and method for ranking search results using click distance
US7739277B2 (en) 2004-09-30 2010-06-15 Microsoft Corporation System and method for incorporating anchor text into ranking search results
US7827181B2 (en) * 2004-09-30 2010-11-02 Microsoft Corporation Click distance determination
US20060074883A1 (en) 2004-10-05 2006-04-06 Microsoft Corporation Systems, methods, and interfaces for providing personalized search and information access
US20060074781A1 (en) 2004-10-06 2006-04-06 Leano Hector V System for facilitating turnkey real estate investment in Mexico
US7702599B2 (en) 2004-10-07 2010-04-20 Bernard Widrow System and method for cognitive memory and auto-associative neural network based pattern recognition
US7533092B2 (en) 2004-10-28 2009-05-12 Yahoo! Inc. Link-based spam detection
US7716198B2 (en) 2004-12-21 2010-05-11 Microsoft Corporation Ranking search results using feature extraction
US7698331B2 (en) 2005-01-18 2010-04-13 Yahoo! Inc. Matching and ranking of sponsored search listings incorporating web search technology and web content
US20060173828A1 (en) 2005-02-01 2006-08-03 Outland Research, Llc Methods and apparatus for using personal background data to improve the organization of documents retrieved in response to a search query
US7689615B2 (en) 2005-02-25 2010-03-30 Microsoft Corporation Ranking results using multiple nested ranking
US20060200460A1 (en) 2005-03-03 2006-09-07 Microsoft Corporation System and method for ranking search results using file types
US7792833B2 (en) 2005-03-03 2010-09-07 Microsoft Corporation Ranking search results using language types
US7574436B2 (en) 2005-03-10 2009-08-11 Yahoo! Inc. Reranking and increasing the relevance of the results of Internet searches
US20060206460A1 (en) 2005-03-14 2006-09-14 Sanjay Gadkari Biasing search results
US8095487B2 (en) 2005-03-16 2012-01-10 Yahoo! Inc. System and method for biasing search results based on topic familiarity
WO2006102122A2 (en) 2005-03-18 2006-09-28 Wink Technologies, Inc. Search engine that applies feedback from users to improve search results
US7870147B2 (en) 2005-03-29 2011-01-11 Google Inc. Query revision using known highly-ranked queries
US7693829B1 (en) 2005-04-25 2010-04-06 Google Inc. Search engine with fill-the-blanks capability
US7401073B2 (en) 2005-04-28 2008-07-15 International Business Machines Corporation Term-statistics modification for category-based search
KR100672277B1 (ko) 2005-05-09 2007-01-24 엔에이치엔(주) 개인화 검색 방법 및 검색 서버
JP4648455B2 (ja) 2005-05-06 2011-03-09 エヌエイチエヌ コーポレーション 個人化検索方法および個人化検索システム
US7451124B2 (en) 2005-05-12 2008-11-11 Xerox Corporation Method of analyzing documents
US7962462B1 (en) 2005-05-31 2011-06-14 Google Inc. Deriving and using document and site quality signals from search query streams
CA2544324A1 (en) 2005-06-10 2006-12-10 Unicru, Inc. Employee selection via adaptive assessment
US20060282455A1 (en) 2005-06-13 2006-12-14 It Interactive Services Inc. System and method for ranking web content
US7627564B2 (en) 2005-06-21 2009-12-01 Microsoft Corporation High scale adaptive search systems and methods
US7599917B2 (en) * 2005-08-15 2009-10-06 Microsoft Corporation Ranking search results using biased click distance
US7653617B2 (en) * 2005-08-29 2010-01-26 Google Inc. Mobile sitemaps
US7499919B2 (en) * 2005-09-21 2009-03-03 Microsoft Corporation Ranking functions using document usage statistics
US7716226B2 (en) * 2005-09-27 2010-05-11 Patentratings, Llc Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects
US7689531B1 (en) * 2005-09-28 2010-03-30 Trend Micro Incorporated Automatic charset detection using support vector machines with charset grouping
US20070085716A1 (en) 2005-09-30 2007-04-19 International Business Machines Corporation System and method for detecting matches of small edit distance
US7873624B2 (en) 2005-10-21 2011-01-18 Microsoft Corporation Question answering over structured content on the web
US20070150473A1 (en) 2005-12-22 2007-06-28 Microsoft Corporation Search By Document Type And Relevance
US7814099B2 (en) 2006-01-31 2010-10-12 Louis S. Wang Method for ranking and sorting electronic documents in a search result list based on relevance
US7689559B2 (en) * 2006-02-08 2010-03-30 Telenor Asa Document similarity scoring and ranking method, device and computer program product
US7685091B2 (en) 2006-02-14 2010-03-23 Accenture Global Services Gmbh System and method for online information analysis
EP2016510A1 (en) 2006-04-24 2009-01-21 Telenor ASA Method and device for efficiently ranking documents in a similarity graph
US20070260597A1 (en) 2006-05-02 2007-11-08 Mark Cramer Dynamic search engine results employing user behavior
EP1862916A1 (en) 2006-06-01 2007-12-05 Microsoft Corporation Indexing Documents for Information Retrieval based on additional feedback fields
US20080005068A1 (en) * 2006-06-28 2008-01-03 Microsoft Corporation Context-based search, retrieval, and awareness
US20080016053A1 (en) * 2006-07-14 2008-01-17 Bea Systems, Inc. Administration Console to Select Rank Factors
US8595245B2 (en) 2006-07-26 2013-11-26 Xerox Corporation Reference resolution for text enrichment and normalization in mining mixed data
US7720830B2 (en) 2006-07-31 2010-05-18 Microsoft Corporation Hierarchical conditional random fields for web extraction
KR100818553B1 (ko) 2006-08-22 2008-04-01 에스케이커뮤니케이션즈 주식회사 문서랭킹 부여방법 및 이를 수행할 수 있는 프로그램이수록된 컴퓨터로 읽을 수 있는 기록 매체
US20080140641A1 (en) 2006-12-07 2008-06-12 Yahoo! Inc. Knowledge and interests based search term ranking for search results validation
US7792883B2 (en) 2006-12-11 2010-09-07 Google Inc. Viewport-relative scoring for location search queries
JP4839195B2 (ja) 2006-12-12 2011-12-21 日本電信電話株式会社 Xml文書の適合度の算出方法およびそのプログラムと、情報処理装置
US7685084B2 (en) * 2007-02-09 2010-03-23 Yahoo! Inc. Term expansion using associative matching of labeled term pairs
US7996392B2 (en) 2007-06-27 2011-08-09 Oracle International Corporation Changing ranking algorithms based on customer settings
US20090006358A1 (en) 2007-06-27 2009-01-01 Microsoft Corporation Search results
US8122032B2 (en) * 2007-07-20 2012-02-21 Google Inc. Identifying and linking similar passages in a digital text corpus
US8201081B2 (en) * 2007-09-07 2012-06-12 Google Inc. Systems and methods for processing inoperative document links
US7840569B2 (en) 2007-10-18 2010-11-23 Microsoft Corporation Enterprise relevancy ranking using a neural network
US9348912B2 (en) 2007-10-18 2016-05-24 Microsoft Technology Licensing, Llc Document length as a static relevance feature for ranking search results
US20090106221A1 (en) 2007-10-18 2009-04-23 Microsoft Corporation Ranking and Providing Search Results Based In Part On A Number Of Click-Through Features
WO2009072174A1 (ja) 2007-12-03 2009-06-11 Pioneer Corporation 情報検索装置、情報検索方法及び検索処理プログラム
US7707229B2 (en) 2007-12-12 2010-04-27 Yahoo! Inc. Unsupervised detection of web pages corresponding to a similarity class
JP2009146248A (ja) 2007-12-17 2009-07-02 Fujifilm Corp コンテンツ提示システム及びプログラム
US20090164929A1 (en) 2007-12-20 2009-06-25 Microsoft Corporation Customizing Search Results
JP2009204442A (ja) 2008-02-28 2009-09-10 Athlete Fa Kk 粒状物質の計量装置
US8412702B2 (en) 2008-03-12 2013-04-02 Yahoo! Inc. System, method, and/or apparatus for reordering search results
US7974974B2 (en) 2008-03-20 2011-07-05 Microsoft Corporation Techniques to perform relative ranking for search results
JP5328212B2 (ja) 2008-04-10 2013-10-30 株式会社エヌ・ティ・ティ・ドコモ レコメンド情報評価装置およびレコメンド情報評価方法
US8812493B2 (en) 2008-04-11 2014-08-19 Microsoft Corporation Search results ranking using editing distance and document information
CN101359331B (zh) 2008-05-04 2014-03-19 索意互动(北京)信息技术有限公司 对搜索结果重新排序的方法和系统
US20090307209A1 (en) 2008-06-10 2009-12-10 David Carmel Term-statistics modification for category-based search
US8326829B2 (en) 2008-10-17 2012-12-04 Centurylink Intellectual Property Llc System and method for displaying publication dates for search results
US8224847B2 (en) 2009-10-29 2012-07-17 Microsoft Corporation Relevant individual searching using managed property and ranking features
US8527507B2 (en) 2009-12-04 2013-09-03 Microsoft Corporation Custom ranking model schema
US8422786B2 (en) 2010-03-26 2013-04-16 International Business Machines Corporation Analyzing documents using stored templates
US8738635B2 (en) 2010-06-01 2014-05-27 Microsoft Corporation Detection of junk in search result ranking
US8370331B2 (en) 2010-07-02 2013-02-05 Business Objects Software Limited Dynamic visualization of search results on a graphical user interface
US9495462B2 (en) 2012-01-27 2016-11-15 Microsoft Technology Licensing, Llc Re-ranking search results

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1226033A (zh) * 1997-11-19 1999-08-18 国际商业机器公司 指定数据处理操作的方法和装置
US6199081B1 (en) * 1998-06-30 2001-03-06 Microsoft Corporation Automatic tagging of documents and exclusion by content
US6598040B1 (en) * 2000-08-14 2003-07-22 International Business Machines Corporation Method and system for processing electronic search expressions

Also Published As

Publication number Publication date
JP6006267B2 (ja) 2016-10-12
EP1659505A1 (en) 2006-05-24
JP2014222538A (ja) 2014-11-27
JP2016181306A (ja) 2016-10-13
US8843486B2 (en) 2014-09-23
JP5323300B2 (ja) 2013-10-23
US20100017403A1 (en) 2010-01-21
JP2012069152A (ja) 2012-04-05
KR100981857B1 (ko) 2010-09-13
US7606793B2 (en) 2009-10-20
JP2006092515A (ja) 2006-04-06
KR20060049239A (ko) 2006-05-18
CN1755677A (zh) 2006-04-05
US20060074865A1 (en) 2006-04-06

Similar Documents

Publication Publication Date Title
CN1755677B (zh) 使用索引关键词的范围搜索系统和方法
JP4856627B2 (ja) 部分的クエリーキャッシング
JP4406609B2 (ja) 単一のインターフェイスからのデータの多重階層を管理するための手法
CN1647080B (zh) 多数据库环境中存取数据的方法和计算机
US7778966B2 (en) Method and system for attribute management in a namespace
US7107261B2 (en) Search engine providing match and alternative answer
US7707168B2 (en) Method and system for data retrieval from heterogeneous data sources
US8271530B2 (en) Method and mechanism for managing and accessing static and dynamic data
CN100465953C (zh) 用逻辑模型查询物理字段或处理抽象查询的方法及系统
US20120059838A1 (en) Providing entity-specific content in response to a search query
US20090234823A1 (en) Remote Access of Heterogeneous Data
JP2006107446A (ja) ネットワーク・ドキュメントのバッチ索引付けのためのシステムおよび方法
JP2006012146A (ja) 影響分析のためのシステムおよび方法
CN102725757A (zh) 上下文查询
US11762775B2 (en) Systems and methods for implementing overlapping data caching for object application program interfaces
US7970867B2 (en) Hypermedia management system
KR100672278B1 (ko) 웹 브라우저의 즐겨찾기 리스트를 이용한 개인화 검색 방법및 검색 서버
Hughes et al. A metadata search engine for digital language archives
US7089232B2 (en) Method of synchronizing distributed but interconnected data repositories
JP3565117B2 (ja) 複数異種情報源アクセス方法及びクライアント装置及び複数異種情報源アクセスプログラムを格納した記憶媒体
US10198249B1 (en) Accessing schema-free databases
WO2007099331A2 (en) Data processing apparatus
Williams Planning and evaluation of federated queries on the web

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: MICROSOFT TECHNOLOGY LICENSING LLC

Free format text: FORMER OWNER: MICROSOFT CORP.

Effective date: 20150504

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20150504

Address after: Washington State

Patentee after: Micro soft technique license Co., Ltd

Address before: Washington State

Patentee before: Microsoft Corp.