WO2012106941A1 - Method and device for full-text search - Google Patents

Method and device for full-text search Download PDF

Info

Publication number
WO2012106941A1
WO2012106941A1 PCT/CN2011/077788 CN2011077788W WO2012106941A1 WO 2012106941 A1 WO2012106941 A1 WO 2012106941A1 CN 2011077788 W CN2011077788 W CN 2011077788W WO 2012106941 A1 WO2012106941 A1 WO 2012106941A1
Authority
WO
WIPO (PCT)
Prior art keywords
classification
search
item
search result
index
Prior art date
Application number
PCT/CN2011/077788
Other languages
French (fr)
Chinese (zh)
Inventor
樊彪
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2011/077788 priority Critical patent/WO2012106941A1/en
Priority to CN2011800013237A priority patent/CN102317943B/en
Publication of WO2012106941A1 publication Critical patent/WO2012106941A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results

Definitions

  • the present invention relates to the field of information search, and in particular, to a method and apparatus for full-text search. Background technique
  • search engines are an effective means of solving the above problems.
  • Most search engines use keyword matching methods to match the data in the repository with the keywords entered by the user to get the information the user needs.
  • Embodiments of the present invention provide a method and apparatus for full-text search to solve the problems in information search in the prior art.
  • a method for full-text search comprising:
  • a device for full-text search characterized in that the device comprises:
  • a search module configured to receive a keyword input by the user, and perform matching according to the keyword in the index library to obtain a search result
  • a classification information obtaining module configured to extract classification information of the search result in the index library
  • a classification display module configured to acquire, according to the classification information, a classification item and a classification item belonging to the search result, and according to the classification information
  • the classified large item and the classified small item to which the search result belongs display the search result into categories;
  • the preset information model includes the classification information of the all searchable documents and the all searchable documents. Title and body content.
  • the beneficial effects of the technical solution provided by the embodiment of the present invention are as follows:
  • the content of the searchable document is saved in the index database by using a preset information model, and the classification information is added, so that the search engine outputs the search result.
  • the classification information the classification item and the classification item belonging to the search result can be classified and displayed, so that the user can quickly obtain the desired result according to the classification big item and the classification item, and search for the user. It is more convenient and faster, reducing the workload of user search.
  • Embodiment 1 is a schematic flowchart of a method for full-text search provided in Embodiment 1 of the present invention
  • Embodiment 2 is a schematic flowchart of a method for full-text search provided in Embodiment 2 of the present invention
  • Embodiment 3 is a schematic structural diagram of a classifier provided in Embodiment 2 of the present invention.
  • FIG. 4 is a schematic diagram showing a display result of a search result after full-text search provided in Embodiment 2 of the present invention
  • FIG. 5 is a schematic diagram showing a sorted search result display situation provided in Embodiment 2 of the present invention
  • FIG. 6 is a schematic structural diagram of an apparatus for searching for a first full text according to Embodiment 3 of the present invention.
  • FIG. 7 is a schematic structural diagram of a second full-text search apparatus according to Embodiment 3 of the present invention.
  • FIG. 8 is a schematic structural diagram of a third full-text search apparatus provided in Embodiment 3 of the present invention.
  • FIG. 9 is a schematic structural diagram of a fourth full-text search apparatus according to Embodiment 3 of the present invention. detailed description
  • an embodiment of the present invention provides a method for full-text search, where the method includes:
  • All the searchable documents mapped by the preset information model are stored in the index library, and the preset information model includes the classification information of all searchable documents and the title and body content of all searchable documents.
  • the full-text search method provided by the embodiment of the present invention saves the content of the searchable document in the index database by using a preset information model, and adds the classification information, so that the search engine can output the search result according to the classification information.
  • the classification item and the classification item belonging to the search result are classified and displayed, so that the user can quickly obtain the desired result by filtering according to the classification item and the classification item, and the user is more convenient and fast to search. Reduce the workload of user search.
  • Embodiment 2 is a refinement based on Embodiment 1, to explain the method provided by the present invention.
  • a method for full-text search specifically comprising:
  • the content of all searchable documents is stored in the index library according to a preset information model, specifically: mapping the content of the searchable document into a body content and a document of a set of metadata and searchable documents.
  • the information consisting of the title.
  • the metadata includes at least classification information of the searchable document.
  • the preset information model defines a custom information format, and the content of the searchable document can be mapped to a format defined by the preset information model, and the content of a searchable document is pre-predicted.
  • the information model is saved as an example to illustrate:
  • keywords content alarm box fault, alarm box reporting a critical alarm when no alarm occurs
  • the content of the searchable document mapped by the information model is saved as an Html format as an example.
  • the content of the searchable document mapped by the information model may also be used in other formats. Save it.
  • the index library may be a database, and is used to save information obtained by mapping the content of the searchable document through a preset information model.
  • a summary field, a keyword field, and the like may be further included, for example, information obtained after mapping by a preset information model, specifically:
  • alarm box fault alarm box reporting a critical alarm when no alarm occurs.
  • field expansion may also be performed, for example, adding a custom field to add classification information and a rule indicating the display of the classification item.
  • the corresponding relationship between the classified large item, the classified small item, and the classification information is stored in the classifier.
  • the classifier is composed of one or more text files, and the structure thereof is composed of three parts: a classification item, a classification item, and metadata, and is illustrated by using FIG. 3 as an example, wherein the classification item includes a large item A.
  • B, N classification items include small items Al, A2, AN, B B2, BN, Nl, and metadata includes VA1, VA2, VAN, VB VB2, VBN, and Li.
  • the classification large items are defined, and the classification small items are defined under each classification large item, and the classification information in the metadata is made. Corresponds to the classification item.
  • each of the classification items and the preset information model Corresponding to the classification information in the metadata the classification information is extracted from the metadata of the information model corresponding to each searchable document, and the classification item and the classification large item to which the searchable document belongs are determined according to the classifier.
  • a searchable document may be defined with a plurality of classification information, and in this way, the searchable document is divided into a plurality of classified large items and classified small items. For example, if the classification information is read in the information model of the searchable document as VA2 and VB2, the searchable document belongs to the classification small item A2 of the classification large item A and the classification of the large classification item B when performing classification. Item B2.
  • the classifier can perform custom extensions to add new classification items and/or classification items.
  • the document index is established.
  • the search engine automatically extracts the classification information in the metadata of all the searchable documents saved according to the preset information model, and obtains the classification items and the classifications of all the searchable documents by the classifier.
  • the item and is saved as a document index in the index library.
  • obtaining, by the classifier, the classification large item and the classification small item to which all the searchable documents belong specifically: acquiring, according to the classification information in the metadata, the classification item belonging to all the searchable documents by the classifier, and according to the classifier
  • the corresponding relationship between the classified large item and the classified small item saved in the category determines the classified large item to which all the searchable documents belong, and saves the classified large item and the classified small item to which all the searchable documents belong as the document index in the index library.
  • the indexing library obtains the classified large item and the classified small item to which the search result belongs, and specifically, in the index library, the classified large item and the classified small item to which the search result saved in the document index belongs are read.
  • the search results are classified and displayed according to the classification information, which specifically includes:
  • the search result is displayed according to the classified large item and the classified small item classified by the search result.
  • the searchable document included in the search result includes three categories of large-scale operation and maintenance processes, document types, product models, and operation and maintenance processes, document types, and product models. It is divided into several sub-items.
  • the above method further includes:
  • the classification item selected by the user is received, and the search result included in the classification item is filtered according to the keyword, and the search result after the filtering search is performed in the searchable document included in the category item is displayed.
  • the search search may be performed based on the keyword among the 388 search results included in the classified item product A, and only the search results included in the classified item A may be displayed. Narrow down The range of search results is more convenient for users to get the closest search results.
  • the method for the full-text search described above may further include:
  • the key field includes a content field and a classification field defined in a preset information model, a content field such as a title, a keyword field, and the like, a classification field such as a summary, a variety of custom fields, and the like, and a body content field of the searchable document may be
  • the key segment weighter is specifically a text file, and the weight of the key segment can be weighted according to the key segment weighter. The greater the weighting result corresponding to the searchable document, the more relevant the searchable document is. High, the higher the priority of the search results output.
  • the title, keyword, and instruction command are all key fields defined in the preset information model.
  • body can be used as a standard field.
  • step 206 specifically includes: classifying all the search results according to the classification information, and displaying all the search results according to the relevance of all the search results. Sort display. As shown in FIG. 5, the ranking results of the search results after the key segment weighting process are displayed, and the top ranked is the most relevant search result, which can make the user get the accuracy of the search result and the user needs to search. Keep the document in front and avoid too many page-finding operations.
  • the full-text search method provided by the embodiment of the present invention saves the content of the searchable document in the index database by using a preset information model, and adds the classification information, so that the search engine can output the search result according to the classification information.
  • the classification item and the classification item belonging to the search result are classified and displayed, so that the user can quickly obtain the desired result by filtering according to the classification item and the classification item, and the user is more convenient and fast to search. Reduce the workload of user search.
  • Example 3 Example 3
  • An embodiment of the present invention provides a device for full-text search. As shown in FIG. 6, the device specifically includes:
  • the search module 301 is configured to receive a keyword input by the user, and perform matching according to the keyword in the index library to obtain a search result;
  • a classification information obtaining module 302 configured to extract classification information of the search result in the index library
  • the classification display module 303 is configured to obtain, according to the classification information, the classification large item and the classification small item to which the search result belongs according to the classification information, and classify and display the search result according to the classification large item and the classification small item to which the search result belongs, and classify and display the search result;
  • the foregoing apparatus further includes:
  • the document indexing module 304 is configured to: before the search module 304 receives the keyword input by the user, establish a classifier according to the preset information model, and extract classification information in all searchable documents mapped by the preset information model, according to the classification And classification information to obtain the classification items and classification items belonging to all searchable documents, and save them as document indexes, and save the document indexes in the index library;
  • the classifier stores a correspondence relationship between the classification large item, the classification small item, and the classification information.
  • the classification display module 303 is specifically configured to: obtain, according to the classification information of the search result, a classification item and a classification item belonging to the search result in the document index, and display the search according to the classification item and the classification item category to which the search result belongs result.
  • the foregoing apparatus further includes:
  • the filtering search module 305 is configured to: after the classification display module 303 sorts and displays the search results according to the classification information, receive the classification item selected by the user, and perform filtering search according to the keyword in the searchable document included in the classification item, display filtering Search results after search.
  • the foregoing apparatus further includes:
  • the weighted index establishing module 306 is configured to establish a key segment weighter according to a preset information model before the search module 304 receives the keyword input by the user, and define different key segments for different key segments in the key segment weighter. Weight, calculate the weighted result of all searchable documents after the preset information model mapping, and store the weighted result as a weighted index in the index library.
  • the classification display module 303 is specifically configured to classify and display the search results according to the classification information, and obtain the weighted results of the search results saved in the weighted index, and display the search results in descending order according to the weighting result.
  • the apparatus for full-text search provided by the embodiment of the present invention saves the content of the searchable document in the index database by using a preset information model, and adds the classification information, so that the search engine can output the search result according to the classification information.
  • the classification item and the classification item belonging to the search result are classified and displayed, so that the user can quickly obtain the desired result by filtering according to the classification item and the classification item, and the user is more convenient and fast to search. Reduce the workload of user search.
  • the apparatus for full-text search provided by the foregoing embodiment is only illustrated by the division of each functional module. In actual applications, the function allocation may be completed by different functional modules as needed, that is, full-text search.
  • the internal structure of the device is divided into different functional modules to perform all or part of the functions described above.
  • the device for the full-text search provided by the foregoing embodiment is the same as the method for the full-text search. The specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • All or part of the technical solutions provided by the above embodiments may be implemented by software programming, and the software program is stored in a readable storage medium such as a hard disk, an optical disk or a floppy disk in a computer.

Abstract

Disclosed are a method and device for full-text search, belonging to the field of information search. In the present invention, the contents of searchable documents are stored in an index library using a preset information model, and classification information is added, so that when the search engine outputs the search results, a broad category and a precise category to which the search results belong can be acquired according to the classification information so as to display the search results in a classified manner, and the user can quickly acquire the desired results by filtering according to the broad category and the precise category, and the user can carry out search more conveniently and quickly, reducing the search workload of the user.

Description

一种全文搜索的方法和装置 技术领域  Method and device for full-text search
本发明涉及信息搜索领域, 特别涉及一种全文搜索的方法和装置。 背景技术  The present invention relates to the field of information search, and in particular, to a method and apparatus for full-text search. Background technique
随着信息规模的迅速增长, 如何说高效、 准确的获取包含用户所需的信息, 日益成为迫 切需要解决的问题。 在目前阶段, 搜索引擎是解决上述问题的一个有效手段。 大多数搜索 引擎使用关键词匹配的方法, 使用用户输入的关键词对信息库中的数据进行匹配, 以得到 用户所需的信息。  With the rapid growth of information scale, how to say that efficient and accurate access to the information needed by users is becoming an urgent problem to be solved. At this stage, search engines are an effective means of solving the above problems. Most search engines use keyword matching methods to match the data in the repository with the keywords entered by the user to get the information the user needs.
 Book
但是, 发明人发现现有技术中通过搜索引擎搜索用户所需的信息的方法存在以下缺陷: 搜索结果数量巨大, 搜索结果可能涉及很多领域, 用户在搜索结果中寻找自己关注的 内容往往是很困难的。 发明内容  However, the inventors have found that the method for searching for information required by a user through a search engine in the prior art has the following drawbacks: The number of search results is huge, and the search results may involve many fields, and it is often difficult for users to find content of their own interest in search results. of. Summary of the invention
本发明实施例提供了一种全文搜索的方法和装置, 以解决在现有技术中在信息搜索中 存在的问题。  Embodiments of the present invention provide a method and apparatus for full-text search to solve the problems in information search in the prior art.
一种全文搜索的方法, 所述方法包括:  A method for full-text search, the method comprising:
接收用户输入的关键词, 根据所述关键词在索引库中进行匹配得出搜索结果; 在所述索引库中提取所述搜索结果的分类信息;  Receiving a keyword input by the user, performing matching in the index library according to the keyword to obtain a search result; extracting, in the index library, classification information of the search result;
根据所述分类信息获取所述搜索结果所属的分类大项和分类小项, 并根据所述搜索结 果所属的分类大项和分类小项将所述搜索结果分类显示;  And obtaining, according to the classification information, a classification large item and a classification small item to which the search result belongs, and classifying the search result according to the classification large item and the classification small item to which the search result belongs;
其中, 在所述索引库中存储有经过预设的信息模型映射后的全部可搜索文档, 所述预 设的信息模型中包括所述全部可搜索文档的分类信息和所述全部可搜索文档的标题以及正 文内容。 一种全文搜索的装置, 其特征在于, 所述装置包括:  All the searchable documents mapped by the preset information model are stored in the index library, and the preset information model includes the classification information of the all searchable documents and the all searchable documents. Title and body content. A device for full-text search, characterized in that the device comprises:
搜索模块, 用于接收用户输入的关键词, 根据所述关键词在索引库中进行匹配得出搜 索结果; 分类信息获取模块, 用于在所述索引库中提取所述搜索结果的分类信息; 分类显示模块, 用于根据所述分类信息获取所述搜索结果所属的分类大项和分类小项, 并根据所述搜索结果所属的分类大项和分类小项将所述搜索结果分类显示; a search module, configured to receive a keyword input by the user, and perform matching according to the keyword in the index library to obtain a search result; a classification information obtaining module, configured to extract classification information of the search result in the index library; and a classification display module, configured to acquire, according to the classification information, a classification item and a classification item belonging to the search result, and according to the classification information The classified large item and the classified small item to which the search result belongs display the search result into categories;
其中, 在所述索引库中存储有经过预设的信息模型映射后的全部可搜索文档, 所述预 设的信息模型中包括所述全部可搜索文档的分类信息和所述全部可搜索文档的标题以及正 文内容。 本发明实施例提供的技术方案的有益效果是: 本发明实施例通过将可搜索文档的内容 使用预设的信息模型保存在索引库中, 并加入分类信息, 使得搜索引擎在输出搜索结果的 时候, 可根据分类信息获取搜索结果所属的分类大项和分类小项对搜索结果进行分类显示, 使得用户可以在根据分类大项和分类小项进行筛选快速得到想要得到的结果, 对于用户进 行搜索更加便捷和快速, 减少了用户搜索的工作量。 附图说明  All the searchable documents mapped by the preset information model are stored in the index library, and the preset information model includes the classification information of the all searchable documents and the all searchable documents. Title and body content. The beneficial effects of the technical solution provided by the embodiment of the present invention are as follows: In the embodiment of the present invention, the content of the searchable document is saved in the index database by using a preset information model, and the classification information is added, so that the search engine outputs the search result. According to the classification information, the classification item and the classification item belonging to the search result can be classified and displayed, so that the user can quickly obtain the desired result according to the classification big item and the classification item, and search for the user. It is more convenient and faster, reducing the workload of user search. DRAWINGS
图 1为本发明实施例 1中所提供的一种全文搜索的方法的流程示意图;  1 is a schematic flowchart of a method for full-text search provided in Embodiment 1 of the present invention;
图 2为本发明实施例 2中所提供的一种全文搜索的方法的流程示意图;  2 is a schematic flowchart of a method for full-text search provided in Embodiment 2 of the present invention;
图 3为本发明实施例 2中所提供的分类器的结构示意图;  3 is a schematic structural diagram of a classifier provided in Embodiment 2 of the present invention;
图 4为本发明实施例 2中所提供的进行全文搜索后的搜索结果显示情况示意图; 图 5为本发明实施例 2中所提供的经过排序后的搜索结果显示情况示意图;  4 is a schematic diagram showing a display result of a search result after full-text search provided in Embodiment 2 of the present invention; FIG. 5 is a schematic diagram showing a sorted search result display situation provided in Embodiment 2 of the present invention;
图 6为本发明实施例 3中所提供的第一种全文搜索的装置的结构示意图;  6 is a schematic structural diagram of an apparatus for searching for a first full text according to Embodiment 3 of the present invention;
图 7为本发明实施例 3中所提供的第二种全文搜索的装置的结构示意图;  FIG. 7 is a schematic structural diagram of a second full-text search apparatus according to Embodiment 3 of the present invention; FIG.
图 8为本发明实施例 3中所提供的第三种全文搜索的装置的结构示意图;  8 is a schematic structural diagram of a third full-text search apparatus provided in Embodiment 3 of the present invention;
图 9为本发明实施例 3中所提供的第四种全文搜索的装置的结构示意图。 具体实施方式  FIG. 9 is a schematic structural diagram of a fourth full-text search apparatus according to Embodiment 3 of the present invention. detailed description
为使本发明的目的、 技术方案和优点更加清楚, 下面将结合附图对本发明实施方式作 进一步地详细描述。  The embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.
实施例 1  Example 1
如图 1所示, 本发明实施例提供了一种全文搜索的方法, 该方法包括:  As shown in FIG. 1 , an embodiment of the present invention provides a method for full-text search, where the method includes:
101、 接收用户输入的关键词, 根据该关键词在索引库中进行匹配得出搜索结果; 101. Receive a keyword input by a user, and perform matching according to the keyword in the index library to obtain a search result;
102、 在索引库中提取搜索结果的分类信息; 103、 根据分类信息获取搜索结果所属的分类大项和分类小项, 并根据搜索结果所属的 分类大项和分类小项将搜索结果分类显示。 102. Extract classification information of the search result in the index library; 103. Acquire, according to the classification information, a classification large item and a classification small item to which the search result belongs, and classify and display the search result according to the classified large item and the classified small item to which the search result belongs.
其中, 在索引库中存储有经过预设的信息模型映射后的全部可搜索文档, 预设的信息 模型中包括全部可搜索文档的分类信息和全部可搜索文档的标题及正文内容。 本发明实施例所提供的全文搜索的方法, 通过将可搜索文档的内容使用预设的信息模 型保存在索引库中, 并加入分类信息, 使得搜索引擎在输出搜索结果的时候, 可根据分类 信息获取搜索结果所属的分类大项和分类小项对搜索结果进行分类显示, 使得用户可以在 根据分类大项和分类小项进行筛选快速得到想要得到的结果, 对于用户进行搜索更加便捷 和快速, 减少了用户搜索的工作量。 实施例 2  All the searchable documents mapped by the preset information model are stored in the index library, and the preset information model includes the classification information of all searchable documents and the title and body content of all searchable documents. The full-text search method provided by the embodiment of the present invention saves the content of the searchable document in the index database by using a preset information model, and adds the classification information, so that the search engine can output the search result according to the classification information. The classification item and the classification item belonging to the search result are classified and displayed, so that the user can quickly obtain the desired result by filtering according to the classification item and the classification item, and the user is more convenient and fast to search. Reduce the workload of user search. Example 2
如图 2所示, 本发明实施例提供了一种全文搜索的方法, 实施例 2是在实施例 1的基 础之上进行的细化, 以对本发明提供的方法进行说明。  As shown in FIG. 2, the embodiment of the present invention provides a method for full-text search, and Embodiment 2 is a refinement based on Embodiment 1, to explain the method provided by the present invention.
一种全文搜索的方法, 具体包括:  A method for full-text search, specifically comprising:
201、 将全部可搜索文档的内容按照预设的信息模型保存在索引库中;  201. Save the content of all searchable documents in an index library according to a preset information model;
其中, 将全部可搜索文档的内容按照预设的信息模型保存在索引库中, 具体的为: 将可搜索文档的内容映射为由一组元数据 (metadata) 和可搜索文档的正文内容及文 档标题组成的信息。 其中, 元数据中至少包括该可搜索文档的分类信息。  The content of all searchable documents is stored in the index library according to a preset information model, specifically: mapping the content of the searchable document into a body content and a document of a set of metadata and searchable documents. The information consisting of the title. The metadata includes at least classification information of the searchable document.
需要说明的是, 预设的信息模型定义了一种自定义信息格式, 可将可搜索文档的内容 映射为预设的信息模型定义的格式进行保存, 下面以将一个可搜索文档的内容按照预设的 信息模型进行保存为例进行说明:  It should be noted that the preset information model defines a custom information format, and the content of the searchable document can be mapped to a format defined by the preset information model, and the content of a searchable document is pre-predicted. The information model is saved as an example to illustrate:
<html>  <html>
<head>  <head>
<meta http_equiv=〃Content_Type〃 content= text/html; charset=utf_8〃〉 <meta http_equiv=〃Content_Type〃 content = text/html; charset=utf_8〃〉
<meta name= DC. Type" content= reference > <meta name = DC. Type" content= reference >
<meta name= abs tract" content= This describes the symptoms.〃〉 <meta name = abs tract" content = This describes the symptoms.〃〉
<meta name= keywords content= alarm box fault, alarm box reporting a critical alarm when no alarm occurs") <meta name= keywords content = alarm box fault, alarm box reporting a critical alarm when no alarm occurs")
<meta name= DC. Audience. Job " content= trouble management 〃〉 <meta name = DC. Audience. Job " content = trouble management 〃〉
</head> <title> Alarm Box Reporting a Critical Alarm When No Alarm Occurs</title> <body> </head> <title> Alarm Box Reporting a Critical Alarm When No Alarm Occurs</title><body>
Hel lo, World  Hel lo, World
</body>  </body>
</html>  </html>
其中, 在本实施例中, 以将经过信息模型映射的可搜索文档的内容保存为 Html格式为 例进行说明, 在实际应用中, 还可以使用其他格式对经过信息模型映射的可搜索文档的内 容进行保存。  In this embodiment, the content of the searchable document mapped by the information model is saved as an Html format as an example. In an actual application, the content of the searchable document mapped by the information model may also be used in other formats. Save it.
在上述将可搜索文档的内容经过预设的信息模型映射后得到的信息中, "〈head〉"与 〃〈/head〉"之间的内容为元数据, "〈meta name= DC. Type" content=〃reference〃〉〃和〃〈meta name= DC. Audience. Job " content= trouble management 〃〉〃为分类信息, 其中, "DC. Type"为元数据名称, 表示该字段为分类信息, "reference"为元数据" DC. Type"对应的 值, 〃 DC. Audience. Job〃也为元数据名称, 表示该字段为分类信息, 〃 trouble management "为兀数据" DC. Audience. Job "对应的值, " Alarm Box Reporting a Critical Alarm When No Alarm Occurs 〃和〃 Hel lo, World 〃分别为可搜索文档的标题和正文内容。 In the above information obtained by mapping the content of the searchable document through the preset information model, the content between "<h ea d>" and 〃</head>" is metadata, "<meta name = DC. Type"content=〃reference〃>〃 and 〃 <meta name = DC. Audience. Job " content = trouble management 〃> 〃 is classification information, where "DC. Type" is the metadata name, indicating that the field is classification information "reference" is the value corresponding to the metadata "DC. Type", 〃 DC. Audience. Job〃 is also the metadata name, indicating that the field is classification information, 〃 trouble management "for 兀 data" DC. Audience. Job " Corresponding values, "Alarm Box Reporting a Critical Alarm When No Alarm Occurs 〃 and Hel Hel lo, World 〃 are the title and body content of the searchable document, respectively.
其中, 索引库具体可以为数据库, 用于保存将可搜索文档的内容经过预设的信息模型 映射后得到的信息。  The index library may be a database, and is used to save information obtained by mapping the content of the searchable document through a preset information model.
进一步地, 在预设的信息模型中, 还可以包括摘要字段、 关键字字段等, 例如, 在经 过预设的信息模型映射后得到的信息, 具体地:  Further, in the preset information model, a summary field, a keyword field, and the like may be further included, for example, information obtained after mapping by a preset information model, specifically:
摘要字段〃 abstract〃为〃 This describes the symptoms.〃, 关键字字段〃 keywords〃为 Abstract field 〃 abstract〃为〃 This describes the symptoms.〃, keyword field 〃 keywords〃
"alarm box fault, alarm box reporting a critical alarm when no alarm occurs"。 "alarm box fault, alarm box reporting a critical alarm when no alarm occurs".
进一步地, 在预设的信息模型中, 还可以进行字段扩展, 例如增加自定义字段, 用以 增加分类信息以及指示分类项显示的规则。  Further, in the preset information model, field expansion may also be performed, for example, adding a custom field to add classification information and a rule indicating the display of the classification item.
202、 根据预设的信息模型建立分类器;  202. Establish a classifier according to a preset information model.
其中, 在分类器中存储有分类大项、 分类小项和分类信息的对应关系。  The corresponding relationship between the classified large item, the classified small item, and the classification information is stored in the classifier.
具体地, 分类器由一个或多个文本文件组成, 其结构由三部分组成: 分类大项, 分类 小项和元数据, 并以图 3为例进行说明, 其中, 分类大项包括大项 A、 B、 N, 分类小项包括 小项 Al、 A2、 AN、 B B2、 BN、 Nl , 元数据包括 VA1、 VA2、 VAN、 VB VB2、 VBN、 丽。  Specifically, the classifier is composed of one or more text files, and the structure thereof is composed of three parts: a classification item, a classification item, and metadata, and is illustrated by using FIG. 3 as an example, wherein the classification item includes a large item A. , B, N, classification items include small items Al, A2, AN, B B2, BN, Nl, and metadata includes VA1, VA2, VAN, VB VB2, VBN, and Li.
在图 3 所示的分类器结构图中, 与预设的信息模型相对应, 定义了分类大项, 并在每 个分类大项下定义了分类小项, 并使将元数据中的分类信息与分类小项进行对应。  In the classifier structure diagram shown in Figure 3, corresponding to the preset information model, the classification large items are defined, and the classification small items are defined under each classification large item, and the classification information in the metadata is made. Corresponds to the classification item.
需要说明的是, 在图 3 所示的分类器结构图中, 每个分类小项与预设的信息模型中的 元数据中的分类信息相对应, 通过在每个可搜索文档对应的信息模型的元数据中提取其分 类信息, 根据分类器可判断该可搜索文档所属的分类小项和分类大项。 It should be noted that, in the classifier structure diagram shown in FIG. 3, each of the classification items and the preset information model Corresponding to the classification information in the metadata, the classification information is extracted from the metadata of the information model corresponding to each searchable document, and the classification item and the classification large item to which the searchable document belongs are determined according to the classifier.
进一步地, 一个可搜索文档的可以定义有多个分类信息, 并通过此方式将该可搜索文 档划分成属于多个分类大项和分类小项。 例如, 在可搜索文档的信息模型中读取其分类信 息为 VA2和 VB2, 则该可搜索文档在进行分类时既属于分类大项 A的分类小项 A2, 又属于 分类大项 B的分类小项 B2。  Further, a searchable document may be defined with a plurality of classification information, and in this way, the searchable document is divided into a plurality of classified large items and classified small items. For example, if the classification information is read in the information model of the searchable document as VA2 and VB2, the searchable document belongs to the classification small item A2 of the classification large item A and the classification of the large classification item B when performing classification. Item B2.
进一步地, 分类器可以进行自定义扩展, 用以增加新的分类大项和 /或分类小项。  Further, the classifier can perform custom extensions to add new classification items and/or classification items.
203、 建立文档索引, 并保存到索引库中;  203. Create a document index and save it to an index library.
其中, 建立文档索引, 具体的为, 搜索引擎自动抽取根据预设的信息模型保存的全部 可搜索文档的元数据中的分类信息, 通过分类器获取全部可搜索文档所属的分类大项和分 类小项, 并作为文档索引保存在到索引库中。  Wherein, the document index is established. Specifically, the search engine automatically extracts the classification information in the metadata of all the searchable documents saved according to the preset information model, and obtains the classification items and the classifications of all the searchable documents by the classifier. The item, and is saved as a document index in the index library.
进一步地, 通过分类器获取全部可搜索文档所属的分类大项和分类小项, 具体包括: 根据元数据中的分类信息, 通过分类器获取全部可搜索文档所属的分类小项, 并根据 分类器中保存的分类大项和分类小项中的对应关系判断全部可搜索文档所属的分类大项, 并将全部可搜索文档所属的分类大项和分类小项作为文档索引保存在索引库中。  Further, obtaining, by the classifier, the classification large item and the classification small item to which all the searchable documents belong, specifically: acquiring, according to the classification information in the metadata, the classification item belonging to all the searchable documents by the classifier, and according to the classifier The corresponding relationship between the classified large item and the classified small item saved in the category determines the classified large item to which all the searchable documents belong, and saves the classified large item and the classified small item to which all the searchable documents belong as the document index in the index library.
204、 接收用户输入的关键词, 根据该关键词在索引库中进行匹配得出搜索结果; 204. Receive a keyword input by a user, and perform matching according to the keyword in the index library to obtain a search result;
205、 通过索引库获取搜索结果所属的分类大项和分类小项; 205. Obtain a classification item and a classification item belonging to the search result by using an index library;
其中, 通过索引库获取搜索结果所属的分类大项和分类小项, 具体的为, 在索引库中, 读取文档索引中所保存的搜索结果所属的分类大项和分类小项。  The indexing library obtains the classified large item and the classified small item to which the search result belongs, and specifically, in the index library, the classified large item and the classified small item to which the search result saved in the document index belongs are read.
206、 根据搜索结果所属的分类大项和分类小项对搜索结果分类显示。  206. Display and display the search results according to the classified large items and the classified small items to which the search result belongs.
在本实施例中, 根据分类信息将搜索结果分类显示, 具体包括:  In this embodiment, the search results are classified and displayed according to the classification information, which specifically includes:
根据上述搜索结果所属的分类大项和分类小项, 按照搜索结果所属的分类大项和分类 小项分类显示搜索结果。  According to the classified large item and the classified small item to which the above search result belongs, the search result is displayed according to the classified large item and the classified small item classified by the search result.
其中, 如图 4所示的搜索结果显示情况中, 搜索结果所包括的可搜索文档中共包括三 个分类大项运维流程、 文档类型、 产品型号, 在运维流程、 文档类型、 产品型号下又分为 若干分类小项。  In the search result display situation shown in FIG. 4, the searchable document included in the search result includes three categories of large-scale operation and maintenance processes, document types, product models, and operation and maintenance processes, document types, and product models. It is divided into several sub-items.
进一步地, 上述方法还包括:  Further, the above method further includes:
接收用户选择的分类小项, 在分类小项所包括的可搜索文档中根据关键词进行过滤搜 索, 显示在该分类小项所包括的可搜索文档中进行过滤搜索后的搜索结果。  The classification item selected by the user is received, and the search result included in the classification item is filtered according to the keyword, and the search result after the filtering search is performed in the searchable document included in the category item is displayed.
例如, 如图 4所示的搜索结果, 可在分类小项产品 A中所包括的 388项搜索结果中根 据关键词进行过滤搜索, 并且只显示分类小项产品 A 中所包括的搜索结果, 以缩小所得到 的搜索结果的范围, 更加方便于用户得到最接近的搜索结果。 For example, as shown in the search result shown in FIG. 4, the search search may be performed based on the keyword among the 388 search results included in the classified item product A, and only the search results included in the classified item A may be displayed. Narrow down The range of search results is more convenient for users to get the closest search results.
进一步地, 上述全文搜索的方法, 还可以包括:  Further, the method for the full-text search described above may further include:
建立关键字段加权器, 在关键字段加权器中为不同的关键字段定义不同的权重, 并计 算索引库中保存的经过预设的信息模型映射后的全部可搜索文档的加权结果, 将加权结果 作为加权索引保存到索引库中。  Establishing a key segment weighter, defining different weights for different key fields in the key segment weighter, and calculating weighted results of all searchable documents mapped by the preset information model saved in the index library, The weighted result is saved to the index library as a weighted index.
其中, 关键字段包括预设的信息模型中定义的内容字段和分类字段, 内容字段如标题、 关键字字段等, 分类字段如摘要以各种自定义字段等, 可搜索文档的正文内容字段可作为 标准字段, 关键字段加权器具体为一个文本文件, 根据关键字段加权器可以对关键字段的 权重进行加权处理, 可搜索文档对应的加权结果越大, 则可搜索文档的相关度越高, 在搜 索结果输出时其优先权越高。  The key field includes a content field and a classification field defined in a preset information model, a content field such as a title, a keyword field, and the like, a classification field such as a summary, a variety of custom fields, and the like, and a body content field of the searchable document may be As a standard field, the key segment weighter is specifically a text file, and the weight of the key segment can be weighted according to the key segment weighter. The greater the weighting result corresponding to the searchable document, the more relevant the searchable document is. High, the higher the priority of the search results output.
例如, 以表 1对关键字段加权器进行说明:  For example, the key segment weighters are described in Table 1:
表 1 Table 1
Figure imgf000008_0001
Figure imgf000008_0001
在表 1中, 标题、 关键字、 instruction command均为在预设的信息模型中定义的关 键字段, 如步骤 201中的信息模型, body可作为标准字段。  In Table 1, the title, keyword, and instruction command are all key fields defined in the preset information model. As the information model in step 201, body can be used as a standard field.
相应地, 在对根据预设的信息模型映射的全部可搜索文档进行加权计算后, 步骤 206 具体的包括- 按照分类信息将全部搜索结果分类显示, 并根据全部搜索结果的相关度将全部搜索结 果排序显示。 如图 5 所示, 为经过关键字段加权处理后进行的搜索结果排序显示, 排名最靠前的为 相关度最高的搜索结果, 该方法可以使得用户得到搜索结果准确性增加, 用户需要检索的 文档靠前, 避免用户过多的翻页查找操作。 本发明实施例所提供的全文搜索的方法, 通过将可搜索文档的内容使用预设的信息模 型保存在索引库中, 并加入分类信息, 使得搜索引擎在输出搜索结果的时候, 可根据分类 信息获取搜索结果所属的分类大项和分类小项对搜索结果进行分类显示, 使得用户可以在 根据分类大项和分类小项进行筛选快速得到想要得到的结果, 对于用户进行搜索更加便捷 和快速, 减少了用户搜索的工作量。 实施例 3 Correspondingly, after performing weighting calculation on all searchable documents mapped according to the preset information model, step 206 specifically includes: classifying all the search results according to the classification information, and displaying all the search results according to the relevance of all the search results. Sort display. As shown in FIG. 5, the ranking results of the search results after the key segment weighting process are displayed, and the top ranked is the most relevant search result, which can make the user get the accuracy of the search result and the user needs to search. Keep the document in front and avoid too many page-finding operations. The full-text search method provided by the embodiment of the present invention saves the content of the searchable document in the index database by using a preset information model, and adds the classification information, so that the search engine can output the search result according to the classification information. The classification item and the classification item belonging to the search result are classified and displayed, so that the user can quickly obtain the desired result by filtering according to the classification item and the classification item, and the user is more convenient and fast to search. Reduce the workload of user search. Example 3
本发明实施例提供了一种全文搜索的装置, 如图 6所示, 该装置具体包括:  An embodiment of the present invention provides a device for full-text search. As shown in FIG. 6, the device specifically includes:
搜索模块 301, 用于接收用户输入的关键词, 根据关键词在索引库中进行匹配得出搜索 结果;  The search module 301 is configured to receive a keyword input by the user, and perform matching according to the keyword in the index library to obtain a search result;
分类信息获取模块 302, 用于在索引库中提取搜索结果的分类信息;  a classification information obtaining module 302, configured to extract classification information of the search result in the index library;
分类显示模块 303,用于根据分类信息根据分类信息获取搜索结果所属的分类大项和分 类小项, 并根据搜索结果所属的分类大项和分类小项将搜索结果分类显示将搜索结果分类 显示;  The classification display module 303 is configured to obtain, according to the classification information, the classification large item and the classification small item to which the search result belongs according to the classification information, and classify and display the search result according to the classification large item and the classification small item to which the search result belongs, and classify and display the search result;
其中, 在索引库中存储有经过预设的信息模型映射的全部可搜索文档, 预设的信息模 型中包括全部可搜索文档的分类信息和全部可搜索文档的正文内容。 进一步地, 如图 7所示, 上述装置还包括:  Wherein, all the searchable documents mapped by the preset information model are stored in the index library, and the preset information model includes the classification information of all searchable documents and the body content of all searchable documents. Further, as shown in FIG. 7, the foregoing apparatus further includes:
文档索引建立模块 304, 用于在搜索模块 304接收用户输入的关键词之前, 根据预设的 信息模型建立分类器, 提取经过预设的信息模型映射的全部可搜索文档中的分类信息, 根 据分类器和分类信息获取全部可搜索文档所属的分类大项和分类小项, 并保存为文档索引, 将文档索引保存在索引库中;  The document indexing module 304 is configured to: before the search module 304 receives the keyword input by the user, establish a classifier according to the preset information model, and extract classification information in all searchable documents mapped by the preset information model, according to the classification And classification information to obtain the classification items and classification items belonging to all searchable documents, and save them as document indexes, and save the document indexes in the index library;
其中, 分类器中存储有分类大项、 分类小项和分类信息的对应关系。  The classifier stores a correspondence relationship between the classification large item, the classification small item, and the classification information.
相应地, 分类显示模块 303 具体用于, 根据搜索结果的分类信息在文档索引中获取搜 索结果所属的分类大项和分类小项, 并按照搜索结果所属的分类大项和分类小项分类显示 搜索结果。  Correspondingly, the classification display module 303 is specifically configured to: obtain, according to the classification information of the search result, a classification item and a classification item belonging to the search result in the document index, and display the search according to the classification item and the classification item category to which the search result belongs result.
进一步地, 如图 8所示, 上述装置还包括: 过滤搜索模块 305, 用于在分类显示模块 303根据分类信息将搜索结果分类显示之后, 接收用户选择的分类小项, 在分类小项所包括的可搜索文档中根据关键词进行过滤搜索, 显示过滤搜索后的搜索结果。 Further, as shown in FIG. 8, the foregoing apparatus further includes: The filtering search module 305 is configured to: after the classification display module 303 sorts and displays the search results according to the classification information, receive the classification item selected by the user, and perform filtering search according to the keyword in the searchable document included in the classification item, display filtering Search results after search.
进一步地, 如图 9所示, 上述装置还包括:  Further, as shown in FIG. 9, the foregoing apparatus further includes:
加权索引建立模块 306, 用于在搜索模块 304接收用户输入的关键词之前, 根据预设的 信息模型建立关键字段加权器, 并在关键字段加权器中为不同的关键字段定义不同的权重, 计算经过预设的信息模型映射后的全部可搜索文档的加权结果, 将加权结果作为加权索引 保存在索引库中。  The weighted index establishing module 306 is configured to establish a key segment weighter according to a preset information model before the search module 304 receives the keyword input by the user, and define different key segments for different key segments in the key segment weighter. Weight, calculate the weighted result of all searchable documents after the preset information model mapping, and store the weighted result as a weighted index in the index library.
相应地, 分类显示模块 303 具体用于, 按照分类信息将搜索结果分类显示, 并获取加 权索引中保存的搜索结果的加权结果, 按照加权结果从高到低排序显示搜索结果。 本发明实施例所提供的全文搜索的装置, 通过将可搜索文档的内容使用预设的信息模 型保存在索引库中, 并加入分类信息, 使得搜索引擎在输出搜索结果的时候, 可根据分类 信息获取搜索结果所属的分类大项和分类小项对搜索结果进行分类显示, 使得用户可以在 根据分类大项和分类小项进行筛选快速得到想要得到的结果, 对于用户进行搜索更加便捷 和快速, 减少了用户搜索的工作量。 需要说明的是: 上述实施例提供的全文搜索的装置, 仅以上述各功能模块的划分进行 举例说明, 实际应用中, 可以根据需要而将上述功能分配由不同的功能模块完成, 即将全 文搜索的装置的内部结构划分成不同的功能模块, 以完成以上描述的全部或者部分功能。 另外, 上述实施例提供的全文搜索的装置与全文搜索的方法实施例属于同一构思, 其具体 实现过程详见方法实施例, 这里不再赘述。  Correspondingly, the classification display module 303 is specifically configured to classify and display the search results according to the classification information, and obtain the weighted results of the search results saved in the weighted index, and display the search results in descending order according to the weighting result. The apparatus for full-text search provided by the embodiment of the present invention saves the content of the searchable document in the index database by using a preset information model, and adds the classification information, so that the search engine can output the search result according to the classification information. The classification item and the classification item belonging to the search result are classified and displayed, so that the user can quickly obtain the desired result by filtering according to the classification item and the classification item, and the user is more convenient and fast to search. Reduce the workload of user search. It should be noted that: the apparatus for full-text search provided by the foregoing embodiment is only illustrated by the division of each functional module. In actual applications, the function allocation may be completed by different functional modules as needed, that is, full-text search. The internal structure of the device is divided into different functional modules to perform all or part of the functions described above. In addition, the device for the full-text search provided by the foregoing embodiment is the same as the method for the full-text search. The specific implementation process is described in detail in the method embodiment, and details are not described herein again.
以上实施例提供的技术方案中的全部或部分内容可以通过软件编程实现, 其软件程序 存储在可读取的存储介质中, 存储介质例如: 计算机中的硬盘、 光盘或软盘。  All or part of the technical solutions provided by the above embodiments may be implemented by software programming, and the software program is stored in a readable storage medium such as a hard disk, an optical disk or a floppy disk in a computer.
以上所述仅为本发明的较佳实施例, 并不用以限制本发明, 凡在本发明的精神和原则 之内, 所作的任何修改、 等同替换、 改进等, 均应包含在本发明的保护范围之内。  The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., which are within the spirit and scope of the present invention, should be included in the protection of the present invention. Within the scope.

Claims

权 利 要 求 书 Claim
1、 一种全文搜索的方法, 其特征在于, 所述方法包括:  A method for full-text search, characterized in that the method comprises:
接收用户输入的关键词, 根据所述关键词在索引库中进行匹配得出搜索结果; 在所述索引库中提取所述搜索结果的分类信息;  Receiving a keyword input by the user, performing matching in the index library according to the keyword to obtain a search result; extracting, in the index library, classification information of the search result;
根据所述分类信息获取所述搜索结果所属的分类大项和分类小项, 并根据所述搜索结果 所属的分类大项和分类小项将所述搜索结果分类显示;  And obtaining, according to the classification information, a classification large item and a classification small item to which the search result belongs, and classifying the search result according to the classification large item and the classification small item to which the search result belongs;
其中, 在所述索引库中存储有经过预设的信息模型映射后的全部可搜索文档, 所述预设 的信息模型中包括所述全部可搜索文档的分类信息和所述全部可搜索文档的标题以及正文内 容。  All the searchable documents mapped by the preset information model are stored in the index library, and the preset information model includes the classification information of the all searchable documents and the all searchable documents. Title and body content.
2、 根据权利要求 1所述的方法, 其特征在于, 所述接收用户输入的关键词之前, 所述方 法还包括: The method according to claim 1, wherein before the receiving the keyword input by the user, the method further includes:
根据所述预设的信息模型建立分类器;  Establishing a classifier according to the preset information model;
提取所述经过预设的信息模型映射后的全部可搜索文档中的分类信息;  Extracting classification information in all searchable documents after the preset information model mapping;
根据所述分类器和所述分类信息获取所述全部可搜索文档所属的分类大项和分类小项, 并保存为文档索引, 将所述文档索引保存在所述索引库中;  Obtaining, according to the classifier and the classification information, a classification item and a classification item belonging to the all searchable documents, and saving the file index as a document index, and saving the document index in the index library;
其中, 所述分类器中存储有所述分类大项、 分类小项和分类信息的对应关系。  The classifier stores a correspondence between the classification major item, the classification item, and the classification information.
3、根据权利要求 2所述的方法, 其特征在于, 所述根据所述分类信息将所述搜索结果分 类显示, 具体包括: The method according to claim 2, wherein the displaying the search results according to the classification information comprises:
根据所述搜索结果的分类信息在所述文档索引中获取所述搜索结果所属的分类大项和分 类小项, 并按照所述搜索结果所属的分类大项和分类小项分类显示所述搜索结果。  Obtaining, according to the classification information of the search result, the classified large item and the classified small item to which the search result belongs, and displaying the search result according to the classified large item and the classified small item classified by the search result. .
4、根据权利要求 3所述的方法, 其特征在于, 所述根据所述分类信息将所述搜索结果分 类显示之后, 所述方法还包括: The method according to claim 3, wherein, after the classifying the search results according to the classification information, the method further includes:
接收所述用户选择的分类小项, 在所述分类小项所包括的可搜索文档中根据所述关键词 进行过滤搜索, 显示过滤搜索后的搜索结果。  And receiving the classified item selected by the user, performing a filtering search according to the keyword in the searchable document included in the classified item, and displaying the search result after filtering the search.
5、 根据权利要求 1-4中所述的任一权利要求所述的方法, 其特征在于, 所述接收用户输 入的关键词之前, 所述方法还包括: 5. The method according to any of claims 1-4, wherein the receiving user loses Before entering the keyword, the method further includes:
根据所述预设的信息模型建立关键字段加权器, 并在所述关键字段加权器中为不同的关 键字段定义不同的权重;  Determining a key segment weighter according to the preset information model, and defining different weights for different key fields in the key segment weighter;
计算所述经过预设的信息模型映射后的全部可搜索文档的加权结果, 将所述加权结果作 为加权索引保存在所述索引库中。  Calculating a weighted result of all searchable documents after the preset information model mapping, and storing the weighted result as a weighted index in the index library.
6、根据权利要求 5所述的方法, 其特征在于, 所述根据所述分类信息将所述搜索结果分 类显示, 具体包括: The method according to claim 5, wherein the displaying the search results according to the classification information comprises:
按照所述分类信息将所述搜索结果分类显示, 并获取所述加权索引中保存的所述搜索结 果的加权结果, 按照所述加权结果从高到低排序显示所述搜索结果。  The search results are classified and displayed according to the classification information, and the weighted results of the search results saved in the weighted index are obtained, and the search results are displayed in descending order according to the weighting result.
7、 一种全文搜索的装置, 其特征在于, 所述装置包括: 7. A device for full-text search, wherein the device comprises:
搜索模块, 用于接收用户输入的关键词, 根据所述关键词在索引库中进行匹配得出搜索 结果;  a search module, configured to receive a keyword input by the user, and perform a matching in the index library according to the keyword to obtain a search result;
分类信息获取模块, 用于在所述索引库中提取所述搜索结果的分类信息;  a classification information obtaining module, configured to extract, in the index library, classification information of the search result;
分类显示模块, 用于根据所述分类信息获取所述搜索结果所属的分类大项和分类小项, 并根据所述搜索结果所属的分类大项和分类小项将所述搜索结果分类显示;  a classification display module, configured to acquire, according to the classification information, a classification large item and a classification small item to which the search result belongs, and classify and display the search result according to the classification large item and the classification small item to which the search result belongs;
其中, 在所述索引库中存储有经过预设的信息模型映射后的全部可搜索文档, 所述预设 的信息模型中包括所述全部可搜索文档的分类信息和所述全部可搜索文档的标题以及正文内 容。  All the searchable documents mapped by the preset information model are stored in the index library, and the preset information model includes the classification information of the all searchable documents and the all searchable documents. Title and body content.
8、 根据权利要求 7所述的装置, 其特征在于, 所述装置还包括: 8. The device according to claim 7, wherein the device further comprises:
文档索引建立模块, 用于在所述搜索模块接收所述用户输入的关键词之前, 根据所述预 设的信息模型建立分类器, 提取所述经过预设的信息模型映射的全部可搜索文档中的分类信 息, 根据所述分类器和所述分类信息获取所述全部可搜索文档所属的分类大项和分类小项, 并保存为文档索引, 将所述文档索引保存在所述索引库中;  a document indexing module, configured to: before the searching module receives the keyword input by the user, establish a classifier according to the preset information model, and extract all the searchable documents that are mapped by the preset information model And the classification item is obtained according to the classifier and the classification information, and is saved as a document index, and the document index is saved in the index library;
其中, 所述分类器中存储有所述分类大项、 分类小项和分类信息的对应关系。  The classifier stores a correspondence between the classification major item, the classification item, and the classification information.
9、 根据权利要求 8所述的装置, 其特征在于, 所述分类显示模块具体用于, 根据所述搜 索结果的分类信息在所述文档索引中获取所述搜索结果所属的分类大项和分类小项, 并按照 所述搜索结果所属的分类大项和分类小项分类显示所述搜索结果。 The apparatus according to claim 8, wherein the classification display module is configured to: obtain, according to the classification information of the search result, a classification item and a classification to which the search result belongs according to the document index. Small item, and follow The classified large item and the classified small item classified by the search result display the search result.
10、 根据权利要求 9所述的装置, 其特征在于, 所述装置还包括: The device according to claim 9, wherein the device further comprises:
过滤搜索模块, 用于在所述分类显示模块根据所述分类信息将所述搜索结果分类显示之 后, 接收所述用户选择的分类小项, 在所述分类小项所包括的可搜索文档中根据所述关键词 进行过滤搜索, 显示过滤搜索后的搜索结果。  a filtering search module, configured to: after the classification display module classifies and displays the search result according to the classification information, receive a classification item selected by the user, in a searchable document included in the classification item, according to The keyword performs a filtering search to display search results after filtering the search.
11、 根据权利要求 7-10所述的任一装置, 其特征在于, 所述装置还包括: 11. The device according to any one of claims 7-10, wherein the device further comprises:
加权索引建立模块, 用于在所述搜索模块接收所述用户输入的关键词之前, 根据所述预 设的信息模型建立关键字段加权器, 并在所述关键字段加权器中为不同的关键字段定义不同 的权重, 计算所述经过预设的信息模型映射后的全部可搜索文档的加权结果, 将所述加权结 果作为加权索引保存在所述索引库中。  a weighted index establishing module, configured to establish a key segment weighter according to the preset information model before the search module receives the keyword input by the user, and is different in the key segment weighter The key segment defines different weights, and the weighted result of all searchable documents after the preset information model mapping is calculated, and the weighted result is saved in the index database as a weighted index.
12、 根据权利要求 11所述的装置, 其特征在于, 所述分类显示模块具体用于, 按照所述 分类信息将所述搜索结果分类显示,并获取所述加权索引中保存的所述搜索结果的加权结果, 按照所述加权结果从高到低排序显示所述搜索结果。 The device according to claim 11, wherein the classification display module is configured to: classify and display the search result according to the classification information, and acquire the search result saved in the weighted index The weighted result displays the search results in descending order according to the weighted result.
PCT/CN2011/077788 2011-07-29 2011-07-29 Method and device for full-text search WO2012106941A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2011/077788 WO2012106941A1 (en) 2011-07-29 2011-07-29 Method and device for full-text search
CN2011800013237A CN102317943B (en) 2011-07-29 2011-07-29 Method and device for full-text search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/077788 WO2012106941A1 (en) 2011-07-29 2011-07-29 Method and device for full-text search

Publications (1)

Publication Number Publication Date
WO2012106941A1 true WO2012106941A1 (en) 2012-08-16

Family

ID=45429420

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/077788 WO2012106941A1 (en) 2011-07-29 2011-07-29 Method and device for full-text search

Country Status (2)

Country Link
CN (1) CN102317943B (en)
WO (1) WO2012106941A1 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102651031A (en) * 2012-03-31 2012-08-29 百度在线网络技术(北京)有限公司 Method and equipment for providing searching result
CN103678350B (en) * 2012-09-10 2018-01-05 腾讯科技(深圳)有限公司 Social network search result methods of exhibiting and device
CN102968454B (en) * 2012-10-26 2016-08-03 北京百度网讯科技有限公司 A kind of for obtaining the method and apparatus promoting object search results
CN104123366A (en) * 2014-07-23 2014-10-29 谢建平 Search method and server
CN106815220A (en) * 2015-11-27 2017-06-09 英业达科技有限公司 Data are classified and method for searching
CN105843867B (en) * 2016-03-17 2019-09-03 畅捷通信息技术股份有限公司 Search method based on metadata schema and the retrieval device based on metadata schema
WO2018023428A1 (en) * 2016-08-02 2018-02-08 步晓芳 Search result display method and search engine
CN107391535B (en) * 2017-04-20 2021-01-12 创新先进技术有限公司 Method and device for searching document in document application
CN107423349A (en) * 2017-05-18 2017-12-01 福建中金在线信息科技有限公司 A kind of method and system of full-text search
CN109657151A (en) * 2018-12-25 2019-04-19 华联世纪工程咨询股份有限公司 A kind of engineering material searching method and device using scene based on user
CN112004126A (en) * 2020-08-24 2020-11-27 海信视像科技股份有限公司 Search result display method and display device
CN112445830A (en) * 2020-11-26 2021-03-05 湖南智慧政务区块链科技有限公司 Data analysis system based on block chain technology
CN113127629A (en) * 2021-03-11 2021-07-16 维沃移动通信有限公司 Keyword searching method and device and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101082914A (en) * 2005-12-30 2007-12-05 香港应用科技研究院有限公司 Category search for structured documents
WO2009136426A1 (en) * 2008-05-08 2009-11-12 三菱電機株式会社 Search query providing equipment
CN101714172A (en) * 2009-11-13 2010-05-26 华中科技大学 Index structure supporting access control and search method thereof
CN101840400A (en) * 2009-03-19 2010-09-22 北大方正集团有限公司 Multilevel classification retrieval method and system
US20100293174A1 (en) * 2009-05-12 2010-11-18 Microsoft Corporation Query classification

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100478962C (en) * 2007-07-24 2009-04-15 华为技术有限公司 Method, device and system for searching web page and device for establishing index database

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101082914A (en) * 2005-12-30 2007-12-05 香港应用科技研究院有限公司 Category search for structured documents
WO2009136426A1 (en) * 2008-05-08 2009-11-12 三菱電機株式会社 Search query providing equipment
CN101840400A (en) * 2009-03-19 2010-09-22 北大方正集团有限公司 Multilevel classification retrieval method and system
US20100293174A1 (en) * 2009-05-12 2010-11-18 Microsoft Corporation Query classification
CN101714172A (en) * 2009-11-13 2010-05-26 华中科技大学 Index structure supporting access control and search method thereof

Also Published As

Publication number Publication date
CN102317943A (en) 2012-01-11
CN102317943B (en) 2013-10-02

Similar Documents

Publication Publication Date Title
WO2012106941A1 (en) Method and device for full-text search
CN104239340B (en) Search result screening technique and device
JP5316158B2 (en) Information processing apparatus, full-text search method, full-text search program, and recording medium
WO2019174132A1 (en) Data processing method, server and computer storage medium
US8311999B2 (en) System and method for knowledge research
KR102468930B1 (en) System for filtering documents of interest and method thereof
CA2886581C (en) Method and system for analysing sentiments
CN106649455A (en) Big data development standardized systematic classification and command set system
US20110106797A1 (en) Document relevancy operator
CN110390044B (en) Method and equipment for searching similar network pages
JP2008515061A (en) A method for searching data elements on the web using conceptual and contextual metadata search engines
CN111026710A (en) Data set retrieval method and system
CN102012915A (en) Keyword recommendation method and system for document sharing platform
CN106897437B (en) High-order rule multi-classification method and system of knowledge system
JP2008117010A (en) Document creation support apparatus and document creation support system
Ma et al. Typifier: Inferring the type semantics of structured data
JP2008210024A (en) Apparatus for analyzing set of documents, method for analyzing set of documents, program implementing this method, and recording medium storing this program
CN106776910A (en) The display methods and device of a kind of Search Results
CN106775694A (en) A kind of hierarchy classification method of software merit rating code product
CN113836434B (en) Web page data processing method based on database
JP2023066404A (en) Method and system for performing product matching on e-commerce platform
JP2011053881A (en) Document management system
CN109062551A (en) Development Framework based on big data exploitation command set
TW201106182A (en) Citation record extraction system and method, and program product
CN117056392A (en) Big data retrieval service system and method based on dynamic hypergraph technology

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201180001323.7

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11858381

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11858381

Country of ref document: EP

Kind code of ref document: A1