US20100241630A1 - Methods for indexing and retrieving information - Google Patents

Methods for indexing and retrieving information Download PDF

Info

Publication number
US20100241630A1
US20100241630A1 US12/661,607 US66160710A US2010241630A1 US 20100241630 A1 US20100241630 A1 US 20100241630A1 US 66160710 A US66160710 A US 66160710A US 2010241630 A1 US2010241630 A1 US 2010241630A1
Authority
US
United States
Prior art keywords
information
word
association
identifying
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/661,607
Inventor
Frank John Williams
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US12/661,607 priority Critical patent/US20100241630A1/en
Publication of US20100241630A1 publication Critical patent/US20100241630A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A preferred method for providing an indexing methodology, an index table and method for retrieving information are disclosed. In a preferred method, an association between a plurality of word elements from a first data corpus such as a query is identified through an identification information such as a number. Then, a preferred index table comprising additional information for identifying the association information such as numbers between the word elements and/or multiple word elements of other or target data corpuses such as a data source is also disclosed, which in conjunction lead to retrieval of irrelevance-free information.

Description

    RELATED APPLICATIONS
  • This application claims the benefit of U.S. provisional patent application Ser. No. 61/210,396 filed 2009 Mar. 18 by the present inventor.
  • BACKGROUND
  • 1. Field of Invention
  • The present invention relates generally to a method for retrieving information. More particularly, a novel method(s) for retrieving information implementing indexing information identifying an association between word elements.
  • 2. Description of Related Art
  • The Revolution of the computer and the digital age are accountable for a series of inventions, communications and the transfer of knowledge including the storage of large amounts of valuable data upon which humanity sustains its progress. Many new scientific disciplines like Computational Linguistics and Natural Language Processing are born to study and understand some of the communication mediums such as natural languages. Regarded Intranets and Internet are built to distribute the valuable communication and knowledge to serve the specific information needs of millions of people every day. In particular, search engines are in charge of retrieving and delivering millions of documents to fulfill the specific needs of millions of people. However, current search technologies fail to effectively retrieve the information in a specific manner requiring its users to spend time and effort reading through large collections of text to find their particular or specific information needs. For example, a user looking to buy “red boots” may simply enter in the search engine the words “red boots.” The search engine then retrieves every document comprising the words “red” and “boots” producing data such as “red hat and yellow boots” which by having nothing to do with “red boots” it fails to serve or fulfills the specific wants of its user. As a result, users are forced to use valuable time and concentration in the efforts of focusing to sort and select through large quantities of relevant and irrelevant data which ultimately contributes to user confusion, frustration, discourage and loose of concentration.
  • In view of the present shortcomings, the present invention distinguishes over the prior art by providing heretofore a more compelling and effective method for retrieving specific information to allow search engines and other application the ability to remove irrelevant data from their results for better serving the needs of their users while providing additional unknown, unsolved and unrecognized advantages as described in the following summary.
  • SUMMARY OF THE INVENTION
  • The present invention teaches certain benefits in use and construction which give rise to the objectives and advantages described below. The methods and systems embodied by the present invention overcome the limitations and shortcomings encountered when retrieving information. The method(s) permits, through the use of a more compelling form of indexing technology, a more accurate and precise form of massive information retrieval, which by the implementation of associations between word elements, is capable of eliminating all the irrational and nonsensical data from user results.
  • OBJECTS AND ADVANTAGES
  • A primary objective inherent in the above described methods of use is to provide several methods and systems to index and identify the desired associations between words, thus allowing the method and systems to effectively reduce or remove the retrieval of irrelevant data not taught by the prior arts and further advantages and objectives not taught by the prior art. Accordingly, several objects and advantages of the invention are:
  • Another objective is to save user time by providing only conceptually matching data.
  • A further objective is to decrease the amount of effort implemented by users discriminating or sorting between relevant and irrelevant data.
  • A further objective is to improve the quality and quantity of results.
  • A further objective is to permit machines and application the ability of handling natural language more efficiently.
  • A further objective is to improve the ability of portable devices to manipulate natural language.
  • Another further objective is to permit the unification of the world's knowledge regardless of language and/or grammar.
  • Another further objective is to permit the retrieval of non-irrelevant data from large collections of information storage.
  • Other features and advantages of the described methods of use will become apparent from the following more detailed description, taken in conjunction with the accompanying drawings which illustrate, by way of example, the principles of the presently described apparatus and method of its use.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The accompanying drawings illustrate examples of at least one of the best mode embodiments of the present method and methods of use. In such drawings:
  • FIG. 1 illustrates several exemplary non-limiting diagrams of some steps of the inventive method identifying and/or numbering the relationships between the elements of several exemplary data corpuses;
  • FIG. 2 is a non-limiting exemplary block diagram of some steps of the inventive method displaying an index table exploiting the additional concept of using information for identifying the associations between the different word elements;
  • FIG. 3A is a non-limiting exemplary block diagram of some significant steps the inventive method handling a query and an index table for identifying information that matches the query and therefore needs to be retrieved;
  • FIG. 3B is another non-limiting exemplary block diagram of some significant steps the inventive method handling a query with several associations and an index table for identifying information that matches the associations in the query for retrieving matching information;
  • FIG. 3C is another non-limiting exemplary block diagram of a variation of some of the most significant steps the inventive method discussed in FIG. 3B manipulating group identifiers;
  • FIG. 3D is another non-limiting exemplary block diagram of a variation of some of the most significant steps the inventive method discussed in FIG. 3B manipulating eeggis instead of words;
  • FIG. 3E is another non-limiting exemplary diagram of a variation of some important steps of the inventive method this time dealing with several index tables, such as a Double Index Table and an Association Number Index Table;
  • FIG. 4 is a non-limiting block diagram of some of the steps of the inventive method displayed in FIG. 1 and FIG. 2 for producing or providing an indexing table for finding and/or comparing and/or retrieving matching information;
  • FIG. 5 is a non-limiting block diagram of some of the main steps of the inventive method displayed in FIGS. 3A, 3B, 3C and 3D for retrieving information.
  • DETAILED DESCRIPTION
  • The above described drawing figures illustrate the described methods and use in at least one of its preferred, best mode embodiment, which are further defined in detail in the following description. Those having ordinary skill in the art may be able to make alterations and modifications from what is described herein without departing from its spirit and scope. Therefore, it must be understood that what is illustrated is set forth only for the purposes of example and that it should not be taken as a limitation in the scope of the present system and method of use.
  • FIG. 1 illustrates several exemplary non-limiting diagrams of some steps of the inventive method identifying and/or numbering the relationships between the word elements of several exemplary data corpuses that were found or attained by an Associative Analysis Protocol such as CIRN. Noteworthy, CIRN discovers and/or forms associations between the different word elements of a given and/or analyzed data corpus, such as associating the nouns with their verbs, etc. The First Data Corpus 1010 (FIG. 1) or sentence “red boots and black hats” is displayed with its corresponding First Table of Relations 1015 (FIG. 1) which contains the relationships found, formed or desired between the words or the elements of the said first sentence. For example, in the First Table of Relations, the top row displays the word “red” (under column Word1) next to the word “boots” (under column Word2) along with their association number or “1” (under column Association Number or “Assoc. No.” for short). In the bottom row, the word “black” (under column Word1), the word “hats” (under column Word2) and their association number “2” (under column Assoc. No.) are all displayed together. In such fashion, each of the two associations (“red—boots” and “black—hats”) that were formed or found in the first sentence is uniquely identified, differentiated and/or numbered. The Second Data Corpus 1020 (FIG. 1) or “Mary ran quickly” illustrates is corresponding table of relationships or Second Table of Relations 1025 (FIG. 1). In this table, the top row associates the word “Mary,” the word “ran” and their association number “15.” In similar fashion, the bottom row associates the word “ran,” the word “quickly” and their association number “16.” As a result, each of the relations formed/found in the second sentence is uniquely numbered or identified. The Third Data Corpus 1030 (FIG. 1) or sentence “silly kitty jumps” illustrates its corresponding table of relationships or Third Table of Relations 1035 (FIG. 1). In this third table, the top row associates the word “silly,” the word “kitty” with their association number “R17;” wherein R17 is the information responsible for identifying the association or relationship between “silly” and “kitty.” In similar fashion, the bottom row display the word “kitty,” the word “jumps” and their association number “M81.” As a result, each of the relationships formed/found in the Third Data Corpus or third sentence is uniquely numbered or identified. Please note, in this particular example, the information identifying each of the associations is not in series but the information is rather in random order or appearance. The Fourth Data Corpus 1040 (FIG. 1) is a sentence made of group identifiers or “Adj333 Noul 12 Ver777” which in English spells the sentence “silly kitty jumps” along with its corresponding table of relationships or Fourth Table of Relations 1045 (FIG. 1). In this table, the top row associates the group identifiers Adj333 (silly), Noul 12 (kitty), with “6;” wherein “6” is the information identifying their unique association. The bottom row associates another set of group identifiers or Noul 12 (kitty), Ver777 (jumps) with number “12” which is the information identifying their unique association: As a result, each of the relationships found, formed or desired from the Fourth Data Corpus is uniquely numbered, identified and/or differentiated. The Fifth Data Corpus 1050 (FIG. 1) is another sentence which this time is made of eeggis or “Adj33.1 Noul 1.4 Ver77.1” which in English spells the sentence “silly kitty jumps.” The Fifth Table of Relations 1055 (FIG. 1) illustrates the associations that were found, formed or desired between the eeggis of the said fifth sentence. In this table, the top row associates the eeggis Adj33.1 (silly), Noul 1.4 (kitty), with their association number “50.” The bottom row displays another association between another group of eeggis or Noul1.4 (kitty), Ver77.1 (jumps) with number “18” which happens to be the information identifying their unique association. As a result, each of the eeggi associations within the Fifth Data Corpus have their own association number (Assoc. No.). Please note, in this example or table of relations, although the associations happened next to each other, the information (numbers) identifying the said associations are not continuous or in series but the information or number are rather in a random order.
  • FIG. 2 is a non-limiting exemplary diagram of some steps of the inventive method displaying a novel type of index table here introduced as an Index and Association's Number Table which exploits the concept of using information such as an association number for identifying the word elements of an association. The set of Data Corpuses 2010 (FIG. 2) comprises three exemplary documents or pages such as the first sentence or “[1] red boots and black hats,” the second sentence or “[2] black boots and red hats” and the third sentence or “[3] black hats and red boots.” Beneath it, is the inventive Index and Association's Number Table 2050 (FIG. 2) displaying several rows (1-12) and columns (Word, Page No. and Assoc. No.). In such fashion, each word in the Index Table has information identifying its Data Corpus or Page Number (Page No.) and its association number (Assoc. No.). Consequentially, the association number on a given page represents the linkage between two words like “red boots.” Additionally, in the Index Table 2050 (FIG. 2) the sixth row illustrates or says that the word “boots” is present in page number 3 ([3]) and that is associated to another word across an association identified by number “2.” Another example, the twelfth or last row, mentions that the word “red” is present in data corpus number 3 (in page number [3]) and that is also linked to another word across the association number 2. As a result, association number 2 in page [3] can be used to identify both of its corresponding word elements or “red” with “boots.” In this fashion, the association number becomes a representative of the elements involved in its association. Another example, in the Index and Association's Number Table, the second and fifth rows mention that association number 1 of page [2] involves “black” and “boots” accordingly.
  • FIG. 3A is a non-limiting exemplary diagram of some significant steps the inventive method handling a query and an Index and Associations' Number Table for identifying information that matches the query for retrieval. The Query 3010 (FIG. 3A) comprises the phrase or sentence “red hats.” The Associative Procedure 3020 (FIG. 3A) such as CIRN (Conceptual Interrelating Network Protocol) identifies if a relationship is present or is possible between the words of the phrase or sentence “red hats.” Please note, there are a variety of methodologies or protocols such as different types of CIRN that are available for forming, producing and/or identifying associations (desired or not) between the different word elements from several kinds of data corpuses, such as those data corpuses using only text, words, group identifiers, eeggis, sounds, etc. In this example, the Query Table of Relations 3030 (FIG. 3A) illustrates that a single relationship is attained between the word elements “red” and “hats” of the Query. Consequentially, any such document(s) wherein the word “red” and “hats” are associated will represent a match. Next, the Index and Associations' Number Table 3050 (FIG. 3A) provides the information needed to allocate or find those pages or data corpuses wherein “red” and “hats” are indeed related as implied by their association number. For example, in the Index and Associations' Number Table, the tenth row indicates that “red” is present in data corpus number 1 (page 1 or [1] under the Page No. column) and that is also associated to another word element across an association identification number “1.” In similar fashion, the first row of the Index and Associations' Number Table indicates that the word “black” is found in page number 1 and that is also associated or linked to another element across the association number “2.” The Possible Documents Table 3070 (FIG. 3A) illustrates a collection of pages (data corpuses) wherein the words of the query are also present including their corresponding association numbers. For example, the Possible Documents Table illustrates that the word “red” is present in pages [1], [2] and [3] and that relates to other words through association numbers 1, 2 and 3 respectively. The word “hats” which is also present in pages [1], [2] and [3] also relates to other words using the association numbers 2, 2 and 1 respectively. Consequentially, after performing an intersection of the information from both words in the Possible Documents Table, we can see that the words “red” and “hats” from the second page (data corpus [2] or second data corpus) have identical association numbers (number 2); thus indicating that both words are indeed related or associated in the said second page. As a result, the second data corpus or second page is a match to the Query (both Query and second data corpus have their words “red” and “hats” been associated) therefore causing, retrieving, and/or displaying the second page in the Results Display 3090 (FIG. 3A).
  • FIG. 3B is another non-limiting exemplary diagram of some significant steps of the inventive method handling a query with several associations and an Index and Associations' Number Table for identifying information that matches the associations in the query for retrieving matching information. The Query 3010 (FIG. 3B) comprises the phrase or sentence “black hats and red boots.” Noteworthy, this query involves two phrases; wherein “black hats” is the first and “red boots” is the second. The Associative Procedure 3020 (FIG. 3B) such as CIRN (Conceptual Interrelating Network Protocol) identifies if a relationship(s) is present or is possible between several groups of words from the set of phrases or sentences “black hats and red boots.” Please note, there are a variety of methodologies or protocols such as different types of CIRN that are available for forming, producing and/or identifying associations (desired or not) between the different word elements from several kinds of data corpuses, such as those data corpuses using only text, words, group identifiers, eeggis, sounds, etc. In this particular example, the Query Table of Relations 3030 (FIG. 3B) illustrates two different relationships from the Query between several word elements are possible or created. For example, “black” associates with “hats” through the information “F1” while “red” and “boots” are associated through the information “H2.” Noteworthy, F1 and H2 are representatives of the associations of their corresponding word elements or black hats and red boots respectively. Consequentially, any document(s) wherein the word “black” is associated with “hats” and wherein the word “red” is associated with the word “boots” would be a match. Next, the Index Table 3050 (FIG. 3B) provides information that is needed to allocate or find those pages (data corpuses); wherein “black” is related to “hats” and “red” is related to “boots” as implied by their corresponding association numbers. For example, in the Index and Associations' Number Table, the fourth row indicates that “boots” is present in data corpus number 1 (page 1 or [1] under the Page No. column) and that is also associated to another word element by the association identification number “1.” In similar fashion, the last or twelfth row in the Index and Associations' Number Table indicates that the word “red” is present in third data corpus (third page or [3]) and that is also associated or linked to another element across the association number “2.” The Possible Documents Table 3070 (FIG. 3B) illustrates the collection of pages (data corpuses) wherein the words of the query are also present including their corresponding association numbers. For example, the Possible Documents Table which displays a synopsis or a summary of all the information of the query's words found in the Index Table illustrates that “black” is present in pages [1], [2] and [3] and that relates to other words through the association numbers 2, 2 and 1 respectively. It also illustrates that “hats” is found in pages [1], [2] and [3] and that “hats” relate to other words using the association numbers 2, 2 and 1 respectively. It also shows that “red” is found in pages [1], [2] and [3] and is linked to other words through the association numbers 1, 2 and 2 respectively. Finally, it illustrates that “boots” is found is pages [1], [2] and [3] and is linked to other words through the association numbers 1, 1 and 2 respectively. Consequentially, after performing an intersection of the sets of information as implied by the Query, we can observe that the set of “black” and “hats” both in page [1] share the same association number or “2.” In addition, “red” and “boots” both in page [1], have identical association information or “1.” As a result, page [1] contains all the words experiencing the same associations that the Query's words experience among themselves. Consequentially, the Results Display 3090 (FIG. 3B) displays the matching record or page [1]. In similar fashion, according to the Index and Associations' Number Table or according to the Possible Documents Table, page [3] contains “black” and “hats” both related by the same association numbers and also contains “red” and “boots” also associated through the same association numbers. As a result, page [3] is also displayed in the Results Display window. Noteworthy, the Possible Documents Table 3070 (FIG. 3B) operates as a summary of the information attained directly from the Index and Associations' Number Table regarding the word elements of the search or other. In such fashion, the Possible Documents Table serves as an aid to assist in the teaching and disclosure of the present inventive method.
  • FIG. 3C is another non-limiting exemplary diagram of a variation of some of the most significant steps the inventive method discussed in FIG. 3B manipulating group identifiers instead of words. In this example, the Query 3010 (FIG. 3C) comprises two group identifier sentences “aj88 no44+aj99 no33” which in English spells or means “black hats and red boots.” The Associative Procedure 3020 (FIG. 3C) such as CIRN (Conceptual Interrelating Network Protocol) identifies any relationships present or possible between several sets of group identifiers from the phrase(s) or sentence(s) in the Query. Please note, the CIRN protocols in this example are designed to handle the involving group identifiers. The Query Table of Relations 3030 (FIG. 3C) illustrates the resulting two relationships from the Query. For example, “aj88” (black) relates to “no44” (hats) and to “F1” (information identifying the association); while “aj99” (red) relates “no33” (boots) and to “H2” (information identifying the association). In such fashion, any document(s) wherein “aj88” and “no44” relate and wherein “aj99” and “no33” relate, will represent a match of the query. The Index and Associations' Number Table 3050 (FIG. 3C) provides information needed to retrieve matching data. For example, in the Index and Associations' Number Table, the fifth row shows that “no33” is in page [2] and that also relates to another word element (group identifier) by number “1” (the Assoc. No.). Also, in the Index Table we can see that “aj88” (black) is present in pages [1], [2] and [3] and that it relates to other identifiers through the association numbers 2, 2 and 1 respectively. It also illustrates that “no44” (hats) is found in pages [1], [2] and [3] and relates to other identifiers using the association numbers 2, 2 and 1 respectively. It also shows that “aj99” is found in pages [1], [2] and [3] and is linked to other identifiers through the association numbers 1, 2 and 2 respectively. Finally, it illustrates that “no33” (boots) is found is pages [1], [2] and [3] and is linked to other identifiers through the association numbers 1, 1 and 2 respectively. The Possible Documents Table 3070 (FIG. 3C) illustrates a summary or comparative collection of data wherein aj88, aj99, no33 and no44 are found. Noteworthy, both, the Index and Associations' Number Table and/or the Possible Documents Table can be used to illustrate and/or retrieve the desired data. As a result, page [1] contains all the sets of identifiers and corresponding associations as required by the Query. Consequentially, the Results Display 3090 (FIG. 3C) displays the matching page [1]. In similar fashion, according to the Index and Associations' Number Table or according to the Possible Documents Table, page [3] contains the same sets of identifiers forming the same associations as the Query. As a result, page [3] is also displayed in the Results Display window.
  • FIG. 3D is another non-limiting exemplary block diagram of a variation of some of the most significant steps the inventive method discussed in FIG. 3B manipulating eeggis instead of words. Noteworthy, eeggis are indices that group all synonyms through a value spectrum or region such as a number with decimals. In this example, the Query 3010 (FIG. 3D) comprises the two eeggi sentences “aj8.0 no4.1+aj9.2 no3.1” which in English spells or means “black hats and red boots.” The Associative Procedure 3020 (FIG. 3D) such as CIRN (Conceptual Interrelating Network Protocol) identifies any relationships present or possible between several groups of eeggi from the phrase or sentence in the Query. Please note, the CIRN protocols in this example are designed to handle the involving eeggi. The Query Table of Relations 3030 (FIG. 3D) illustrates the resulting two relationships from the Query. For example, “aj8.” (black and/or any synonyms of black are represented by the region aj8.xxx) relates to “no4.1” (hats and/or any synonyms of hats are represented by the spectrum no4.xxx) and to “F1” (information identifying the association); while “aj9.” (red and/or any synonyms of red such as “crimson” or aj9.5) relates “no3.” (boots and/or any synonyms of boots) and to “H2” (information identifying the association). In such fashion, any document(s) wherein “aj8.xxx” and “no4.xxx” relate through the same association number and wherein “aj9.xxx” and “no3.xxx” relate through their same association numbers, will represent a match to the query. The Index and Associations' Number Table 3050 (FIG. 3D) provides information needed to retrieve matching data. For example, in this table, the tenth row shows that “aj9.5” (crimson—a synonym of red) is in page [1] and that also relates to another word element (eeggi) across association number “1” (Assoc. No.). In the Index and Associations' Number Table we can see that “aj8.0” (black) is present in pages [1], [2] and [3] and that relates to other identifiers through the association numbers 2, 1 and 1 respectively. They also illustrate that “no4.1” (hats) is found in pages [1], [2] and [3] and relates to other identifiers using the association numbers 2, 2 and 1 respectively. It also shows that “aj9.2” (red) and “aj9.5 (crimson) are found in pages [1], [2] and [3] and that they linked to other eeggi through the association numbers 1, 2 and 2 respectively. Finally, it illustrates that “no3.1” (boots) is found is pages [1], [2] and [3] and is linked to other eeggi through the association numbers 1, 1 and 2 respectively. The Possible Documents Table 3070 (FIG. 3D) illustrates a summary or comparative collection of data wherein the spectrums aj8, aj9, no3 and no4 are found. Noteworthy, either the Index Table and/or the Possible Documents Table can be used to illustrate and/or retrieve the desired data. As a result, page [1] contains all the eeggi regions or spectrums of eeggi and corresponding associations as implied by the Query. Consequentially, the Results Display 3090 (FIG. 3D) displays the matching page [1] (crimson boots and black hats). Noteworthy, page [1] involves “crimson” (a synonym of red) instead of “red” or its eeggi. In similar fashion, according to the Index Table and/or Possible Documents Table, page [3] contains the same eeggi spectrums forming the same associations as the Query. As a result, page [3] is also displayed in the Results Display window.
  • FIG. 3E is another non-limiting exemplary diagram of a variation of some important steps of the inventive method this time dealing with several index tables, such as a Double Index Table and an Association Number Index Table. In this example, the Query 3010 (FIG. 3E) comprises the word sentence “black boots and red hats.” The Associative Procedure 3020 (FIG. 3E) such as CIRN (Conceptual Interrelating Network Protocol) identifies any relationships present or possible between several words of the phrase or sentence in the Query. The Query Table of Relations 3030 (FIG. 3E) illustrates the resulting two relationships from the Query. For example, “black” relates to “boots” and “red” relates to “hats.” In such fashion, any document(s) wherein “black” and “boots” relate and wherein “red” and “hats” relate too, will represent a match to the query. The Double Index Table 3051 (FIG. 3E) provides part of the information needed to retrieve matching data. For example, in the Double Index Table, the topmost or first row shows that “black” and “hats” are related across an association number “317.” Noteworthy, this is called a Double Index Table because both columns, “Word1” and “Word2,” are used to identify the word elements relating in a given association. As a result, paying close attention to the Double Index Table, we can see that “black” and “boots” or vice versa (second row), are identified by association number “1028;” while “red” and “hats” (last or fourth row) are identified by association number “28371.” The Association Number Index Table 3052 (FIG. 3E) further identifies which associations are found or are experienced on which documents. For example, in the Association Number Index Table, association number 317 (top most) is found in pages [1] and [3]. As a result, paying close inspection of the Association Number Index Table, we can see that both associations of the query (association numbers “1028” and “28371”) can be found in page [2]. The Possible Documents 3085 (FIG. 3E) displays all the pages including those that are to be retrieved. Consequentially, from both tables is concluded that only page [2] experiences the same association as the query. The Results Display 3090 (FIG. 3E) displays the matching page [2] (black boots and red hats) that matched the associations and word elements of the Query.
  • FIG. 4 is a non-limiting block diagram of some of the steps of the inventive method displayed in FIG. 1 and FIG. 2 for producing or providing an indexing table for finding and/or comparing and/or retrieving matching information. The First Step 4010 (FIG. 4) involves identifying a First Word Element (a word element is an information identifying at least one of a: word, concept, idea, meaning, image and grammatical element) in a Data Corpus. For example, in a query with four word elements, one of the elements is identified or selected. The Second Step 4020 (FIG. 4) involves identifying another or Second Word Element in the said Data Corpus. For example, from the query in the First Step which identified one element, in this second step another of the remaining three elements is identified or selected. The next or Third Step 4030 (FIG. 4) involves the step of identifying and/or finding an association between said First and Second Word Elements through the use of an associative protocol such as CIRN. For example, CIRN (Conceptual Inter-relating Network Protocols) identifies and/or forms associations between different types of word elements of a particular data corpus. In such fashion, a sentence such as “fat cats and silly dogs” when analyzed by CIRN will find or form associations between “fat” and “cats” and also find or form another association between “silly” and “dogs.” The next of Fourth Step 4040 (FIG. 4) involves the step of implementing information for identifying the mentioned association from the previous Third Step. For example, in the sentence or data corpus “fat cats and silly dogs” two associations are made, which results in one information identifying the first association (between fat with cats), and another different information for identifying the second association (between silly and dogs). The next or Fifth Step 4050 (FIG. 4) involves the obvious step of implementing an information for identifying the data corpus comprising the said First Word Element and said Second Word Element. For example, in indexing tables of the current art, each word in the table has or uses an information for identifying the documents or pages wherein the words are present. In such fashion, search engines can quickly retrieve those documents comprising the words of the query. The last or Sixth Step 4060 (FIG. 4) involves the step of registering or recording all the needed information to form the indexing table such as implementing the first information (information identifying the association between at least two word elements), the second information (such as the information identifying the pages or documents comprising the indexed word), and at least one of the word elements in the index table.
  • FIG. 5 is a non-limiting block diagram of some of the main steps of the inventive method displayed in FIGS. 3A, 3B, 3C, 3D and 3E for retrieving information. The First Step 5010 (FIG. 5) involves the step of identifying an association between a plurality of word elements from a first data corpus such as a query. For example, a query such as “red hats” involves an association between the words “red” and “hats.” The next or Second Step 5020 (FIG. 5) involves the step of identifying a word element from said first data corpus or said association. For example, from the query “red hats” the word “red” is identified or selected. The next or Third Step 5030 (FIG. 5) involves the step of searching an index table identifying a first word element from the query and at least one of a: information identifying an association between said first word element and other word element, information identifying the location or data corpus comprising said first word element. For example, an index table identifying the word “red” could mention that “red” is in X, Y and Z pages, and that in page X “red” is associated with another word such as “hats” or that in the said X page the association of “red” with another word is identified by an association number such as “F1.” In such fashion, looking up for the word “red” in the index table, not only mentions in which pages “red” is found, but also mentions the information identifying the association that “red” has experienced with other word elements. The next or Fourth Step 5040 (FIG. 5) involves the step of searching an index table this time identifying the second word element of the query and at least one of a: information identifying an association between said second word element and other word element, information identifying the location or data corpus comprising said second word element. For example, an index table (same as in the Third Step or another) identifying the second word element of the query or “hats” could mention that “hats” is in X, V and U pages, and that in page X “hats” is also associated with another word wherein their association is identified also “F1.” In such fashion, looking up for the word “hats” in the index table, not only mentions in which pages “red” is found, but also mentions the information identifying the association that “red” has experienced with other word elements. The next or Fifth Step 5050 (FIG. 5) involves the step of identifying an information identifying a data corpus; wherein said first word element and said second word element have the same said information identifying an association. In other words, identifying a corpus of information where the first word is associated to the second word (same associative information). For example, from a query such as “red hats” which form an association, find all those pages in the index table wherein “red” and “hats” are in the same page and also share the same information identifying their association. The final step or Sixth Step 5060 (FIG. 5) involves the obvious step of retrieving the data corpuses or pages comprising the word elements with the same associations of the query.
  • Noteworthy, there are several forms of arranging an index table and several ways of combining several types on indexing tables. In addition, there is a myriad of CIRN types and types of word elements such as words, group identifiers and eeggis, several types of languages, grammar, desired associations and combinations fro identifying the elements and their associations; thus leading to possibly hundreds of other figures and corresponding detailed descriptions yet without ever departing from the main spirit and scope of the disclosed inventive method. Consequentially, to ease and facilitate the illustrations, description and teaching of the inventive method, the disclosed figures are assumed or expected to suffice the description of the main steps and enablements of the disclosed inventive method.
  • The enablements described in detail above are considered novel over the prior art of record and are considered critical to the operation of at least one aspect of an apparatus and its method of use and to the achievement of the above described objectives. The words used in this specification to describe the instant embodiments are to be understood not only in the sense of their commonly defined meanings, but to include by special definition in this specification: structure, material or acts beyond the scope of the commonly defined meanings. Thus if an element can be understood in the context of this specification as including more than one meaning, then its use must be understood as being generic to all possible meanings supported by the specification and by the word or words describing the element.
  • The definitions of the words or drawing elements described herein are meant to include not only the combination of elements which are literally set forth, but all equivalent structure, material or acts for performing substantially the same function in substantially the same way to obtain substantially the same result. In this sense it is therefore contemplated that an equivalent substitution of two or more elements may be made for any one of the elements described and its various embodiments or that a single element may be substituted for two or more elements in a claim.
  • Changes from the claimed subject matter as viewed by a person with ordinary skill in the art, now known or later devised, are expressly contemplated as being equivalents within the scope intended and its various embodiments. Therefore, obvious substitutions now or later known to one with ordinary skill in the art are defined to be within the scope of the defined elements. This disclosure is thus meant to be understood to include what is specifically illustrated and described above, what is conceptually equivalent, what can be obviously substituted, and also what incorporates the essential ideas.
  • The scope of this description is to be interpreted only in conjunction with the appended claims and it is made clear, here, that each named inventor believes that the claimed subject matter is what is intended to be patented.
  • CONCLUSION
  • From the foregoing, a series of novel methods for forming and index table, implementing an indexing methodology and method for retrieving information can be appreciated. The described methods overcomes the limitations encountered by current information technologies such as search engines, speech recognition, word processors, and others which fail to identify and/or effectively implement the underlying associations between different kinds of word elements; which potentially leads to the generation of irrelevant data, irrational data and user confusion, to allow current and future information technologies to properly and effectively manipulate, identify, select, match and retrieve data.

Claims (4)

1. A Method for indexing information comprising the steps of:
a) Identifying a first word element such as an information identifying a word, concept, idea, meaning, image and grammatical information in a first data corpus
b) Identifying a second word element such as an information identifying a word, concept, idea, meaning, image and grammatical information in said first data corpus
c) Identifying a first association between said first word element and said second word element implementing an associative protocol such as CIRN
d) Implementing a first information for identifying said first association
e) Implementing a second information for identifying said first data corpus
f) Registering at least one of a said: first information and second information with at least one of a said: first word element and second word element
2. A method for retrieving information comprising the steps of:
a) Identifying an association between one of a plurality of word elements from a first data corpus such as a query
b) Identifying a word element from said plurality
c) Searching an index table comprising at least one of a: said first word element, information identifying a data corpus of said first word element and information identifying an association of said first word element
d) Searching an index table comprising at least one of a: said second word element, information identifying a data corpus of said second word element and information identifying an association of said second word element
e) Identifying an information identifying a data corpus; wherein said first word element and said second word element have the same said information identifying an association
f) Retrieving said data corpus
3. A method for providing and index table comprising the steps of:
a) Identifying an index table
b) Adding an information field to said index table for containing information for identifying at least one information identifying and association between its indexed word element with one other word element.
c) Registering information in said information field
4. A method for identifying information of an index table in a data corpus such as a query, the method comprising the steps of:
a) Identifying a first group of word elements in a data corpus such as a query,
b) Identifying a first association between said first group of word elements,
c) Identifying a second group of word elements in said data corpus,
d) Identifying a second association between said second group of word elements,
e) Assigning each said association a unique identifying information.
US12/661,607 2009-03-18 2010-03-18 Methods for indexing and retrieving information Abandoned US20100241630A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/661,607 US20100241630A1 (en) 2009-03-18 2010-03-18 Methods for indexing and retrieving information

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US21039609P 2009-03-18 2009-03-18
US12/661,607 US20100241630A1 (en) 2009-03-18 2010-03-18 Methods for indexing and retrieving information

Publications (1)

Publication Number Publication Date
US20100241630A1 true US20100241630A1 (en) 2010-09-23

Family

ID=42738394

Family Applications (3)

Application Number Title Priority Date Filing Date
US12/661,612 Active 2033-03-05 US9063923B2 (en) 2009-03-18 2010-03-18 Method for identifying the integrity of information
US12/661,607 Abandoned US20100241630A1 (en) 2009-03-18 2010-03-18 Methods for indexing and retrieving information
US12/661,613 Abandoned US20100241631A1 (en) 2009-03-18 2010-03-18 Methods for indexing and retrieving information

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US12/661,612 Active 2033-03-05 US9063923B2 (en) 2009-03-18 2010-03-18 Method for identifying the integrity of information

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/661,613 Abandoned US20100241631A1 (en) 2009-03-18 2010-03-18 Methods for indexing and retrieving information

Country Status (1)

Country Link
US (3) US9063923B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110314001A1 (en) * 2010-06-18 2011-12-22 Microsoft Corporation Performing query expansion based upon statistical analysis of structured data
US20230096705A1 (en) * 2021-09-28 2023-03-30 Yohsuke Utoh Information processing apparatus, data management method, and non-transitory recording medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110145269A1 (en) * 2009-12-09 2011-06-16 Renew Data Corp. System and method for quickly determining a subset of irrelevant data from large data content
US9116996B1 (en) * 2011-07-25 2015-08-25 Google Inc. Reverse question answering
US9858336B2 (en) * 2016-01-05 2018-01-02 International Business Machines Corporation Readability awareness in natural language processing systems
US9910912B2 (en) 2016-01-05 2018-03-06 International Business Machines Corporation Readability awareness in natural language processing systems
US10459900B2 (en) * 2016-06-15 2019-10-29 International Business Machines Corporation Holistic document search
US10628743B1 (en) 2019-01-24 2020-04-21 Andrew R. Kalukin Automated ontology system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6853993B2 (en) * 1998-07-15 2005-02-08 A9.Com, Inc. System and methods for predicting correct spellings of terms in multiple-term search queries
US20050198068A1 (en) * 2004-03-04 2005-09-08 Shouvick Mukherjee Keyword recommendation for internet search engines

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6405162B1 (en) * 1999-09-23 2002-06-11 Xerox Corporation Type-based selection of rules for semantically disambiguating words
CN1302030B (en) * 1999-12-24 2010-04-21 纽昂斯通讯公司 Machine translation method and system for resolving word ambiguity
US6675159B1 (en) * 2000-07-27 2004-01-06 Science Applic Int Corp Concept-based search and retrieval system
US7860706B2 (en) * 2001-03-16 2010-12-28 Eli Abir Knowledge system method and appparatus
US7403890B2 (en) * 2002-05-13 2008-07-22 Roushar Joseph C Multi-dimensional method and apparatus for automated language interpretation
US7539619B1 (en) * 2003-09-05 2009-05-26 Spoken Translation Ind. Speech-enabled language translation system and method enabling interactive user supervision of translation and speech recognition accuracy
US7512596B2 (en) * 2005-08-01 2009-03-31 Business Objects Americas Processor for fast phrase searching
US20080071737A1 (en) 2005-08-01 2008-03-20 Frank John Williams Method for retrieving searched results
US20070214199A1 (en) 2006-03-09 2007-09-13 Williams Frank J Method for registering information for searching
US20070266009A1 (en) 2006-03-09 2007-11-15 Williams Frank J Method for searching and retrieving information implementing a conceptual control
US20070214125A1 (en) 2006-03-09 2007-09-13 Williams Frank J Method for identifying a meaning of a word capable of identifying a plurality of meanings
US20070299831A1 (en) 2006-06-10 2007-12-27 Williams Frank J Method of searching, and retrieving information implementing metric conceptual identities
US20080082511A1 (en) 2006-08-31 2008-04-03 Williams Frank J Methods for providing, displaying and suggesting results involving synonyms, similarities and others
US20080091411A1 (en) 2006-10-12 2008-04-17 Frank John Williams Method for identifying a meaning of a word capable of identifying several meanings
US20080109416A1 (en) 2006-11-06 2008-05-08 Williams Frank J Method of searching and retrieving synonyms, similarities and other relevant information
US20080140635A1 (en) 2006-11-27 2008-06-12 Frank John Williams Methods for providing categorical and/or subcategorical information from a query
US20080140634A1 (en) 2006-11-27 2008-06-12 Frank John Williams Methods for relational searching, discovering relational information, and responding to interrogations
US20080140649A1 (en) 2006-11-27 2008-06-12 Frank John Williams Methods for providing suggestive results
US7899666B2 (en) * 2007-05-04 2011-03-01 Expert System S.P.A. Method and system for automatically extracting relations between concepts included in text
US9053089B2 (en) * 2007-10-02 2015-06-09 Apple Inc. Part-of-speech tagging using latent analogy
WO2010107327A1 (en) * 2009-03-20 2010-09-23 Syl Research Limited Natural language processing method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6853993B2 (en) * 1998-07-15 2005-02-08 A9.Com, Inc. System and methods for predicting correct spellings of terms in multiple-term search queries
US20050198068A1 (en) * 2004-03-04 2005-09-08 Shouvick Mukherjee Keyword recommendation for internet search engines

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110314001A1 (en) * 2010-06-18 2011-12-22 Microsoft Corporation Performing query expansion based upon statistical analysis of structured data
US20230096705A1 (en) * 2021-09-28 2023-03-30 Yohsuke Utoh Information processing apparatus, data management method, and non-transitory recording medium

Also Published As

Publication number Publication date
US20100241419A1 (en) 2010-09-23
US20100241631A1 (en) 2010-09-23
US9063923B2 (en) 2015-06-23

Similar Documents

Publication Publication Date Title
CN110147436B (en) Education knowledge map and text-based hybrid automatic question-answering method
US20100241630A1 (en) Methods for indexing and retrieving information
KR102094934B1 (en) Natural Language Question-Answering System and method
US7058564B2 (en) Method of finding answers to questions
Jockers et al. Text‐mining the humanities
US9342592B2 (en) Method for systematic mass normalization of titles
CN112667794A (en) Intelligent question-answer matching method and system based on twin network BERT model
US20190278838A1 (en) Tabular data compilation
US20100293162A1 (en) Automated Keyword Generation Method for Searching a Database
Fitzmaurice et al. Linguistic DNA: Investigating conceptual change in early modern English discourse
Mustafa et al. Kurdish stemmer pre-processing steps for improving information retrieval
Reveilhac et al. Dictionary-based and machine learning classification approaches: a comparison for tonality and frame detection on Twitter data
Alqahtani et al. Evaluation criteria for computational Quran search
Leveling et al. On metonymy recognition for geographic information retrieval
Dahlberg et al. A distributional semantic online lexicon for linguistic explorations of societies
Murtagh Semantic Mapping: Towards Contextual and Trend Analysis of Behaviours and Practices.
Azmi et al. Modern information retrieval in Arabic–catering to standard and colloquial Arabic users
Dominguès et al. Toponym recognition in custom-made map titles
JP4428703B2 (en) Information retrieval method and system, and computer program
JP5877775B2 (en) Content management apparatus, content management system, content management method, program, and storage medium
Sati et al. Arabic text question answering from an answer retrieval point of view: A survey
Bhaduri et al. Demonstrating use of Natural Language Processing to compare college of engineering mission statements
Mealand Hellenistic Greek and the New Testament: A stylometric perspective
Irfan et al. Refining Kea++ automatic keyphrase assignment
Thanadechteemapat et al. Thai word segmentation for visualization of thai web sites

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION