US20110119291A1 - Entity Identification and/or Association Using Multiple Data Elements - Google Patents

Entity Identification and/or Association Using Multiple Data Elements Download PDF

Info

Publication number
US20110119291A1
US20110119291A1 US12/877,096 US87709610A US2011119291A1 US 20110119291 A1 US20110119291 A1 US 20110119291A1 US 87709610 A US87709610 A US 87709610A US 2011119291 A1 US2011119291 A1 US 2011119291A1
Authority
US
United States
Prior art keywords
data
entity
values
data elements
entities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/877,096
Inventor
Scott Gregory Robert Rice
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qsent Inc
TransUnion Teledata LLC
Original Assignee
Qsent Inc
TransUnion Teledata LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qsent Inc, TransUnion Teledata LLC filed Critical Qsent Inc
Priority to US12/877,096 priority Critical patent/US20110119291A1/en
Publication of US20110119291A1 publication Critical patent/US20110119291A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures

Definitions

  • Embodiments described in the present application relate to the field of identification and/or association of one or more data records representing entities within one or more data sources.
  • data sources such as commercial data repositories, utility company customer databases, etc., to list only a few examples, store data records corresponding to individual entities, such as people, companies, etc.
  • the data records are typically comprised of multiple data elements, and the value for each data element typically represents a particular aspect of the entity's identity, or other information related to the entity.
  • Numerous commercial and noncommercial enterprises employ such data sources in a variety of ways as an integral part of their product or service offerings and daily operations.
  • a data source can include, it often proves to be a challenge to search, analyze, and/or manipulate the entity-representing data in a meaningful way.
  • some data sources contain inaccurate or out-dated information. For example, even using a well-indexed data source, it often can be difficult to identify with sufficient certainty that one or more particular records actually correspond to the specific entity they putatively represent. It can also be difficult to identify associations between multiple seemingly independent entity data records. Due to variations in the type, amount, and structure of data elements each data source can employ for its respective data records, the challenges of identifying and associating individual entities can be greatly magnified if multiple data sources are employed.
  • Embodiments consistent with the present application can utilize, at least in part, entity data records comprising multiple data elements to facilitate identification of entities and/or associations being made among entities represented by the data records.
  • the data records can originate from and/or be maintained within one or more data sources.
  • Such embodiments can combine data values from a plurality of data elements to form an entity identifier, which can serve, at least in part, as a key for facilitating the identification of and/or associations among entities represented by a plurality of data records.
  • the entity identifier can facilitate the identification of one or more data records corresponding to a unique entity, from one or more data sources.
  • an embodiment can facilitate the association of multiple data records that represent the same entity, and/or multiple data records representing separate, associated entities. For example, associations can be made among two or more unique entities and/or their respective representative data records if they correspond to substantially the same entity identifier.
  • the number, type, and/or characteristics of data elements used to form an entity identifier can be selected so that the entity identifier is expected to represent an individual entity and/or associated entities with at least a predetermined confidence level.
  • FIG. 1 illustrates a system in accordance with one embodiment.
  • FIG. 2 presents one embodiment of a process flow diagram consistent with the claimed subject matter.
  • FIG. 3 conceptually illustrates associations among entities using entity identifiers in accordance with one embodiment.
  • FIG. 4 presents a second embodiment of a process flow diagram consistent with the claimed subject matter.
  • Embodiments consistent with the present application can be implemented as systems, apparatuses, methods, and/or other implementations of subject matter for combining a plurality of data element values from data records originating from one or more data sources to form an entity identifier that can be used, at least in part, to facilitate identification of and/or associations among entities, as represented by the data records.
  • data elements selected to form an entity identifier can be selected at least in part, so that their respective data values can be combined and/or otherwise employed to form an entity identifier that can be substantially statistically unique.
  • Employing an entity identifier that is substantially statistically unique can facilitate identifications and/or associations being made with confidence levels that are appropriately high for a given application.
  • the term “confidence level” corresponds to a probability that an identified association does not represent a false positive association. Thus, the higher the confidence level, the less likely it is that data records will be erroneously associated with one another.
  • One advantage of embodiments consistent with the claimed subject matter is the ability to tailor the extent to which an entity identifier is statistically unique, which can correspondingly yield appropriately tailored confidence levels in the associations made among data records.
  • the degree of uniqueness can be selected to suit the particular application, field of use, and/or implementation in which the entity identifier is to be employed. In certain embodiments and/or implementations, a high confidence level in the identification and/or association results can be desirable. In such instances a more statistically unique entity identifier can be used. In alternative embodiments and/or implementations, lower confidence in the identification and/or association of unique entities can be acceptable. In such instances an entity identifier that is less statistically unique can be employed.
  • substantially statistically unique is employed herein consistent with the above described notions of flexibility, scalability, and customization. Two different entity identifiers can both be considered substantially statistically unique, even if one is more statistically unique than the other. One or more embodiments can require that an entity identifier should be at least statistically unique enough to provide meaningful and/or useful results for a given implementation.
  • a relatively wider variety of data elements can be selected to form an entity identifier.
  • Such implementations can employ data elements having values that are not very unique to individual entities, such as a name or date of birth, as but two examples. In a large dataset, there can be multiple data records representing several different entities with the same name and date of birth. However, in an implementation that requires entities to be identified and/or associated with a high-degree of confidence, data elements can be selected so as to form entity identifiers that can indicate associations with a confidence level that is sufficiently high for a particular intended application.
  • customization through varying the selection and/or number of data elements employed to form entity identifiers, as well as other factors, can allow the accuracy and/or reliability of operations performed on the data records to be tailored, at least in part, in accordance with the requirements of each specific implementation.
  • an entity identifier embodiment can be formed using data elements representing components of present and/or historic contact information and/or other identifying data stored in data records representing individual entities.
  • data elements can include values representing, in whole or in part, address, phone number, e-mail handle, and/or other data representing an entity, to name but a few examples.
  • An entity identifier embodiment can be formed from values for these and/or other contact information data elements and can be employed, at least in part, to facilitate identification of and/or an association among one or more entities as disclosed in more detail below.
  • the statistical uniqueness of an entity identifier can be improved by selecting data elements to form the entity identifier that have values that are as evenly distributed across the population of entities as possible and/or practicable.
  • even distribution of values can be reasonably achieved by selecting data elements having values that are believed to be substantially randomly assigned to the entities represented by the data records.
  • the concept of even distribution of values can be illustrated graphically as a histogram, frequency diagram, and/or other suitable depiction graphing the range of possible values for a selected data element against the number of instances of each possible value occurring in a given set of data records.
  • Data elements having a distribution of values that graphs more flat, rather than as a bell-curve, can facilitate the formation of entity identifiers that are expected to identify associations among the data records with increased confidence.
  • a data element storing values for the last four digits of a contact phone number would facilitate formation of an entity identifier that would yield higher confidence results than would an entity identifier formed from values for a data element storing zip code data.
  • zip codes as in the example of the United States postal system designation, are not randomly assigned to, or evenly distributed among, entities. Rather, they are assigned based on entity location within geographic groupings.
  • the last four digits of an entity's contact phone number more closely approximate a random distribution throughout the population of entities.
  • using a data element storing full telephone number values decreases the evenness and/or reasonable randomness of the value distribution, as telephone number area codes and prefixes are assigned based at least in part on geographic grouping.
  • the values for the last four digits of a nine-digit U.S. Social Security Number are relatively evenly and/or reasonably randomly assigned throughout the population of U.S. individuals, while the values for the first three and middle two digits exhibit grouping characteristics.
  • Data records that have multiple common values for data elements having reasonably random and/or evenly distributed values can be more confidently associated with one another than can data records that have multiple common values for data elements having values with grouped distribution among the entities. Associating data records based on commonalities in poorly distributed values can lead to false positive associations.
  • two data records can coincidentally contain two or more matching historical zip code values even though the data records represent neither the same entity nor entities that should be properly associated (such as family members, spouses, non-familial cohabitants, etc.).
  • specific data elements can be selected so as to form entity identifiers that can be used to associate data records with sufficiently high confidence and reduced instances of false positive associations.
  • house number values are described below as employing house number values as contact information data elements used to form the entity identifiers.
  • a data record includes an address of 1234 Main Street for an entity
  • the “1234” portion of the address is an example of a house number.
  • Use of the term “house number” however is not meant to limit the claimed subject matter to addresses for houses, which are typically unattached single family dwellings.
  • the term “house number” can apply to the corresponding portion of any address data, regardless of the form or type of dwelling, building, or edifice that exists at that location.
  • a house number is but one example of a data element that can be employed consistent with the present application. House-number embodiments are described below only for illustrative purposes and not by way of limitation on the claimed subject matter. Those skilled in the relevant art will appreciate that additional and/or alternative data elements can also be employed consistent with this application and the claimed subject matter.
  • embodiments can be implemented to facilitate entity authentication and/or identification with improved accuracy and reliability. This is facilitated, at least in part, by the fact that, within a range of common house number values, the values can be sufficiently evenly distributed among, and/or randomly assigned to, entities.
  • Embodiments can use present and/or historic house number values, as but two examples.
  • the quantity, specifications, and/or characteristics of the house number data elements can also be chosen so as to achieve a confidence level that is substantially sufficient and/or tailored for a particular application and/or implementation.
  • an embodiment can combine values from a predetermined quantity of house number data elements associated with an entity to form one or more entity identifiers. For example, in one implementation, having a particular data set and/or grouping of data records, an entity identifier formed from values for two house numbers can be sufficiently unique to identify useful associations. A different implementation can require that values from three or more house numbers are used to form entity identifiers. Other variations are also possible consistent with the claimed subject matter.
  • multiple data elements can be selected based at least in part on having values exhibiting characteristics that make them suitable for combining to form an entity identifier that is substantially statistically unique for a given set of data records and/or represented entities.
  • the number of possible values for an entity identifier can be approximated as the product of the number of digits composing the entity identifier times the number of available, distinct values per digit.
  • Specific data elements can be selected so the number of possible unique values for an entity identifier formed from values for the selected data elements exceeds the number of distinct entities within the population.
  • Such an entity identifier can be considered substantially statistically unique with respect to the population of entities represented by the data records.
  • the factor by which the number of possible unique entity identifier values exceeds the number of entities can be customized for a desired confidence level in associations made among data records.
  • the greater the factor of excess the more statistically unique the entity identifier is and the better the quality and specificity of the associations made using that entity identifier.
  • a factor can be predetermined and can be designated, selected, and/or applied for an intended application and/or specific implementation to yield associations having a desired confidence level and/or quality.
  • data records can be selected so as to form one or more substantially statistically unique entity identifiers for the given population.
  • the extent to which the formed entity identifier is statistically unique relative to the applicable population can be customized based, at least in part, on selection of data elements. For example, if data elements are selected such that a formed entity identifier includes nine digits, with each digit possessing ten possible numerical values (0-9), then there are approximately one billion possible values for the entity identifier (10 ⁇ 9). This represents a factor of about 3.33, meaning there are approximately three and one third possible entity identifier values per entity in the population.
  • data elements can be selected so that a formed entity identifier includes twelve digits, with each digit possessing ten possible numerical values (0-9). In such embodiment, there are approximately one trillion possible entity identifier values (10 ⁇ 12). Because the number of possible entity identifier values exceeds the number of entities within the population by a factor of over 3,333, the corresponding twelve-digit entity identifiers are more substantially statistically unique than were the nine-digit entity identifiers. It should be noted that various data elements can be combined to achieve the results indicated above. For example, two data elements with six-digit values can combine to form an entity identifier that is statistically comparable to an entity identifier formed from three data elements having values of four-digits each.
  • the threshold for quantifying and/or qualifying a predetermined factor for a specific implementation and/or application can be determined based at least in part on a number of applicable considerations, including, without limitation, the extent to which the values for the selected data elements are randomly assigned and/or evenly distributed among the entities in the population, and the application's tolerance for false-positives, to name only a couple of examples.
  • a number of applicable considerations including, without limitation, the extent to which the values for the selected data elements are randomly assigned and/or evenly distributed among the entities in the population, and the application's tolerance for false-positives, to name only a couple of examples.
  • Those skilled in the relevant arts will appreciate that certain requirements, considerations, and/or characteristics of an intended application can require associations to achieve a specific confidence level and an appropriately applicable factor value and/or acceptable range of factor values can be determined accordingly.
  • entity identifiers can be defined to include a predetermined number of digits (e.g., 10, 12, 20, etc.). A sufficient number of house number data elements can be combined to achieve the desired number of digits in the entity identifier. Those skilled in the relevant arts will appreciate that increasing the quantity of house numbers providing values used to create the entity identifier also increases the statistical uniqueness of the formed entity identifier.
  • entity identifiers can be created that are substantially statistically unique (e.g., it can be said with substantially high statistical confidence, sufficient for the intended application and/or implementation, that a specific entity identifier corresponds to either one unique entity or separate unique entities that can be properly associated).
  • entities in the United States for which sufficient corresponding address data records exist can be identified or associated using three or more house numbers contained in their financial, utility, or other address history records.
  • the degree and/or extent of statistical uniqueness of the resulting identifier can be sufficiently and/or substantially high if the three house numbers contain a total of twelve or more digits when combined.
  • FIG. 1 illustrates one example of a system for implementing identification and/or association embodiments consistent with the claimed subject matter. The system of FIG.
  • a computer system 100 is provided to access data records from one or more data sources 102 .
  • Computer system 100 can access data sources 102 directly, or via an optional network connection 104 , such as the Internet, an intranet, LAN, WAN, and/or other network. Accordingly, data sources 102 can be maintained locally and/or remotely with respect to the location of computer system 100 .
  • Computer system 100 can also include and/or have access to a processing engine 106 capable of executing programming instructions for generating and/or applying one or more entity identifiers to identify and/or associate entities represented by the data records in data sources 102 .
  • Results of identification and/or association operations can be further processed and/or applied within computer system 100 , or they can be organized for and/or communicated to one or more separate systems for subsequent handling, if or to the extent necessary and/or desirable given the intended functionality and/or specific implementation in which an embodiment operates.
  • FIG. 2 presents a process flow diagram including examples of steps that can be included in one embodiment of such a process.
  • FIG. 2 can facilitate the identification of non-obvious associations between entities.
  • the process of FIG. 2 can include step 200 , for establishing access to data records, which can include securing access to data records not previously accessed and/or possessed.
  • the data records can be contained within, maintained by, and/or otherwise made available from one or more separate, discrete data sources.
  • step 202 data records for each individual entity can be processed to identify entity identifiers that correspond to that entity.
  • step 202 using house numbers
  • an entity's data record has house number values including 900, 725, 1255, and 1221, using combinations of three house numbers, and ignoring order, the following four entity identifiers can be formed: 9007251255, 9007251221, 72512551221, and 90012551221.
  • results of step 202 can be organized and/or grouped.
  • One example grouping embodiment can include grouping results first according to entity identifier, and then according to individual entity corresponding to each entity identifier. Other grouping methodologies can additionally and/or alternatively be employed.
  • step 206 identifies associations among the entities and can initiate and/or facilitate additional processing of one or more of the data records based, at least in part, on the identified associations. Entities that share entity identifiers can be associated. If an entity identifier from the list grouped in step 204 corresponds to two data records, the entities represented by those data records can accordingly also be associated.
  • the records representing those entities are either separate records representing the same entity, or separate records representing different entities that can be properly associated with one another.
  • Context or data within the data records, application, and/or specific data elements can be used to distinguish between the two types of associations. For example, presence of the same substantially universal key, such as a full Social Security Number, in both data records can indicate that the associated records represent the same entity.
  • FIG. 3 presents a diagram conceptually illustrating the association of multiple entities and/or data records using a common entity identifier.
  • data elements such as house numbers
  • data elements in one or more data records for a first entity data record 300 can form two entity identifiers, illustrated in FIG. 3 as entity identifier 302 , and entity identifier 304 .
  • Entity data record 306 can also form entity identifier 304 .
  • Entity data record 308 forms entity identifier 310 . Because entity data record 300 and entity data record 306 share entity identifier 304 in common, they can be associated. However, because entity data record 308 does not share a common entity identifier, it cannot be associated with either entity data record 300 or entity data record 306 .
  • FIG. 4 presents an alternative methodology and process flow to that depicted in FIG. 2 .
  • FIG. 4 illustrates a process flow diagram for identifying and/or associating entities using, as separate components, individual data elements that collectively can be combined to form an entity identifier.
  • an embodiment implementing the process of FIG. 4 can identify additional records corresponding to the same entity, as well as separate entities that can be associated with the original entity.
  • step 400 begins by identifying and selecting the type and/or quantity of data elements that can be employed so as to yield a substantially statistically unique entity identifier.
  • the values for each of those data element components are gathered from an identified data record representing an original entity.
  • one or more data sources can be queried to identify data records that include a value matching and/or substantially matching the value for any of the data element components identified in step 402 . Separate searches/queries can be executed for each data element value.
  • associations can be identified among entities included in the search results. For example, data records with element values matching each of the search queries either represent the original entity, or entities that can be associated with the original entity with substantially statistical reliability.
  • a substantially statistically unique entity identifier can be formed for a given set of data records using three house numbers
  • separate searches can be conducted using each house number as a query, entities represented by data records that appear three times in the search results list, indicating that the data record included a match for each separate house number value searched, represent the original entity and/or associated entities (e.g., related entities sharing a common address history, etc.).
  • the searching procedure can also employ filtering logic to substantially reduce processing requirements when executing searches. Rather than searching all data records in all data sources for matches based on each search criterion, searching on the second criterion can be limited to those entities returned as results of a search on the first criterion. Similarly, a search performed using the third criterion can be limited to the results of the second search, and so on.
  • entity identifiers corresponding to two or more data records do not have to represent exact matches in order for the data records and/or the corresponding entities to be associated. Based, at least in part, on factors such as the tolerance for false positive associations in a given application and/or implementation, a certain acceptable margin of error can be allowed for purposes of identifying matches between entity identifiers or selected data element component values.
  • the use of the phrases “substantially matching,” “substantial match,” or the like in this application and the attached claims is meant to indicate matches that are either exact, or within a predetermined acceptable margin of error for a given implementation and/or application.
  • the order in which the data values appear in two or more entity identifiers can be ignored for purposes of comparing entity identifiers and associating corresponding data records.
  • An alternative embodiment can elect to ignore duplicate values in the formation and/or comparison of entity identifiers.
  • Still other embodiments can allow for other variances in exact matching to be allowed.
  • a few such examples can include rounding conventions for numeric data values, and/or common synonyms, abbreviations, and/or alternative spellings for alphanumeric data values, to illustrate but a few examples.

Abstract

Data values from a plurality of data elements can be combined to form one or more entity identifiers to facilitate identifications of and/or associations among a plurality of data records representing one or more entities. Associated data records can represent the same entity and/or multiple entities that can be properly associated. Associations can be made among two or more unique entities and/or their respective representative data records if they correspond to substantially the same entity identifier. In one embodiment, the number, type, and/or characteristics of values for data elements used to form an entity identifier can be selected so that the entity identifier is substantially statistically unique.

Description

    RELATED APPLICATIONS
  • This patent application is a continuation of and claims the benefit of priority from U.S. Nonprovisional patent application Ser. No. 11/818,908, filed Jun. 14, 2007, which is a nonprovisional of and claims the benefit of priority from U.S. Provisional Patent Application No. 60/813,792, filed Jun. 14, 2006, both of which are hereby incorporated by reference in their entirety.
  • COPYRIGHT NOTICE
  • ©2007 TransUnion TeleData, LLC. A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. 37 CFR §1.71(d), (e).
  • TECHNICAL FIELD
  • Embodiments described in the present application relate to the field of identification and/or association of one or more data records representing entities within one or more data sources.
  • BACKGROUND
  • Many data sources, such as commercial data repositories, utility company customer databases, etc., to list only a few examples, store data records corresponding to individual entities, such as people, companies, etc. The data records are typically comprised of multiple data elements, and the value for each data element typically represents a particular aspect of the entity's identity, or other information related to the entity. Numerous commercial and noncommercial enterprises employ such data sources in a variety of ways as an integral part of their product or service offerings and daily operations.
  • Unfortunately, given the potentially vast array of records a data source can include, it often proves to be a challenge to search, analyze, and/or manipulate the entity-representing data in a meaningful way. Furthermore, some data sources contain inaccurate or out-dated information. For example, even using a well-indexed data source, it often can be difficult to identify with sufficient certainty that one or more particular records actually correspond to the specific entity they putatively represent. It can also be difficult to identify associations between multiple seemingly independent entity data records. Due to variations in the type, amount, and structure of data elements each data source can employ for its respective data records, the challenges of identifying and associating individual entities can be greatly magnified if multiple data sources are employed.
  • SUMMARY
  • Embodiments consistent with the present application can utilize, at least in part, entity data records comprising multiple data elements to facilitate identification of entities and/or associations being made among entities represented by the data records. The data records can originate from and/or be maintained within one or more data sources. Such embodiments can combine data values from a plurality of data elements to form an entity identifier, which can serve, at least in part, as a key for facilitating the identification of and/or associations among entities represented by a plurality of data records.
  • In one embodiment, the entity identifier can facilitate the identification of one or more data records corresponding to a unique entity, from one or more data sources. In addition or in the alternative, an embodiment can facilitate the association of multiple data records that represent the same entity, and/or multiple data records representing separate, associated entities. For example, associations can be made among two or more unique entities and/or their respective representative data records if they correspond to substantially the same entity identifier. In one embodiment, the number, type, and/or characteristics of data elements used to form an entity identifier can be selected so that the entity identifier is expected to represent an individual entity and/or associated entities with at least a predetermined confidence level.
  • Additional aspects and advantages will be apparent from the following detailed description of preferred embodiments, which proceeds with reference to the accompanying drawings.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates a system in accordance with one embodiment.
  • FIG. 2 presents one embodiment of a process flow diagram consistent with the claimed subject matter.
  • FIG. 3 conceptually illustrates associations among entities using entity identifiers in accordance with one embodiment.
  • FIG. 4 presents a second embodiment of a process flow diagram consistent with the claimed subject matter.
  • DETAILED DESCRIPTION
  • Embodiments consistent with the present application can be implemented as systems, apparatuses, methods, and/or other implementations of subject matter for combining a plurality of data element values from data records originating from one or more data sources to form an entity identifier that can be used, at least in part, to facilitate identification of and/or associations among entities, as represented by the data records. In one embodiment, data elements selected to form an entity identifier can be selected at least in part, so that their respective data values can be combined and/or otherwise employed to form an entity identifier that can be substantially statistically unique. Employing an entity identifier that is substantially statistically unique can facilitate identifications and/or associations being made with confidence levels that are appropriately high for a given application. As used throughout this application and the attached claims, the term “confidence level” corresponds to a probability that an identified association does not represent a false positive association. Thus, the higher the confidence level, the less likely it is that data records will be erroneously associated with one another.
  • One advantage of embodiments consistent with the claimed subject matter is the ability to tailor the extent to which an entity identifier is statistically unique, which can correspondingly yield appropriately tailored confidence levels in the associations made among data records. The degree of uniqueness can be selected to suit the particular application, field of use, and/or implementation in which the entity identifier is to be employed. In certain embodiments and/or implementations, a high confidence level in the identification and/or association results can be desirable. In such instances a more statistically unique entity identifier can be used. In alternative embodiments and/or implementations, lower confidence in the identification and/or association of unique entities can be acceptable. In such instances an entity identifier that is less statistically unique can be employed. The phrase “substantially statistically unique” is employed herein consistent with the above described notions of flexibility, scalability, and customization. Two different entity identifiers can both be considered substantially statistically unique, even if one is more statistically unique than the other. One or more embodiments can require that an entity identifier should be at least statistically unique enough to provide meaningful and/or useful results for a given implementation.
  • In an implementation for which a relatively low result confidence association is acceptable, a relatively wider variety of data elements can be selected to form an entity identifier. Such implementations can employ data elements having values that are not very unique to individual entities, such as a name or date of birth, as but two examples. In a large dataset, there can be multiple data records representing several different entities with the same name and date of birth. However, in an implementation that requires entities to be identified and/or associated with a high-degree of confidence, data elements can be selected so as to form entity identifiers that can indicate associations with a confidence level that is sufficiently high for a particular intended application. As disclosed in more detail below, customization through varying the selection and/or number of data elements employed to form entity identifiers, as well as other factors, can allow the accuracy and/or reliability of operations performed on the data records to be tailored, at least in part, in accordance with the requirements of each specific implementation.
  • For purposes of facilitating discussion, and not by way of limitation on the claimed subject matter, one example of an entity identifier embodiment, presented for illustrative purposes, can be formed using data elements representing components of present and/or historic contact information and/or other identifying data stored in data records representing individual entities. For example, such data elements can include values representing, in whole or in part, address, phone number, e-mail handle, and/or other data representing an entity, to name but a few examples. An entity identifier embodiment can be formed from values for these and/or other contact information data elements and can be employed, at least in part, to facilitate identification of and/or an association among one or more entities as disclosed in more detail below.
  • In one or more embodiments, the statistical uniqueness of an entity identifier can be improved by selecting data elements to form the entity identifier that have values that are as evenly distributed across the population of entities as possible and/or practicable. In one embodiment, even distribution of values can be reasonably achieved by selecting data elements having values that are believed to be substantially randomly assigned to the entities represented by the data records. The concept of even distribution of values can be illustrated graphically as a histogram, frequency diagram, and/or other suitable depiction graphing the range of possible values for a selected data element against the number of instances of each possible value occurring in a given set of data records. Data elements having a distribution of values that graphs more flat, rather than as a bell-curve, can facilitate the formation of entity identifiers that are expected to identify associations among the data records with increased confidence.
  • To illustrate the above point, a data element storing values for the last four digits of a contact phone number would facilitate formation of an entity identifier that would yield higher confidence results than would an entity identifier formed from values for a data element storing zip code data. This is because zip codes, as in the example of the United States postal system designation, are not randomly assigned to, or evenly distributed among, entities. Rather, they are assigned based on entity location within geographic groupings. In comparison, the last four digits of an entity's contact phone number more closely approximate a random distribution throughout the population of entities. However, using a data element storing full telephone number values decreases the evenness and/or reasonable randomness of the value distribution, as telephone number area codes and prefixes are assigned based at least in part on geographic grouping. Similarly, the values for the last four digits of a nine-digit U.S. Social Security Number are relatively evenly and/or reasonably randomly assigned throughout the population of U.S. individuals, while the values for the first three and middle two digits exhibit grouping characteristics. Data records that have multiple common values for data elements having reasonably random and/or evenly distributed values can be more confidently associated with one another than can data records that have multiple common values for data elements having values with grouped distribution among the entities. Associating data records based on commonalities in poorly distributed values can lead to false positive associations. For example, it is possible that two data records can coincidentally contain two or more matching historical zip code values even though the data records represent neither the same entity nor entities that should be properly associated (such as family members, spouses, non-familial cohabitants, etc.). By obtaining knowledge of the content and/or characteristics of the data records and data elements included therein, specific data elements can be selected so as to form entity identifiers that can be used to associate data records with sufficiently high confidence and reduced instances of false positive associations.
  • To facilitate discussion, one or more embodiments are described below as employing house number values as contact information data elements used to form the entity identifiers. For illustration, if a data record includes an address of 1234 Main Street for an entity, the “1234” portion of the address is an example of a house number. Use of the term “house number” however is not meant to limit the claimed subject matter to addresses for houses, which are typically unattached single family dwellings. The term “house number” can apply to the corresponding portion of any address data, regardless of the form or type of dwelling, building, or edifice that exists at that location. Furthermore, a house number is but one example of a data element that can be employed consistent with the present application. House-number embodiments are described below only for illustrative purposes and not by way of limitation on the claimed subject matter. Those skilled in the relevant art will appreciate that additional and/or alternative data elements can also be employed consistent with this application and the claimed subject matter.
  • Continuing with reference to embodiments employing house-number data elements, for purposes of discussion, such embodiments can be implemented to facilitate entity authentication and/or identification with improved accuracy and reliability. This is facilitated, at least in part, by the fact that, within a range of common house number values, the values can be sufficiently evenly distributed among, and/or randomly assigned to, entities. Embodiments can use present and/or historic house number values, as but two examples. In addition to selecting house numbers as the type of data element to use in forming an entity identifier, the quantity, specifications, and/or characteristics of the house number data elements can also be chosen so as to achieve a confidence level that is substantially sufficient and/or tailored for a particular application and/or implementation. Such choices can be made, at least in part, based on the number and/or characteristics of the available data records and/or the entities the data records represent. As but one example, in an implementation using data records from data sources that reflect address histories for entities, an embodiment can combine values from a predetermined quantity of house number data elements associated with an entity to form one or more entity identifiers. For example, in one implementation, having a particular data set and/or grouping of data records, an entity identifier formed from values for two house numbers can be sufficiently unique to identify useful associations. A different implementation can require that values from three or more house numbers are used to form entity identifiers. Other variations are also possible consistent with the claimed subject matter.
  • Consistent with the claimed subject matter, multiple data elements can be selected based at least in part on having values exhibiting characteristics that make them suitable for combining to form an entity identifier that is substantially statistically unique for a given set of data records and/or represented entities. For example, in one embodiment, the number of possible values for an entity identifier can be approximated as the product of the number of digits composing the entity identifier times the number of available, distinct values per digit. Specific data elements can be selected so the number of possible unique values for an entity identifier formed from values for the selected data elements exceeds the number of distinct entities within the population. Such an entity identifier can be considered substantially statistically unique with respect to the population of entities represented by the data records.
  • The factor by which the number of possible unique entity identifier values exceeds the number of entities can be customized for a desired confidence level in associations made among data records. The greater the factor of excess, the more statistically unique the entity identifier is and the better the quality and specificity of the associations made using that entity identifier. A factor can be predetermined and can be designated, selected, and/or applied for an intended application and/or specific implementation to yield associations having a desired confidence level and/or quality.
  • For example, in an embodiment having data records representing a population of approximately 300,000,000 entities, data records can be selected so as to form one or more substantially statistically unique entity identifiers for the given population. Additionally, the extent to which the formed entity identifier is statistically unique relative to the applicable population can be customized based, at least in part, on selection of data elements. For example, if data elements are selected such that a formed entity identifier includes nine digits, with each digit possessing ten possible numerical values (0-9), then there are approximately one billion possible values for the entity identifier (10̂9). This represents a factor of about 3.33, meaning there are approximately three and one third possible entity identifier values per entity in the population. In an alternative embodiment, data elements can be selected so that a formed entity identifier includes twelve digits, with each digit possessing ten possible numerical values (0-9). In such embodiment, there are approximately one trillion possible entity identifier values (10̂12). Because the number of possible entity identifier values exceeds the number of entities within the population by a factor of over 3,333, the corresponding twelve-digit entity identifiers are more substantially statistically unique than were the nine-digit entity identifiers. It should be noted that various data elements can be combined to achieve the results indicated above. For example, two data elements with six-digit values can combine to form an entity identifier that is statistically comparable to an entity identifier formed from three data elements having values of four-digits each. The threshold for quantifying and/or qualifying a predetermined factor for a specific implementation and/or application can be determined based at least in part on a number of applicable considerations, including, without limitation, the extent to which the values for the selected data elements are randomly assigned and/or evenly distributed among the entities in the population, and the application's tolerance for false-positives, to name only a couple of examples. Those skilled in the relevant arts will appreciate that certain requirements, considerations, and/or characteristics of an intended application can require associations to achieve a specific confidence level and an appropriately applicable factor value and/or acceptable range of factor values can be determined accordingly.
  • Continuing for illustrative purposes with the example of house number data elements, and as one example of an additional and/or alternative requirement, entity identifiers can be defined to include a predetermined number of digits (e.g., 10, 12, 20, etc.). A sufficient number of house number data elements can be combined to achieve the desired number of digits in the entity identifier. Those skilled in the relevant arts will appreciate that increasing the quantity of house numbers providing values used to create the entity identifier also increases the statistical uniqueness of the formed entity identifier. As additional quantities of house number values are combined to create the entity identifier, it becomes statistically less likely that the same identifier can match multiple data records without the data records either referring to the same entity or entities that can be properly associated with one another (familial relatives, roommates, etc.).
  • At least in part by choosing sufficiently restrictive data element requirements and/or characteristics to form the entity identifiers, entity identifiers can be created that are substantially statistically unique (e.g., it can be said with substantially high statistical confidence, sufficient for the intended application and/or implementation, that a specific entity identifier corresponds to either one unique entity or separate unique entities that can be properly associated). For example, in one embodiment, entities in the United States for which sufficient corresponding address data records exist can be identified or associated using three or more house numbers contained in their financial, utility, or other address history records. In such an embodiment, the degree and/or extent of statistical uniqueness of the resulting identifier can be sufficiently and/or substantially high if the three house numbers contain a total of twelve or more digits when combined. This is because, in the Unites States, for example, a majority of house numbers have values with three or more digits that range between 100 and 20000. By volume, the majority of house numbers have four digits. Therefore, the odds of any two entity data records including the same three house numbers, regardless of sequence, without the represented entities being associated is approximately (20000−100)̂3, or 1 in 7,880,599,000,000. In alternative implementations (for example, in a system for associating entities with addresses outside the United States, etc.), other characteristics can be chosen for the data elements used to form the entity identifier. Data elements can be chosen so as to produce confidence levels that are specifically tailored for the given application, implementation, and/or data records.
  • Because of the statistical improbability of two discrete entities randomly sharing a common entity identifier with the above characteristics, embodiments consistent with the claimed subject matter can implement such substantially statistically unique entity identifiers in a wide variety of business applications and/or for other implementations and/or purposes that can require substantially high levels of accuracy in identifying and/or associating entities represented by data records from one or more data sources. FIG. 1 illustrates one example of a system for implementing identification and/or association embodiments consistent with the claimed subject matter. The system of FIG. 1 is presented for illustrative purposes and to facilitate discussion; it is not meant as a limitation on the scope of the attached claims, and those skilled in the relevant art will appreciate that apparatuses or other systems can be provided with fewer, alternative, and/or additional components and/or configurations while remaining consistent with the claimed subject matter.
  • With specific reference to FIG. 1, a computer system 100 is provided to access data records from one or more data sources 102. Computer system 100 can access data sources 102 directly, or via an optional network connection 104, such as the Internet, an intranet, LAN, WAN, and/or other network. Accordingly, data sources 102 can be maintained locally and/or remotely with respect to the location of computer system 100. Computer system 100 can also include and/or have access to a processing engine 106 capable of executing programming instructions for generating and/or applying one or more entity identifiers to identify and/or associate entities represented by the data records in data sources 102. Results of identification and/or association operations can be further processed and/or applied within computer system 100, or they can be organized for and/or communicated to one or more separate systems for subsequent handling, if or to the extent necessary and/or desirable given the intended functionality and/or specific implementation in which an embodiment operates.
  • Consistent with the present application, apparatuses or systems, such as the system illustrated in FIG. 1, can implement various identification and/or association processes using substantially statistically unique identifiers as disclosed herein. FIG. 2 presents a process flow diagram including examples of steps that can be included in one embodiment of such a process. In particular, FIG. 2 can facilitate the identification of non-obvious associations between entities. The process of FIG. 2 can include step 200, for establishing access to data records, which can include securing access to data records not previously accessed and/or possessed. The data records can be contained within, maintained by, and/or otherwise made available from one or more separate, discrete data sources. At step 202, data records for each individual entity can be processed to identify entity identifiers that correspond to that entity. As one example illustrating step 202 using house numbers, if an entity's data record has house number values including 900, 725, 1255, and 1221, using combinations of three house numbers, and ignoring order, the following four entity identifiers can be formed: 9007251255, 9007251221, 72512551221, and 90012551221.
  • At step 204, results of step 202 can be organized and/or grouped. One example grouping embodiment can include grouping results first according to entity identifier, and then according to individual entity corresponding to each entity identifier. Other grouping methodologies can additionally and/or alternatively be employed. Using the results grouped in step 204, step 206 identifies associations among the entities and can initiate and/or facilitate additional processing of one or more of the data records based, at least in part, on the identified associations. Entities that share entity identifiers can be associated. If an entity identifier from the list grouped in step 204 corresponds to two data records, the entities represented by those data records can accordingly also be associated. The records representing those entities are either separate records representing the same entity, or separate records representing different entities that can be properly associated with one another. Context or data within the data records, application, and/or specific data elements can be used to distinguish between the two types of associations. For example, presence of the same substantially universal key, such as a full Social Security Number, in both data records can indicate that the associated records represent the same entity.
  • FIG. 3 presents a diagram conceptually illustrating the association of multiple entities and/or data records using a common entity identifier. For an embodiment as illustrated in FIG. 3, data elements, such as house numbers, in one or more data records for a first entity data record 300 can form two entity identifiers, illustrated in FIG. 3 as entity identifier 302, and entity identifier 304. Entity data record 306 can also form entity identifier 304. Entity data record 308 forms entity identifier 310. Because entity data record 300 and entity data record 306 share entity identifier 304 in common, they can be associated. However, because entity data record 308 does not share a common entity identifier, it cannot be associated with either entity data record 300 or entity data record 306.
  • FIG. 4 presents an alternative methodology and process flow to that depicted in FIG. 2. FIG. 4 illustrates a process flow diagram for identifying and/or associating entities using, as separate components, individual data elements that collectively can be combined to form an entity identifier. Given an original entity for which the values of data elements in a representative data record are known, an embodiment implementing the process of FIG. 4 can identify additional records corresponding to the same entity, as well as separate entities that can be associated with the original entity. With particular reference to FIG. 4, step 400 begins by identifying and selecting the type and/or quantity of data elements that can be employed so as to yield a substantially statistically unique entity identifier. In step 402, the values for each of those data element components are gathered from an identified data record representing an original entity. In step 404, one or more data sources can be queried to identify data records that include a value matching and/or substantially matching the value for any of the data element components identified in step 402. Separate searches/queries can be executed for each data element value. In step 406, associations can be identified among entities included in the search results. For example, data records with element values matching each of the search queries either represent the original entity, or entities that can be associated with the original entity with substantially statistical reliability. For example, in one embodiment consistent with the claimed subject matter, presented for illustrative purposes, and not by way of limitation, if a substantially statistically unique entity identifier can be formed for a given set of data records using three house numbers, separate searches can be conducted using each house number as a query, entities represented by data records that appear three times in the search results list, indicating that the data record included a match for each separate house number value searched, represent the original entity and/or associated entities (e.g., related entities sharing a common address history, etc.).
  • For efficiency or process optimization purposes, the searching procedure can also employ filtering logic to substantially reduce processing requirements when executing searches. Rather than searching all data records in all data sources for matches based on each search criterion, searching on the second criterion can be limited to those entities returned as results of a search on the first criterion. Similarly, a search performed using the third criterion can be limited to the results of the second search, and so on.
  • It should also be noted that, consistent with the claimed subject matter, entity identifiers corresponding to two or more data records do not have to represent exact matches in order for the data records and/or the corresponding entities to be associated. Based, at least in part, on factors such as the tolerance for false positive associations in a given application and/or implementation, a certain acceptable margin of error can be allowed for purposes of identifying matches between entity identifiers or selected data element component values. The use of the phrases “substantially matching,” “substantial match,” or the like in this application and the attached claims is meant to indicate matches that are either exact, or within a predetermined acceptable margin of error for a given implementation and/or application. For example, in an embodiment using an entity identifier formed from multiple data element values, the order in which the data values appear in two or more entity identifiers can be ignored for purposes of comparing entity identifiers and associating corresponding data records. An alternative embodiment can elect to ignore duplicate values in the formation and/or comparison of entity identifiers. Still other embodiments can allow for other variances in exact matching to be allowed. A few such examples can include rounding conventions for numeric data values, and/or common synonyms, abbreviations, and/or alternative spellings for alphanumeric data values, to illustrate but a few examples.
  • It will be obvious to those having skill in the art that many changes may be made to the details of the above-described embodiments without departing from the underlying principles of the invention. The scope of the present invention should, therefore, be determined only by the following claims.

Claims (10)

1. A method for associating data records representing one or more entities, comprising:
obtaining access to a plurality of data records, each data record including a plurality of data elements;
selecting two or more data elements from the plurality of data elements, the selected data elements being selected so as to enable one or more entity identifiers to be formed from values for the selected data elements from the plurality of data records; and
associating a first data record with a second data record if a first entity identifier formed from values for the selected data elements from the first data record substantially matches a second entity identifier formed from values for the selected data elements from the second data record.
2. The method of claim 1 wherein the data elements are selected so that the formed one or more entity identifiers are substantially statistically unique.
3. The method of claim 2 wherein the data elements are selected so that the formed one or more entity identifiers have a number of possible values in excess of a number of entities represented by the plurality of data records.
4. The method of claim 3 wherein the selected data elements are selected so that the number of possible values for the formed one or more entity identifiers exceeds the number of entities by at least a predetermined factor.
5. The method of claim 4 wherein the predetermined factor is determined based at least in part on a source for the plurality of data records.
6. The method of claim 4 wherein the predetermined factor is determined at least in part according to an intended purpose for associating the first data record with the second data record.
7. The method of claim 4 wherein the predetermined factor is determined so that the associating of the first data records with the second data record achieves at least a predetermined confidence level.
8. The method of claim 1 wherein the selected data elements encompass values that are substantially randomly assigned to entities represented by the plurality of data records.
9. The method of claim 1 wherein the selected data elements encompass values that are substantially evenly distributed among entities represented by the plurality of data records.
10. The method of claim 1, further comprising defining a criterion for the one or more entity identifiers, wherein the data elements are selected so that the one or more entity identifiers are formed to satisfy the criterion.
US12/877,096 2006-06-14 2010-09-07 Entity Identification and/or Association Using Multiple Data Elements Abandoned US20110119291A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/877,096 US20110119291A1 (en) 2006-06-14 2010-09-07 Entity Identification and/or Association Using Multiple Data Elements

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US81379206P 2006-06-14 2006-06-14
US11/818,908 US7792864B1 (en) 2006-06-14 2007-06-14 Entity identification and/or association using multiple data elements
US12/877,096 US20110119291A1 (en) 2006-06-14 2010-09-07 Entity Identification and/or Association Using Multiple Data Elements

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US11/818,908 Continuation US7792864B1 (en) 2006-06-14 2007-06-14 Entity identification and/or association using multiple data elements

Publications (1)

Publication Number Publication Date
US20110119291A1 true US20110119291A1 (en) 2011-05-19

Family

ID=42669742

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/818,908 Active 2028-05-21 US7792864B1 (en) 2006-06-14 2007-06-14 Entity identification and/or association using multiple data elements
US12/877,096 Abandoned US20110119291A1 (en) 2006-06-14 2010-09-07 Entity Identification and/or Association Using Multiple Data Elements

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US11/818,908 Active 2028-05-21 US7792864B1 (en) 2006-06-14 2007-06-14 Entity identification and/or association using multiple data elements

Country Status (1)

Country Link
US (2) US7792864B1 (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8359278B2 (en) 2006-10-25 2013-01-22 IndentityTruth, Inc. Identity protection
US8819793B2 (en) 2011-09-20 2014-08-26 Csidentity Corporation Systems and methods for secure and efficient enrollment into a federation which utilizes a biometric repository
US8825671B1 (en) * 2011-10-05 2014-09-02 Google Inc. Referent determination from selected content
US8878785B1 (en) 2011-10-05 2014-11-04 Google Inc. Intent determination using geometric shape input
US8890827B1 (en) 2011-10-05 2014-11-18 Google Inc. Selected content refinement mechanisms
JP2015512103A (en) * 2012-02-28 2015-04-23 シークオティエント インコーポレイテッド System, method and apparatus for identifying links between interactive digital data
US9032316B1 (en) 2011-10-05 2015-05-12 Google Inc. Value-based presentation of user-selectable computing actions
US9235728B2 (en) 2011-02-18 2016-01-12 Csidentity Corporation System and methods for identifying compromised personally identifiable information on the internet
US9305108B2 (en) 2011-10-05 2016-04-05 Google Inc. Semantic selection and purpose facilitation
US9501583B2 (en) 2011-10-05 2016-11-22 Google Inc. Referent based search suggestions
US10013152B2 (en) 2011-10-05 2018-07-03 Google Llc Content selection disambiguation
US10339527B1 (en) 2014-10-31 2019-07-02 Experian Information Solutions, Inc. System and architecture for electronic fraud detection
WO2019136407A1 (en) * 2018-01-08 2019-07-11 Equifax Inc. Facilitating entity resolution, keying, and search match without transmitting personally identifiable information in the clear
CN110347480A (en) * 2019-06-26 2019-10-18 联动优势科技有限公司 The preferred access path method and device of data source containing coincidence data item label
US10592982B2 (en) 2013-03-14 2020-03-17 Csidentity Corporation System and method for identifying related credit inquiries
US10699028B1 (en) 2017-09-28 2020-06-30 Csidentity Corporation Identity security architecture systems and methods
US10878955B2 (en) 2006-09-26 2020-12-29 Centrifyhealth, Llc Individual health record system and apparatus
US10896472B1 (en) 2017-11-14 2021-01-19 Csidentity Corporation Security and identity verification system and architecture
US10909617B2 (en) 2010-03-24 2021-02-02 Consumerinfo.Com, Inc. Indirect monitoring and reporting of a user's credit data
US11030562B1 (en) 2011-10-31 2021-06-08 Consumerinfo.Com, Inc. Pre-data breach monitoring
US11074423B2 (en) 2018-01-29 2021-07-27 Hewlett-Packard Development Company, L.P. Object ID-centered workflow
US11151468B1 (en) 2015-07-02 2021-10-19 Experian Information Solutions, Inc. Behavior analysis using distributed representations of event data
US11170879B1 (en) 2006-09-26 2021-11-09 Centrifyhealth, Llc Individual health record system and apparatus
US11226959B2 (en) 2019-04-03 2022-01-18 Unitedhealth Group Incorporated Managing data objects for graph-based data structures

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7792864B1 (en) * 2006-06-14 2010-09-07 TransUnion Teledata, L.L.C. Entity identification and/or association using multiple data elements
US8036979B1 (en) 2006-10-05 2011-10-11 Experian Information Solutions, Inc. System and method for generating a finance attribute from tradeline data
US20080103800A1 (en) * 2006-10-25 2008-05-01 Domenikos Steven D Identity Protection
US8606626B1 (en) 2007-01-31 2013-12-10 Experian Information Solutions, Inc. Systems and methods for providing a direct marketing campaign planning environment
US8606666B1 (en) 2007-01-31 2013-12-10 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US20100281351A1 (en) * 2009-04-29 2010-11-04 Soiba Mohammed Web print content control using html
US9665643B2 (en) * 2011-12-30 2017-05-30 Microsoft Technology Licensing, Llc Knowledge-based entity detection and disambiguation
US9063991B2 (en) 2013-01-25 2015-06-23 Wipro Limited Methods for identifying unique entities across data sources and devices thereof
US10262362B1 (en) 2014-02-14 2019-04-16 Experian Information Solutions, Inc. Automatic generation of code for attributes
US20150242407A1 (en) * 2014-02-22 2015-08-27 SourceThought, Inc. Discovery of Data Relationships Between Disparate Data Sets
WO2015161899A1 (en) * 2014-04-25 2015-10-29 Hewlett Packard Development Company L.P. Determine relationships between entities in datasets
US9977807B1 (en) * 2017-02-13 2018-05-22 Sas Institute Inc. Distributed data set indexing
US10445152B1 (en) 2014-12-19 2019-10-15 Experian Information Solutions, Inc. Systems and methods for dynamic report generation based on automatic modeling of complex data structures

Citations (66)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4479196A (en) * 1982-11-15 1984-10-23 At&T Bell Laboratories Hyperedge entity-relationship data base systems
US5634049A (en) * 1995-03-16 1997-05-27 Pitkin; John R. Method and apparatus for constructing a new database from overlapping databases
US5724575A (en) * 1994-02-25 1998-03-03 Actamed Corp. Method and system for object-based relational distributed databases
US5758351A (en) * 1995-03-01 1998-05-26 Sterling Software, Inc. System and method for the creation and use of surrogate information system objects
US5826250A (en) * 1996-06-19 1998-10-20 Pegasystems Inc. Rules bases and methods of access thereof
US5860917A (en) * 1997-01-15 1999-01-19 Chiron Corporation Method and apparatus for predicting therapeutic outcomes
US5978791A (en) * 1995-04-11 1999-11-02 Kinetech, Inc. Data processing system using substantially unique identifiers to identify data items, whereby identical data items have the same identifiers
US5991758A (en) * 1997-06-06 1999-11-23 Madison Information Technologies, Inc. System and method for indexing information about entities from different information sources
US5991714A (en) * 1998-04-22 1999-11-23 The United States Of America As Represented By The National Security Agency Method of identifying data type and locating in a file
US6047280A (en) * 1996-10-25 2000-04-04 Navigation Technologies Corporation Interface layer for navigation system
US6311186B1 (en) * 1998-02-20 2001-10-30 Priority Call Management, Inc. Telecommunications switching system utilizing a channelized database access mechanism
US20020010686A1 (en) * 2000-04-04 2002-01-24 Whitesage Michael D. System and method for managing purchasing contracts
US20020038296A1 (en) * 2000-02-18 2002-03-28 Margolus Norman H. Data repository and method for promoting network storage of data
US20020038304A1 (en) * 2000-06-30 2002-03-28 Boris Gelfand Data cells and data cell generations
US20020059260A1 (en) * 2000-10-16 2002-05-16 Frank Jas Database method implementing attribute refinement model
US20020178271A1 (en) * 2000-11-20 2002-11-28 Graham Todd D. Dynamic file access control and management
US20030018616A1 (en) * 2001-06-05 2003-01-23 Wilbanks John Thompson Systems, methods and computer program products for integrating databases to create an ontology network
US6523041B1 (en) * 1997-07-29 2003-02-18 Acxiom Corporation Data linking system and method using tokens
US20030046280A1 (en) * 2001-09-05 2003-03-06 Siemens Medical Solutions Health Services Corporat Ion System for processing and consolidating records
US20030046213A1 (en) * 2001-08-31 2003-03-06 Vora Poorvi L. Anonymous processing of usage rights with variable degrees of privacy and accuracy
US20030126155A1 (en) * 2001-12-28 2003-07-03 Parker Daniel J. Method and apparatus for generating a weather index
US6629097B1 (en) * 1999-04-28 2003-09-30 Douglas K. Keith Displaying implicit associations among items in loosely-structured data sets
US6643642B1 (en) * 1999-12-07 2003-11-04 Bitpipe Communication, Inc. Hierarchical mapped database system for identifying searchable terms associated with data nodes
US20040001568A1 (en) * 2002-06-03 2004-01-01 Lockheed Martin Corporation System and method for detecting alteration of objects
US20040078364A1 (en) * 2002-09-03 2004-04-22 Ripley John R. Remote scoring and aggregating similarity search engine for use with relational databases
USRE38572E1 (en) * 1997-11-17 2004-08-31 Donald Tetro System and method for enhanced fraud detection in automated electronic credit card processing
US6801915B1 (en) * 1999-07-28 2004-10-05 Robert Mack Paired keys for data structures
US20050021551A1 (en) * 2003-05-29 2005-01-27 Locateplus Corporation Current mailing address identification and verification
US20050060332A1 (en) * 2001-12-20 2005-03-17 Microsoft Corporation Methods and systems for model matching
US6879983B2 (en) * 2000-10-12 2005-04-12 Qas Limited Method and apparatus for retrieving data representing a postal address from a plurality of postal addresses
US6895412B1 (en) * 2001-04-12 2005-05-17 Ncr Corporation Methods for dynamically configuring the cardinality of keyword attributes
US20050222894A1 (en) * 2003-09-05 2005-10-06 Moshe Klein Universal transaction identifier
US20050246268A1 (en) * 2002-08-16 2005-11-03 Inter-Net Payments Patents Limeted Funds transfer method and system
US7000183B1 (en) * 1999-09-27 2006-02-14 John M. Crawford, Jr. Method and apparatus for viewer-specific presentation of information
US20060053107A1 (en) * 2004-09-07 2006-03-09 Stuart Robert O More efficient search algorithm (MESA) using virtual search parameters
US20060080300A1 (en) * 2001-04-12 2006-04-13 Primentia, Inc. System and method for organizing data
US20060116827A1 (en) * 2004-11-30 2006-06-01 Webb Peter G Systems and methods for producing chemical array layouts
US20060123010A1 (en) * 2004-09-15 2006-06-08 John Landry System and method for managing data in a distributed computer system
US20060212487A1 (en) * 2005-03-21 2006-09-21 Kennis Peter H Methods and systems for monitoring transaction entity versions for policy compliance
US20060217925A1 (en) * 2005-03-23 2006-09-28 Taron Maxime G Methods for entity identification
US20060230039A1 (en) * 2005-01-25 2006-10-12 Markmonitor, Inc. Online identity tracking
US20060277176A1 (en) * 2005-06-01 2006-12-07 Mydrew Inc. System, method and apparatus of constructing user knowledge base for the purpose of creating an electronic marketplace over a public network
US7184653B2 (en) * 2000-11-22 2007-02-27 Microsoft Corporation Unique digital content identifier generating methods and arrangements
US20070067298A1 (en) * 2004-04-21 2007-03-22 Thomas Stoneman Two-stage data validation and mapping for database access
US20070078842A1 (en) * 2005-09-30 2007-04-05 Zola Scot G System and method for responding to a user reference query
US7243108B1 (en) * 2001-10-14 2007-07-10 Frank Jas Database component packet manager
US20070217317A1 (en) * 2006-03-20 2007-09-20 Nec Electronics Corporation Optical disk device, playback method of the optical disk device, and reproduction signal generating circuit
US20070282796A1 (en) * 2004-04-02 2007-12-06 Asaf Evenhaim Privacy Preserving Data-Mining Protocol
US7307820B2 (en) * 2004-06-21 2007-12-11 Siemens Energy & Automation, Inc. Systems, methods, and device for arc fault detection
US20080127296A1 (en) * 2006-11-29 2008-05-29 International Business Machines Corporation Identity assurance method and system
US7403942B1 (en) * 2003-02-04 2008-07-22 Seisint, Inc. Method and system for processing data records
US20080195579A1 (en) * 2004-03-19 2008-08-14 Kennis Peter H Methods and systems for extraction of transaction data for compliance monitoring
US7461077B1 (en) * 2001-07-31 2008-12-02 Nicholas Greenwood Representation of data records
US7516137B1 (en) * 2000-03-21 2009-04-07 Arrayworks Inc. System and method for dynamic management of business processes
US20090177201A1 (en) * 2007-11-14 2009-07-09 Michael Soltz Staple with Multiple Cross Sectional Shapes
US20090222442A1 (en) * 2005-11-09 2009-09-03 Henry Houh User-directed navigation of multimedia search results
US7614054B2 (en) * 2003-12-31 2009-11-03 Intel Corporation Behavioral model based multi-threaded architecture
US7739287B1 (en) * 2004-06-11 2010-06-15 Seisint, Inc. System and method for dynamically creating keys in a database system
US7792864B1 (en) * 2006-06-14 2010-09-07 TransUnion Teledata, L.L.C. Entity identification and/or association using multiple data elements
US20100287158A1 (en) * 2003-07-22 2010-11-11 Kinor Technologies Inc. Information access using ontologies
US20100325145A1 (en) * 2009-06-17 2010-12-23 Pioneer Corporation Search word candidate outputting apparatus, search apparatus, search word candidate outputting method, computer-readable recording medium in which search word candidate outputting program is recorded, and computer-readable recording medium in which data structure is recorded
US7912842B1 (en) * 2003-02-04 2011-03-22 Lexisnexis Risk Data Management Inc. Method and system for processing and linking data records
US7940685B1 (en) * 2005-11-16 2011-05-10 At&T Intellectual Property Ii, Lp Method and apparatus for monitoring a network
US7966495B2 (en) * 2005-03-21 2011-06-21 Revinetix, Inc. Conserving file system with backup and validation
US8001153B2 (en) * 2003-05-29 2011-08-16 Experian Marketing Solutions, Inc. System, method and software for providing persistent personal and business entity identification and linking personal and business entity information in an integrated data repository
US8363720B2 (en) * 2009-01-26 2013-01-29 Panasonic Corporation Moving image processing device, moving image processing method and imaging apparatus

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7526487B1 (en) * 1999-10-29 2009-04-28 Computer Sciences Corporation Business transaction processing systems and methods
US7103605B1 (en) * 1999-12-10 2006-09-05 A21, Inc. Timeshared electronic catalog system and method
US20030200216A1 (en) * 2002-01-22 2003-10-23 Recording Industry Association Of America Method and system for identification of music industry releases and licenses
US20060161573A1 (en) * 2005-01-14 2006-07-20 International Business Machines Corporation Logical record model entity switching
US7203677B1 (en) * 2006-01-05 2007-04-10 International Business Machines Corporation Creation of duration episodes from single time events

Patent Citations (70)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4479196A (en) * 1982-11-15 1984-10-23 At&T Bell Laboratories Hyperedge entity-relationship data base systems
US5724575A (en) * 1994-02-25 1998-03-03 Actamed Corp. Method and system for object-based relational distributed databases
US5758351A (en) * 1995-03-01 1998-05-26 Sterling Software, Inc. System and method for the creation and use of surrogate information system objects
US5634049A (en) * 1995-03-16 1997-05-27 Pitkin; John R. Method and apparatus for constructing a new database from overlapping databases
US20020052884A1 (en) * 1995-04-11 2002-05-02 Kinetech, Inc. Identifying and requesting data in network using identifiers which are based on contents of data
US5978791A (en) * 1995-04-11 1999-11-02 Kinetech, Inc. Data processing system using substantially unique identifiers to identify data items, whereby identical data items have the same identifiers
US5826250A (en) * 1996-06-19 1998-10-20 Pegasystems Inc. Rules bases and methods of access thereof
US6047280A (en) * 1996-10-25 2000-04-04 Navigation Technologies Corporation Interface layer for navigation system
US5860917A (en) * 1997-01-15 1999-01-19 Chiron Corporation Method and apparatus for predicting therapeutic outcomes
US5991758A (en) * 1997-06-06 1999-11-23 Madison Information Technologies, Inc. System and method for indexing information about entities from different information sources
US6523041B1 (en) * 1997-07-29 2003-02-18 Acxiom Corporation Data linking system and method using tokens
USRE38572E1 (en) * 1997-11-17 2004-08-31 Donald Tetro System and method for enhanced fraud detection in automated electronic credit card processing
US6311186B1 (en) * 1998-02-20 2001-10-30 Priority Call Management, Inc. Telecommunications switching system utilizing a channelized database access mechanism
US5991714A (en) * 1998-04-22 1999-11-23 The United States Of America As Represented By The National Security Agency Method of identifying data type and locating in a file
US6629097B1 (en) * 1999-04-28 2003-09-30 Douglas K. Keith Displaying implicit associations among items in loosely-structured data sets
US6801915B1 (en) * 1999-07-28 2004-10-05 Robert Mack Paired keys for data structures
US7000183B1 (en) * 1999-09-27 2006-02-14 John M. Crawford, Jr. Method and apparatus for viewer-specific presentation of information
US6643642B1 (en) * 1999-12-07 2003-11-04 Bitpipe Communication, Inc. Hierarchical mapped database system for identifying searchable terms associated with data nodes
US20020038296A1 (en) * 2000-02-18 2002-03-28 Margolus Norman H. Data repository and method for promoting network storage of data
US7516137B1 (en) * 2000-03-21 2009-04-07 Arrayworks Inc. System and method for dynamic management of business processes
US20020010686A1 (en) * 2000-04-04 2002-01-24 Whitesage Michael D. System and method for managing purchasing contracts
US20020038304A1 (en) * 2000-06-30 2002-03-28 Boris Gelfand Data cells and data cell generations
US6879983B2 (en) * 2000-10-12 2005-04-12 Qas Limited Method and apparatus for retrieving data representing a postal address from a plurality of postal addresses
US20020059260A1 (en) * 2000-10-16 2002-05-16 Frank Jas Database method implementing attribute refinement model
US6694325B2 (en) * 2000-10-16 2004-02-17 Frank Jas Database method implementing attribute refinement model
US20020178271A1 (en) * 2000-11-20 2002-11-28 Graham Todd D. Dynamic file access control and management
US7184653B2 (en) * 2000-11-22 2007-02-27 Microsoft Corporation Unique digital content identifier generating methods and arrangements
US6895412B1 (en) * 2001-04-12 2005-05-17 Ncr Corporation Methods for dynamically configuring the cardinality of keyword attributes
US20060080300A1 (en) * 2001-04-12 2006-04-13 Primentia, Inc. System and method for organizing data
US20030018616A1 (en) * 2001-06-05 2003-01-23 Wilbanks John Thompson Systems, methods and computer program products for integrating databases to create an ontology network
US7461077B1 (en) * 2001-07-31 2008-12-02 Nicholas Greenwood Representation of data records
US20030046213A1 (en) * 2001-08-31 2003-03-06 Vora Poorvi L. Anonymous processing of usage rights with variable degrees of privacy and accuracy
US20030046280A1 (en) * 2001-09-05 2003-03-06 Siemens Medical Solutions Health Services Corporat Ion System for processing and consolidating records
US7243108B1 (en) * 2001-10-14 2007-07-10 Frank Jas Database component packet manager
US20050060332A1 (en) * 2001-12-20 2005-03-17 Microsoft Corporation Methods and systems for model matching
US20030126155A1 (en) * 2001-12-28 2003-07-03 Parker Daniel J. Method and apparatus for generating a weather index
US20040001568A1 (en) * 2002-06-03 2004-01-01 Lockheed Martin Corporation System and method for detecting alteration of objects
US7415613B2 (en) * 2002-06-03 2008-08-19 Lockheed Martin Corporation System and method for detecting alteration of objects
US20050246268A1 (en) * 2002-08-16 2005-11-03 Inter-Net Payments Patents Limeted Funds transfer method and system
US20040078364A1 (en) * 2002-09-03 2004-04-22 Ripley John R. Remote scoring and aggregating similarity search engine for use with relational databases
US7403942B1 (en) * 2003-02-04 2008-07-22 Seisint, Inc. Method and system for processing data records
US7912842B1 (en) * 2003-02-04 2011-03-22 Lexisnexis Risk Data Management Inc. Method and system for processing and linking data records
US20120078932A1 (en) * 2003-05-29 2012-03-29 Experian Marketing Solutions, Inc. System, Method and Software for Providing Persistent Entity Identification and Linking Entity Information in an Integrated Data Repository
US8001153B2 (en) * 2003-05-29 2011-08-16 Experian Marketing Solutions, Inc. System, method and software for providing persistent personal and business entity identification and linking personal and business entity information in an integrated data repository
US20050021551A1 (en) * 2003-05-29 2005-01-27 Locateplus Corporation Current mailing address identification and verification
US20100287158A1 (en) * 2003-07-22 2010-11-11 Kinor Technologies Inc. Information access using ontologies
US20050222894A1 (en) * 2003-09-05 2005-10-06 Moshe Klein Universal transaction identifier
US7614054B2 (en) * 2003-12-31 2009-11-03 Intel Corporation Behavioral model based multi-threaded architecture
US20080195579A1 (en) * 2004-03-19 2008-08-14 Kennis Peter H Methods and systems for extraction of transaction data for compliance monitoring
US20070282796A1 (en) * 2004-04-02 2007-12-06 Asaf Evenhaim Privacy Preserving Data-Mining Protocol
US20070067298A1 (en) * 2004-04-21 2007-03-22 Thomas Stoneman Two-stage data validation and mapping for database access
US7739287B1 (en) * 2004-06-11 2010-06-15 Seisint, Inc. System and method for dynamically creating keys in a database system
US7307820B2 (en) * 2004-06-21 2007-12-11 Siemens Energy & Automation, Inc. Systems, methods, and device for arc fault detection
US20060053107A1 (en) * 2004-09-07 2006-03-09 Stuart Robert O More efficient search algorithm (MESA) using virtual search parameters
US20060123010A1 (en) * 2004-09-15 2006-06-08 John Landry System and method for managing data in a distributed computer system
US20060116827A1 (en) * 2004-11-30 2006-06-01 Webb Peter G Systems and methods for producing chemical array layouts
US20060230039A1 (en) * 2005-01-25 2006-10-12 Markmonitor, Inc. Online identity tracking
US20060212487A1 (en) * 2005-03-21 2006-09-21 Kennis Peter H Methods and systems for monitoring transaction entity versions for policy compliance
US7966495B2 (en) * 2005-03-21 2011-06-21 Revinetix, Inc. Conserving file system with backup and validation
US20060217925A1 (en) * 2005-03-23 2006-09-28 Taron Maxime G Methods for entity identification
US20060277176A1 (en) * 2005-06-01 2006-12-07 Mydrew Inc. System, method and apparatus of constructing user knowledge base for the purpose of creating an electronic marketplace over a public network
US20070078842A1 (en) * 2005-09-30 2007-04-05 Zola Scot G System and method for responding to a user reference query
US20090222442A1 (en) * 2005-11-09 2009-09-03 Henry Houh User-directed navigation of multimedia search results
US7940685B1 (en) * 2005-11-16 2011-05-10 At&T Intellectual Property Ii, Lp Method and apparatus for monitoring a network
US20070217317A1 (en) * 2006-03-20 2007-09-20 Nec Electronics Corporation Optical disk device, playback method of the optical disk device, and reproduction signal generating circuit
US7792864B1 (en) * 2006-06-14 2010-09-07 TransUnion Teledata, L.L.C. Entity identification and/or association using multiple data elements
US20080127296A1 (en) * 2006-11-29 2008-05-29 International Business Machines Corporation Identity assurance method and system
US20090177201A1 (en) * 2007-11-14 2009-07-09 Michael Soltz Staple with Multiple Cross Sectional Shapes
US8363720B2 (en) * 2009-01-26 2013-01-29 Panasonic Corporation Moving image processing device, moving image processing method and imaging apparatus
US20100325145A1 (en) * 2009-06-17 2010-12-23 Pioneer Corporation Search word candidate outputting apparatus, search apparatus, search word candidate outputting method, computer-readable recording medium in which search word candidate outputting program is recorded, and computer-readable recording medium in which data structure is recorded

Cited By (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10878955B2 (en) 2006-09-26 2020-12-29 Centrifyhealth, Llc Individual health record system and apparatus
US11170879B1 (en) 2006-09-26 2021-11-09 Centrifyhealth, Llc Individual health record system and apparatus
US8359278B2 (en) 2006-10-25 2013-01-22 IndentityTruth, Inc. Identity protection
US10909617B2 (en) 2010-03-24 2021-02-02 Consumerinfo.Com, Inc. Indirect monitoring and reporting of a user's credit data
US10593004B2 (en) 2011-02-18 2020-03-17 Csidentity Corporation System and methods for identifying compromised personally identifiable information on the internet
US9710868B2 (en) 2011-02-18 2017-07-18 Csidentity Corporation System and methods for identifying compromised personally identifiable information on the internet
US9235728B2 (en) 2011-02-18 2016-01-12 Csidentity Corporation System and methods for identifying compromised personally identifiable information on the internet
US9558368B2 (en) 2011-02-18 2017-01-31 Csidentity Corporation System and methods for identifying compromised personally identifiable information on the internet
US8819793B2 (en) 2011-09-20 2014-08-26 Csidentity Corporation Systems and methods for secure and efficient enrollment into a federation which utilizes a biometric repository
US9237152B2 (en) 2011-09-20 2016-01-12 Csidentity Corporation Systems and methods for secure and efficient enrollment into a federation which utilizes a biometric repository
US9501583B2 (en) 2011-10-05 2016-11-22 Google Inc. Referent based search suggestions
US9305108B2 (en) 2011-10-05 2016-04-05 Google Inc. Semantic selection and purpose facilitation
US9594474B2 (en) 2011-10-05 2017-03-14 Google Inc. Semantic selection and purpose facilitation
US9652556B2 (en) 2011-10-05 2017-05-16 Google Inc. Search suggestions based on viewport content
US9032316B1 (en) 2011-10-05 2015-05-12 Google Inc. Value-based presentation of user-selectable computing actions
US9779179B2 (en) 2011-10-05 2017-10-03 Google Inc. Referent based search suggestions
US10013152B2 (en) 2011-10-05 2018-07-03 Google Llc Content selection disambiguation
US8890827B1 (en) 2011-10-05 2014-11-18 Google Inc. Selected content refinement mechanisms
US8878785B1 (en) 2011-10-05 2014-11-04 Google Inc. Intent determination using geometric shape input
US8825671B1 (en) * 2011-10-05 2014-09-02 Google Inc. Referent determination from selected content
US11568348B1 (en) 2011-10-31 2023-01-31 Consumerinfo.Com, Inc. Pre-data breach monitoring
US11030562B1 (en) 2011-10-31 2021-06-08 Consumerinfo.Com, Inc. Pre-data breach monitoring
JP2015512103A (en) * 2012-02-28 2015-04-23 シークオティエント インコーポレイテッド System, method and apparatus for identifying links between interactive digital data
US10592982B2 (en) 2013-03-14 2020-03-17 Csidentity Corporation System and method for identifying related credit inquiries
US11941635B1 (en) 2014-10-31 2024-03-26 Experian Information Solutions, Inc. System and architecture for electronic fraud detection
US10339527B1 (en) 2014-10-31 2019-07-02 Experian Information Solutions, Inc. System and architecture for electronic fraud detection
US11436606B1 (en) 2014-10-31 2022-09-06 Experian Information Solutions, Inc. System and architecture for electronic fraud detection
US10990979B1 (en) 2014-10-31 2021-04-27 Experian Information Solutions, Inc. System and architecture for electronic fraud detection
US11151468B1 (en) 2015-07-02 2021-10-19 Experian Information Solutions, Inc. Behavior analysis using distributed representations of event data
US11157650B1 (en) 2017-09-28 2021-10-26 Csidentity Corporation Identity security architecture systems and methods
US11580259B1 (en) 2017-09-28 2023-02-14 Csidentity Corporation Identity security architecture systems and methods
US10699028B1 (en) 2017-09-28 2020-06-30 Csidentity Corporation Identity security architecture systems and methods
US10896472B1 (en) 2017-11-14 2021-01-19 Csidentity Corporation Security and identity verification system and architecture
WO2019136407A1 (en) * 2018-01-08 2019-07-11 Equifax Inc. Facilitating entity resolution, keying, and search match without transmitting personally identifiable information in the clear
US11775679B2 (en) 2018-01-08 2023-10-03 Equifax Inc. Facilitating entity resolution, keying, and search match without transmitting personally identifiable information in the clear
US11074423B2 (en) 2018-01-29 2021-07-27 Hewlett-Packard Development Company, L.P. Object ID-centered workflow
US11688153B2 (en) 2018-01-29 2023-06-27 Hewlett-Packard Development Company, L.P. Object ID-centered workflow
US11669514B2 (en) 2019-04-03 2023-06-06 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
US11586613B2 (en) 2019-04-03 2023-02-21 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
US11593353B2 (en) 2019-04-03 2023-02-28 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
US11620278B2 (en) 2019-04-03 2023-04-04 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
US11636097B2 (en) 2019-04-03 2023-04-25 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
US11301461B2 (en) 2019-04-03 2022-04-12 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
US11281662B2 (en) 2019-04-03 2022-03-22 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
US11741085B2 (en) 2019-04-03 2023-08-29 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
US11755566B2 (en) 2019-04-03 2023-09-12 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
US11226959B2 (en) 2019-04-03 2022-01-18 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
US11775505B2 (en) 2019-04-03 2023-10-03 Unitedhealth Group Incorporated Managing data objects for graph-based data structures
CN110347480A (en) * 2019-06-26 2019-10-18 联动优势科技有限公司 The preferred access path method and device of data source containing coincidence data item label

Also Published As

Publication number Publication date
US7792864B1 (en) 2010-09-07

Similar Documents

Publication Publication Date Title
US7792864B1 (en) Entity identification and/or association using multiple data elements
US9710523B2 (en) System, method and software for providing persistent entity identification and linking entity information in a data repository
US6934714B2 (en) Method and system for identification and maintenance of families of data records
US20220284017A1 (en) Systems and methods for rapid data analysis
US8615521B2 (en) Real time data warehousing
US7373669B2 (en) Method and system for determining presence of probable error or fraud in a data set by linking common data values or elements
US7636719B2 (en) Contact schema
US20060112133A1 (en) System and method for creating and maintaining data records to improve accuracy thereof
US20070038664A1 (en) Real time data warehousing
US20080114730A1 (en) Batching document identifiers for result trimming
US20110119249A1 (en) Index backbone join
US7840546B2 (en) Method and apparatus for conducting data queries using consolidation strings and inter-node consolidation
EP2245554A1 (en) Systems, methods, and software for entity relationship resolution
US20030041068A1 (en) System and method for creating and maintaining data records to improve accuracy thereof
US10922299B2 (en) Correlating multiple tables in a non-relational database environment
US20060212416A1 (en) Method for processing data to optimize and categorize matches
US20130254168A1 (en) Data Integrity Validation
CN108614818B (en) Data storage, updating and query method and device
Moran Comparing Matchers to Enhance Front-End Capture of Duplicate Addresses

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION