US20040215656A1 - Automated data mining runs - Google Patents

Automated data mining runs Download PDF

Info

Publication number
US20040215656A1
US20040215656A1 US10/423,011 US42301103A US2004215656A1 US 20040215656 A1 US20040215656 A1 US 20040215656A1 US 42301103 A US42301103 A US 42301103A US 2004215656 A1 US2004215656 A1 US 2004215656A1
Authority
US
United States
Prior art keywords
data
analytical
data source
processing
mining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/423,011
Inventor
Marcus Dill
Harish Mahabal
Lakshmi Shankar
Jens Weidner
Bernd Ecker
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SAP SE
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US10/423,011 priority Critical patent/US20040215656A1/en
Assigned to SAP AKTIENGESELLSCHAFT reassignment SAP AKTIENGESELLSCHAFT ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MAHABAL, HARISH HOSKERE, SHANKAR, LAKSHMI, ECKER, BERND, DILL, MARCUS, WEIDNER, JENS
Priority to US10/816,909 priority patent/US7571191B2/en
Priority to US10/816,910 priority patent/US20040267751A1/en
Priority to PCT/EP2004/003742 priority patent/WO2004097667A2/en
Priority to EP04726131A priority patent/EP1623343A2/en
Publication of US20040215656A1 publication Critical patent/US20040215656A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99942Manipulating data structure, e.g. compression, compaction, compilation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99943Generating database or data structure, e.g. via user interface

Definitions

  • This description relates to loading and using data in a data warehouse on a computer system.
  • Computer systems often are used to manage and process business data. To do so, a business enterprise may use various application programs running on one or more computer systems. Application programs may be used to process business transactions, such as taking and fulfilling customer orders, providing supply chain and inventory management, performing human resource management functions, and performing financial management functions. Data used in business transactions may be referred to as transaction data or operational data. Often, transaction processing systems provide real-time access to data, and such systems may be referred to as on-line transaction processing (OLTP) systems.
  • OLTP on-line transaction processing
  • Application programs also may be used for analyzing data, including analyzing data obtained through transaction processing systems.
  • the data needed for analysis may have been produced by various transaction processing systems and may be located in many different data management systems.
  • a large volume of data may be available to a business enterprise for analysis.
  • An analysis data repository may store data obtained from transaction processing systems and used for analytical processing.
  • the analysis data repository may be referred to as a data warehouse or a data mart.
  • the term data mart typically is used when an analysis data repository stores data for a portion of a business enterprise or stores a subset of data stored in another, larger analysis data repository, which typically is referred to as a data warehouse.
  • a business enterprise may use a sales data mart for sales data and a financial data mart for financial data.
  • Analytical processing may be used to analyze data stored in a data warehouse or other type of analytical data repository.
  • an analytical processing tool accesses the data warehouse on a real-time basis, the analytical processing tool may be referred to as an OLAP system.
  • An OLAP system may support complex analyses using a large volume of data.
  • An OLAP system may produce an information model using a three-dimensional presentation, which may be referred to as an information cube or a data cube.
  • One type of analytical processing identifies relationships in data stored in a data warehouse or another type of data repository.
  • the process of identifying data relationships by means of an automated computer process may be referred to as data mining.
  • a data mining mart may be used to store a subset of data extracted from a data warehouse.
  • a data mining process may be performed on data in the data mining mart, rather than the data mining process being performed on data in the data warehouse.
  • the results of the data mining process then are stored in the data warehouse.
  • the use of a data mining mart that is separate from a data warehouse may help decrease the impact on the data warehouse of a data mining process that requires significant system resources, such as processing capacity or input/output capacity.
  • data mining marts may be optimized for access by data mining analyses that provide faster and more flexible access.
  • One type of data relationship that may be identified by a data mining process is an associative relationship in which one data value is associated or otherwise occurs in conjunction with another data value or event.
  • an association between two or more products that are purchased by a customer at the same time may be identified by analyzing sales receipts or sales orders. This may be referred to as a sales basket analysis or a cross-selling analysis.
  • the association of products purchases may be based on a pairing of two products, such as when a customer purchases product A, the customer also purchases product B.
  • the analysis may also reveal relationships between three products, such as when a customer purchases product A and product B, the customer also typically purchases product C.
  • the results of a cross-selling analysis may be used to promote associated products, such as through a marketing campaign that promotes the associated products or by locating the associated products near one another in a retail store, such as by locating the products in the same aisle or shelf.
  • Customers that are at risk of not renewing a sales contract or not purchasing products in the future also may be identified by data mining. Such an analysis may be referred to as a churn analysis in which the likelihood of churn refers to the likelihood that a customer will not purchase products or services in the future.
  • a customer at risk of churning may be identified based on having similar characteristics to customers that have already churned. The ability to identify a customer at risk of churning may be advantageous, particularly when steps may be taken to reduce the number of customers who do churn.
  • a churn analysis may also be referred to as a customer loyalty analysis.
  • a customer may be able to switch from one telecommunication provider to another telecommunications provider relatively easily.
  • a telecommunications provider may be able to identify, using data mining techniques, particular customers that are likely to switch to a different telecommunications provider.
  • the telecommunications provider may be able to provide an incentive to at-risk customers to decrease the number of customers who switch.
  • a delay in performing data mining analysis may be problematic when the results of the analysis are most useful at a particular time. For example, the value of a churn prediction for a particular customer or group of customers may be time-sensitive. After a customer purchases a service or product elsewhere, the opportunity of the business enterprise to influence the behavior of the customer is lost. When the identification of a high likelihood of churning occurs after the customer has been lost, the data mining result is wasted.
  • Some aspects of creating or using a data warehouse may be automated, that is initiated without user manipulation.
  • an automated software agent may be employed to collect data from various distributed databases to collect data for a data warehouse.
  • a report or other type of output may be automatically generated and sent to various receiving devices, such as a personal digital assistant, a printer, or a pager.
  • the online transaction data may be automatically summarized and stored as summary data.
  • the invention automates the triggering of special analyses directly after having loaded new data in a data warehouse environment to enrich the newly loaded data with new attributes.
  • the invention automates, without requiring user manipulation, copying data from a data warehouse to a data mining mart, the triggering of a data mining procedure (such as a training or a prediction procedure) that enriches the data with new attributes, and the triggering of the upload of the enriched data to the data warehouse.
  • the invention also may automate, without requiring user manipulation, the loading of transaction data from a source system into a data warehouse before triggering the data mining process.
  • One area where the invention may find specific applicability is in performing a data mining procedure on a regular, predetermined basis. For example, sales receipts for a particular month may be automatically loaded into a data warehouse and analyzed for associative sales relationships. Another example is performing periodic analysis of customer activity to identify customers that are at risk of churning for the purpose of influencing customer behavior.
  • a data mining process may be automatically triggered.
  • An analytical process is triggered based on the presence of data in a data source that is used for analytical processing.
  • the analytical process is performed on data from the data source after the analytical process has been triggered.
  • the analytical process uses a procedure that also is usable in a data extraction process.
  • the created data attribute is stored in the data source.
  • Implementations may include one or more of the features noted above and one or more of the following features.
  • the analytical process may be triggered based on the completion of a computer program for loading data to the data source that is used for analytical processing.
  • data may be extracted from a data source used for transaction processing and loaded to the data source that is used for analytical processing.
  • a person may initiate at most the step of extracting data from the data source used for transaction processing or the step of loading the extracted data. The occurrence of a predetermined date and time may trigger extracting the data or may trigger loading the data.
  • data In addition to extracting data from a transaction data source, data also may be extracted from the data source that is used for analytical processing and loaded to temporary data storage. The analytical process may be performed on the data stored in the temporary data storage.
  • the types of analytical processes that may be triggered include an analytical process to determine a relationship between two data values in the data source, or determine a relationship between two data values that predict a likelihood of whether a particular customer will fail to purchase a service or product in the future.
  • the analytical process also may apply a relationship that has been previously-determined to data values in the data source.
  • the analytical process also may identify products or services that are purchased in the same transaction. For example, the analytical process may determine the likelihood of whether a particular customer will fail to purchase a service or a product in the future. The likelihood may be based on characteristics associated with customers who have been identified as failing to purchase a service or a product.
  • Implementations of the techniques discussed above may include a method or process, a system or apparatus, or computer software on a computer-accessible medium.
  • the details of one or more implementations of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
  • FIG. 1 is a block diagram of a system incorporating various aspects of the invention.
  • FIG. 2 is a block diagram illustrating the enrichment of data stored in the data warehouse based on an automated data mining run.
  • FIGS. 3 and 4 are flow charts of processes to automate a data mining process.
  • FIG. 5 is a block diagram of the components of a software architecture for automating a data mining run.
  • FIG. 6 is a block diagram of a process to use a data mining workbench to design an automated data mining process.
  • FIG. 1 shows a block diagram of a system 100 of networked computers, including a computer system 110 for a data warehouse and transaction computer systems 120 and 130 .
  • the loading of new data to the data warehouse 110 from the transaction computer systems 120 and 130 triggers a special analysis to enrich the newly loaded data with new attributes.
  • the system 100 includes a computer system 110 for a data warehouse, a client computer 115 used to administer the data warehouse, and transaction computer systems 120 and 130 , all of which are capable of executing instructions on data.
  • each computer system 110 , 120 or 130 includes a server 140 , 142 or 144 and a data storage device 145 , 146 or 148 associated with each server.
  • Each of the data storage devices 145 , 146 and 148 includes data 150 , 152 or 154 and executable instructions 155 , 156 or 158 .
  • a particular portion of data here referred to as business objects 162 or 164 , is stored in computer systems 120 and 130 , respectively.
  • Each of business objects 162 or 164 includes multiple business objects.
  • Each business object in business objects 162 or 164 is a collection of data attribute values, and typically is associated with a principal entity represented in a computing device or a computing system.
  • Examples of a business object include information about a customer, an employee, a product, a business partner, a product, a sales invoice, and a sales order.
  • a business object may be stored as a row in a relational database table, an object instance in an object-oriented database, data in an extensible mark-up language (XML) file, or a record in a data file. Attributes are associated with a business object.
  • a customer business object may be associated with a series of attributes including a customer number uniquely identifying the customer, a first name, a last name, an electronic mail address, a mailing address, a daytime telephone number, an evening telephone number, date of first purchase by the customer, date of the most recent purchase by the customer, birth date or age of customer, and the income level of customer.
  • a sales order business object may include a customer number of the purchaser, the date on which the sales order was placed, and a list of products, services, or both products and services purchased.
  • the data warehouse computer system 110 stores a particular portion of data, here referred to as data warehouse 165 .
  • the data warehouse 165 is a central repository of data, extracted from transaction computer system 120 or 130 such as business objects 162 or 164 .
  • the data in the data warehouse 165 is used for special analyses, such as data mining analyses used to identify relationships among data.
  • the results of the data mining analysis also are stored in the data warehouse 165 .
  • the data warehouse computer system 110 includes an automated data mining process 168 having a data warehouse upload process 170 and a data mining analysis process 172 .
  • the data warehouse upload process 170 includes executable instructions for automatically extracting, transmitting and loading data from the transaction computer systems 120 and 130 to the data warehouse computer system 110 .
  • the data mining analysis process 172 includes executable instructions for triggering a data mining analysis in the data warehouse computer system 110 , and enriching the data in the data warehouse 165 with new attributes determined by the data mining analysis, as described more fully below.
  • the data warehouse computer system 110 also may include a data mining mart 174 that temporarily stores data from the data warehouse 165 for use in data mining.
  • the data mining analysis process 172 also may extract data from the data warehouse 165 , store the extracted data to the data mining mart 174 , trigger a data mining analysis that operates on the data from the data mining mart 174 , and enrich the data in the data warehouse 165 with the new attributes determined by the data mining analysis.
  • the data warehouse computer system 110 is capable of delivering and exchanging data with the transaction computer systems 120 and 130 through a wired or wireless communication pathway 176 and 178 , respectively.
  • the data warehouse computer system 110 also is able to communicate with the on-line client 115 that is connected to the computer system 110 through a communication pathway 176 .
  • the data warehouse computer system 110 , the transaction computer systems 120 and 130 , and the on-line client 115 may be arranged to operate within or in concert with one or more other systems, such as, for example, one or more LANs (“Local Area Networks”) and/or one or more WANs (“Wide Area Networks”).
  • the on-line client 115 may be a general-purpose computer that is capable of operating as a client of the application program (e.g., a desktop personal computer, a workstation, or a laptop computer running an application program), or a more special-purpose computer (e.g., a device specifically programmed to operate as a client of a particular application program).
  • the on-line client 115 uses communication pathway 182 to communicate with the data warehouse computer system 110 .
  • FIG. 1 illustrates only a single on-line client 115 for system 100 .
  • the data warehouse computer system 110 initiates an automated data mining process. This may be accomplished, for example, through the use of a task scheduler (not shown) that initiates the automated data mining process at a particular day and time.
  • the automated data mining process 1) uses the data warehouse upload process 170 to initiate the extraction, transformation and loading of data to the data warehouse 165 from the source systems 120 and 130 , and 2) uses the data mining analysis process 172 to initiate a data mining run that creates new attributes by performing a special analysis of the data and loads the new attributes to the data warehouse 165 without user manipulation.
  • a particular automated data mining run may be scheduled as a recurring event based on the occurrence of a predetermined time or date (such as the first day of a month, every Saturday at one o'clock a.m., or the first day of a quarter). Examples of automated data mining processes are described more fully in FIGS. 3-5.
  • the data warehouse computer system 110 uses the automated data mining process 168 to initiate the data warehouse upload process 170 .
  • the data warehouse upload process 170 extracts or copies a portion of data, such as all or some of business objects 162 , from the data storage 146 of the transaction computer system 120 .
  • the extracted data is transmitted over the connection 176 to the data warehouse computer system 110 , where the extracted data are stored in data warehouse 165 .
  • the data warehouse computer system 110 also may transform the extracted data from a format suitable to computer system 110 into a different format that is suitable for the data warehouse computer system 110 .
  • the data warehouse computer system 110 may extract a portion of data from data storage 154 of the computer system 130 , such as all or some of business objects 164 , transmit the extracted data over connection 178 , store the extracted data in the data warehouse 165 , and optionally transform the extracted data.
  • the automated data mining process 168 initiates the data mining analysis 172 .
  • the data mining analysis 172 performs a particular data mining procedure to analyze data from the data warehouse 165 , enrich the data with new attributes, and store the enriched data in the data warehouse 165 .
  • a particular data mining procedure also may be referred to as a data mining run.
  • a data mining run may be a training run in which data relationships are determined, a prediction run that applies a determined relationship to a collection of data relevant to a future event, such as a customer failing to renew a service contract or make another purchase, or both a training run and a prediction run.
  • the prediction run results in the creation of a new attribute for each business object in the data warehouse 165 .
  • the creation of a new attribute may be referred to as data enrichment.
  • an attribute for the likelihood of churn for each customer is stored in the data warehouse 165 . That is, the data warehouse 165 is enriched with the new attribute.
  • the combination of the data warehouse upload process 170 and the data mining analysis 172 in the automated data mining process 168 may increase the coupling of the data mining with the upload of new data to the data warehouse, which, in turn, may reduce the time until the results of new data mining analyses are available.
  • the combination of the data warehouse upload process 170 and the data mining analysis 172 in the automated data mining process 168 also enable the use of the same monitoring process to monitor both the data warehouse load process 170 and the data mining analysis process 172 , which, in turn, may help simplify the monitoring of the automated data mining process 168 .
  • the data warehouse computer system 110 also includes a data warehouse monitor 180 that reports on the administration of the automated data mining process 168 .
  • a data warehouse monitor 180 that reports on the administration of the automated data mining process 168 .
  • an end user of online client 115 is able to view when an automated data mining process is scheduled to next occur, the frequency or other basis on which the automated data mining process is scheduled, and the status of the automated data mining process.
  • the end user may be able to determine that the automated data mining process 168 is executing.
  • the end user may be able to view the progress and status of each of the steps within the data mining method.
  • the end user may be able to view the time that the data warehouse upload process 168 was initiated.
  • the ability to monitor the execution of the automated data mining process may be useful to ensure that the automated data mining process 168 is operating as desired.
  • a notification of the problem may be sent to an administrator for the data warehouse or other type of end user.
  • the use of the data warehouse monitor 180 with both the data upload process 170 and the data mining analysis 172 may be advantageous.
  • a system administrator or another type of user need only access a single monitoring process (here, data warehouse monitor 180 ) to monitor both sub-processes (here, the data upload process 170 and the data mining analysis 172 ).
  • the use of the same monitoring process for different sub-processes may result in consistent process behavior across the different sub-processes.
  • the use of the same monitoring process also may reduce the amount of training required for system administrators to be able to use the data warehouse monitor 180 .
  • the ability to trigger special analyses directly after having loaded new data in a data warehouse environment to enrich the newly loaded data by new attributes may be useful.
  • Multiple users, often geographically or organizationally distributed, are typically responsible for performing different aspects of the process, all aspects of which must be completed before the newly loaded data is enriched by the special analyses. This may result in a delay from the time when the transaction data is available for analysis to the time when the results of the analysis are available.
  • the delay may be significant or may negatively impact the business enterprise. For example, a business enterprise may be harmed by lost sales by the delay of product arrangements in a retail store based on a cross-selling analysis or by the delay of a promotional marketing campaign to target at-risk customers.
  • FIG. 2 shows the results 200 of enriching the data stored in the data warehouse based on an automated data mining process.
  • the results 200 are stored in a relational database system that logically organizes data into a database table.
  • the database table arranges data associated with an entity (here, a customer) in a series of columns 210 - 216 and rows 220 - 223 .
  • Each column 210 , 211 , 212 , 213 , 214 , 215 , or 216 describes an attribute of the customer for which data is being stored.
  • Each row 220 , 221 , 222 or 223 represents a collection of attribute values for a particular customer number by a customer identifier 210 .
  • the attributes 210 - 215 were extracted from a source system, such as a customer relationship management system, and loaded into the data warehouse.
  • the attribute 216 represents the likelihood of churn for each customer 220 , 221 , 222 and 223 .
  • the likelihood-of-churn attribute 216 was created and loaded into the data warehouse by an automated data mining process, such as the automated data mining process described in FIGS. 1, 3 and 4 .
  • FIG. 3 illustrates an automated data mining process 300 .
  • the automated data mining process 300 may be performed by a processor on a computing system, such as data warehouse computer system 110 of FIG. 1.
  • the automated data mining processor is directed by a method, script, or other type of computer program that includes executable instructions for performing the automated data mining process 300 .
  • An example of such a collection of executable instructions is the automated data mining process 168 of FIG. 1.
  • the automated data mining process 300 includes an extract, transform and load (ETL) sub-process 310 , a data mining sub-process 320 , and a data enrichment sub-process 330 .
  • the automated data mining process 300 begins at a predetermined time and date, typically a recurring predetermined time and date.
  • a system administrator or another type of user may manually initiate the automated data mining process 300 .
  • the automated data mining process 300 once initiated, automatically triggers sub-processes 310 , 320 and 330 without requiring further user manipulation.
  • a churn management automated data mining process may be associated with a script that includes a remote procedure call to extract data from one or more source systems in step 340 , a computer program to transform the extracted data, a database script for loading the data warehouse with the transformed data, and a computer program to perform a churn analysis on the customer data in the data warehouse.
  • a script that includes a remote procedure call to extract data from one or more source systems in step 340
  • a computer program to transform the extracted data
  • a database script for loading the data warehouse with the transformed data
  • a computer program to perform a churn analysis on the customer data in the data warehouse.
  • the data warehouse processor extracts from a source system appropriate data and transmits the extracted data to the data warehouse (step 340 ).
  • the data warehouse processor may execute a remote procedure call on the source system to trigger the extraction and transmission of data from the source system to the computer system on which the data warehouse resides.
  • the data warehouse processor may connect to a web service on the source system to request the extraction and transmission of the data.
  • the data to be extracted is data from a transaction system, such as an OLTP system.
  • the data extracted may be a complete set of the appropriate data (such as all sales orders or all customers) from the source system, or may be only the data that has been changed since the last extraction.
  • the processor may extract and transmit the data from the source system in a series of data groups, such as data blocks.
  • the extraction may be performed either as a background process or an on-line process, as may the transmission.
  • the ability to extract and transmit data in groups, extract and transmit only changed data, and extract and transmit as a background process may collectively or individually be useful, particularly when a large volume of data is to be extracted and transmitted.
  • the extracted data also may be transformed from the format used by the source system to a different format used by the data warehouse (step 345 ).
  • the data transformation may include transforming data values representing a particular attribute to a different field type and length that is used by the data warehouse.
  • the data transformation also may include translating a data code used by the source system to a corresponding but different data code used by the data warehouse.
  • the source system may store a country value using a numeric code (for example, a “1” for the United States and a “2” for the United Kingdom) whereas the data warehouse may store a country value as a textual abbreviation (for example, “U.S.” for the United States and “U.K.” for the United Kingdom).
  • the data transformation also may include translating a proprietary key numbering system in which primary keys are created by sequentially allocating numbers within an allocated number range to a corresponding GUID (“globally unique identifier”) key that is produced from a well-known algorithm and is able to be processed by any computer system using the well-known algorithm.
  • the processor may use a translation table or other software engineering or programming techniques to perform the transformations required. For example, the processor may use a translation table that translates the various possible values from one system to another system for a particular data attribute (for example, translating a country code of “1” to “U.S.” and “2” to “U.K.” or translating a particular proprietary key to a corresponding GUID key).
  • the processor may aggregate data or generate additional data values based on the extracted data.
  • the processor may determine a geographic region for a customer based on the customer's mailing address or may determine the total amount of sales to a particular customer that is associated with multiple sales orders.
  • the data warehouse processor loads the extracted data into data storage associated with the data warehouse, such as the data warehouse 165 of FIG. 1 (step 350 ).
  • the data warehouse processor may execute a computer program having executable instructions for loading the extracted data into the data storage and identified by the automated data mining method directing the process 300 .
  • a database script may be executed that includes database commands to load the data to the data warehouse.
  • the use of a separate computer program for loading the data may increase the modularity of the data mining method, which, in turn, may improve the efficiency of modifying the automated data mining process 300 .
  • Steps 340 - 350 may be referred to as the ETL sub-process 310 .
  • the data warehouse processor automatically triggers a data mining process (step 360 ). This may be accomplished, for example, by using a script or other type of computer program to control the execution of multiple programs.
  • the data warehouse processor performs a data mining run (step 365 ). To do so, the data warehouse processor may apply a data mining model or another type of collection of data mining rules that defines the type of analysis to be performed. The data mining model may be applied to all or a portion of the data in the data warehouse. In some implementations, the data warehouse processor may store the data to be used in the data mining run in transient or persistent storage peripheral to the data warehouse processor where the data is accessed during the data mining run. This may be particularly advantageous when the data warehouse includes a very large volume of data and/or the data warehouse also is used for OLAP processing. In some cases, the storage of the data to transient or persistent storage may be referred to as extracting or staging the data to a data mart for data mining purposes.
  • the data mining run may be a training run or a prediction run. In some implementations, both a training run and a prediction run may be performed during process 300 .
  • the results of the data mining run are stored in temporary storage. To do so, the data warehouse process may copy the results stored in the temporary data structure to the data warehouse. For example, in a customer churn analysis data mining process, the likelihood of churn for each customer may be assessed and stored in a temporary results data structure. Steps 360 - 365 may be referred to as a data mining sub-process 320 .
  • the data warehouse processor stores the data mining results in the data warehouse (step 370 ). For example, a new column for the data mining results may be added to a table in a relational data management system being used for the date warehouse.
  • the likelihood of churn for each customer may be added as a new attribute in the data warehouse and appropriately populated with the likelihood data generated when the data mining run was performed in step 365 .
  • the process of storing the data created by the data mining run in the data warehouse may be referred to as a data enrichment sub-process 330 .
  • the process 300 may be used for an automated customer-churn data mining process.
  • a system administrator develops computer programs, each of which are executed to accomplish a portion of the automated customer-churn data mining process.
  • the system administrator also develops a script that identifies each of the computer programs to be executed and the order in which the computer programs are to be executed to accomplish the automated customer-churn data mining process.
  • the system administrator using a task scheduling program schedules the automated customer-churn data mining script to be triggered on a monthly basis, such as on the first Saturday of each month and beginning at one o'clock a.m.
  • the task scheduling program triggers the data warehouse processor to execute the automated customer-churn data mining script.
  • the data warehouse processor executes a remote procedure call in a customer relationship management system to extract customer data and transmit the data to the data warehouse computer system.
  • the data warehouse computer system receives and stores the extracted customer data.
  • the data warehouse processor executes a computer program, as directed by the executing automated customer-churn data mining process script, to transform the customer data to a format usable by the data warehouse.
  • the data warehouse processor continues to execute the automated customer-churn data mining process script, which then triggers a data mining training run to identify hidden relationships within the customer data. Specifically, the characteristics of customers who have not renewed a service contract in the last eighteen months are identified. The characteristics identified may include, for example, an income above or below a particular level, a geographic region in which the non-returning customer resides, the types of service contract that were not renewed, and the median age of a non-renewing customer.
  • the data warehouse processor then, under the continued direction of the automated customer-churn data mining process script, triggers a data mining prediction run to identify particular customers who are at risk of not renewing a service contract, the prediction is made based on the customer characteristics identified in the data mining training run.
  • the data warehouse processor determines a likelihood-of-churn for each customer.
  • the data warehouse is enriched with the likelihood-of-churn for each customer such that a likelihood-of-churn attribute is added to the customer data in the data warehouse and the likelihood-of-churn value for each value is stored in the new attribute.
  • a subsequent likelihood-of-churn value for a customer is determined, such as a likelihood-of-churn value for a customer that is determined in the following month
  • the likelihood-of-churn value from the previous data mining prediction run may be replaced so that a customer has only one likelihood-of-churn value at any time.
  • some implementations may store the new likelihood-of-churn value each month, in addition to a previous value for the likelihood-of-churn, to develop a time-dependent prediction—that is, a new prediction for the same type of prediction is stored each time a prediction run is performed for a customer.
  • the time-dependent prediction may help improve the accuracy of the data mining training runs because the predicted values may be monitored over time and compared with actual customer behavior.
  • FIG. 4 illustrates another example of an automated data mining process.
  • automated data mining process 400 replicates data from a data warehouse, such as data warehouse 165 in FIG. 1, to a data mining mart, such as data mining mart 174 of FIG. 1.
  • the data mining process 400 then performs the data mining analysis on data in the data mart, and stores the data mining results as enriched data in the data warehouse.
  • the automated data mining process 400 may be performed by a processor on a computing system, such as data warehouse computer system 110 of FIG. 1.
  • the automated data mining processor is directed by a method, script, or other type of computer program that includes executable instructions for performing the automated data mining process 400 .
  • An example of such a collection of executable instructions is the automated data mining process 168 of FIG. 1.
  • the automated data mining process 400 includes an extract, transform and load (ETL) sub-process 410 , a data mining sub-process 420 that uses a data mart, and a data enrichment sub-process 430 .
  • the automated mining process 400 begins at a predetermined time and date, typically a recurring predetermined time and date.
  • the ETL sub-process 410 extracts data from a transactional processing or other type of source system and loads the data to a data warehouse, as described previously with respect to ETL sub-process 310 of FIG. 3.
  • the data warehouse processor automatically triggers a data mining run, as described with respect to step 360 in FIG. 3 (step 440 ).
  • the data warehouse processor copies data from the data warehouse to the data mining mart for use in a data mining run (step 450 ).
  • the data warehouse processor may insert into database tables of a data mining mart a copy of some of the data rows stored in the data warehouse.
  • the data warehouse processor may extract data from the data warehouse on a computer system and transmit the data to the data mart located on a different computer system.
  • the data warehouse processor then may execute a remote procedure call or other collection of executable instructions to load data into the data mart.
  • the data warehouse processor may replicate data from the data warehouse to the data mining mart—that is, the data warehouse processor copies the data to the data mining mart and synchronizes the data mining mart with the data warehouse such that changes made to one of the data warehouse or the data mining mart are reflected in all other of the data warehouse or the data mining mart.
  • the data warehouse processor may transform the data from the data warehouse before storing the data in the data mining mart.
  • the data warehouse processor then performs a data mining run, as described in step 365 in FIG. 3, using data in the data mining mart (step 460 ).
  • the steps 440 - 460 may be referred to as a data mining sub-process 420 .
  • the data warehouse processor stores the data mining results in the data warehouse (step 470 ), as described in step 370 and sub-process 330 in FIG. 3.
  • FIG. 5 depicts the components of a software architecture 500 for an automated data mining process.
  • the software architecture 500 may be used to implement the automated data mining process 300 described in FIG. 3 or the automated data mining process 400 described in FIG. 4.
  • the software architecture 500 may be implemented, for example, on computer system 110 of FIG. 1.
  • FIG. 5 also illustrates a data flow and a process flow using the components of the software architecture to implement the automated data mining process 400 in FIG. 4.
  • the software architecture 500 includes an automated data mining task scheduler 510 , a transaction data extractor 515 , and a data mining extractor 520 .
  • the software architecture also includes a transaction processing data management system 525 for a transaction processing system, such as transaction computer system 120 or transaction computer system 130 in FIG. 1.
  • the software architecture also includes a data warehouse 530 , such as the data warehouse 165 in FIG. 1, and a data mart 535 , such as the optional data mart 174 in FIG. 1.
  • One example of the automated data mining task scheduler 510 is a process chain for triggering the transaction data extractor 515 and the data mining extractor 520 at a predetermined date and time.
  • a process chain is a computer program that defines particular tasks that are to occur in a particular order at a predetermined date and time. For example, a system administrator or another type of user may schedule the process chain to occur at regular intervals, such as at one o-clock a.m. the first Saturday of a month, every Sunday at eight o'clock a.m., or at two o'clock a.m. on the first day and the fifteenth day of each month.
  • a process chain may include dependencies between the defined tasks in the process chain such that a subsequent task is not triggered until a previous task has been successfully completed.
  • the automated data mining task scheduler 510 is a process chain that calls two extractor processes: the transaction data extractor 515 and the data mining extractor 520 .
  • the data mining extractor 520 is only initiated after the successful completion of the transaction data extractor 515 .
  • the automated data mining task scheduler 510 starts the transaction data extractor 515 at a predetermined date and time, as illustrated by process flow 542 .
  • an extractor is a computer program that performs the extraction of data from a data source using a set of predefined settings. Typical settings for an extractor include data selection settings that identify the particular data attributes and data filter settings that identify the criteria that identifies the particular records to be extracted. For example, an extractor may identify three attributes—customer number, last purchase date, and amount of last purchase—that are to be extracted for all customers that are located in a particular geographic region. The extractor then reads the attribute values for the records that meet the filter condition from the data source, maps the data to the attributes included in the data warehouse, and loads the data to the data warehouse. An extractor also may be referred to as an upload process.
  • the transaction data extractor 515 extracts, using predefined settings, data from the transaction processing data management system, as indicated by data flow line 544 , and transforms the data as necessary to prepare the data to be loaded to the data warehouse 530 .
  • the transaction data extractor 515 then loads the extracted data to the data warehouse 530 , as indicated by data flow 546 .
  • the transaction data extractor 515 returns processing control to the automated data mining task scheduler 510 , as indicated by process flow 548 . When returning processing control, the transaction data extractor 515 also reports the successful completion of the extraction.
  • the automated data mining task scheduler 510 starts the data mining extractor 520 , as illustrated by process flow 552 .
  • the data mining extractor initiates a data mining process using the newly loaded transaction data in the data warehouse 530 .
  • the data mining process analyzes the data and writes the results back to the data warehouse.
  • the data mining extractor 520 extracts data from the data warehouse 530 (function 555 ), as illustrated by data flow 556 , and loads the extracted data to the data mart 535 , as illustrated by data flow 558 , for use by the data mining analysis.
  • the data mining extractor 520 then performs a data mining training analysis (function 560 ) using the data from the data mart 535 , as illustrated by data flow 562 .
  • the data mining extractor 520 updates the appropriate data mining model in data mining model 565 with the results of the data mining training analysis, as illustrated by data flow 564 .
  • the data mining extractor 520 uses the results of the data mining training analysis from a data mining model 564 , as illustrated by data flow 566 , to perform a data mining prediction analysis (function 568 ).
  • the data mining extractor 520 stores the results of the data mining prediction analysis in the data mart 535 , as illustrated by data flow 569 .
  • the data mining extractor 520 then performs a data enrichment function (function 570 ) using the results from the data mart 535 , as illustrated by data flow 572 , to load the data mining results into the data warehouse 530 , as illustrated by data flow 574 .
  • a data enrichment function (function 570 ) using the results from the data mart 535 , as illustrated by data flow 572 , to load the data mining results into the data warehouse 530 , as illustrated by data flow 574 .
  • the data mining extractor 520 After enriching the data warehouse 530 with the data mining analysis results, the data mining extractor 520 returns processing control to the automated data mining task scheduler 510 , as depicted by process flow 576 . When returning processing control, the data mining extractor 520 also reports to the automated data mining task scheduler 510 the successful completion of the data mining analyses and enrichment of the data warehouse. To do so, the data mining extractor 520 may report a return code that is consistent with a successful process.
  • a task scheduler here in the form of a process chain, to link the task of extracting the transaction data from a source system with the task of performing the data mining process may be useful.
  • the process for loading transaction data to the data warehouse is combined with an immediate data mining analysis and enrichment of the data warehouse data with the results of the analysis.
  • the linkage of the transactional data availability with the automatic performance of the data mining analysis may reduce, perhaps even substantially reduce, the lag between the time at which the transaction data first becomes available in the data warehouse and the time at which the data enriched with data mining analysis results becomes available in the data warehouse.
  • a type of data loading computer program for both (1) the load of the transaction data to the data warehouse and (2) the performance of the data mining analysis and the enrichment of the data warehouse data with the data mining analysis results.
  • This may be particularly true when a data mart is used for temporary storage of data from the data warehouse in which an extraction is to be performed.
  • a task scheduler may be available only for use with a data loading process and may not be available for general use with a data mining process. In such a case, wrapping the data mining process within a data loading process allows a data mining process to be automatically triggered at a predetermined time on a scheduled basis (such as daily, weekly or monthly at a particular time).
  • the use of the same types of techniques, procedures and processes for both a data extraction process and an analytical process of data mining run may be useful. For example, it may enable the use of a common software tool for administering a data warehouse and a data mining run, particularly when data is extracted from a data warehouse for use by a data mining run.
  • the use of the same techniques, procedures and processes for both a data extraction process and an analytical process also may make a function available to both processes when the function was previously available only to one of the analytical process or the data extraction process. It also may encourage consistent behavior from a data warehouse process and a data mining analysis, which may, in turn, reduce the amount of training required by a system administrator.
  • FIG. 6 depicts a process 600 supported by a data mining workbench for defining an automated data mining process.
  • the data mining workbench presents a user interface to guide a user to define a particular type of automated data mining analysis.
  • the data mining workbench uses a generic template for a particular type of data mining analysis, receives user-entered information applicable to the generic template, receives scheduling information from the user, and generates a particular automated data mining process.
  • the process 600 to define an automated data mining process begins when the data mining workbench presents a user interface for the user to enter identifying information for the data mining analysis process being defined (step 610 ). For example, the user may enter a name or another type of identifier and a description of the data mining analysis.
  • the data mining workbench then presents an interface that allows a user to identify the data mining analysis template to be used (step 620 ).
  • the data mining workbench may present a list of data mining analysis templates, such as a template for particular type of a customer loyalty analysis or a template for a particular type of cross-selling analysis, from which the user selects.
  • the data mining workbench presents an appropriate interface to guide the user through the process of entering the user-configuration data mining information to configure the template for the particular analysis being defined (step 630 ).
  • the user enters an identifier for the particular marketing campaign to be analyzed, the particular customer attributes to be analyzed, the attributes to be measured to determine the effect of the marketing campaign (such as sales attribute), and the filter criteria for selecting the records to be analyzed.
  • the data mining analysis template includes a portion for the transaction data extraction, such as transaction data extractor 515 in FIG. 5, and a portion for data mining extraction, such as data mining extractor 520 in FIG. 5.
  • the user then schedules when the data mining analysis process should be automatically triggered (step 640 ). For example, the user may identify a recurring pattern of dates and times for triggering the data mining analysis. This may be accomplished through the presentation of a calendar or the presentation of a set of schedule options from which the user selects.
  • the data mining workbench then stores a version of the generic data mining analysis template with the user-entered information (step 650 ).
  • the data mining workbench may use the name or identifier entered by the user as the name of the stored automated data mining process.
  • the automated data mining process may be added to a task scheduler and scheduled based on the information the user entered.

Abstract

A data mining run that includes special analyses is triggered directly after having loaded new data in a data warehouse environment, to enrich the newly loaded data by new attributes. The process automates replicating transaction data from a source system into a data warehouse, triggering a data mining procedure (such as a training or a prediction procedure) that enriches the data with new attributes, and triggering the upload of the enriched data back into the data warehouse.

Description

    TECHNICAL FIELD
  • This description relates to loading and using data in a data warehouse on a computer system. [0001]
  • BACKGROUND
  • Computer systems often are used to manage and process business data. To do so, a business enterprise may use various application programs running on one or more computer systems. Application programs may be used to process business transactions, such as taking and fulfilling customer orders, providing supply chain and inventory management, performing human resource management functions, and performing financial management functions. Data used in business transactions may be referred to as transaction data or operational data. Often, transaction processing systems provide real-time access to data, and such systems may be referred to as on-line transaction processing (OLTP) systems. [0002]
  • Application programs also may be used for analyzing data, including analyzing data obtained through transaction processing systems. In many cases, the data needed for analysis may have been produced by various transaction processing systems and may be located in many different data management systems. A large volume of data may be available to a business enterprise for analysis. [0003]
  • When data used for analysis is produced in a different computer system than the computer system used for analysis or when a large volume of data is used for analysis, the use of an analysis data repository separate from the transaction computer system may be helpful. An analysis data repository may store data obtained from transaction processing systems and used for analytical processing. The analysis data repository may be referred to as a data warehouse or a data mart. The term data mart typically is used when an analysis data repository stores data for a portion of a business enterprise or stores a subset of data stored in another, larger analysis data repository, which typically is referred to as a data warehouse. For example, a business enterprise may use a sales data mart for sales data and a financial data mart for financial data. [0004]
  • Analytical processing may be used to analyze data stored in a data warehouse or other type of analytical data repository. When an analytical processing tool accesses the data warehouse on a real-time basis, the analytical processing tool may be referred to as an OLAP system. An OLAP system may support complex analyses using a large volume of data. An OLAP system may produce an information model using a three-dimensional presentation, which may be referred to as an information cube or a data cube. [0005]
  • One type of analytical processing identifies relationships in data stored in a data warehouse or another type of data repository. The process of identifying data relationships by means of an automated computer process may be referred to as data mining. Sometimes a data mining mart may be used to store a subset of data extracted from a data warehouse. A data mining process may be performed on data in the data mining mart, rather than the data mining process being performed on data in the data warehouse. The results of the data mining process then are stored in the data warehouse. The use of a data mining mart that is separate from a data warehouse may help decrease the impact on the data warehouse of a data mining process that requires significant system resources, such as processing capacity or input/output capacity. Also, data mining marts may be optimized for access by data mining analyses that provide faster and more flexible access. [0006]
  • One type of data relationship that may be identified by a data mining process is an associative relationship in which one data value is associated or otherwise occurs in conjunction with another data value or event. For example, an association between two or more products that are purchased by a customer at the same time may be identified by analyzing sales receipts or sales orders. This may be referred to as a sales basket analysis or a cross-selling analysis. The association of products purchases may be based on a pairing of two products, such as when a customer purchases product A, the customer also purchases product B. The analysis may also reveal relationships between three products, such as when a customer purchases product A and product B, the customer also typically purchases product C. The results of a cross-selling analysis may be used to promote associated products, such as through a marketing campaign that promotes the associated products or by locating the associated products near one another in a retail store, such as by locating the products in the same aisle or shelf. [0007]
  • Customers that are at risk of not renewing a sales contract or not purchasing products in the future also may be identified by data mining. Such an analysis may be referred to as a churn analysis in which the likelihood of churn refers to the likelihood that a customer will not purchase products or services in the future. A customer at risk of churning may be identified based on having similar characteristics to customers that have already churned. The ability to identify a customer at risk of churning may be advantageous, particularly when steps may be taken to reduce the number of customers who do churn. A churn analysis may also be referred to as a customer loyalty analysis. [0008]
  • For example, in the telecommunications industry a customer may be able to switch from one telecommunication provider to another telecommunications provider relatively easily. A telecommunications provider may be able to identify, using data mining techniques, particular customers that are likely to switch to a different telecommunications provider. The telecommunications provider may be able to provide an incentive to at-risk customers to decrease the number of customers who switch. [0009]
  • In general, using data for special data analysis, such as the application of data mining techniques, involves a fixed sequence of processes, in which each process occurs only after the completion of a predecessor process. For example, in a data warehouse that uses a separate data mining mart for the performance of a data mining process, three processes may need to be performed in order. First, data must be loaded to a data warehouse from a transaction data management system. Second, data from the data warehouse must be copied to a data mining mart and the data mining process must be performed. Third, the enriched or new data that results from the data mining process must be loaded to the data warehouse. Each of those processes may be triggered separately, often by different users. As a result, the data mining process is performed separately from the loading of the new data to the data warehouse. In some cases, performing the data mining process may occur days, or even weeks, after the data has been loaded from the transaction processing system and is available for analysis. [0010]
  • A delay in performing data mining analysis may be problematic when the results of the analysis are most useful at a particular time. For example, the value of a churn prediction for a particular customer or group of customers may be time-sensitive. After a customer purchases a service or product elsewhere, the opportunity of the business enterprise to influence the behavior of the customer is lost. When the identification of a high likelihood of churning occurs after the customer has been lost, the data mining result is wasted. [0011]
  • Some aspects of creating or using a data warehouse may be automated, that is initiated without user manipulation. For example, an automated software agent may be employed to collect data from various distributed databases to collect data for a data warehouse. Using an OLAP system, a report or other type of output may be automatically generated and sent to various receiving devices, such as a personal digital assistant, a printer, or a pager. When transaction data is input to a transaction processing system, the online transaction data may be automatically summarized and stored as summary data. [0012]
  • SUMMARY
  • Generally, the invention automates the triggering of special analyses directly after having loaded new data in a data warehouse environment to enrich the newly loaded data with new attributes. The invention automates, without requiring user manipulation, copying data from a data warehouse to a data mining mart, the triggering of a data mining procedure (such as a training or a prediction procedure) that enriches the data with new attributes, and the triggering of the upload of the enriched data to the data warehouse. The invention also may automate, without requiring user manipulation, the loading of transaction data from a source system into a data warehouse before triggering the data mining process. One area where the invention may find specific applicability is in performing a data mining procedure on a regular, predetermined basis. For example, sales receipts for a particular month may be automatically loaded into a data warehouse and analyzed for associative sales relationships. Another example is performing periodic analysis of customer activity to identify customers that are at risk of churning for the purpose of influencing customer behavior. [0013]
  • In one general aspect, a data mining process may be automatically triggered. An analytical process is triggered based on the presence of data in a data source that is used for analytical processing. The analytical process is performed on data from the data source after the analytical process has been triggered. The analytical process uses a procedure that also is usable in a data extraction process. The created data attribute is stored in the data source. [0014]
  • Implementations may include one or more of the features noted above and one or more of the following features. For example, the analytical process may be triggered based on the completion of a computer program for loading data to the data source that is used for analytical processing. [0015]
  • Also, data may be extracted from a data source used for transaction processing and loaded to the data source that is used for analytical processing. A person may initiate at most the step of extracting data from the data source used for transaction processing or the step of loading the extracted data. The occurrence of a predetermined date and time may trigger extracting the data or may trigger loading the data. [0016]
  • In addition to extracting data from a transaction data source, data also may be extracted from the data source that is used for analytical processing and loaded to temporary data storage. The analytical process may be performed on the data stored in the temporary data storage. [0017]
  • The types of analytical processes that may be triggered include an analytical process to determine a relationship between two data values in the data source, or determine a relationship between two data values that predict a likelihood of whether a particular customer will fail to purchase a service or product in the future. The analytical process also may apply a relationship that has been previously-determined to data values in the data source. The analytical process also may identify products or services that are purchased in the same transaction. For example, the analytical process may determine the likelihood of whether a particular customer will fail to purchase a service or a product in the future. The likelihood may be based on characteristics associated with customers who have been identified as failing to purchase a service or a product. [0018]
  • Implementations of the techniques discussed above may include a method or process, a system or apparatus, or computer software on a computer-accessible medium. The details of one or more implementations of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims. [0019]
  • DESCRIPTION OF DRAWINGS
  • FIG. 1 is a block diagram of a system incorporating various aspects of the invention. [0020]
  • FIG. 2 is a block diagram illustrating the enrichment of data stored in the data warehouse based on an automated data mining run. [0021]
  • FIGS. 3 and 4 are flow charts of processes to automate a data mining process. [0022]
  • FIG. 5 is a block diagram of the components of a software architecture for automating a data mining run. [0023]
  • FIG. 6 is a block diagram of a process to use a data mining workbench to design an automated data mining process.[0024]
  • DETAILED DESCRIPTION
  • FIG. 1 shows a block diagram of a [0025] system 100 of networked computers, including a computer system 110 for a data warehouse and transaction computer systems 120 and 130. The loading of new data to the data warehouse 110 from the transaction computer systems 120 and 130 triggers a special analysis to enrich the newly loaded data with new attributes.
  • The [0026] system 100 includes a computer system 110 for a data warehouse, a client computer 115 used to administer the data warehouse, and transaction computer systems 120 and 130, all of which are capable of executing instructions on data. As is conventional, each computer system 110, 120 or 130 includes a server 140, 142 or 144 and a data storage device 145, 146 or 148 associated with each server. Each of the data storage devices 145, 146 and 148 includes data 150, 152 or 154 and executable instructions 155, 156 or 158. A particular portion of data, here referred to as business objects 162 or 164, is stored in computer systems 120 and 130, respectively. Each of business objects 162 or 164 includes multiple business objects. Each business object in business objects 162 or 164 is a collection of data attribute values, and typically is associated with a principal entity represented in a computing device or a computing system. Examples of a business object include information about a customer, an employee, a product, a business partner, a product, a sales invoice, and a sales order. A business object may be stored as a row in a relational database table, an object instance in an object-oriented database, data in an extensible mark-up language (XML) file, or a record in a data file. Attributes are associated with a business object. In one example, a customer business object may be associated with a series of attributes including a customer number uniquely identifying the customer, a first name, a last name, an electronic mail address, a mailing address, a daytime telephone number, an evening telephone number, date of first purchase by the customer, date of the most recent purchase by the customer, birth date or age of customer, and the income level of customer. In another example, a sales order business object may include a customer number of the purchaser, the date on which the sales order was placed, and a list of products, services, or both products and services purchased.
  • The data [0027] warehouse computer system 110 stores a particular portion of data, here referred to as data warehouse 165. The data warehouse 165 is a central repository of data, extracted from transaction computer system 120 or 130 such as business objects 162 or 164. The data in the data warehouse 165 is used for special analyses, such as data mining analyses used to identify relationships among data. The results of the data mining analysis also are stored in the data warehouse 165.
  • The data [0028] warehouse computer system 110 includes an automated data mining process 168 having a data warehouse upload process 170 and a data mining analysis process 172. The data warehouse upload process 170 includes executable instructions for automatically extracting, transmitting and loading data from the transaction computer systems 120 and 130 to the data warehouse computer system 110. The data mining analysis process 172 includes executable instructions for triggering a data mining analysis in the data warehouse computer system 110, and enriching the data in the data warehouse 165 with new attributes determined by the data mining analysis, as described more fully below.
  • In some implementations, the data [0029] warehouse computer system 110 also may include a data mining mart 174 that temporarily stores data from the data warehouse 165 for use in data mining. In such a case, the data mining analysis process 172 also may extract data from the data warehouse 165, store the extracted data to the data mining mart 174, trigger a data mining analysis that operates on the data from the data mining mart 174, and enrich the data in the data warehouse 165 with the new attributes determined by the data mining analysis.
  • The data [0030] warehouse computer system 110 is capable of delivering and exchanging data with the transaction computer systems 120 and 130 through a wired or wireless communication pathway 176 and 178, respectively. The data warehouse computer system 110 also is able to communicate with the on-line client 115 that is connected to the computer system 110 through a communication pathway 176.
  • The data [0031] warehouse computer system 110, the transaction computer systems 120 and 130, and the on-line client 115 may be arranged to operate within or in concert with one or more other systems, such as, for example, one or more LANs (“Local Area Networks”) and/or one or more WANs (“Wide Area Networks”). The on-line client 115 may be a general-purpose computer that is capable of operating as a client of the application program (e.g., a desktop personal computer, a workstation, or a laptop computer running an application program), or a more special-purpose computer (e.g., a device specifically programmed to operate as a client of a particular application program). The on-line client 115 uses communication pathway 182 to communicate with the data warehouse computer system 110. For brevity, FIG. 1 illustrates only a single on-line client 115 for system 100.
  • At predetermined times, the data [0032] warehouse computer system 110 initiates an automated data mining process. This may be accomplished, for example, through the use of a task scheduler (not shown) that initiates the automated data mining process at a particular day and time. In general, the automated data mining process 1) uses the data warehouse upload process 170 to initiate the extraction, transformation and loading of data to the data warehouse 165 from the source systems 120 and 130, and 2) uses the data mining analysis process 172 to initiate a data mining run that creates new attributes by performing a special analysis of the data and loads the new attributes to the data warehouse 165 without user manipulation. A particular automated data mining run may be scheduled as a recurring event based on the occurrence of a predetermined time or date (such as the first day of a month, every Saturday at one o'clock a.m., or the first day of a quarter). Examples of automated data mining processes are described more fully in FIGS. 3-5.
  • More specifically, the data [0033] warehouse computer system 110 uses the automated data mining process 168 to initiate the data warehouse upload process 170. The data warehouse upload process 170 extracts or copies a portion of data, such as all or some of business objects 162, from the data storage 146 of the transaction computer system 120. The extracted data is transmitted over the connection 176 to the data warehouse computer system 110, where the extracted data are stored in data warehouse 165. The data warehouse computer system 110 also may transform the extracted data from a format suitable to computer system 110 into a different format that is suitable for the data warehouse computer system 110. Similarly, the data warehouse computer system 110 may extract a portion of data from data storage 154 of the computer system 130, such as all or some of business objects 164, transmit the extracted data over connection 178, store the extracted data in the data warehouse 165, and optionally transform the extracted data.
  • After the data have been extracted from the source computer systems (here, [0034] transaction computer systems 120 and 130), the automated data mining process 168 initiates the data mining analysis 172. The data mining analysis 172 performs a particular data mining procedure to analyze data from the data warehouse 165, enrich the data with new attributes, and store the enriched data in the data warehouse 165. A particular data mining procedure also may be referred to as a data mining run. There are different types of data mining runs. A data mining run may be a training run in which data relationships are determined, a prediction run that applies a determined relationship to a collection of data relevant to a future event, such as a customer failing to renew a service contract or make another purchase, or both a training run and a prediction run. The prediction run results in the creation of a new attribute for each business object in the data warehouse 165. The creation of a new attribute may be referred to as data enrichment. For example, when the data mining run predicts the likelihood that each customer will churn, an attribute for the likelihood of churn for each customer is stored in the data warehouse 165. That is, the data warehouse 165 is enriched with the new attribute.
  • The combination of the data warehouse upload [0035] process 170 and the data mining analysis 172 in the automated data mining process 168 may increase the coupling of the data mining with the upload of new data to the data warehouse, which, in turn, may reduce the time until the results of new data mining analyses are available. The combination of the data warehouse upload process 170 and the data mining analysis 172 in the automated data mining process 168 also enable the use of the same monitoring process to monitor both the data warehouse load process 170 and the data mining analysis process 172, which, in turn, may help simplify the monitoring of the automated data mining process 168.
  • The data [0036] warehouse computer system 110 also includes a data warehouse monitor 180 that reports on the administration of the automated data mining process 168. For example, an end user of online client 115 is able to view when an automated data mining process is scheduled to next occur, the frequency or other basis on which the automated data mining process is scheduled, and the status of the automated data mining process. For example, the end user may be able to determine that the automated data mining process 168 is executing. When the automated data mining process 168 is executing, the end user may be able to view the progress and status of each of the steps within the data mining method. For example, the end user may be able to view the time that the data warehouse upload process 168 was initiated. The ability to monitor the execution of the automated data mining process may be useful to ensure that the automated data mining process 168 is operating as desired. In some implementations, when a problem is detected in the automation of a data mining process, a notification of the problem may be sent to an administrator for the data warehouse or other type of end user. The use of the data warehouse monitor 180 with both the data upload process 170 and the data mining analysis 172 may be advantageous. For example, a system administrator or another type of user need only access a single monitoring process (here, data warehouse monitor 180) to monitor both sub-processes (here, the data upload process 170 and the data mining analysis 172). The use of the same monitoring process for different sub-processes may result in consistent process behavior across the different sub-processes. The use of the same monitoring process also may reduce the amount of training required for system administrators to be able to use the data warehouse monitor 180.
  • The ability to trigger special analyses directly after having loaded new data in a data warehouse environment to enrich the newly loaded data by new attributes may be useful. Multiple users, often geographically or organizationally distributed, are typically responsible for performing different aspects of the process, all aspects of which must be completed before the newly loaded data is enriched by the special analyses. This may result in a delay from the time when the transaction data is available for analysis to the time when the results of the analysis are available. The delay may be significant or may negatively impact the business enterprise. For example, a business enterprise may be harmed by lost sales by the delay of product arrangements in a retail store based on a cross-selling analysis or by the delay of a promotional marketing campaign to target at-risk customers. [0037]
  • FIG. 2 shows the [0038] results 200 of enriching the data stored in the data warehouse based on an automated data mining process. The results 200 are stored in a relational database system that logically organizes data into a database table. The database table arranges data associated with an entity (here, a customer) in a series of columns 210-216 and rows 220-223. Each column 210, 211, 212, 213, 214, 215, or 216 describes an attribute of the customer for which data is being stored. Each row 220, 221, 222 or 223 represents a collection of attribute values for a particular customer number by a customer identifier 210. The attributes 210-215 were extracted from a source system, such as a customer relationship management system, and loaded into the data warehouse. The attribute 216 represents the likelihood of churn for each customer 220, 221, 222 and 223. The likelihood-of-churn attribute 216 was created and loaded into the data warehouse by an automated data mining process, such as the automated data mining process described in FIGS. 1, 3 and 4.
  • FIG. 3 illustrates an automated [0039] data mining process 300. The automated data mining process 300 may be performed by a processor on a computing system, such as data warehouse computer system 110 of FIG. 1. The automated data mining processor is directed by a method, script, or other type of computer program that includes executable instructions for performing the automated data mining process 300. An example of such a collection of executable instructions is the automated data mining process 168 of FIG. 1.
  • The automated [0040] data mining process 300 includes an extract, transform and load (ETL) sub-process 310, a data mining sub-process 320, and a data enrichment sub-process 330. The automated data mining process 300 begins at a predetermined time and date, typically a recurring predetermined time and date. In some implementations, a system administrator or another type of user may manually initiate the automated data mining process 300. In such a case, the automated data mining process 300, once initiated, automatically triggers sub-processes 310, 320 and 330 without requiring further user manipulation.
  • For example, a churn management automated data mining process may be associated with a script that includes a remote procedure call to extract data from one or more source systems in [0041] step 340, a computer program to transform the extracted data, a database script for loading the data warehouse with the transformed data, and a computer program to perform a churn analysis on the customer data in the data warehouse. Thus, once the script for the churn management automated data mining process has been initiated, by a task scheduler or other type of computer program, the tasks are then automatically triggered based on the completion of the previous script component.
  • The data warehouse processor extracts from a source system appropriate data and transmits the extracted data to the data warehouse (step [0042] 340). For example, the data warehouse processor may execute a remote procedure call on the source system to trigger the extraction and transmission of data from the source system to the computer system on which the data warehouse resides. Alternatively, the data warehouse processor may connect to a web service on the source system to request the extraction and transmission of the data. Typically, the data to be extracted is data from a transaction system, such as an OLTP system. The data extracted may be a complete set of the appropriate data (such as all sales orders or all customers) from the source system, or may be only the data that has been changed since the last extraction. The processor may extract and transmit the data from the source system in a series of data groups, such as data blocks. The extraction may be performed either as a background process or an on-line process, as may the transmission. The ability to extract and transmit data in groups, extract and transmit only changed data, and extract and transmit as a background process may collectively or individually be useful, particularly when a large volume of data is to be extracted and transmitted.
  • In some implementations, the extracted data also may be transformed from the format used by the source system to a different format used by the data warehouse (step [0043] 345). The data transformation may include transforming data values representing a particular attribute to a different field type and length that is used by the data warehouse. The data transformation also may include translating a data code used by the source system to a corresponding but different data code used by the data warehouse. For example, the source system may store a country value using a numeric code (for example, a “1” for the United States and a “2” for the United Kingdom) whereas the data warehouse may store a country value as a textual abbreviation (for example, “U.S.” for the United States and “U.K.” for the United Kingdom). The data transformation also may include translating a proprietary key numbering system in which primary keys are created by sequentially allocating numbers within an allocated number range to a corresponding GUID (“globally unique identifier”) key that is produced from a well-known algorithm and is able to be processed by any computer system using the well-known algorithm. The processor may use a translation table or other software engineering or programming techniques to perform the transformations required. For example, the processor may use a translation table that translates the various possible values from one system to another system for a particular data attribute (for example, translating a country code of “1” to “U.S.” and “2” to “U.K.” or translating a particular proprietary key to a corresponding GUID key).
  • Other types of data transformation also may be performed by the data warehouse processor. For example, the processor may aggregate data or generate additional data values based on the extracted data. For example, the processor may determine a geographic region for a customer based on the customer's mailing address or may determine the total amount of sales to a particular customer that is associated with multiple sales orders. [0044]
  • The data warehouse processor loads the extracted data into data storage associated with the data warehouse, such as the [0045] data warehouse 165 of FIG. 1 (step 350). The data warehouse processor may execute a computer program having executable instructions for loading the extracted data into the data storage and identified by the automated data mining method directing the process 300. For example, a database script may be executed that includes database commands to load the data to the data warehouse. The use of a separate computer program for loading the data may increase the modularity of the data mining method, which, in turn, may improve the efficiency of modifying the automated data mining process 300. Steps 340-350 may be referred to as the ETL sub-process 310.
  • After completing the [0046] ETL sub-process 310, the data warehouse processor automatically triggers a data mining process (step 360). This may be accomplished, for example, by using a script or other type of computer program to control the execution of multiple programs.
  • The data warehouse processor performs a data mining run (step [0047] 365). To do so, the data warehouse processor may apply a data mining model or another type of collection of data mining rules that defines the type of analysis to be performed. The data mining model may be applied to all or a portion of the data in the data warehouse. In some implementations, the data warehouse processor may store the data to be used in the data mining run in transient or persistent storage peripheral to the data warehouse processor where the data is accessed during the data mining run. This may be particularly advantageous when the data warehouse includes a very large volume of data and/or the data warehouse also is used for OLAP processing. In some cases, the storage of the data to transient or persistent storage may be referred to as extracting or staging the data to a data mart for data mining purposes.
  • The data mining run may be a training run or a prediction run. In some implementations, both a training run and a prediction run may be performed during [0048] process 300. The results of the data mining run are stored in temporary storage. To do so, the data warehouse process may copy the results stored in the temporary data structure to the data warehouse. For example, in a customer churn analysis data mining process, the likelihood of churn for each customer may be assessed and stored in a temporary results data structure. Steps 360-365 may be referred to as a data mining sub-process 320.
  • When the [0049] data mining sub-process 320 is completed, the data warehouse processor stores the data mining results in the data warehouse (step 370). For example, a new column for the data mining results may be added to a table in a relational data management system being used for the date warehouse. In a customer churn analysis data mining process, the likelihood of churn for each customer may be added as a new attribute in the data warehouse and appropriately populated with the likelihood data generated when the data mining run was performed in step 365. The process of storing the data created by the data mining run in the data warehouse may be referred to as a data enrichment sub-process 330.
  • In one example, the [0050] process 300 may be used for an automated customer-churn data mining process. A system administrator develops computer programs, each of which are executed to accomplish a portion of the automated customer-churn data mining process. The system administrator also develops a script that identifies each of the computer programs to be executed and the order in which the computer programs are to be executed to accomplish the automated customer-churn data mining process. The system administrator, using a task scheduling program schedules the automated customer-churn data mining script to be triggered on a monthly basis, such as on the first Saturday of each month and beginning at one o'clock a.m.
  • At the scheduled time, the task scheduling program triggers the data warehouse processor to execute the automated customer-churn data mining script. The data warehouse processor executes a remote procedure call in a customer relationship management system to extract customer data and transmit the data to the data warehouse computer system. The data warehouse computer system receives and stores the extracted customer data. The data warehouse processor executes a computer program, as directed by the executing automated customer-churn data mining process script, to transform the customer data to a format usable by the data warehouse. [0051]
  • The data warehouse processor continues to execute the automated customer-churn data mining process script, which then triggers a data mining training run to identify hidden relationships within the customer data. Specifically, the characteristics of customers who have not renewed a service contract in the last eighteen months are identified. The characteristics identified may include, for example, an income above or below a particular level, a geographic region in which the non-returning customer resides, the types of service contract that were not renewed, and the median age of a non-renewing customer. [0052]
  • The data warehouse processor then, under the continued direction of the automated customer-churn data mining process script, triggers a data mining prediction run to identify particular customers who are at risk of not renewing a service contract, the prediction is made based on the customer characteristics identified in the data mining training run. The data warehouse processor determines a likelihood-of-churn for each customer. The data warehouse is enriched with the likelihood-of-churn for each customer such that a likelihood-of-churn attribute is added to the customer data in the data warehouse and the likelihood-of-churn value for each value is stored in the new attribute. [0053]
  • In some implementations, when a subsequent likelihood-of-churn value for a customer is determined, such as a likelihood-of-churn value for a customer that is determined in the following month, the likelihood-of-churn value from the previous data mining prediction run may be replaced so that a customer has only one likelihood-of-churn value at any time. In contrast, some implementations may store the new likelihood-of-churn value each month, in addition to a previous value for the likelihood-of-churn, to develop a time-dependent prediction—that is, a new prediction for the same type of prediction is stored each time a prediction run is performed for a customer. The time-dependent prediction may help improve the accuracy of the data mining training runs because the predicted values may be monitored over time and compared with actual customer behavior. [0054]
  • FIG. 4 illustrates another example of an automated data mining process. In contrast to the automated [0055] data mining process 300 of FIG. 3, automated data mining process 400 replicates data from a data warehouse, such as data warehouse 165 in FIG. 1, to a data mining mart, such as data mining mart 174 of FIG. 1. The data mining process 400 then performs the data mining analysis on data in the data mart, and stores the data mining results as enriched data in the data warehouse.
  • The automated [0056] data mining process 400 may be performed by a processor on a computing system, such as data warehouse computer system 110 of FIG. 1. The automated data mining processor is directed by a method, script, or other type of computer program that includes executable instructions for performing the automated data mining process 400. An example of such a collection of executable instructions is the automated data mining process 168 of FIG. 1.
  • The automated [0057] data mining process 400 includes an extract, transform and load (ETL) sub-process 410, a data mining sub-process 420 that uses a data mart, and a data enrichment sub-process 430. The automated mining process 400 begins at a predetermined time and date, typically a recurring predetermined time and date. The ETL sub-process 410 extracts data from a transactional processing or other type of source system and loads the data to a data warehouse, as described previously with respect to ETL sub-process 310 of FIG. 3.
  • After completing the [0058] ETL sub-process 410, the data warehouse processor automatically triggers a data mining run, as described with respect to step 360 in FIG. 3 (step 440). The data warehouse processor copies data from the data warehouse to the data mining mart for use in a data mining run (step 450). For example, when the data warehouse and the data mining mart are located on the same computer system, the data warehouse processor may insert into database tables of a data mining mart a copy of some of the data rows stored in the data warehouse. Alternatively, when the data warehouse is located on a different computer system than the computer system on which the data mart is located, the data warehouse processor may extract data from the data warehouse on a computer system and transmit the data to the data mart located on a different computer system. The data warehouse processor then may execute a remote procedure call or other collection of executable instructions to load data into the data mart. In some implementations, the data warehouse processor may replicate data from the data warehouse to the data mining mart—that is, the data warehouse processor copies the data to the data mining mart and synchronizes the data mining mart with the data warehouse such that changes made to one of the data warehouse or the data mining mart are reflected in all other of the data warehouse or the data mining mart. In some implementations, the data warehouse processor may transform the data from the data warehouse before storing the data in the data mining mart.
  • The data warehouse processor then performs a data mining run, as described in [0059] step 365 in FIG. 3, using data in the data mining mart (step 460). The steps 440-460 may be referred to as a data mining sub-process 420. When the data mining sub-process 420 is completed, the data warehouse processor stores the data mining results in the data warehouse (step 470), as described in step 370 and sub-process 330 in FIG. 3.
  • FIG. 5 depicts the components of a [0060] software architecture 500 for an automated data mining process. The software architecture 500 may be used to implement the automated data mining process 300 described in FIG. 3 or the automated data mining process 400 described in FIG. 4. The software architecture 500 may be implemented, for example, on computer system 110 of FIG. 1. FIG. 5 also illustrates a data flow and a process flow using the components of the software architecture to implement the automated data mining process 400 in FIG. 4.
  • The [0061] software architecture 500 includes an automated data mining task scheduler 510, a transaction data extractor 515, and a data mining extractor 520. The software architecture also includes a transaction processing data management system 525 for a transaction processing system, such as transaction computer system 120 or transaction computer system 130 in FIG. 1. The software architecture also includes a data warehouse 530, such as the data warehouse 165 in FIG. 1, and a data mart 535, such as the optional data mart 174 in FIG. 1.
  • One example of the automated data [0062] mining task scheduler 510 is a process chain for triggering the transaction data extractor 515 and the data mining extractor 520 at a predetermined date and time. In general, a process chain is a computer program that defines particular tasks that are to occur in a particular order at a predetermined date and time. For example, a system administrator or another type of user may schedule the process chain to occur at regular intervals, such as at one o-clock a.m. the first Saturday of a month, every Sunday at eight o'clock a.m., or at two o'clock a.m. on the first day and the fifteenth day of each month. A process chain may include dependencies between the defined tasks in the process chain such that a subsequent task is not triggered until a previous task has been successfully completed. In this example, the automated data mining task scheduler 510 is a process chain that calls two extractor processes: the transaction data extractor 515 and the data mining extractor 520. The data mining extractor 520 is only initiated after the successful completion of the transaction data extractor 515.
  • The automated data [0063] mining task scheduler 510 starts the transaction data extractor 515 at a predetermined date and time, as illustrated by process flow 542. In general, an extractor is a computer program that performs the extraction of data from a data source using a set of predefined settings. Typical settings for an extractor include data selection settings that identify the particular data attributes and data filter settings that identify the criteria that identifies the particular records to be extracted. For example, an extractor may identify three attributes—customer number, last purchase date, and amount of last purchase—that are to be extracted for all customers that are located in a particular geographic region. The extractor then reads the attribute values for the records that meet the filter condition from the data source, maps the data to the attributes included in the data warehouse, and loads the data to the data warehouse. An extractor also may be referred to as an upload process.
  • The [0064] transaction data extractor 515 extracts, using predefined settings, data from the transaction processing data management system, as indicated by data flow line 544, and transforms the data as necessary to prepare the data to be loaded to the data warehouse 530. The transaction data extractor 515 then loads the extracted data to the data warehouse 530, as indicated by data flow 546. After the extracted data has been loaded, the transaction data extractor 515 returns processing control to the automated data mining task scheduler 510, as indicated by process flow 548. When returning processing control, the transaction data extractor 515 also reports the successful completion of the extraction.
  • Based on the successful completion of the [0065] transaction data extractor 515, the automated data mining task scheduler 510 starts the data mining extractor 520, as illustrated by process flow 552. In general, the data mining extractor initiates a data mining process using the newly loaded transaction data in the data warehouse 530. The data mining process analyzes the data and writes the results back to the data warehouse.
  • First, the [0066] data mining extractor 520 extracts data from the data warehouse 530 (function 555), as illustrated by data flow 556, and loads the extracted data to the data mart 535, as illustrated by data flow 558, for use by the data mining analysis. The data mining extractor 520 then performs a data mining training analysis (function 560) using the data from the data mart 535, as illustrated by data flow 562. The data mining extractor 520 updates the appropriate data mining model in data mining model 565 with the results of the data mining training analysis, as illustrated by data flow 564.
  • The [0067] data mining extractor 520 uses the results of the data mining training analysis from a data mining model 564, as illustrated by data flow 566, to perform a data mining prediction analysis (function 568). The data mining extractor 520 stores the results of the data mining prediction analysis in the data mart 535, as illustrated by data flow 569.
  • The [0068] data mining extractor 520 then performs a data enrichment function (function 570) using the results from the data mart 535, as illustrated by data flow 572, to load the data mining results into the data warehouse 530, as illustrated by data flow 574. After enriching the data warehouse 530 with the data mining analysis results, the data mining extractor 520 returns processing control to the automated data mining task scheduler 510, as depicted by process flow 576. When returning processing control, the data mining extractor 520 also reports to the automated data mining task scheduler 510 the successful completion of the data mining analyses and enrichment of the data warehouse. To do so, the data mining extractor 520 may report a return code that is consistent with a successful process.
  • The use of a task scheduler, here in the form of a process chain, to link the task of extracting the transaction data from a source system with the task of performing the data mining process may be useful. For example, the process for loading transaction data to the data warehouse is combined with an immediate data mining analysis and enrichment of the data warehouse data with the results of the analysis. The linkage of the transactional data availability with the automatic performance of the data mining analysis may reduce, perhaps even substantially reduce, the lag between the time at which the transaction data first becomes available in the data warehouse and the time at which the data enriched with data mining analysis results becomes available in the data warehouse. [0069]
  • There also may be advantages in a type of data loading computer program (here, an extractor) for both (1) the load of the transaction data to the data warehouse and (2) the performance of the data mining analysis and the enrichment of the data warehouse data with the data mining analysis results. This may be particularly true when a data mart is used for temporary storage of data from the data warehouse in which an extraction is to be performed. For example, in some data warehousing systems, a task scheduler may be available only for use with a data loading process and may not be available for general use with a data mining process. In such a case, wrapping the data mining process within a data loading process allows a data mining process to be automatically triggered at a predetermined time on a scheduled basis (such as daily, weekly or monthly at a particular time). [0070]
  • More generally, the use of the same types of techniques, procedures and processes for both a data extraction process and an analytical process of data mining run may be useful. For example, it may enable the use of a common software tool for administering a data warehouse and a data mining run, particularly when data is extracted from a data warehouse for use by a data mining run. The use of the same techniques, procedures and processes for both a data extraction process and an analytical process also may make a function available to both processes when the function was previously available only to one of the analytical process or the data extraction process. It also may encourage consistent behavior from a data warehouse process and a data mining analysis, which may, in turn, reduce the amount of training required by a system administrator. [0071]
  • FIG. 6 depicts a [0072] process 600 supported by a data mining workbench for defining an automated data mining process. The data mining workbench presents a user interface to guide a user to define a particular type of automated data mining analysis. In general, the data mining workbench uses a generic template for a particular type of data mining analysis, receives user-entered information applicable to the generic template, receives scheduling information from the user, and generates a particular automated data mining process.
  • The [0073] process 600 to define an automated data mining process begins when the data mining workbench presents a user interface for the user to enter identifying information for the data mining analysis process being defined (step 610). For example, the user may enter a name or another type of identifier and a description of the data mining analysis.
  • The data mining workbench then presents an interface that allows a user to identify the data mining analysis template to be used (step [0074] 620). For example, the data mining workbench may present a list of data mining analysis templates, such as a template for particular type of a customer loyalty analysis or a template for a particular type of cross-selling analysis, from which the user selects.
  • Based on the data mining analysis template selected, the data mining workbench presents an appropriate interface to guide the user through the process of entering the user-configuration data mining information to configure the template for the particular analysis being defined (step [0075] 630). In one example of defining an automated data mining analysis for determining the effect of a particular marketing campaign, the user enters an identifier for the particular marketing campaign to be analyzed, the particular customer attributes to be analyzed, the attributes to be measured to determine the effect of the marketing campaign (such as sales attribute), and the filter criteria for selecting the records to be analyzed. The data mining analysis template includes a portion for the transaction data extraction, such as transaction data extractor 515 in FIG. 5, and a portion for data mining extraction, such as data mining extractor 520 in FIG. 5.
  • The user then schedules when the data mining analysis process should be automatically triggered (step [0076] 640). For example, the user may identify a recurring pattern of dates and times for triggering the data mining analysis. This may be accomplished through the presentation of a calendar or the presentation of a set of schedule options from which the user selects.
  • The data mining workbench then stores a version of the generic data mining analysis template with the user-entered information (step [0077] 650). To do so, for example, the data mining workbench may use the name or identifier entered by the user as the name of the stored automated data mining process. The automated data mining process may be added to a task scheduler and scheduled based on the information the user entered.
  • Although the techniques and concepts described above refer to a single data mining process, the applicability of the techniques and concepts is not limited to a single data mining process. For example, a particular data warehouse may be used for, and typically is used for, many different data mining processes, many of which may benefit from being automated as described herein. [0078]
  • A number of implementations of the invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other implementations are within the scope of the following claims. [0079]

Claims (34)

What is claimed is:
1. A method for conducting a data mining process, the method comprising:
triggering the data mining process wherein:
the data mining process includes an analytical process,
the triggering is based on the presence of data in a data source that is used for analytical processing, and
the analytical process uses a procedure that also is usable in a data extraction process;
creating a data attribute by performing the analytical process on data from the data source after the analytical process has been triggered; and
storing the created data attribute in the data source.
2. The method of claim 1 further comprising extracting data from a data source used for transaction processing.
3. The method of claim 2 wherein a person initiates at most the step of extracting data from the data source used for transaction processing.
4. The method of claim 2 further comprising loading the extracted data to the data source that is used for analytical processing.
5. The method of claim 4 wherein a person initiates at most the step of loading the extracted data.
6. The method of claim 1 further comprising:
extracting data from the data source that is used for analytical processing; and
loading the data extracted from the data source that is used for analytical processing to temporary data storage,
wherein performing the analytical process comprises performing the analytical process using data stored in the temporary data storage.
7. The method of claim 1 wherein triggering the analytical process based presence of data in the data source that is used for analytical processing comprises triggering an analytical process based on the completion of a computer program for loading data to the data source that is used for analytical processing.
8. The method of claim 1 further comprising triggering, based on an occurrence of a predetermined date and time, loading the data extracted from a data source used for transaction processing to the data source that is used for analytical processing.
9. The method of claim 1 further comprising triggering, based on an occurrence of a predetermined date and time, extracting data from a data source used for transaction processing and loading the extracted data to the data source that is used for analytical processing
10. The method of claim 1 wherein performing the analytical process comprises determining a relationship between two data values in the data source.
11. The method of claim 10 wherein performing the analytical process comprises determining a relationship between two data values that predict a likelihood of whether a particular customer will fail to purchase a service or product in the future.
12. The method of claim 10 wherein performing the analytical process comprises identifying products that are purchased in the same transaction.
13. The method of claim 10 wherein performing the analytical process comprises identifying services that are purchased in the same transaction.
14. The method of claim 10 wherein performing the analytical process comprises applying a previously-determined relationship between two data values to data in the data source.
15. The method of claim 14 wherein applying a previously-determined relationship comprises determining a likelihood of whether a particular customer will fail to purchase a service or a product in the future based on characteristics associated with customers who have been identified as failing to purchase a service or a product.
16. The method of claim 1 performing the analytical process comprises determining a relationship between two data values in the data source and applying the determined relationship to data in the data source.
17. The method of claim 16 wherein:
determining a relationship between two data values in the data source comprises determining a relationship between two data values that predict a likelihood of whether a particular customer will fail to purchase a service or product in the future, and
applying the determined relationship to data in the data source comprises determining a likelihood of whether a particular customer will fail to purchase a service or a product in the future based on the relationship determined between two data values that predict a likelihood of whether a particular customer will fail to purchase a service or product in the future.
18. A method for conducting a data mining process, the method comprising:
triggering the data mining process wherein:
the data mining process includes an analytical process,
the triggering is based on the presence of data in a data source that is used for analytical processing, and
the analytical process uses a procedure that also is usable in a data extraction process;
extracting data from the data source that is used for analytical processing;
loading the data extracted from the data source that is used for analytical processing to temporary data storage,
creating a data attribute by performing the analytical process on data in the temporary data storage; and
storing the created data attribute in the data source that is used for analytical processing.
19. The method of claim 18 further comprising:
extracting data from a data source used for transaction processing, and
comprising loading the extracted data to the data source that is used for analytical processing.
20. The method of claim 19 wherein a person initiates at most the step of extracting data from the data source used for transaction processing.
21. The method of claim 18 wherein performing the analytical process comprises determining a relationship between two data values in the data source.
22. A computer-readable medium or propagated signal having embodied thereon a computer program configured to conduct a data mining process, the medium or signal comprising one or more code segments configured to:
trigger the data mining process wherein:
the data mining process includes an analytical process,
the triggering is based on the presence of data in a data source that is used for analytical processing, and
the analytical process uses a procedure that also is usable in a data extraction process;
create a data attribute by performing the analytical process on data from the data source after the analytical process has been triggered; and
store the created data attribute in the data source.
23. The medium or signal of claim 22 wherein the one or more code segments are further configured to:
extract data from the data source that is used for analytical processing; and
load the data extracted from the data source that is used for analytical processing to temporary data storage,
wherein the one or more code segments configured to perform the analytical process comprise one or more code segments configured to perform the analytical process using data stored in the temporary data storage.
24. The medium or signal of claim 22 wherein the one or more code segments configured to trigger the analytical process comprise one or more code segments configured to trigger an analytical process based on the completion of a computer program for loading data to the data source that is used for analytical processing.
25. The medium or signal of claim 22 wherein the one or more code segments are further configured to trigger, based on an occurrence of a predetermined date and time, loading the data extracted from a data source used for transaction processing to the data source that is used for analytical processing.
26. The medium or signal of claim 22 wherein the one or more code segments are further configured to trigger, based on an occurrence of a predetermined date and time, extracting data from a data source used for transaction processing and loading the extracted data to the data source that is used for analytical processing.
27. The medium or signal of claim 22 wherein the one or more code segments configured to perform the analytical process comprise one or more code segments configured to determine a relationship between two data values in the data source.
28. A system for conducting a data mining process, the system comprising a processor connected to a storage device and one or more input/output devices, wherein the processor is configured to:
trigger the data mining process wherein:
the data mining process includes an analytical process,
the triggering is based on the presence of data in a data source that is used for analytical processing, and
the analytical process uses a procedure that also is usable in a data extraction process;
create a data attribute by performing the analytical process on data from the data source after the analytical process has been triggered; and
store the created data attribute in the data source.
29. The system of claim 28 wherein the processor is further configured to:
extract data from the data source that is used for analytical processing;
load the data extracted from the data source that is used for analytical processing to temporary data storage; and
perform the analytical process using data stored in the temporary data storage.
30. The system of claim 28 wherein the processor is configured to trigger an analytical process based on the completion of a computer program for loading data to the data source that is used for analytical processing.
31. The system of claim 28 wherein the processor is further configured to trigger, based on an occurrence of a predetermined date and time, loading the data extracted from a data source used for transaction processing to the data source that is used for analytical processing.
32. The system of claim 28 wherein the processor is further configured to trigger, based on an occurrence of a predetermined date and time, extracting data from a data source used for transaction processing and loading the extracted data to the data source that is used for analytical processing.
33. The system of claim 28 wherein the processor is configured to determine a relationship between two data values in the data source.
34. A method for defining an automated data mining process, the method comprising:
presenting a user interface for:
identifying a template for a type of automated data mining process for triggering an analytical process, the analytical process using a procedure that also is usable in a data extraction process, based on the presence of data in a data source that is used for analytical processing, creating a data attribute by performing the analytical process on data from the data source after the analytical process has been triggered, and storing the created data attribute in the data source; and
entering information for defining the automated data mining process;
associating the entered information with the identified template; and
storing the associated entered information with the identified template as a computer program configured to perform the automated data mining process.
US10/423,011 2003-04-25 2003-04-25 Automated data mining runs Abandoned US20040215656A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US10/423,011 US20040215656A1 (en) 2003-04-25 2003-04-25 Automated data mining runs
US10/816,909 US7571191B2 (en) 2003-04-25 2004-04-05 Defining a data analysis process
US10/816,910 US20040267751A1 (en) 2003-04-25 2004-04-05 Performing a data analysis process
PCT/EP2004/003742 WO2004097667A2 (en) 2003-04-25 2004-04-07 Automated data mining runs
EP04726131A EP1623343A2 (en) 2003-04-25 2004-04-07 Automated data mining runs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/423,011 US20040215656A1 (en) 2003-04-25 2003-04-25 Automated data mining runs

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US10/816,910 Continuation-In-Part US20040267751A1 (en) 2003-04-25 2004-04-05 Performing a data analysis process
US10/816,909 Continuation-In-Part US7571191B2 (en) 2003-04-25 2004-04-05 Defining a data analysis process

Publications (1)

Publication Number Publication Date
US20040215656A1 true US20040215656A1 (en) 2004-10-28

Family

ID=33299004

Family Applications (3)

Application Number Title Priority Date Filing Date
US10/423,011 Abandoned US20040215656A1 (en) 2003-04-25 2003-04-25 Automated data mining runs
US10/816,910 Abandoned US20040267751A1 (en) 2003-04-25 2004-04-05 Performing a data analysis process
US10/816,909 Active 2025-08-04 US7571191B2 (en) 2003-04-25 2004-04-05 Defining a data analysis process

Family Applications After (2)

Application Number Title Priority Date Filing Date
US10/816,910 Abandoned US20040267751A1 (en) 2003-04-25 2004-04-05 Performing a data analysis process
US10/816,909 Active 2025-08-04 US7571191B2 (en) 2003-04-25 2004-04-05 Defining a data analysis process

Country Status (3)

Country Link
US (3) US20040215656A1 (en)
EP (1) EP1623343A2 (en)
WO (1) WO2004097667A2 (en)

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030222903A1 (en) * 2002-05-31 2003-12-04 Wolfgang Herzog Distributing customized computer settings to affected systems
US20040088730A1 (en) * 2002-11-01 2004-05-06 Srividya Gopalan System and method for maximizing license utilization and minimizing churn rate based on zero-reject policy for video distribution
US20050038701A1 (en) * 2003-08-13 2005-02-17 Alan Matthew Computer system for card in connection with, but not to carry out, a transaction
US20060010434A1 (en) * 2004-07-07 2006-01-12 Wolfgang Herzog Providing customizable configuration data in computer systems
US20060010163A1 (en) * 2004-07-07 2006-01-12 Wolfgang Herzog Configuring computer systems with business configuration information
US20060080277A1 (en) * 2004-10-04 2006-04-13 Peter Nador Method and system for designing, implementing and documenting OLAP
US20060136443A1 (en) * 2004-12-16 2006-06-22 International Business Machines Corporation Method and apparatus for initializing data propagation execution for large database replication
US20060167911A1 (en) * 2005-01-24 2006-07-27 Stephane Le Cam Automatic data pattern recognition and extraction
US20060179061A1 (en) * 2005-02-07 2006-08-10 D Souza Roy P Multi-dimensional surrogates for data management
US20060190486A1 (en) * 2005-02-24 2006-08-24 Qi Zhou Configuring a computer application with preconfigured business content
US20070112615A1 (en) * 2005-11-11 2007-05-17 Matteo Maga Method and system for boosting the average revenue per user of products or services
US20070143374A1 (en) * 2005-02-07 2007-06-21 D Souza Roy P Enterprise service availability through identity preservation
US20070143365A1 (en) * 2005-02-07 2007-06-21 D Souza Roy P Synthetic full copies of data and dynamic bulk-to-brick transformation
US20070143373A1 (en) * 2005-02-07 2007-06-21 D Souza Roy P Enterprise server version migration through identity preservation
US20070150499A1 (en) * 2005-02-07 2007-06-28 D Souza Roy P Dynamic bulk-to-brick transformation of data
US20070156793A1 (en) * 2005-02-07 2007-07-05 D Souza Roy P Synthetic full copies of data and dynamic bulk-to-brick transformation
US20070156792A1 (en) * 2005-02-07 2007-07-05 D Souza Roy P Dynamic bulk-to-brick transformation of data
US20070168500A1 (en) * 2005-02-07 2007-07-19 D Souza Roy P Enterprise service availability through identity preservation
US20070233756A1 (en) * 2005-02-07 2007-10-04 D Souza Roy P Retro-fitting synthetic full copies of data
EP1895410A1 (en) * 2006-09-01 2008-03-05 France Telecom Method and system for extraction of a data table from a database and corresponding computer program product
US20080071812A1 (en) * 2006-09-15 2008-03-20 Oracle International Corporation Evolution of XML schemas involving partial data copy
US20080082560A1 (en) * 2006-09-28 2008-04-03 Oracle International Corporation Implementation of backward compatible XML schema evolution
US20090300280A1 (en) * 2008-06-02 2009-12-03 Curtis Edward Jutzi Detecting data mining processes to increase caching efficiency
US7657780B2 (en) 2005-02-07 2010-02-02 Mimosa Systems, Inc. Enterprise service availability through identity preservation
US20100146510A1 (en) * 2008-12-10 2010-06-10 Jan Teichmann Automated Scheduling of Mass Data Run Objects
US20130198093A1 (en) * 2012-01-09 2013-08-01 W. C. Taylor, III Data mining and logic checking tools
US20130253977A1 (en) * 2012-03-23 2013-09-26 Commvault Systems, Inc. Automation of data storage activities
US8577833B2 (en) 2012-01-04 2013-11-05 International Business Machines Corporation Automated data analysis and transformation
US20140012862A1 (en) * 2012-07-04 2014-01-09 Sony Corporation Information processing apparatus, information processing method, program, and information processing system
US20140067803A1 (en) * 2012-09-06 2014-03-06 Sap Ag Data Enrichment Using Business Compendium
US20140074760A1 (en) * 2012-09-13 2014-03-13 Nokia Corporation Method and apparatus for providing standard data processing model through machine learning
US20140177544A1 (en) * 2012-11-29 2014-06-26 Telefonakiebolaget L M Ericsson (Publ) Network resource configuration
FR3032538A1 (en) * 2015-02-09 2016-08-12 Orbite COMPUTER SYSTEM FOR AUTOMATIC DATA COLLECTION
US20170323326A1 (en) * 2016-05-03 2017-11-09 Eric Kim Method and systems for determining programmatically expected performances
US20170337567A1 (en) * 2016-05-17 2017-11-23 Sap Se Real-time system to identify and analyze behavioral patterns to predict churn risk and increase retention
US9898515B1 (en) * 2014-10-29 2018-02-20 Jpmorgan Chase Bank, N.A. Data extraction and transformation method and system
WO2018075817A1 (en) * 2016-10-19 2018-04-26 Salesforce.Com, Inc. Streamlined creation and updating of olap analytic databases
US20180300388A1 (en) * 2017-04-17 2018-10-18 International Business Machines Corporation System and method for automatic data enrichment from multiple public datasets in data integration tools
US20180341889A1 (en) * 2017-05-25 2018-11-29 Centene Corporation Entity level classifier using machine learning
EP3276504A4 (en) * 2015-03-24 2018-12-12 Gixo Ltd. Data processing system, data processing method, program, and computer memory medium
US10306013B2 (en) * 2015-07-15 2019-05-28 Sap Se Churn risk scoring using call network analysis
US10599527B2 (en) 2017-03-29 2020-03-24 Commvault Systems, Inc. Information management cell health monitoring system
US10860401B2 (en) 2014-02-27 2020-12-08 Commvault Systems, Inc. Work flow management for an information management system
US11113702B1 (en) * 2018-12-12 2021-09-07 Amazon Technologies, Inc. Online product subscription recommendations based on a customers failure to perform a computer-based action and a monetary value threshold
US11138193B2 (en) * 2015-05-29 2021-10-05 International Business Machines Corporation Estimating the cost of data-mining services
US11263600B2 (en) 2015-03-24 2022-03-01 4 S Technologies, LLC Automated trustee payments system
US11687567B2 (en) * 2017-09-21 2023-06-27 Vmware, Inc. Trigger based analytics database synchronization

Families Citing this family (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7693847B1 (en) * 2004-07-13 2010-04-06 Teradata Us, Inc. Administering workload groups
US8600845B2 (en) * 2006-10-25 2013-12-03 American Express Travel Related Services Company, Inc. System and method for reconciling one or more financial transactions
EP1723583A2 (en) * 2004-02-04 2006-11-22 Sap Ag Method, system and software application for real time accounting data processing
US7593916B2 (en) * 2004-08-19 2009-09-22 Sap Ag Managing data administration
JP2006227896A (en) * 2005-02-17 2006-08-31 Fuji Xerox Co Ltd Information analyzing apparatus, information analyzing method and program
US8984636B2 (en) 2005-07-29 2015-03-17 Bit9, Inc. Content extractor and analysis system
US7895651B2 (en) 2005-07-29 2011-02-22 Bit 9, Inc. Content tracking in a network security system
US8272058B2 (en) 2005-07-29 2012-09-18 Bit 9, Inc. Centralized timed analysis in a network security system
US7979569B2 (en) 2005-12-01 2011-07-12 Firestar Software, Inc. System and method for exchanging information among exchange applications
US7801761B2 (en) * 2005-12-02 2010-09-21 Satyam Computer Services Ltd. System and method for tracking customer satisfaction index based on intentional context
EP1801689A1 (en) * 2005-12-23 2007-06-27 Sap Ag Methods, systems and software applications including tab panel elements
US8954852B2 (en) * 2006-02-03 2015-02-10 Sonic Solutions, Llc. Adaptive intervals in navigating content and/or media
US7756881B2 (en) * 2006-03-09 2010-07-13 Microsoft Corporation Partitioning of data mining training set
US8190571B2 (en) 2006-06-07 2012-05-29 Microsoft Corporation Managing data with backup server indexing
US7818203B1 (en) * 2006-06-29 2010-10-19 Emc Corporation Method for scoring customer loyalty and satisfaction
US7551176B2 (en) * 2006-08-24 2009-06-23 Via Technologies, Inc. Systems and methods for providing shared attribute evaluation circuits in a graphics processing unit
JP4979307B2 (en) * 2006-08-25 2012-07-18 シスメックス株式会社 Blood sample measuring device
US20080065476A1 (en) * 2006-09-07 2008-03-13 Loyalty Builders, Inc. Online direct marketing system
US20080177892A1 (en) * 2007-01-19 2008-07-24 International Business Machines Corporation Method for service oriented data extraction transformation and load
US7933861B2 (en) * 2007-04-09 2011-04-26 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Process data warehouse
US20080281695A1 (en) 2007-05-11 2008-11-13 Verizon Services Organization Inc. Systems and methods for using voice services records to provide targeted marketing services
US8452636B1 (en) * 2007-10-29 2013-05-28 United Services Automobile Association (Usaa) Systems and methods for market performance analysis
US20090259995A1 (en) * 2008-04-15 2009-10-15 Inmon William H Apparatus and Method for Standardizing Textual Elements of an Unstructured Text
US9659073B2 (en) * 2008-06-18 2017-05-23 Oracle International Corporation Techniques to extract and flatten hierarchies
US8566185B2 (en) * 2008-06-26 2013-10-22 Sap Ag Managing consistent interfaces for financial instrument business objects across heterogeneous systems
US20100153432A1 (en) * 2008-12-11 2010-06-17 Sap Ag Object based modeling for software application query generation
US8639653B2 (en) * 2008-12-12 2014-01-28 At&T Intellectual Property I, L.P. Methods, systems, and computer program products for managing batch operations in an enterprise data integration platform environment
CA2660748C (en) * 2009-03-31 2016-08-09 Trapeze Software Inc. System for aggregating data and a method for providing the same
CA2713039C (en) * 2009-08-31 2014-06-10 Accenture Global Services Gmbh Flexible cube data warehousing
US8401993B2 (en) * 2009-09-14 2013-03-19 International Business Machines Corporation Analytics integration server within a comprehensive framework for composing and executing analytics applications in business level languages
US10127299B2 (en) * 2009-09-14 2018-11-13 International Business Machines Corporation Analytics information directories within a comprehensive framework for composing and executing analytics applications in business level languages
US10242406B2 (en) * 2009-09-14 2019-03-26 International Business Machines Corporation Analytics integration workbench within a comprehensive framework for composing and executing analytics applications in business level languages
CN102129425B (en) * 2010-01-20 2016-08-03 阿里巴巴集团控股有限公司 The access method of big object set table and device in data warehouse
US8717917B1 (en) * 2010-04-27 2014-05-06 Openwave Mobility, Inc. System and method for managing transaction data in a mobile communication network using selective sampling
US9186642B2 (en) * 2010-04-28 2015-11-17 The Procter & Gamble Company Delivery particle
US10671628B2 (en) * 2010-07-09 2020-06-02 State Street Bank And Trust Company Systems and methods for data warehousing
US9147195B2 (en) 2011-06-14 2015-09-29 Microsoft Technology Licensing, Llc Data custodian and curation system
US9244956B2 (en) 2011-06-14 2016-01-26 Microsoft Technology Licensing, Llc Recommending data enrichments
US8893028B2 (en) * 2011-09-21 2014-11-18 International Business Machines Corporation Supplementary calculation of numeric data in a web browser
US8954376B2 (en) * 2012-03-08 2015-02-10 International Business Machines Corporation Detecting transcoding tables in extract-transform-load processes
US8583626B2 (en) 2012-03-08 2013-11-12 International Business Machines Corporation Method to detect reference data tables in ETL processes
US11755663B2 (en) * 2012-10-22 2023-09-12 Recorded Future, Inc. Search activity prediction
US8914308B2 (en) * 2013-01-24 2014-12-16 Bank Of America Corporation Method and apparatus for initiating a transaction on a mobile device
US9535970B2 (en) 2013-06-28 2017-01-03 Sap Se Metric catalog system
US9426219B1 (en) 2013-12-06 2016-08-23 Amazon Technologies, Inc. Efficient multi-part upload for a data warehouse
US10902368B2 (en) * 2014-03-12 2021-01-26 Dt360 Inc. Intelligent decision synchronization in real time for both discrete and continuous process industries
US20150262095A1 (en) * 2014-03-12 2015-09-17 Bahwan CyberTek Private Limited Intelligent Decision Synchronization in Real Time for both Discrete and Continuous Process Industries
US20210182749A1 (en) * 2014-03-12 2021-06-17 Dt360 Inc. Method of predicting component failure in drive train assembly of wind turbines
US10101889B2 (en) 2014-10-10 2018-10-16 Salesforce.Com, Inc. Dashboard builder with live data updating without exiting an edit mode
US10049141B2 (en) 2014-10-10 2018-08-14 salesforce.com,inc. Declarative specification of visualization queries, display formats and bindings
US9449188B2 (en) 2014-10-10 2016-09-20 Salesforce.Com, Inc. Integration user for analytical access to read only data stores generated from transactional systems
US9767145B2 (en) 2014-10-10 2017-09-19 Salesforce.Com, Inc. Visual data analysis with animated informational morphing replay
US9396018B2 (en) * 2014-10-10 2016-07-19 Salesforce.Com, Inc. Low latency architecture with directory service for integration of transactional data system with analytical data structures
US9600548B2 (en) 2014-10-10 2017-03-21 Salesforce.Com Row level security integration of analytical data store with cloud architecture
CN104408641B (en) * 2014-10-29 2018-02-06 深圳先进技术研究院 The brand identity extracting method and system of ecommerce recommended models
CN106156125B (en) * 2015-04-08 2019-08-23 中国人民解放军国防科学技术大学 A method of the virtual identity management system copy based on different data organizational form
US11089160B1 (en) * 2015-07-14 2021-08-10 Ujet, Inc. Peer-to-peer VoIP
US10115213B2 (en) 2015-09-15 2018-10-30 Salesforce, Inc. Recursive cell-based hierarchy for data visualizations
US10089368B2 (en) 2015-09-18 2018-10-02 Salesforce, Inc. Systems and methods for making visual data representations actionable
US11263338B2 (en) 2017-10-16 2022-03-01 Sentience Inc. Data security maintenance method for data analysis application
US11880803B1 (en) * 2022-12-19 2024-01-23 Tbk Bank, Ssb System and method for data mapping and transformation

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US144174A (en) * 1873-10-28 Improvement in steam-engine valve-gears
US6049599A (en) * 1996-01-11 2000-04-11 Gte Telecommunication Services Incorporated Churn amelioration system and method therefor
US6173310B1 (en) * 1999-03-23 2001-01-09 Microstrategy, Inc. System and method for automatic transmission of on-line analytical processing system report output
US6266668B1 (en) * 1998-08-04 2001-07-24 Dryken Technologies, Inc. System and method for dynamic data-mining and on-line communication of customized information
US6272478B1 (en) * 1997-06-24 2001-08-07 Mitsubishi Denki Kabushiki Kaisha Data mining apparatus for discovering association rules existing between attributes of data
US6301471B1 (en) * 1998-11-02 2001-10-09 Openwave System Inc. Online churn reduction and loyalty system
US6430545B1 (en) * 1998-03-05 2002-08-06 American Management Systems, Inc. Use of online analytical processing (OLAP) in a rules based decision management system
US20020133490A1 (en) * 2000-01-13 2002-09-19 Erinmedia, Llc Privacy compliant multiple dataset correlation and content delivery system and methods
US6460037B1 (en) * 1998-04-01 2002-10-01 Mitel Knowledge Corporation Agent-based data mining and warehousing
US6473757B1 (en) * 2000-03-28 2002-10-29 Lucent Technologies Inc. System and method for constraint based sequential pattern mining
US6510457B1 (en) * 1998-06-17 2003-01-21 Hitachi, Ltd. Data analysis method and apparatus for data mining
US6636860B2 (en) * 2001-04-26 2003-10-21 International Business Machines Corporation Method and system for data mining automation in domain-specific analytic applications
US20030212789A1 (en) * 2002-05-09 2003-11-13 International Business Machines Corporation Method, system, and program product for sequential coordination of external database application events with asynchronous internal database events
US20040002961A1 (en) * 2002-06-27 2004-01-01 International Business Machines Corporation Intelligent query re-execution
US20040122790A1 (en) * 2002-12-18 2004-06-24 Walker Matthew J. Computer-assisted data processing system and method incorporating automated learning
US20040210579A1 (en) * 2003-04-17 2004-10-21 International Business Machines Corporation Rule application management in an abstract database
US20040215501A1 (en) * 2003-04-24 2004-10-28 D'ornano Emmanuel Method and system for automated marketing
US20050144163A1 (en) * 2002-06-21 2005-06-30 Microsoft Corporation Systems and methods for generating prediction queries
US20070053513A1 (en) * 1999-10-05 2007-03-08 Hoffberg Steven M Intelligent electronic appliance system and method
US20070130103A1 (en) * 2001-10-03 2007-06-07 Malone Donna B Methods and Systems for Processing a Plurality of Errors

Family Cites Families (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6385604B1 (en) * 1999-08-04 2002-05-07 Hyperroll, Israel Limited Relational database management system having integrated non-relational multi-dimensional data store of aggregated data elements
US6434568B1 (en) * 1999-08-31 2002-08-13 Accenture Llp Information services patterns in a netcentric environment
US6442748B1 (en) * 1999-08-31 2002-08-27 Accenture Llp System, method and article of manufacture for a persistent state and persistent object separator in an information services patterns environment
US6640244B1 (en) * 1999-08-31 2003-10-28 Accenture Llp Request batcher in a transaction services patterns environment
US7003560B1 (en) * 1999-11-03 2006-02-21 Accenture Llp Data warehouse computing system
US6490585B1 (en) * 1999-11-12 2002-12-03 Unisys Corp Cellular multiprocessor data warehouse
US6829615B2 (en) * 2000-02-25 2004-12-07 International Business Machines Corporation Object type relationship graphical user interface
US6622218B2 (en) * 2000-06-10 2003-09-16 Hewlett-Packard Development Company, Lp. Cache coherence protocol engine and method for efficient processing of interleaved memory transactions in a multiprocessor system
US6954758B1 (en) * 2000-06-30 2005-10-11 Ncr Corporation Building predictive models within interactive business analysis processes
US6718336B1 (en) * 2000-09-29 2004-04-06 Battelle Memorial Institute Data import system for data analysis system
US20020123996A1 (en) * 2001-02-06 2002-09-05 O'brien Christopher Data mining system, method and apparatus for industrial applications
US20020169735A1 (en) * 2001-03-07 2002-11-14 David Kil Automatic mapping from data to preprocessing algorithms
US6643635B2 (en) * 2001-03-15 2003-11-04 Sagemetrics Corporation Methods for dynamically accessing, processing, and presenting data acquired from disparate data sources
AU2002317119A1 (en) 2001-07-06 2003-01-21 Angoss Software Corporation A method and system for the visual presentation of data mining models
US6772034B1 (en) * 2001-07-12 2004-08-03 Advanced Micro Devices, Inc. System and software for data distribution in semiconductor manufacturing and method thereof
JP3773426B2 (en) * 2001-07-18 2006-05-10 株式会社日立製作所 Preprocessing method and preprocessing system in data mining
US6804669B2 (en) * 2001-08-14 2004-10-12 International Business Machines Corporation Methods and apparatus for user-centered class supervision
US20030120528A1 (en) * 2001-10-23 2003-06-26 Kruk Jeffrey M. System and method for managing compliance with strategic business rules
US20030130871A1 (en) * 2001-11-02 2003-07-10 Rao R. Bharat Patient data mining for clinical trials
US20030130996A1 (en) * 2001-12-21 2003-07-10 International Business Machines Corporation Interactive mining of time series data
US20040215522A1 (en) 2001-12-26 2004-10-28 Eder Jeff Scott Process optimization system
US6714893B2 (en) * 2002-02-15 2004-03-30 International Business Machines Corporation Enhanced concern indicator failure prediction system
US6985904B1 (en) * 2002-02-28 2006-01-10 Oracle International Corporation Systems and methods for sharing of execution plans for similar database statements
US20030182284A1 (en) * 2002-03-25 2003-09-25 Lucian Russell Dynamic data mining process
US20030191727A1 (en) * 2002-04-04 2003-10-09 Ibm Corporation Managing multiple data mining scoring results
US6999977B1 (en) * 2002-05-09 2006-02-14 Oracle International Corp Method and apparatus for change data capture in a database system
US20040128204A1 (en) 2002-12-27 2004-07-01 Cihla Virgil F. Systems for procuring products in a distributed system
US20040164961A1 (en) * 2003-02-21 2004-08-26 Debasis Bal Method, system and computer product for continuously monitoring data sources for an event of interest
US20040215695A1 (en) * 2003-03-31 2004-10-28 Sue-Chen Hsu Method and system for implementing accurate and convenient online transactions in a loosely coupled environments
US7350191B1 (en) * 2003-04-22 2008-03-25 Noetix, Inc. Computer implemented system and method for the generation of data access applications

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US144174A (en) * 1873-10-28 Improvement in steam-engine valve-gears
US6049599A (en) * 1996-01-11 2000-04-11 Gte Telecommunication Services Incorporated Churn amelioration system and method therefor
US6272478B1 (en) * 1997-06-24 2001-08-07 Mitsubishi Denki Kabushiki Kaisha Data mining apparatus for discovering association rules existing between attributes of data
US6430545B1 (en) * 1998-03-05 2002-08-06 American Management Systems, Inc. Use of online analytical processing (OLAP) in a rules based decision management system
US6460037B1 (en) * 1998-04-01 2002-10-01 Mitel Knowledge Corporation Agent-based data mining and warehousing
US6510457B1 (en) * 1998-06-17 2003-01-21 Hitachi, Ltd. Data analysis method and apparatus for data mining
US6266668B1 (en) * 1998-08-04 2001-07-24 Dryken Technologies, Inc. System and method for dynamic data-mining and on-line communication of customized information
US6301471B1 (en) * 1998-11-02 2001-10-09 Openwave System Inc. Online churn reduction and loyalty system
US6173310B1 (en) * 1999-03-23 2001-01-09 Microstrategy, Inc. System and method for automatic transmission of on-line analytical processing system report output
US20070053513A1 (en) * 1999-10-05 2007-03-08 Hoffberg Steven M Intelligent electronic appliance system and method
US20020133490A1 (en) * 2000-01-13 2002-09-19 Erinmedia, Llc Privacy compliant multiple dataset correlation and content delivery system and methods
US6473757B1 (en) * 2000-03-28 2002-10-29 Lucent Technologies Inc. System and method for constraint based sequential pattern mining
US6636860B2 (en) * 2001-04-26 2003-10-21 International Business Machines Corporation Method and system for data mining automation in domain-specific analytic applications
US20070130103A1 (en) * 2001-10-03 2007-06-07 Malone Donna B Methods and Systems for Processing a Plurality of Errors
US20030212789A1 (en) * 2002-05-09 2003-11-13 International Business Machines Corporation Method, system, and program product for sequential coordination of external database application events with asynchronous internal database events
US20050144163A1 (en) * 2002-06-21 2005-06-30 Microsoft Corporation Systems and methods for generating prediction queries
US20040002961A1 (en) * 2002-06-27 2004-01-01 International Business Machines Corporation Intelligent query re-execution
US20040122790A1 (en) * 2002-12-18 2004-06-24 Walker Matthew J. Computer-assisted data processing system and method incorporating automated learning
US20040210579A1 (en) * 2003-04-17 2004-10-21 International Business Machines Corporation Rule application management in an abstract database
US20040215501A1 (en) * 2003-04-24 2004-10-28 D'ornano Emmanuel Method and system for automated marketing

Cited By (95)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030222903A1 (en) * 2002-05-31 2003-12-04 Wolfgang Herzog Distributing customized computer settings to affected systems
US20040088730A1 (en) * 2002-11-01 2004-05-06 Srividya Gopalan System and method for maximizing license utilization and minimizing churn rate based on zero-reject policy for video distribution
US20050038701A1 (en) * 2003-08-13 2005-02-17 Alan Matthew Computer system for card in connection with, but not to carry out, a transaction
US8095563B2 (en) 2004-07-07 2012-01-10 Sap Aktiengesellschaft Configuring computer systems with business configuration information
US20100281243A1 (en) * 2004-07-07 2010-11-04 Sap Aktiengesellschaft Configuring Computer Systems with Business Configuration Information
US7735063B2 (en) 2004-07-07 2010-06-08 Sap Aktiengesellschaft Providing customizable configuration data in computer systems
US7774369B2 (en) 2004-07-07 2010-08-10 Sap Aktiengesellschaft Configuring computer systems with business configuration information
US20100281244A1 (en) * 2004-07-07 2010-11-04 Sap Aktiengesellschaft Configuring Computer Systems with Business Configuration Information
US20060010163A1 (en) * 2004-07-07 2006-01-12 Wolfgang Herzog Configuring computer systems with business configuration information
US8095562B2 (en) 2004-07-07 2012-01-10 Sap Aktiengesellshaft Configuring computer systems with business configuration information
US8095564B2 (en) 2004-07-07 2012-01-10 Sap Aktiengesellschaft Configuring computer systems with business configuration information
US20060010434A1 (en) * 2004-07-07 2006-01-12 Wolfgang Herzog Providing customizable configuration data in computer systems
US20100287075A1 (en) * 2004-07-07 2010-11-11 Sap Aktiengesellschaft Configuring Computer Systems with Business Configuration Information
US20060080277A1 (en) * 2004-10-04 2006-04-13 Peter Nador Method and system for designing, implementing and documenting OLAP
US7478112B2 (en) * 2004-12-16 2009-01-13 International Business Machines Corporation Method and apparatus for initializing data propagation execution for large database replication
US20060136443A1 (en) * 2004-12-16 2006-06-22 International Business Machines Corporation Method and apparatus for initializing data propagation execution for large database replication
US20060167911A1 (en) * 2005-01-24 2006-07-27 Stephane Le Cam Automatic data pattern recognition and extraction
US20070143374A1 (en) * 2005-02-07 2007-06-21 D Souza Roy P Enterprise service availability through identity preservation
US8543542B2 (en) 2005-02-07 2013-09-24 Mimosa Systems, Inc. Synthetic full copies of data and dynamic bulk-to-brick transformation
EP1856637A2 (en) * 2005-02-07 2007-11-21 Mimosa Systems Inc. Multi-dimensional surrogates for data management
US20060179061A1 (en) * 2005-02-07 2006-08-10 D Souza Roy P Multi-dimensional surrogates for data management
US8271436B2 (en) 2005-02-07 2012-09-18 Mimosa Systems, Inc. Retro-fitting synthetic full copies of data
US7917475B2 (en) 2005-02-07 2011-03-29 Mimosa Systems, Inc. Enterprise server version migration through identity preservation
US8918366B2 (en) 2005-02-07 2014-12-23 Mimosa Systems, Inc. Synthetic full copies of data and dynamic bulk-to-brick transformation
US8812433B2 (en) 2005-02-07 2014-08-19 Mimosa Systems, Inc. Dynamic bulk-to-brick transformation of data
US20070168500A1 (en) * 2005-02-07 2007-07-19 D Souza Roy P Enterprise service availability through identity preservation
EP1856637A4 (en) * 2005-02-07 2009-04-22 Mimosa Systems Inc Multi-dimensional surrogates for data management
US8799206B2 (en) 2005-02-07 2014-08-05 Mimosa Systems, Inc. Dynamic bulk-to-brick transformation of data
US7657780B2 (en) 2005-02-07 2010-02-02 Mimosa Systems, Inc. Enterprise service availability through identity preservation
US20070156792A1 (en) * 2005-02-07 2007-07-05 D Souza Roy P Dynamic bulk-to-brick transformation of data
US20070233756A1 (en) * 2005-02-07 2007-10-04 D Souza Roy P Retro-fitting synthetic full copies of data
US20070156793A1 (en) * 2005-02-07 2007-07-05 D Souza Roy P Synthetic full copies of data and dynamic bulk-to-brick transformation
US7778976B2 (en) 2005-02-07 2010-08-17 Mimosa, Inc. Multi-dimensional surrogates for data management
US20070150499A1 (en) * 2005-02-07 2007-06-28 D Souza Roy P Dynamic bulk-to-brick transformation of data
US20070143373A1 (en) * 2005-02-07 2007-06-21 D Souza Roy P Enterprise server version migration through identity preservation
US20070143365A1 (en) * 2005-02-07 2007-06-21 D Souza Roy P Synthetic full copies of data and dynamic bulk-to-brick transformation
US8161318B2 (en) 2005-02-07 2012-04-17 Mimosa Systems, Inc. Enterprise service availability through identity preservation
US7870416B2 (en) 2005-02-07 2011-01-11 Mimosa Systems, Inc. Enterprise service availability through identity preservation
US20060190486A1 (en) * 2005-02-24 2006-08-24 Qi Zhou Configuring a computer application with preconfigured business content
US7325015B2 (en) * 2005-02-24 2008-01-29 Sap Aktiengesellschaft Configuring a computer application with preconfigured business content
US7917383B2 (en) * 2005-11-11 2011-03-29 Accenture Global Services Limited Method and system for boosting the average revenue per user of products or services
US20070112615A1 (en) * 2005-11-11 2007-05-17 Matteo Maga Method and system for boosting the average revenue per user of products or services
US20080059443A1 (en) * 2006-09-01 2008-03-06 France Telecom Method and system for the extraction of a data table from a data base, corresponding computer program product
EP1895410A1 (en) * 2006-09-01 2008-03-05 France Telecom Method and system for extraction of a data table from a database and corresponding computer program product
US8346725B2 (en) * 2006-09-15 2013-01-01 Oracle International Corporation Evolution of XML schemas involving partial data copy
US20080071812A1 (en) * 2006-09-15 2008-03-20 Oracle International Corporation Evolution of XML schemas involving partial data copy
US7870163B2 (en) 2006-09-28 2011-01-11 Oracle International Corporation Implementation of backward compatible XML schema evolution in a relational database system
US20080082560A1 (en) * 2006-09-28 2008-04-03 Oracle International Corporation Implementation of backward compatible XML schema evolution
US20090300280A1 (en) * 2008-06-02 2009-12-03 Curtis Edward Jutzi Detecting data mining processes to increase caching efficiency
US8019939B2 (en) * 2008-06-02 2011-09-13 Intel Corporation Detecting data mining processes to increase caching efficiency
US20100146510A1 (en) * 2008-12-10 2010-06-10 Jan Teichmann Automated Scheduling of Mass Data Run Objects
US8555241B2 (en) 2008-12-10 2013-10-08 Sap Ag Automated scheduling of mass data run objects
US8577833B2 (en) 2012-01-04 2013-11-05 International Business Machines Corporation Automated data analysis and transformation
US8768880B2 (en) 2012-01-04 2014-07-01 International Business Machines Corporation Automated data analysis and transformation
US10078685B1 (en) 2012-01-09 2018-09-18 W. C. Taylor, III Data gathering and data re-presentation tools
US20130198093A1 (en) * 2012-01-09 2013-08-01 W. C. Taylor, III Data mining and logic checking tools
US10885067B2 (en) 2012-01-09 2021-01-05 W. C. Taylor, III Data gathering and data re-presentation tools
US9361656B2 (en) * 2012-01-09 2016-06-07 W. C. Taylor, III Data mining and logic checking tools
US11030059B2 (en) 2012-03-23 2021-06-08 Commvault Systems, Inc. Automation of data storage activities
US10824515B2 (en) 2012-03-23 2020-11-03 Commvault Systems, Inc. Automation of data storage activities
US20130253977A1 (en) * 2012-03-23 2013-09-26 Commvault Systems, Inc. Automation of data storage activities
US9292815B2 (en) 2012-03-23 2016-03-22 Commvault Systems, Inc. Automation of data storage activities
US11550670B2 (en) 2012-03-23 2023-01-10 Commvault Systems, Inc. Automation of data storage activities
US20140012862A1 (en) * 2012-07-04 2014-01-09 Sony Corporation Information processing apparatus, information processing method, program, and information processing system
US9582555B2 (en) * 2012-09-06 2017-02-28 Sap Se Data enrichment using business compendium
US20140067803A1 (en) * 2012-09-06 2014-03-06 Sap Ag Data Enrichment Using Business Compendium
US9324033B2 (en) * 2012-09-13 2016-04-26 Nokia Technologies Oy Method and apparatus for providing standard data processing model through machine learning
US20140074760A1 (en) * 2012-09-13 2014-03-13 Nokia Corporation Method and apparatus for providing standard data processing model through machine learning
US9515793B2 (en) * 2012-11-29 2016-12-06 Telefonaktiebolaget Lm Ericsson (Publ) Network resource configuration
US20140177544A1 (en) * 2012-11-29 2014-06-26 Telefonakiebolaget L M Ericsson (Publ) Network resource configuration
US10860401B2 (en) 2014-02-27 2020-12-08 Commvault Systems, Inc. Work flow management for an information management system
US9898515B1 (en) * 2014-10-29 2018-02-20 Jpmorgan Chase Bank, N.A. Data extraction and transformation method and system
US10515090B2 (en) * 2014-10-29 2019-12-24 Jpmorgan Chase Bank, N.A. Data extraction and transformation method and system
FR3032538A1 (en) * 2015-02-09 2016-08-12 Orbite COMPUTER SYSTEM FOR AUTOMATIC DATA COLLECTION
EP3276504A4 (en) * 2015-03-24 2018-12-12 Gixo Ltd. Data processing system, data processing method, program, and computer memory medium
US11263600B2 (en) 2015-03-24 2022-03-01 4 S Technologies, LLC Automated trustee payments system
US10762066B2 (en) 2015-03-24 2020-09-01 Gixo Ltd. Data processing system having an integration layer, aggregation layer, and analysis layer, data processing method for the same, program for the same, and computer storage medium for the same
US11138193B2 (en) * 2015-05-29 2021-10-05 International Business Machines Corporation Estimating the cost of data-mining services
US10306013B2 (en) * 2015-07-15 2019-05-28 Sap Se Churn risk scoring using call network analysis
US10592917B2 (en) * 2016-05-03 2020-03-17 Cox Automotive, Inc. Method and systems for determining programmatically expected performances
US20170323326A1 (en) * 2016-05-03 2017-11-09 Eric Kim Method and systems for determining programmatically expected performances
US20170337567A1 (en) * 2016-05-17 2017-11-23 Sap Se Real-time system to identify and analyze behavioral patterns to predict churn risk and increase retention
US10600063B2 (en) * 2016-05-17 2020-03-24 Sap Se Real-time system to identify and analyze behavioral patterns to predict churn risk and increase retention
US11126616B2 (en) 2016-10-19 2021-09-21 Salesforce.Com, Inc. Streamlined creation and updating of olap analytic databases
WO2018075817A1 (en) * 2016-10-19 2018-04-26 Salesforce.Com, Inc. Streamlined creation and updating of olap analytic databases
US10311047B2 (en) 2016-10-19 2019-06-04 Salesforce.Com, Inc. Streamlined creation and updating of OLAP analytic databases
US10599527B2 (en) 2017-03-29 2020-03-24 Commvault Systems, Inc. Information management cell health monitoring system
US11314602B2 (en) 2017-03-29 2022-04-26 Commvault Systems, Inc. Information management security health monitoring system
US11734127B2 (en) 2017-03-29 2023-08-22 Commvault Systems, Inc. Information management cell health monitoring system
US11829255B2 (en) 2017-03-29 2023-11-28 Commvault Systems, Inc. Information management security health monitoring system
US20180300388A1 (en) * 2017-04-17 2018-10-18 International Business Machines Corporation System and method for automatic data enrichment from multiple public datasets in data integration tools
US20190095513A1 (en) * 2017-04-17 2019-03-28 International Business Machines Corporation System and method for automatic data enrichment from multiple public datasets in data integration tools
US20180341889A1 (en) * 2017-05-25 2018-11-29 Centene Corporation Entity level classifier using machine learning
US11687567B2 (en) * 2017-09-21 2023-06-27 Vmware, Inc. Trigger based analytics database synchronization
US11113702B1 (en) * 2018-12-12 2021-09-07 Amazon Technologies, Inc. Online product subscription recommendations based on a customers failure to perform a computer-based action and a monetary value threshold

Also Published As

Publication number Publication date
US7571191B2 (en) 2009-08-04
WO2004097667A2 (en) 2004-11-11
US20050027683A1 (en) 2005-02-03
EP1623343A2 (en) 2006-02-08
WO2004097667A3 (en) 2005-02-17
US20040267751A1 (en) 2004-12-30

Similar Documents

Publication Publication Date Title
US20040215656A1 (en) Automated data mining runs
US20180225345A1 (en) Systems and methods for collection and consolidation of heterogeneous remote business data using dynamic data handling
US9684703B2 (en) Method and apparatus for automatically creating a data warehouse and OLAP cube
US7308704B2 (en) Data structure for access control
US6367077B1 (en) Method of upgrading a software application in the presence of user modifications
US10956422B2 (en) Integrating event processing with map-reduce
US7350237B2 (en) Managing access control information
US8671084B2 (en) Updating a data warehouse schema based on changes in an observation model
EP1610235B1 (en) A data processing system and method
US8626703B2 (en) Enterprise resource planning (ERP) system change data capture
US8407183B2 (en) Business intelligence data extraction on demand
WO2005106711A1 (en) Method and apparatus for automatically creating a data warehouse and olap cube
US20220092117A1 (en) System and method of creating different relationships between various entities using a graph database
US11327954B2 (en) Multitenant architecture for prior period adjustment processing
Kimball The evolving role of the enterprise data warehouse in the era of big data analytics
US8396847B2 (en) System and method to retrieve and analyze data for decision making
Saxena et al. Business intelligence
US20210334274A1 (en) Systems and methods for monitoring user-defined metrics
US7676443B2 (en) System and method for processing data elements in retail sales environment
JP2001195288A (en) Trunk task package and method for selling the same
US20220292420A1 (en) Survey and Result Analysis Cycle Using Experience and Operations Data
EP1324229A2 (en) Using point-in-time views to provide varying levels of data freshness
Sonnleitner et al. Persistence of workflow control data in temporal databases
Calderon et al. Leveraging Internet and Database Technologies to Enhance Internal Business Processes
Druzhinina et al. A subsystem for the management of working hours during the operation of automated technology for the processing of information about scientific activities

Legal Events

Date Code Title Description
AS Assignment

Owner name: SAP AKTIENGESELLSCHAFT, GERMANY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DILL, MARCUS;MAHABAL, HARISH HOSKERE;SHANKAR, LAKSHMI;AND OTHERS;REEL/FRAME:014030/0282;SIGNING DATES FROM 20030806 TO 20030821

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION