US20150120263A1 - Computer-Implemented Systems and Methods for Testing Large Scale Automatic Forecast Combinations - Google Patents

Computer-Implemented Systems and Methods for Testing Large Scale Automatic Forecast Combinations Download PDF

Info

Publication number
US20150120263A1
US20150120263A1 US14/557,312 US201414557312A US2015120263A1 US 20150120263 A1 US20150120263 A1 US 20150120263A1 US 201414557312 A US201414557312 A US 201414557312A US 2015120263 A1 US2015120263 A1 US 2015120263A1
Authority
US
United States
Prior art keywords
forecast
model
post
time series
combination model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/557,312
Inventor
Jerzy Michal Brzezicki
Dinesh P. Apte
Michael J. Leonard
Michael Ryan Chipley
Sagar Arun Mainkar
Edward Tilden Blair
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SAS Institute Inc
Original Assignee
SAS Institute Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/189,131 external-priority patent/US20130024167A1/en
Application filed by SAS Institute Inc filed Critical SAS Institute Inc
Priority to US14/557,312 priority Critical patent/US20150120263A1/en
Assigned to SAS INSTITUTE INC. reassignment SAS INSTITUTE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: APTE, DINESH P., BLAIR, EDWARD T., BRZEZICKI, JERZY MICHAL, CHIPLEY, MICHAEL RYAN, LEONARD, MICHAEL J., MAINKAR, SAGAR ARUN
Publication of US20150120263A1 publication Critical patent/US20150120263A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • G06F17/5009
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/08Probabilistic or stochastic CAD

Definitions

  • This document relates generally to computer-implemented forecasting and more particularly to testing a combined forecast that is generated using multiple forecasts.
  • Forecasting is a process of making statements about events whose actual outcomes typically have not yet been observed. A commonplace example might be estimation for some variable of interest at some specified future date. Forecasting often involves formal statistical methods employing time series, cross-sectional or longitudinal data, or alternatively to less formal judgmental methods. Forecasts are often generated by providing a number of input values to a predictive model, where the model outputs a forecast. While a well designed model may give an accurate forecast, a configuration where predictions of multiple models are considered when generating a forecast may provide even stronger forecast results.
  • a forecast model selection graph is accessed, the forecast model selection graph comprising a hierarchy of nodes arranged in parent-child relationships.
  • a plurality of model forecast nodes are resolved, where resolving a model forecast node includes generating a node forecast for the one or more physical process attributes.
  • a combination node is processed, where a combination node transforms a plurality of node forecasts at child nodes of the combination node into a combined forecast.
  • a selection node is processed, where a selection node chooses a node forecast from among child nodes of the selection node based on a selection criteria.
  • a system for storing evaluating a physical process with respect to one or more attributes of the physical process by combining forecasts for the one or more physical process attributes, where data for evaluating the physical process is generated over time may include one or more data processors and a computer-readable medium encoded with instructions for commanding the one or more data processors to execute steps.
  • steps a forecast model selection graph is accessed, the forecast model selection graph comprising a hierarchy of nodes arranged in parent-child relationships.
  • a plurality of model forecast nodes are resolved, where resolving a model forecast node includes generating a node forecast for the one or more physical process attributes.
  • a combination node is processed, where a combination node transforms a plurality of node forecasts at child nodes of the combination node into a combined forecast.
  • a selection node is processed, where a selection node chooses a node forecast from among child nodes of the selection node based on a selection criteria.
  • a computer-readable storage medium may be encoded with instructions for commanding one or more data processors to execute a method.
  • a forecast model selection graph is accessed, the forecast model selection graph comprising a hierarchy of nodes arranged in parent-child relationships.
  • a plurality of model forecast nodes are resolved, where resolving a model forecast node includes generating a node forecast for the one or more physical process attributes.
  • a combination node is processed, where a combination node transforms a plurality of node forecasts at child nodes of the combination node into a combined forecast.
  • a selection node is processed, where a selection node chooses a node forecast from among child nodes of the selection node based on a selection criteria.
  • one or more computer-readable storage mediums may store data structures for access by an application program being executed on one or more data processors for evaluating a physical process with respect to one or more attributes of the physical process by combining forecasts for the one or more physical process attributes, where physical process data generated over time is used in the forecasts for the one or more physical process attributes.
  • the data structures may include a predictive models data structure, the predictive models data structure containing predictive data model records for specifying predictive data models and a forecast model selection graph data structure, where the forecast model selection graph data structure contains data about a hierarchical structure of nodes which specify how the forecasts for the one or more physical process attributes are combined, where the hierarchical structure of nodes has a root node wherein the nodes include model forecast nodes, one or more model combination nodes, and one or more model selection nodes.
  • the forecast model selection graph data structure may include model forecast node data which specifies for the model forecast nodes which particular predictive data models contained in the predictive models data structure are to be used for generating forecasts, model combination node data which specifies for the one or more model combination nodes which of the forecasts generated by the model forecast nodes are to be combined, and selection node data which specifies for the one or more model selection nodes model selection criteria for selecting, based upon model forecasting performance, models associated with the model forecast nodes or the one or more model combination nodes.
  • a plurality of forecasting models may be generated using a set of in-sample data.
  • a selection of two or more forecasting models may be received from the plurality of forecasting models for use in generating a combined forecast.
  • a set of actual out-of-sample data may be received.
  • An ex-ante combined forecast may be generated for an out-of-sample period using the selected two or more forecasting models.
  • the ex-ante combined forecast and the set of actual out-of-sample data may be provided for use in evaluating performance of the combined forecast.
  • FIG. 1 is a block diagram depicting a computer-implemented combined forecast engine.
  • FIG. 2 is a block diagram depicting the generation of a combined forecast for a forecast variable.
  • FIG. 3 is a block diagram depicting steps that may be performed by a combined forecast engine in generating a combined forecast.
  • FIG. 4 depicts an example forecast model selection graph.
  • FIG. 5 depicts an example forecast model selection graph including selection nodes, combination nodes, and model forecast nodes.
  • FIG. 6 is a block diagram depicting example operations that may be performed by a combined forecast engine in combining one or more forecasts.
  • FIG. 7 is a flow diagram depicting an example redundancy test in the form of an encompassing test.
  • FIG. 8 depicts a forecast model selection graph having a selection node as a root node.
  • FIG. 9 depicts a forecast model selection graph having a combination node as a root node.
  • FIG. 10 depicts an example model repository for storing predictive models.
  • FIG. 11 depicts a link between a forecast model selection graph and a model repository.
  • FIG. 12 is a diagram depicting relationships among a forecast model selection graph data structure, a models data structure, and a combined forecast engine.
  • FIG. 13 depicts an example forecast model selection graph data structure.
  • FIG. 14 depicts an example node record.
  • FIGS. 15-32 depict graphical user interfaces that may be used in generating and comparing combined forecasts.
  • FIGS. 33A , 33 B, and 33 C depict example systems for use in implementing combined forecast engine.
  • FIG. 34 is a block diagram depicting an example system for evaluating the performance of a combined forecast model using a rolling simulation analysis.
  • FIGS. 35-41 depict example user interfaces for evaluating the performance of a combined forecast model using a rolling simulation analysis.
  • FIG. 1 is a block diagram depicting a computer-implemented combined forecast engine.
  • FIG. 1 depicts a computer-implemented combined forecast engine 102 for facilitating the creation of combined forecasts and evaluation of created combined forecasts against individual forecasts as well as other combined forecasts.
  • Forecasts are predictions that are typically generated by a predictive model based on one or more inputs to the predictive model.
  • a combined forecast engine 102 combines predictions made by multiple models, of the same or different type, to generate a single, combined forecast that can incorporate the strengths of the multiple, individual models which comprise the combined forecast.
  • a combined forecast may be generated (e.g., to predict a manufacturing process output, to estimate product sales) by combining individual forecasts from two linear regression models and one autoregressive regression model.
  • the individual forecasts may be combined in a variety of ways, such as by a straight average, via a weighted average, or via another method.
  • automated analysis of the individual forecasts may be performed to identify weights to generate an optimum combined forecast that best utilizes the available individual forecasts.
  • the combined forecast engine 102 provides a platform for users 104 to generate combined forecasts based on individual forecasts generated by individual predictive models 106 .
  • a user 104 accesses the combined forecast engine 102 , which is hosted on one or more servers 108 , via one or more networks 110 .
  • the one or more servers 108 are responsive to one or more data stores 112 .
  • the one or more data stores 112 may contain a variety of data that includes predictive models 106 and model forecasts 114 .
  • FIG. 2 is a block diagram depicting the generation of a combined forecast for a forecast variable (e.g., one or more physical process attributes).
  • the combined forecast engine 202 receives an identification of a forecast variable 204 for which to generate a combined forecast 206 .
  • a user may command that the combined forecast engine 202 generate a combined forecast 206 of sales for a particular clothing item.
  • the combined forecast engine 202 may identify a number of individual predictive models. Those individual predictive models may be provided historic data 208 as input, and those individual predictive models provide individual forecasts based on the provided historic data 208 .
  • the combined forecast engine 202 performs operations to combine those individual predictions of sales of the particular clothing item to generate the combined forecast of sales for the particular clothing item.
  • FIG. 3 is a block diagram depicting steps that may be performed by a combined forecast engine in generating a combined forecast.
  • the combined forecast engine 302 receives a forecast variable 304 for which to generate a combined forecast as well as historic data 306 to be used as input to individual predictive models whose predictions become components of the combined forecast 308 .
  • the combined forecast engine 302 may utilize model selection and model combination operations to generate a combined forecast. For example, the combined forecast engine 302 may evaluate a physical process with respect to one or more attributes of the physical process by combining forecasts for the one or more physical process attributes. Data for evaluating the physical process may be generated over time, such as time series data.
  • the combined forecast engine accesses a forecast model selection graph.
  • a forecast model selection graph incorporates both model selection and model combination into a decision based framework that, when applied to a time series, automatically selects a forecast from an evaluation of independent, individual forecasts generated.
  • the forecast model selection graph can include forecasts from statistical models, external forecasts from outside agents (e.g., expert predictions, other forecasts generated outside of the combined forecast engine 302 ), or combinations thereof.
  • the forecast model selection graph may be used to generate combined forecasts as well as comparisons among competing generated forecasts to select a best forecast.
  • a forecast model selection graph for a forecast decision process of arbitrary complexity may be created, limited only by external factors such as computational power and machine resource limits.
  • a forecast model selection graph may include a hierarchy of nodes arranged in parent-child relationships including a root node.
  • the hierarchy may include one or more selection nodes, one or more combination nodes, and a plurality of model forecast nodes.
  • Each of the model forecast nodes is associated with a predictive model.
  • the combined forecast engine may resolve the plurality of model forecast nodes, as shown at 312 .
  • Resolving a model forecast node includes generating a node forecast for the forecast variable 304 using the predictive model for the model forecast node.
  • a first model forecast node may be associated with a regression model.
  • the combined forecast engine 302 provides the historic data 306 to the regression model, and the regression model generates a node forecast for the model forecast node.
  • a second model forecast node may be associated with a human expert prediction. In such a case, computation by the combined forecast engine 302 may be limited, such as simply accessing the human expert's prediction from storage.
  • a third model forecast node may be associated with a different combined model. To resolve the third model forecast node, the combined forecast engine 302 provides the historic data 306 to the different combined model, and the different combined model generates a node forecast for the model forecast node. Other types of models and forecasts may also be associated with a model forecast node.
  • the combined forecast engine processes a combination node.
  • the combined forecast engine 302 transforms a plurality of node forecasts at child nodes of the combination nodes into a combined forecast. For example, a combination node having three child nodes would have the node forecasts for those three child nodes combined into a combined forecast for the combination node.
  • Combining node forecasts may be done in a variety of ways, such as via a weighted average.
  • a weighted average may weight each of the three node forecasts equally, or the combined forecast engine 302 may implement more complex logic to identify a weight for each of the three node forecasts.
  • weight types may include a simple average, user-defined weights, rank weights, ranked user-weights, AICC weights, root mean square error weights, restricted least squares eights, OLS weights, and least absolute deviation weights.
  • the combined forecast engine processes a selection node.
  • the combined forecast engine 302 chooses a node forecast from among child nodes of the selection node based on a selection criteria.
  • the selection criteria may take a variety of forms. For example, the selection criteria may dictate selection of a node forecast associated with a node whose associated model performs best in a hold out sample analysis.
  • Metadata may be associated with models associated with node forecasts, where the metadata identifies a model characteristic of a model.
  • the selection criteria may dictate selection of a node forecast whose metadata model characteristic best matches a characteristic of the forecast variable 304 . For example, if the forecast variable 304 tends to behave in a seasonal pattern, then the selection criteria may dictate selection of a node forecast that was generated by a model whose metadata identifies it as handling seasonal data.
  • model metadata characteristics include trending model, intermittent model, and transformed model.
  • the selection criteria may dictate selection of a node forecast having the least amount of missing data.
  • a node forecast may include forecasts for the forecast variable 304 for a number of time periods in the future (e.g., forecast variable at t+1, forecast variable at t+2, . . . ).
  • a node forecast may be missing data for certain future time period forecasts (e.g., the node forecast is an expert's prediction, where the expert only makes one prediction at t+6 months).
  • the selection criteria may dictate that a selected node forecast must not be missing a forecast at the time period of interest (e.g., when the time period of interest is t+1 month, the node forecast including the expert's prediction may not be selected).
  • the selection criteria may be based on a statistic of fit.
  • the combined forecast engine 302 may fit models associated with child nodes of a selection node with the historic data 306 and calculate statistics of fit for those models. Based on the determined statistics of fit, the combined forecast engine 302 selects the forecast node associated with the model that is a best fit.
  • the combined forecast engine 302 may continue resolving model forecast nodes 312 and processing combination and selection nodes 314 , 316 until a final combined forecast is generated.
  • the combined forecast engine may work from the leaves up to the root in the forecast model selection graph hierarchy, where the final combined forecast is generated at the root node.
  • FIG. 4 depicts an example forecast model selection graph.
  • the forecast model selection graph includes a hierarchy of nodes arranged in parent-child relationships that includes a root node 402 .
  • the forecast model selection graph also includes two model forecast nodes 404 .
  • the model forecast nodes 404 may be associated with a model that can be used to forecast one or more values for a forecast variable.
  • a model associated with a model forecast node 404 may also be a combined model or a forecast generated outside of the combined forecast engine, such as an expert or other human generated forecast.
  • the model forecast nodes 404 are resolved to identify a node forecast (e.g., using an associated model to generate a node forecast, accessing an expert forecast from storage).
  • the forecast model selection graph also includes selection nodes 406 .
  • a selection node may include a selection criteria for choosing a node forecast from among child nodes (e.g., model forecast nodes 404 ) of the selection node 406 . Certain of the depicted selection nodes S 1 , S 2 , Sn do not have their child nodes depicted in FIG. 4 .
  • FIG. 5 depicts an example forecast model selection graph including selection nodes, combination nodes, and model forecast nodes.
  • model forecast nodes 502 are resolved to generate node forecasts for one or more forecast variables (e.g., physical process attributes).
  • a selection node 504 selects one of the node forecasts associated with the model forecast nodes 502 based on a selection criteria.
  • the selection criteria may dictate a model forecast based on metadata associated with a model used to generate the model forecast at the model forecast node 502 .
  • Additional model forecast nodes 506 may be resolved to generate node forecasts at those model forecast nodes 506 .
  • a first combined forecast node 508 combines a model forecast associated with model forecast node MF 1 _ 1 and the model forecast at the selection node 504 to generate a combined forecast at the combination node 508 .
  • a second combined forecast node 510 combines a model forecast associated with model forecast node MF 2 _ 1 and the model forecast at the selection node 504 to generate a combined forecast at the combination node 510 .
  • Another selection node 512 selects a model forecast from one of the two combination nodes 508 , 510 based on a selection criteria as the final combined forecast for the forecast model selection graph 500 .
  • a forecast model selection graph may take a variety of forms.
  • the forecast model selection graph may be represented in one or more records in a database or described in a file.
  • the forecast model selection graph may be represented via one or more XML based data structures.
  • the XML data structures may identify the forecast sources to combine, diagnostic tests used in the selection and filtering of forecasts, methods for determining weights to forecasts to be combined, treatment of missing values, and selection of methods for estimating forecast prediction error variance.
  • FIG. 6 is a block diagram depicting example operations that may be performed by a combined forecast engine in combining one or more forecasts (e.g., when processing a combination node).
  • an initial set of model forecasts is identified. In some implementations, all identified model forecasts may be combined to create a combined forecast. However, in some implementations, it may be desirable to filter the models used in creating a combined forecast.
  • the set of model forecasts may be reduced at 604 based on one or more forecast candidate tests.
  • the forecast candidate tests may take a variety of forms, such as analysis of the types of models used to generate the model forecasts identified at 602 and characteristics of the forecast variable. For example, if the forecast variable is a trending variable, the candidate tests may eliminate model forecasts generated by models that are designed to handle seasonal data.
  • the set of model forecasts may be reduced based on one or more forecast quality tests.
  • Forecast quality tests may take a variety of forms. For example, forecast quality tests may analyze missing values of model forecasts. For example, model forecasts may be filtered from the set if the model forecasts have missing values in an area of interest (e.g., a forecast horizon). In another example, a model forecast may be filtered from the set if it is missing more than a particular % of values in the forecast horizon.
  • the set of model forecasts may be reduced based on redundancy tests.
  • a redundancy test may analyze models associated with model forecasts nodes to identify robust models, and those models having a high degree of redundancy (e.g., models that are producing forecasts that are statistically too similar). Model forecasts having a high degree of redundancy may be excluded from the combined model being generated.
  • certain statistics for a combined forecast may be determined. For example, a prediction error variance estimate may be calculated.
  • the prediction error variance estimate may incorporate pair-wise correlation estimates between the individual forecast prediction errors for the predictions that make up the combined forecast and their associated prediction error variances.
  • FIG. 7 is a flow diagram depicting an example redundancy test in the form of an encompassing test.
  • the set of model forecasts is shown at 702 .
  • each model in the set 702 is analyzed to determine whether the current model forecast is redundant (e.g., whether the information in the current model forecast is already represented in the continuing set of forecasts 706 ). If the current model forecast is redundant, then it is excluded. If the current model forecast is not redundant, then it remains in the set of forecasts 706 .
  • weights are assigned to the model forecasts remaining in the set. Weights may be assigned using a number of different algorithms. For example, weights may be assigned as a straight average of the set of remaining model forecasts, or more complex processes may be implemented, such as a least absolute deviation procedure. At 612 , the weighted model forecasts are aggregated to generate a combined forecast.
  • FIG. 8 depicts a forecast model selection graph having a selection node as a root node.
  • a number of node forecasts 802 are resolved (e.g., by generating node forecasts using a model, accessing externally generated forecasts from computer memory).
  • a combination node 804 combines the model forecasts of child nodes 806 of the combination node 804 .
  • a selection node 808 selects a forecast from among the combination node 804 and model forecasts at child nodes 810 of the selection node 808 based on a selection criteria.
  • FIG. 9 depicts a forecast model selection graph having a combination node as a root node.
  • a number of node forecasts 902 are resolved (e.g., by generating node forecasts using a model, accessing externally generated forecasts from memory).
  • a selection node 904 selects a model forecast from the child nodes 906 of the selection node.
  • a combination node 908 combines the model forecast from the selection node 904 and model forecasts at child nodes 910 of the combination node 908 to generate a combined forecast.
  • a model forecast node may be associated with a predictive model that is used to generate a model forecast for the model forecast node.
  • the predictive models may be stored in a model repository for convenient access.
  • FIG. 10 depicts an example model repository for storing predictive models.
  • the model repository 1002 includes a number of model records 1004 .
  • a model record may contain model data for implementing a predictive model 1006 .
  • a model record 1004 may contain a reference to where data for implementing the predictive model 1006 can be found (e.g., a file location, a pointer to a memory location, a reference to a record in a database).
  • Other example details of a model repository are described in U.S. Pat. No. 7,809,729, entitled “Model Repository,” the entirety of which is herein incorporated by reference.
  • a model repository configuration may streamline the data contained in a forecast model selection graph.
  • FIG. 11 depicts a link between a forecast model selection graph and a model repository.
  • a forecast model selection graph 1102 includes a number of model forecast nodes MF 1 , MF 2 , MF 3 , MF 4 , a selection node S 1 , and a combination node C 1 .
  • the model forecast nodes are resolved to generate node forecasts.
  • One of the model forecast nodes, MF 4 is associated with a model record 1104 .
  • model forecast node, MF 4 may contain an index value for the model record 1104 .
  • the model record is stored in the model repository 1104 and may contain data for implementing a predictive model to generate the node forecast, or the model record may contain a reference to the location of such data 1108 , such as a location in a the model repository 1106 .
  • the model forecast node, MF 4 is to be resolved, the model record 1104 is located based on the index identified by the model forecast node, MF 4 .
  • Data for the desired predictive model 1108 to be used to generate the node forecast is located in the model repository 1106 based on data contained in the model record 1104 .
  • FIG. 12 is a diagram depicting relationships among a forecast model selection graph data structure, a models data structure, and a combined forecast engine.
  • a forecast model selection graph data structure 1202 and a models data structure 1204 may be stored on one or more computer-readable storage mediums for access by an application program, such as a combined forecast engine 1206 being executed on one or more data structures.
  • the data structures 1202 , 1204 may be used as part of a process for evaluating a physical process with respect to one or more attributes of the physical process by combining forecasts for the one or more physical process attributes.
  • Physical process data generated over time e.g., time series data
  • the forecast model selection graph data structure 1202 may contain data about a hierarchical structure of nodes which specify how forecasts for the one or more physical attributes are combined, where the hierarchical structure of nodes has a root node, and where the nodes include one or more selection nodes 1208 , one or more model combination nodes 1210 , and model forecast nodes 1212 .
  • the forecast model selection graph data structure 1202 may include selection node data 1208 that specifies, for the one or more model selection nodes, model selection criteria for selecting, based upon model forecasting performance, models associated with the model forecast nodes or the one or more model combination nodes.
  • the forecast model selection graph data structure 1202 may also include model combination node data 1210 that specifies, for the one or more model combination nodes, which of the forecasts generated by the model forecast nodes are to be combined.
  • the forecast model selection graph data structure 1202 may also include model forecast node data 1212 that specifies, for the model forecast nodes, which particular predictive data models contained in the models data structure are to be used for generating forecasts.
  • the model forecast node data 1212 may link which stored data model is associated with a specific model forecast node, such as via an index 1214 .
  • the stored data model 1216 identified by the model forecast node data 1212 may be accessed as part of a resolving process to generate a node forecast for a particular node of the model forecast selection graph.
  • the combined forecast engine 1206 may process the forecast model selection graph data structure 1202 , using stored data models 1216 identified by the models data structure 1204 via the link between the model forecast node data 1212 and the models data structure 1204 to generate a combined forecast 1218 .
  • FIG. 13 depicts an example forecast model selection graph data structure.
  • the forecast model selection graph data structure 1302 is a data structure that includes a number of node records 1304 as sub-data structures.
  • the node records 1304 may each be descriptive of a model forecast node, a combination node, or a selection node.
  • Each of the node records 1304 includes data.
  • FIG. 14 depicts an example node record.
  • the node record 1402 may contain data related to the type of a node 1404 and data for the node to be processed, such as an identification of a model to generate a node forecast 1406 or a selection criteria for selecting among child nodes.
  • a node record 1402 may include structure data that identifies, in whole or in part, a position of a node in the forecast model selection graph.
  • the node record data may contain data identifying child nodes 1408 of a node and a parent node 1410 of the node.
  • the node record 1402 may also identify a node as a root or a leaf node or the exact position of a node in the forecast model selection graph hierarchy (e.g., a pre-order or a post-order value).
  • FIGS. 15-32 depict graphical user interfaces that may be used in generating and comparing combined forecasts.
  • FIG. 15 depicts an example graphical user interface for identifying parameters related to time, where a user may specify parameters such as a time interval, a multiplier value, a shift value, a seasonal cycle length, and a date format.
  • FIG. 16 depicts an example forecasting settings graphical user interface for identifying parameters related to data preparation, where a user may specify how to prepare data for forecasting.
  • Example settings include how to interpret embedded missing values, which leading or trailing missing values to remove, which leading or trailing zero values to interpret as missing, and whether to ignore data points earlier than a specified date.
  • FIG. 17 depicts an example forecasting settings graphical user interface for identifying diagnostics settings.
  • Example settings include intermittency test settings, seasonality test settings, independent variable diagnostic settings, and outlier detection settings.
  • Such diagnostic settings may be used in a variety of contexts, including processing of combination nodes of a forecast model selection graph.
  • FIG. 18 depicts an example forecasting settings graphical user interface for identifying model generation settings.
  • Example settings include identifications of which models to fit to each time series.
  • Example models include system-generated ARIMA models, system-generated exponential smoothing models, system-generated unobserved components models, and models from an external list.
  • Such model generation settings may be used in a variety of contexts, including with model forecast nodes of a forecast model selection graph.
  • FIG. 19 depicts an example forecasting settings graphical user interface for identifying model selection settings.
  • Example settings include whether to use a holdout sample in performing model selection and a selection criteria for selecting a forecast.
  • Such model selection settings may be used in a variety of contexts, including with model selection nodes of a forecast model selection graph.
  • FIG. 20 depicts an example forecasting settings graphical user interface for identifying model forecast settings.
  • Example settings include a forecast horizon, calculation of statistics of fit settings, confidence limit settings, negative forecast settings, and component series data set settings.
  • FIG. 21 depicts an example forecasting settings graphical user interface for identification of hierarchical forecast reconciliation settings. Using the user interface of FIG. 21 , a preference for reconciliation of a forecast hierarchy may be selected along with a method for performing the reconciliation, such as a top-down, bottom-up, or middle-out process.
  • FIG. 22 depicts an example forecasting settings graphical user interface for combined model settings.
  • the combined model settings user interface allows selection of a combine model option.
  • the user interface of FIG. 22 also includes an advanced options control.
  • FIG. 23 depicts an example graphical user interface for specification of advanced combined model settings.
  • the settings of FIG. 23 may be used in a variety of contexts, including in processing of a combination node of a forecast model selection graph.
  • Example settings for advanced combined model settings include a method of combination setting.
  • Example parameters include a RANKWGT setting, where a combined forecast engine analyzes the forecasts to be combined and assigns weights to those forecasts based on the analysis.
  • the RANKWGT option may accept a set of user-defined weights that are substituted for the automatic rank weight settings for each ordinal position in the ranked set.
  • the combined forecast engine analyzes and ranks the forecasts to be combined and then assigns the user-defined weights to the forecasts according to the forecast's ordinal position in the ranking.
  • a user may directly assign weights to the individual forecasts, and as a further option, a mean-average of the individual forecasts may be utilized.
  • the advanced settings interface also includes an option for directing that a forecast encompassing test be performed. When selected, the combined forecast engine ranks individual forecasts for pairwise encompassing elimination.
  • the advanced setting interface further includes options related to treatment of missing values. For example, a rescale option may be selected for weight methods that incorporate a sum-to-one restriction for combination weights.
  • a further option directs a method of computation of prediction error variance series. This option is an allowance for treating scenarios where the cross-correlation between two forecast error series is localized over segments of time when it is assumed that the error series are not jointly stationary. DIAG may be the default setting, while ESTCORR presumes that the combination forecast error series are jointly stationary and estimates the pairwise cross-correlations over the complete time spans.
  • FIG. 24 depicts a model view graphical user interface.
  • the user interface is configured to enable graphical analysis of a model residual series plot, residual distribution, time domain analysis (e.g., ACF, PACF, IACF, white noise), frequency domain analysis (e.g., spectral density, periodogram).
  • time domain analysis e.g., ACF, PACF, IACF, white noise
  • frequency domain analysis e.g., spectral density, periodogram
  • the user interface also enables exploration of parameter estimates, statistics of fit (e.g., RMSE, MAPE, AIC), and bias statistics.
  • FIG. 25 depicts example graphs that may be provided by a model view graphical interface.
  • Other options provided by a model view graphical user interface may include options for managing model combinations, such as adding a model for consideration, editing a previously added model, copying a model, and deleting a model (e.g., a previously added combined model).
  • FIG. 26 depicts an example graphical user interface for manually defining a combined model.
  • a manually defined combined model may be utilized with a model forecast node in a forecast model selection graph.
  • the graphical user interface may be configured to receive a selection of one or more models to be combined, weights to be applied to those combined models in generating the combination, as well as other parameters.
  • FIG. 27 depicts the manual entry of ranked weights to be applied to the selected models after they are ranked by a combined forecast engine.
  • FIG. 28 depicts an example interface for comparing models.
  • the example interface may be accessed via a model view interface.
  • the present interface enables comparison of selected model combinations in graphical form.
  • FIG. 29 depicts a table that enables comparison of selected model combinations statistically in text form.
  • FIG. 30 depicts a graphical user interface for performing scenario analysis using model combinations.
  • scenario analysis scenarios can be generated, where an input time series can be varied to better understand possible future outcomes and to evaluate a model's sufficiency to different input values.
  • a create new scenario menu may be accessed by selecting a new control in a scenario analysis view.
  • a model is selected for analysis.
  • a scenario is generated, and a graph depicting results of the scenario analysis is displayed, such as the graph of FIG. 32 .
  • forecast accuracy may often be significantly improved by combining forecasts of individual predictive models.
  • Combined forecasts also tend to produce reduced variability compared to the individual forecasts that are components of a combined forecast.
  • the disclosed combination process may automatically generate forecast combinations and vet them against other model and expert forecasts as directed by the forecast model selection graph processing.
  • Combined forecasts allow for better predicting systematic behavior of an underlying data generating process that cannot be captured by a single model forecast alone. Frequently, combinations of forecasts from simple models outperform a forecast from a single, complex model.
  • FIGS. 33A , 33 B, and 33 C depict example systems for use in implementing an enterprise data management system.
  • FIG. 33A depicts an exemplary system 3300 that includes a standalone computer architecture where a processing system 3302 (e.g., one or more computer processors) includes a combined forecast engine 3304 being executed on it.
  • the processing system 3302 has access to a computer-readable memory 3306 in addition to one or more data stores 3308 .
  • the one or more data stores 3308 may include models 3310 as well as model forecasts 3312 .
  • FIG. 33B depicts a system 3320 that includes a client server architecture.
  • One or more user PCs 3322 accesses one or more servers 3324 running a combined forecast engine 3326 on a processing system 3327 via one or more networks 3328 .
  • the one or more servers 3324 may access a computer readable memory 3330 as well as one or more data stores 3332 .
  • the one or more data stores 3332 may contain models 3334 as well as model forecasts 3336 .
  • FIG. 33C shows a block diagram of exemplary hardware for a standalone computer architecture 3350 , such as the architecture depicted in FIG. 33A that may be used to contain and/or implement the program instructions of system embodiments of the present invention.
  • a bus 3352 may serve as the information highway interconnecting the other illustrated components of the hardware.
  • a processing system 3354 labeled CPU (central processing unit) e.g., one or more computer processors
  • CPU central processing unit
  • a processor-readable storage medium such as read only memory (ROM) 3356 and random access memory (RAM) 3358 , may be in communication with the processing system 3354 and may contain one or more programming instructions for performing the method of implementing a combined forecast engine.
  • program instructions may be stored on a computer readable storage medium such as a magnetic disk, optical disk, recordable memory device, flash memory, or other physical storage medium.
  • Computer instructions may also be communicated via a communications signal, or a modulated carrier wave.
  • a disk controller 3360 interfaces one or more optional disk drives to the system bus 3352 .
  • These disk drives may be external or internal floppy disk drives such as 3362 , external or internal CD-ROM, CD-R, CD-RW or DVD drives such as 3364 , or external or internal hard drives 3366 .
  • these various disk drives and disk controllers are optional devices.
  • Each of the element managers, real-time data buffer, conveyors, file input processor, database index shared access memory loader, reference data buffer and data managers may include a software application stored in one or more of the disk drives connected to the disk controller 3360 , the ROM 3356 and/or the RAM 3358 .
  • the processor 3354 may access each component as required.
  • a display interface 3368 may permit information from the bus 3352 to be displayed on a display 3370 in audio, graphic, or alphanumeric format. Communication with external devices may optionally occur using various communication ports 3372 .
  • the hardware may also include data input devices, such as a keyboard 3373 , or other input device 3374 , such as a microphone, remote control, pointer, mouse and/or joystick.
  • data input devices such as a keyboard 3373 , or other input device 3374 , such as a microphone, remote control, pointer, mouse and/or joystick.
  • FIGS. 34-41 depict example systems and methods for evaluating the performance of a combined forecast model using rolling simulations.
  • ex-ante model performance measures are commonly referred to as ex-ante model performance measures.
  • Ex-ante model performance measures are similar to ex post (i.e., after the fact) forecast performance measures; however, ex-post forecast performance measures are used to evaluate the performance of forecasts regardless of the source (e.g., model-based, judgment-based, possible adjustments, etc.)
  • Rolling simulations can be used to perform ex-ante model performance by repeating the analyses over several forecast origins and lead times.
  • FIG. 34 is a block diagram depicting an example system 4000 for evaluating the performance of a combined forecast model using a rolling simulation analysis.
  • the combined forecast model may be generated from a plurality of individual forecast models 4010 using a combined forecast engine 4020 , for example using one or more of the systems and methods described above with reference to FIGS. 1-33 .
  • the plurality of forecast models 4010 may, for example, be generated based on sampled time-series data 4030 using a hierarchical time series forecasting system 4040 , such as the Forecast Studio and Forecast Server software sold by SAS Institute Inc. of Cary, N.C.
  • the example system 4000 includes a rolling simulation engine 4050 that works in combination with the combined forecast engine 4020 to select a model combination 4060 and evaluate its performance over an out-of-sample range 4065 .
  • the rolling simulation engine 4050 and/or the combined forecast engine 4020 may present a user interface to select two or more of the individual forecast models 4010 for use in generating a combined forecast.
  • the resulting ex-post forecasts 4070 may then be displayed on a user interface 4080 along with the actual out-of-sample data 4090 , such that the ex-post forecasts may be visually compared with the actual out-of-sample data over the specified period.
  • the rolling simulation engine 4050 may also calculate one or more performance statistics 4100 based on statistical comparisons of the actual out-of-sample data and the ex-post forecasts over the specified out-of-sample period 4065 .
  • the performance statistics 4100 may, for example, be displayed on a different tab of the user interface 4080 .
  • the performance statistics 4100 may include statistics that provide an indication of the average error between the forecast 4070 and the actual out-of-sample data 4090 , such as mean, mean absolute percentage error (MAPE), mean absolute error (MAE), median absolute deviation (MAD) and/or MAD/Mean ratio calculations.
  • a low variance for the statistics over different out-of-sample ranges may provide an indication that the combined model is forecasting with required accuracy.
  • a high variation in the forecast performance statistics 4100 over an out-of-sample range may indicate that the combined model is not performing at the required accuracy and may therefore result in larger errors when used over a wider horizon.
  • the rolling simulation engine 4050 and/or the combined forecast engine 4020 may be further configured to dynamically adjust one or more characteristics of the combined model based on the forecast horizon.
  • the combined model list may be re-run for each (BACK, LEAD) pair such that the set of model candidates included in the combination can change from one pair of (BACK, LEAD) values to the next depending on how the combined model list is defined.
  • certain forecast quality tests such as an encompassing test can be specified.
  • An encompassing test examines forecasts produced by the component models of a combined model to determine whether the forecasts produced by one or more of the component models that make up the combined model are redundant when processing given data.
  • an encompassing test may be performed on the combined model based on the out-of-sample input data to be provided to the combined model for that (BACK, LEAD) pair. Should one or more of the component models be found to be redundant, those component models can be omitted from the combined model for that (BACK, LEAD) pair, with the weightings of the component models of the combined model being automatically adjusted accordingly. Component models can be dropped for other quality reasons as well, such as the inability or ineffectiveness of those component models in handling missing values present in the candidate model forecast's historical period and/or its horizon for a particular (BACK, LEAD) pair.
  • component models may be dropped if the chosen method of weight estimation fails to produce a valid weight estimate for that model's forecast.
  • Component model weights may also adaptively rescale over the span of the (BACK, LEAD) period to account for candidate models with missing values in their forecasts.
  • FIG. 35 illustrates an example interface 4200 which may be generated to select a model or model combination for the rolling simulation analysis.
  • the interface 4200 includes a model selection region 4210 that lists the fitted individual models that are available for analysis.
  • the interface 4200 may also include a forecast summary region 4220 that displays forecast data and/or other information relating to a selected one of the fitted models listed in the selection region 4210 .
  • the interface 4200 may enable the user to highlight two or more of the individual models from the selection region 4210 and then select a combine models icon 4230 . Selecting the combine model icon 4230 may cause a combine models interface to be displayed, for example in a pop-up window, as illustrated in FIG. 36 .
  • FIG. 36 depicts an example of an interface 4300 which may be generated to define the characteristics of a combined model for the rolling simulation analysis.
  • the interface 4300 includes a model selection region 4310 for selecting the individual models for combination.
  • the model selection region 4310 may list the model names along with one or more characteristics of the model, such as the model type and a model performance statistic (e.g., MAPE).
  • a model performance statistic e.g., MAPE
  • two models TOP — 1 and TOP — 2 are selected for combination.
  • the model combination interface 4300 provides a plurality of user-editable fields 4320 for defining the characteristics of the combined model.
  • the user-editable fields 4320 may include a field for defining the model combination method (e.g., average or specify weights), a field for editing any specific weights applied to each individual model, a field to define how the combined model will treat missing values, and/or other fields for defining the characteristics of the combined model.
  • the interface 4300 may further include fields 4330 that may be used to define the percentage of missing forecast values in the combination horizon and the percentage of missing forecast values in the combination estimation region.
  • the interface 4300 may also provide regions for naming the combined model and providing a model description.
  • the model selection interface is updated to list the newly defined combined model 4400 .
  • the combined model 4400 may then be selected for rolling simulation analysis, as shown in FIG. 37 .
  • the rolling simulation analysis may, for example, be executed on the selected combined model 4400 by selecting a rolling simulation icon 4410 on the interface 4200 .
  • FIG. 38 depicts an example rolling simulation interface 4500 that may be displayed upon executing a rolling simulation analysis on a selected combined model.
  • the user may begin the simulation, for example by selecting a Run Simulation icon 4520 .
  • a simulation has been executed after selecting 6 out-of-sample observations.
  • the rolling simulation interface 4500 enables the user to graphically view the predictions of the combined model at various forecast origins and compare the predictions to the actual out-of-sample data values.
  • ex-post forecasts (BACK:1 through BACK:6) are plotted in the display region 4530 for comparison with the actual out-of-sample data values.
  • BACK:1 the combined forecast is generated one period into the past (December 02)
  • BACK:2 the next out-of-sample observation
  • BACK:6 the combined forecast is generated two periods into the past (December 02 thru November 02)
  • BACK:6 the sixth out-of-sample observation
  • the rolling simulation interface 4500 of FIG. 38 also includes a numerical display region 4540 that displays the forecast values for each out-of-sample observation (BACK:1 thru BACK:6) along with the actual values of the out-of-sample data.
  • the forecasted values in the illustrated example are set forth in the numerical display region 4540 in bold text.
  • the rolling simulation engine may calculate an optimal number of out-of-sample observations to automatically populate the interface field 4510 with a back range default value.
  • the default value for the back range field 4510 may, for example, be calculated using the following formula:
  • the rolling simulation interface 4500 may also include one or more fields for use in generating and displaying one or more forecast performance statistics.
  • the simulation statistics fields may, for example, be included in a separate tab 4600 of the rolling simulation interface 4500 , as illustrated in FIG. 39 .
  • the simulation statistics tab 4600 includes a field 4610 for selecting a particular forecast performance statistic for identifying the error between the forecasted values and the actual out-of-sample values, such as a Mean, MAE, MAPE, MAD or MAD/Mean ratio calculation.
  • a graph of the selected performance statistic at each lead time is displayed in a display region 4620 of the interface.
  • the simulation statistics tab 4600 may also include a numerical display region 4630 that displays the performance statistic values for each lead time.
  • the numerical display region 4630 includes values for a plurality of different performance statistics calculated by the rolling simulation engine. In this way, the user may simultaneously view the values for multiple performance statistics for the combined model in the numerical display region 4630 , and select a particular one of the performance statistic for graphical display 4620 .
  • the graph 4620 and numerical display 4630 in the example illustrated in FIG. 39 show the performance error in terms of the lead time of the ex-post forecast.
  • t ⁇ b (m) y t-b+l ⁇ t-b+l
  • FIG. 40 illustrates the further ability of the rolling simulation engine to generate ex-ante (i.e., forward-looking) forecasts for the combined model.
  • the example interface 4500 includes a user input region 4700 to select the desired number of periods to forecast.
  • the rolling simulation engine When a simulation is executed with both the number of out-of-sample observations 4510 and a number of forecast periods 4700 selected, the rolling simulation engine generates both ex-ante and ex-post forecasts for each of the out-of-sample observations.
  • the forecasts for each out-of-sample observation are plotted along with the actual out-of-sample data in the graphical display region 4530 of the interface 4500 .
  • a vertical line 4730 separates the ex-post forecasts 4720 (to the left of the line) from the ex-ante 4710 forecasts (to the right of the line.)
  • actual out-of-sample data values are displayed only for past periods in which data has been recorded, which is why there are no actual out-of-sample values to the right of the vertical line 4730 in the illustrated example.
  • the example interface 4500 also includes numerical values for both the ex-ante and ex-post forecasts in the numerical display region 4540 , along with the actual out-of-sample data values.
  • the ex-ante and ex-post forecasted values in the illustrated example are set forth in the numerical display region 4540 in bold text.
  • FIG. 41 illustrates an example of how the graphical display on the simulation interface 4500 may be modified to compare specific out-of-sample observations.
  • the interface includes user input fields 4800 for selecting either all of the simulations or select simulations for display. If the “select simulations” field is selected, then the interface 4500 only displays the forecasts for out-of-sample observations that are selected in the numerical display region 4540 . In the illustrated example, the Back:2, Back:4 and Back:6 observations have been selected, and therefore only these three forecasts are displayed in the graphical display region 4530 along with the actual out-of-sample values.
  • the methods and systems described herein may be implemented on many different types of processing devices by program code comprising program instructions that are executable by the device processing subsystem.
  • the software program instructions may include source code, object code, machine code, or any other stored data that is operable to cause a processing system to perform the methods and operations described herein.
  • Other implementations may also be used, however, such as firmware or even appropriately designed hardware configured to carry out the methods and systems described herein.
  • the systems' and methods' data may be stored and implemented in one or more different types of computer-implemented data stores, such as different types of storage devices and programming constructs (e.g., RAM, ROM, Flash memory, flat files, databases, programming data structures, programming variables, IF-THEN (or similar type) statement constructs, etc.).
  • storage devices and programming constructs e.g., RAM, ROM, Flash memory, flat files, databases, programming data structures, programming variables, IF-THEN (or similar type) statement constructs, etc.
  • data structures describe formats for use in organizing and storing data in databases, programs, memory, or other computer-readable media for use by a computer program.
  • a module or processor includes but is not limited to a unit of code that performs a software operation, and can be implemented for example as a subroutine unit of code, or as a software function unit of code, or as an object (as in an object-oriented paradigm), or as an applet, or in a computer script language, or as another type of computer code.
  • the software components and/or functionality may be located on a single computer or distributed across multiple computers depending upon the situation at hand.

Abstract

Systems and methods are provided for evaluating performance of forecasting models. A plurality of forecasting models may be generated using a set of in-sample data. Two or more forecasting models from the plurality of forecasting models may be selected for use in generating a combined forecast. An ex-ante combined forecast may be generated for an out-of-sample period using the selected two or more forecasting models. The ex-ante combined forecast may then be compared with a set of actual out-of-sample data to evaluate performance of the combined forecast.

Description

  • This application is a continuation patent application of patent application Ser. No. 13/440,045, filed on Apr. 5, 2012 and entitled “Computer-Implemented Systems and Methods for Testing Large Scale Automatic Forecast Combinations,” which is a continuation-in-part of U.S. patent application Ser. No. 13/189,131, filed on Jul. 22, 2011; this application also claims priority to U.S. Provisional Application No. 61/594,442, filed on Feb. 3, 2012. The entirety of these priority applications are incorporated herein by reference.
  • TECHNICAL FIELD
  • This document relates generally to computer-implemented forecasting and more particularly to testing a combined forecast that is generated using multiple forecasts.
  • BACKGROUND
  • Forecasting is a process of making statements about events whose actual outcomes typically have not yet been observed. A commonplace example might be estimation for some variable of interest at some specified future date. Forecasting often involves formal statistical methods employing time series, cross-sectional or longitudinal data, or alternatively to less formal judgmental methods. Forecasts are often generated by providing a number of input values to a predictive model, where the model outputs a forecast. While a well designed model may give an accurate forecast, a configuration where predictions of multiple models are considered when generating a forecast may provide even stronger forecast results.
  • SUMMARY
  • In accordance with the teachings herein, systems and methods are provided for evaluating a physical process with respect to one or more attributes of the physical process by combining forecasts for the one or more physical process attributes, where data for evaluating the physical process is generated over time. In one example, a forecast model selection graph is accessed, the forecast model selection graph comprising a hierarchy of nodes arranged in parent-child relationships. A plurality of model forecast nodes are resolved, where resolving a model forecast node includes generating a node forecast for the one or more physical process attributes. A combination node is processed, where a combination node transforms a plurality of node forecasts at child nodes of the combination node into a combined forecast. A selection node is processed, where a selection node chooses a node forecast from among child nodes of the selection node based on a selection criteria.
  • As another example, a system for storing evaluating a physical process with respect to one or more attributes of the physical process by combining forecasts for the one or more physical process attributes, where data for evaluating the physical process is generated over time is provided. The system may include one or more data processors and a computer-readable medium encoded with instructions for commanding the one or more data processors to execute steps. In the steps, a forecast model selection graph is accessed, the forecast model selection graph comprising a hierarchy of nodes arranged in parent-child relationships. A plurality of model forecast nodes are resolved, where resolving a model forecast node includes generating a node forecast for the one or more physical process attributes. A combination node is processed, where a combination node transforms a plurality of node forecasts at child nodes of the combination node into a combined forecast. A selection node is processed, where a selection node chooses a node forecast from among child nodes of the selection node based on a selection criteria.
  • As a further example, a computer-readable storage medium may be encoded with instructions for commanding one or more data processors to execute a method. In the method, a forecast model selection graph is accessed, the forecast model selection graph comprising a hierarchy of nodes arranged in parent-child relationships. A plurality of model forecast nodes are resolved, where resolving a model forecast node includes generating a node forecast for the one or more physical process attributes. A combination node is processed, where a combination node transforms a plurality of node forecasts at child nodes of the combination node into a combined forecast. A selection node is processed, where a selection node chooses a node forecast from among child nodes of the selection node based on a selection criteria.
  • As an additional example, one or more computer-readable storage mediums may store data structures for access by an application program being executed on one or more data processors for evaluating a physical process with respect to one or more attributes of the physical process by combining forecasts for the one or more physical process attributes, where physical process data generated over time is used in the forecasts for the one or more physical process attributes. The data structures may include a predictive models data structure, the predictive models data structure containing predictive data model records for specifying predictive data models and a forecast model selection graph data structure, where the forecast model selection graph data structure contains data about a hierarchical structure of nodes which specify how the forecasts for the one or more physical process attributes are combined, where the hierarchical structure of nodes has a root node wherein the nodes include model forecast nodes, one or more model combination nodes, and one or more model selection nodes. The forecast model selection graph data structure may include model forecast node data which specifies for the model forecast nodes which particular predictive data models contained in the predictive models data structure are to be used for generating forecasts, model combination node data which specifies for the one or more model combination nodes which of the forecasts generated by the model forecast nodes are to be combined, and selection node data which specifies for the one or more model selection nodes model selection criteria for selecting, based upon model forecasting performance, models associated with the model forecast nodes or the one or more model combination nodes.
  • In accordance with the teachings herein, systems and methods are provided for evaluating performance of forecasting models. A plurality of forecasting models may be generated using a set of in-sample data. A selection of two or more forecasting models may be received from the plurality of forecasting models for use in generating a combined forecast. A set of actual out-of-sample data may be received. An ex-ante combined forecast may be generated for an out-of-sample period using the selected two or more forecasting models. The ex-ante combined forecast and the set of actual out-of-sample data may be provided for use in evaluating performance of the combined forecast.
  • BRIEF DESCRIPTION OF THE FIGURES
  • FIG. 1 is a block diagram depicting a computer-implemented combined forecast engine.
  • FIG. 2 is a block diagram depicting the generation of a combined forecast for a forecast variable.
  • FIG. 3 is a block diagram depicting steps that may be performed by a combined forecast engine in generating a combined forecast.
  • FIG. 4 depicts an example forecast model selection graph.
  • FIG. 5 depicts an example forecast model selection graph including selection nodes, combination nodes, and model forecast nodes.
  • FIG. 6 is a block diagram depicting example operations that may be performed by a combined forecast engine in combining one or more forecasts.
  • FIG. 7 is a flow diagram depicting an example redundancy test in the form of an encompassing test.
  • FIG. 8 depicts a forecast model selection graph having a selection node as a root node.
  • FIG. 9 depicts a forecast model selection graph having a combination node as a root node.
  • FIG. 10 depicts an example model repository for storing predictive models.
  • FIG. 11 depicts a link between a forecast model selection graph and a model repository.
  • FIG. 12 is a diagram depicting relationships among a forecast model selection graph data structure, a models data structure, and a combined forecast engine.
  • FIG. 13 depicts an example forecast model selection graph data structure.
  • FIG. 14 depicts an example node record.
  • FIGS. 15-32 depict graphical user interfaces that may be used in generating and comparing combined forecasts.
  • FIGS. 33A, 33B, and 33C depict example systems for use in implementing combined forecast engine.
  • FIG. 34 is a block diagram depicting an example system for evaluating the performance of a combined forecast model using a rolling simulation analysis.
  • FIGS. 35-41 depict example user interfaces for evaluating the performance of a combined forecast model using a rolling simulation analysis.
  • DETAILED DESCRIPTION
  • FIG. 1 is a block diagram depicting a computer-implemented combined forecast engine. FIG. 1 depicts a computer-implemented combined forecast engine 102 for facilitating the creation of combined forecasts and evaluation of created combined forecasts against individual forecasts as well as other combined forecasts. Forecasts are predictions that are typically generated by a predictive model based on one or more inputs to the predictive model. A combined forecast engine 102 combines predictions made by multiple models, of the same or different type, to generate a single, combined forecast that can incorporate the strengths of the multiple, individual models which comprise the combined forecast.
  • For example, a combined forecast may be generated (e.g., to predict a manufacturing process output, to estimate product sales) by combining individual forecasts from two linear regression models and one autoregressive regression model. The individual forecasts may be combined in a variety of ways, such as by a straight average, via a weighted average, or via another method. To generate a weighted forecast, automated analysis of the individual forecasts may be performed to identify weights to generate an optimum combined forecast that best utilizes the available individual forecasts.
  • The combined forecast engine 102 provides a platform for users 104 to generate combined forecasts based on individual forecasts generated by individual predictive models 106. A user 104 accesses the combined forecast engine 102, which is hosted on one or more servers 108, via one or more networks 110. The one or more servers 108 are responsive to one or more data stores 112. The one or more data stores 112 may contain a variety of data that includes predictive models 106 and model forecasts 114.
  • FIG. 2 is a block diagram depicting the generation of a combined forecast for a forecast variable (e.g., one or more physical process attributes). The combined forecast engine 202 receives an identification of a forecast variable 204 for which to generate a combined forecast 206. For example, a user may command that the combined forecast engine 202 generate a combined forecast 206 of sales for a particular clothing item. To generate the combined forecast 206, the combined forecast engine 202 may identify a number of individual predictive models. Those individual predictive models may be provided historic data 208 as input, and those individual predictive models provide individual forecasts based on the provided historic data 208. The combined forecast engine 202 performs operations to combine those individual predictions of sales of the particular clothing item to generate the combined forecast of sales for the particular clothing item.
  • FIG. 3 is a block diagram depicting steps that may be performed by a combined forecast engine in generating a combined forecast. The combined forecast engine 302 receives a forecast variable 304 for which to generate a combined forecast as well as historic data 306 to be used as input to individual predictive models whose predictions become components of the combined forecast 308.
  • The combined forecast engine 302 may utilize model selection and model combination operations to generate a combined forecast. For example, the combined forecast engine 302 may evaluate a physical process with respect to one or more attributes of the physical process by combining forecasts for the one or more physical process attributes. Data for evaluating the physical process may be generated over time, such as time series data.
  • At 310, the combined forecast engine accesses a forecast model selection graph. A forecast model selection graph incorporates both model selection and model combination into a decision based framework that, when applied to a time series, automatically selects a forecast from an evaluation of independent, individual forecasts generated. The forecast model selection graph can include forecasts from statistical models, external forecasts from outside agents (e.g., expert predictions, other forecasts generated outside of the combined forecast engine 302), or combinations thereof. The forecast model selection graph may be used to generate combined forecasts as well as comparisons among competing generated forecasts to select a best forecast. A forecast model selection graph for a forecast decision process of arbitrary complexity may be created, limited only by external factors such as computational power and machine resource limits.
  • A forecast model selection graph may include a hierarchy of nodes arranged in parent-child relationships including a root node. The hierarchy may include one or more selection nodes, one or more combination nodes, and a plurality of model forecast nodes. Each of the model forecast nodes is associated with a predictive model. The combined forecast engine may resolve the plurality of model forecast nodes, as shown at 312. Resolving a model forecast node includes generating a node forecast for the forecast variable 304 using the predictive model for the model forecast node. For example, a first model forecast node may be associated with a regression model. To resolve the first model forecast node, the combined forecast engine 302 provides the historic data 306 to the regression model, and the regression model generates a node forecast for the model forecast node. A second model forecast node may be associated with a human expert prediction. In such a case, computation by the combined forecast engine 302 may be limited, such as simply accessing the human expert's prediction from storage. A third model forecast node may be associated with a different combined model. To resolve the third model forecast node, the combined forecast engine 302 provides the historic data 306 to the different combined model, and the different combined model generates a node forecast for the model forecast node. Other types of models and forecasts may also be associated with a model forecast node.
  • At 314, the combined forecast engine processes a combination node. In processing a combination node, the combined forecast engine 302 transforms a plurality of node forecasts at child nodes of the combination nodes into a combined forecast. For example, a combination node having three child nodes would have the node forecasts for those three child nodes combined into a combined forecast for the combination node. Combining node forecasts may be done in a variety of ways, such as via a weighted average. A weighted average may weight each of the three node forecasts equally, or the combined forecast engine 302 may implement more complex logic to identify a weight for each of the three node forecasts. For example, weight types may include a simple average, user-defined weights, rank weights, ranked user-weights, AICC weights, root mean square error weights, restricted least squares eights, OLS weights, and least absolute deviation weights.
  • At 316, the combined forecast engine processes a selection node. In processing a selection node, the combined forecast engine 302 chooses a node forecast from among child nodes of the selection node based on a selection criteria. The selection criteria may take a variety of forms. For example, the selection criteria may dictate selection of a node forecast associated with a node whose associated model performs best in a hold out sample analysis.
  • As another example, metadata may be associated with models associated with node forecasts, where the metadata identifies a model characteristic of a model. The selection criteria may dictate selection of a node forecast whose metadata model characteristic best matches a characteristic of the forecast variable 304. For example, if the forecast variable 304 tends to behave in a seasonal pattern, then the selection criteria may dictate selection of a node forecast that was generated by a model whose metadata identifies it as handling seasonal data. Other example model metadata characteristics include trending model, intermittent model, and transformed model.
  • As a further example, the selection criteria may dictate selection of a node forecast having the least amount of missing data. A node forecast may include forecasts for the forecast variable 304 for a number of time periods in the future (e.g., forecast variable at t+1, forecast variable at t+2, . . . ). In some circumstances, a node forecast may be missing data for certain future time period forecasts (e.g., the node forecast is an expert's prediction, where the expert only makes one prediction at t+6 months). If a certain time period in the future is of specific interest, the selection criteria may dictate that a selected node forecast must not be missing a forecast at the time period of interest (e.g., when the time period of interest is t+1 month, the node forecast including the expert's prediction may not be selected).
  • As another example, the selection criteria may be based on a statistic of fit. For example, the combined forecast engine 302 may fit models associated with child nodes of a selection node with the historic data 306 and calculate statistics of fit for those models. Based on the determined statistics of fit, the combined forecast engine 302 selects the forecast node associated with the model that is a best fit.
  • The combined forecast engine 302 may continue resolving model forecast nodes 312 and processing combination and selection nodes 314, 316 until a final combined forecast is generated. For example, the combined forecast engine may work from the leaves up to the root in the forecast model selection graph hierarchy, where the final combined forecast is generated at the root node.
  • FIG. 4 depicts an example forecast model selection graph. The forecast model selection graph includes a hierarchy of nodes arranged in parent-child relationships that includes a root node 402. The forecast model selection graph also includes two model forecast nodes 404. The model forecast nodes 404 may be associated with a model that can be used to forecast one or more values for a forecast variable. A model associated with a model forecast node 404 may also be a combined model or a forecast generated outside of the combined forecast engine, such as an expert or other human generated forecast. The model forecast nodes 404 are resolved to identify a node forecast (e.g., using an associated model to generate a node forecast, accessing an expert forecast from storage).
  • The forecast model selection graph also includes selection nodes 406. A selection node may include a selection criteria for choosing a node forecast from among child nodes (e.g., model forecast nodes 404) of the selection node 406. Certain of the depicted selection nodes S1, S2, Sn do not have their child nodes depicted in FIG. 4.
  • FIG. 5 depicts an example forecast model selection graph including selection nodes, combination nodes, and model forecast nodes. To generate a combined forecast for the forecast model selection graph 500, model forecast nodes 502 are resolved to generate node forecasts for one or more forecast variables (e.g., physical process attributes). With node forecasts resolved for the model forecast nodes 502, a selection node 504 selects one of the node forecasts associated with the model forecast nodes 502 based on a selection criteria. For example, the selection criteria may dictate a model forecast based on metadata associated with a model used to generate the model forecast at the model forecast node 502.
  • Additional model forecast nodes 506 may be resolved to generate node forecasts at those model forecast nodes 506. A first combined forecast node 508 combines a model forecast associated with model forecast node MF1_1 and the model forecast at the selection node 504 to generate a combined forecast at the combination node 508. A second combined forecast node 510 combines a model forecast associated with model forecast node MF2_1 and the model forecast at the selection node 504 to generate a combined forecast at the combination node 510. Another selection node 512 selects a model forecast from one of the two combination nodes 508, 510 based on a selection criteria as the final combined forecast for the forecast model selection graph 500.
  • A forecast model selection graph may take a variety of forms. For example, the forecast model selection graph may be represented in one or more records in a database or described in a file. In another implementation, the forecast model selection graph may be represented via one or more XML based data structures. The XML data structures may identify the forecast sources to combine, diagnostic tests used in the selection and filtering of forecasts, methods for determining weights to forecasts to be combined, treatment of missing values, and selection of methods for estimating forecast prediction error variance.
  • FIG. 6 is a block diagram depicting example operations that may be performed by a combined forecast engine in combining one or more forecasts (e.g., when processing a combination node). At 602, an initial set of model forecasts is identified. In some implementations, all identified model forecasts may be combined to create a combined forecast. However, in some implementations, it may be desirable to filter the models used in creating a combined forecast. For example, at 604, the set of model forecasts may be reduced at 604 based on one or more forecast candidate tests. The forecast candidate tests may take a variety of forms, such as analysis of the types of models used to generate the model forecasts identified at 602 and characteristics of the forecast variable. For example, if the forecast variable is a trending variable, the candidate tests may eliminate model forecasts generated by models that are designed to handle seasonal data.
  • At 606, the set of model forecasts may be reduced based on one or more forecast quality tests. Forecast quality tests may take a variety of forms. For example, forecast quality tests may analyze missing values of model forecasts. For example, model forecasts may be filtered from the set if the model forecasts have missing values in an area of interest (e.g., a forecast horizon). In another example, a model forecast may be filtered from the set if it is missing more than a particular % of values in the forecast horizon.
  • At 608, the set of model forecasts may be reduced based on redundancy tests. A redundancy test may analyze models associated with model forecasts nodes to identify robust models, and those models having a high degree of redundancy (e.g., models that are producing forecasts that are statistically too similar). Model forecasts having a high degree of redundancy may be excluded from the combined model being generated.
  • In addition to generating a combined forecast, certain statistics for a combined forecast may be determined. For example, a prediction error variance estimate may be calculated. The prediction error variance estimate may incorporate pair-wise correlation estimates between the individual forecast prediction errors for the predictions that make up the combined forecast and their associated prediction error variances.
  • FIG. 7 is a flow diagram depicting an example redundancy test in the form of an encompassing test. The set of model forecasts is shown at 702. At 704, each model in the set 702 is analyzed to determine whether the current model forecast is redundant (e.g., whether the information in the current model forecast is already represented in the continuing set of forecasts 706). If the current model forecast is redundant, then it is excluded. If the current model forecast is not redundant, then it remains in the set of forecasts 706.
  • With reference back to FIG. 6, at 610, weights are assigned to the model forecasts remaining in the set. Weights may be assigned using a number of different algorithms. For example, weights may be assigned as a straight average of the set of remaining model forecasts, or more complex processes may be implemented, such as a least absolute deviation procedure. At 612, the weighted model forecasts are aggregated to generate a combined forecast.
  • FIG. 8 depicts a forecast model selection graph having a selection node as a root node. A number of node forecasts 802 are resolved (e.g., by generating node forecasts using a model, accessing externally generated forecasts from computer memory). A combination node 804 combines the model forecasts of child nodes 806 of the combination node 804. A selection node 808 selects a forecast from among the combination node 804 and model forecasts at child nodes 810 of the selection node 808 based on a selection criteria.
  • FIG. 9 depicts a forecast model selection graph having a combination node as a root node. A number of node forecasts 902 are resolved (e.g., by generating node forecasts using a model, accessing externally generated forecasts from memory). A selection node 904 selects a model forecast from the child nodes 906 of the selection node. A combination node 908 combines the model forecast from the selection node 904 and model forecasts at child nodes 910 of the combination node 908 to generate a combined forecast.
  • As noted previously, a model forecast node may be associated with a predictive model that is used to generate a model forecast for the model forecast node. In one embodiment, the predictive models may be stored in a model repository for convenient access. FIG. 10 depicts an example model repository for storing predictive models. The model repository 1002 includes a number of model records 1004. A model record may contain model data for implementing a predictive model 1006. In another embodiment, a model record 1004 may contain a reference to where data for implementing the predictive model 1006 can be found (e.g., a file location, a pointer to a memory location, a reference to a record in a database). Other example details of a model repository are described in U.S. Pat. No. 7,809,729, entitled “Model Repository,” the entirety of which is herein incorporated by reference.
  • A model repository configuration may streamline the data contained in a forecast model selection graph. FIG. 11 depicts a link between a forecast model selection graph and a model repository. A forecast model selection graph 1102 includes a number of model forecast nodes MF1, MF2, MF3, MF4, a selection node S1, and a combination node C1. The model forecast nodes are resolved to generate node forecasts. One of the model forecast nodes, MF4, is associated with a model record 1104. For example, model forecast node, MF4, may contain an index value for the model record 1104. The model record is stored in the model repository 1104 and may contain data for implementing a predictive model to generate the node forecast, or the model record may contain a reference to the location of such data 1108, such as a location in a the model repository 1106. When the model forecast node, MF4, is to be resolved, the model record 1104 is located based on the index identified by the model forecast node, MF4. Data for the desired predictive model 1108 to be used to generate the node forecast is located in the model repository 1106 based on data contained in the model record 1104.
  • FIG. 12 is a diagram depicting relationships among a forecast model selection graph data structure, a models data structure, and a combined forecast engine. A forecast model selection graph data structure 1202 and a models data structure 1204 may be stored on one or more computer-readable storage mediums for access by an application program, such as a combined forecast engine 1206 being executed on one or more data structures. The data structures 1202, 1204 may be used as part of a process for evaluating a physical process with respect to one or more attributes of the physical process by combining forecasts for the one or more physical process attributes. Physical process data generated over time (e.g., time series data) may be used in the forecasts for the one or more physical attributes.
  • The forecast model selection graph data structure 1202 may contain data about a hierarchical structure of nodes which specify how forecasts for the one or more physical attributes are combined, where the hierarchical structure of nodes has a root node, and where the nodes include one or more selection nodes 1208, one or more model combination nodes 1210, and model forecast nodes 1212. The forecast model selection graph data structure 1202 may include selection node data 1208 that specifies, for the one or more model selection nodes, model selection criteria for selecting, based upon model forecasting performance, models associated with the model forecast nodes or the one or more model combination nodes. The forecast model selection graph data structure 1202 may also include model combination node data 1210 that specifies, for the one or more model combination nodes, which of the forecasts generated by the model forecast nodes are to be combined.
  • The forecast model selection graph data structure 1202 may also include model forecast node data 1212 that specifies, for the model forecast nodes, which particular predictive data models contained in the models data structure are to be used for generating forecasts. For example, the model forecast node data 1212 may link which stored data model is associated with a specific model forecast node, such as via an index 1214. The stored data model 1216 identified by the model forecast node data 1212 may be accessed as part of a resolving process to generate a node forecast for a particular node of the model forecast selection graph. The combined forecast engine 1206 may process the forecast model selection graph data structure 1202, using stored data models 1216 identified by the models data structure 1204 via the link between the model forecast node data 1212 and the models data structure 1204 to generate a combined forecast 1218.
  • FIG. 13 depicts an example forecast model selection graph data structure. In FIG. 13, the forecast model selection graph data structure 1302 is a data structure that includes a number of node records 1304 as sub-data structures. The node records 1304 may each be descriptive of a model forecast node, a combination node, or a selection node. Each of the node records 1304 includes data.
  • FIG. 14 depicts an example node record. For example, the node record 1402 may contain data related to the type of a node 1404 and data for the node to be processed, such as an identification of a model to generate a node forecast 1406 or a selection criteria for selecting among child nodes. Additionally, a node record 1402 may include structure data that identifies, in whole or in part, a position of a node in the forecast model selection graph. For example, the node record data may contain data identifying child nodes 1408 of a node and a parent node 1410 of the node. The node record 1402 may also identify a node as a root or a leaf node or the exact position of a node in the forecast model selection graph hierarchy (e.g., a pre-order or a post-order value).
  • FIGS. 15-32 depict graphical user interfaces that may be used in generating and comparing combined forecasts. FIG. 15 depicts an example graphical user interface for identifying parameters related to time, where a user may specify parameters such as a time interval, a multiplier value, a shift value, a seasonal cycle length, and a date format.
  • FIG. 16 depicts an example forecasting settings graphical user interface for identifying parameters related to data preparation, where a user may specify how to prepare data for forecasting. Example settings include how to interpret embedded missing values, which leading or trailing missing values to remove, which leading or trailing zero values to interpret as missing, and whether to ignore data points earlier than a specified date.
  • FIG. 17 depicts an example forecasting settings graphical user interface for identifying diagnostics settings. Example settings include intermittency test settings, seasonality test settings, independent variable diagnostic settings, and outlier detection settings. Such diagnostic settings may be used in a variety of contexts, including processing of combination nodes of a forecast model selection graph.
  • FIG. 18 depicts an example forecasting settings graphical user interface for identifying model generation settings. Example settings include identifications of which models to fit to each time series. Example models include system-generated ARIMA models, system-generated exponential smoothing models, system-generated unobserved components models, and models from an external list. Such model generation settings may be used in a variety of contexts, including with model forecast nodes of a forecast model selection graph.
  • FIG. 19 depicts an example forecasting settings graphical user interface for identifying model selection settings. Example settings include whether to use a holdout sample in performing model selection and a selection criteria for selecting a forecast. Such model selection settings may be used in a variety of contexts, including with model selection nodes of a forecast model selection graph.
  • FIG. 20 depicts an example forecasting settings graphical user interface for identifying model forecast settings. Example settings include a forecast horizon, calculation of statistics of fit settings, confidence limit settings, negative forecast settings, and component series data set settings.
  • FIG. 21 depicts an example forecasting settings graphical user interface for identification of hierarchical forecast reconciliation settings. Using the user interface of FIG. 21, a preference for reconciliation of a forecast hierarchy may be selected along with a method for performing the reconciliation, such as a top-down, bottom-up, or middle-out process.
  • FIG. 22 depicts an example forecasting settings graphical user interface for combined model settings. The combined model settings user interface allows selection of a combine model option. The user interface of FIG. 22 also includes an advanced options control. FIG. 23 depicts an example graphical user interface for specification of advanced combined model settings. The settings of FIG. 23 may be used in a variety of contexts, including in processing of a combination node of a forecast model selection graph.
  • Example settings for advanced combined model settings include a method of combination setting. Example parameters include a RANKWGT setting, where a combined forecast engine analyzes the forecasts to be combined and assigns weights to those forecasts based on the analysis. In another example, the RANKWGT option may accept a set of user-defined weights that are substituted for the automatic rank weight settings for each ordinal position in the ranked set. The combined forecast engine analyzes and ranks the forecasts to be combined and then assigns the user-defined weights to the forecasts according to the forecast's ordinal position in the ranking. As another option, a user may directly assign weights to the individual forecasts, and as a further option, a mean-average of the individual forecasts may be utilized.
  • The advanced settings interface also includes an option for directing that a forecast encompassing test be performed. When selected, the combined forecast engine ranks individual forecasts for pairwise encompassing elimination. The advanced setting interface further includes options related to treatment of missing values. For example, a rescale option may be selected for weight methods that incorporate a sum-to-one restriction for combination weights. A further option directs a method of computation of prediction error variance series. This option is an allowance for treating scenarios where the cross-correlation between two forecast error series is localized over segments of time when it is assumed that the error series are not jointly stationary. DIAG may be the default setting, while ESTCORR presumes that the combination forecast error series are jointly stationary and estimates the pairwise cross-correlations over the complete time spans.
  • FIG. 24 depicts a model view graphical user interface. Using the model view, a user can evaluate combined model residuals. The user interface is configured to enable graphical analysis of a model residual series plot, residual distribution, time domain analysis (e.g., ACF, PACF, IACF, white noise), frequency domain analysis (e.g., spectral density, periodogram). The user interface also enables exploration of parameter estimates, statistics of fit (e.g., RMSE, MAPE, AIC), and bias statistics. FIG. 25 depicts example graphs that may be provided by a model view graphical interface. Other options provided by a model view graphical user interface may include options for managing model combinations, such as adding a model for consideration, editing a previously added model, copying a model, and deleting a model (e.g., a previously added combined model).
  • FIG. 26 depicts an example graphical user interface for manually defining a combined model. For example, a manually defined combined model may be utilized with a model forecast node in a forecast model selection graph. The graphical user interface may be configured to receive a selection of one or more models to be combined, weights to be applied to those combined models in generating the combination, as well as other parameters. For example, FIG. 27 depicts the manual entry of ranked weights to be applied to the selected models after they are ranked by a combined forecast engine.
  • FIG. 28 depicts an example interface for comparing models. The example interface may be accessed via a model view interface. The present interface enables comparison of selected model combinations in graphical form. FIG. 29 depicts a table that enables comparison of selected model combinations statistically in text form.
  • FIG. 30 depicts a graphical user interface for performing scenario analysis using model combinations. Using scenario analysis, scenarios can be generated, where an input time series can be varied to better understand possible future outcomes and to evaluate a model's sufficiency to different input values. A create new scenario menu may be accessed by selecting a new control in a scenario analysis view. Using the create new scenario menu, shown in further detail in FIG. 31, a model is selected for analysis. A scenario is generated, and a graph depicting results of the scenario analysis is displayed, such as the graph of FIG. 32.
  • The systems and methods described herein may, in some implementations, be utilized to achieve one or more of the following benefits. For example, forecast accuracy may often be significantly improved by combining forecasts of individual predictive models. Combined forecasts also tend to produce reduced variability compared to the individual forecasts that are components of a combined forecast. The disclosed combination process may automatically generate forecast combinations and vet them against other model and expert forecasts as directed by the forecast model selection graph processing. Combined forecasts allow for better predicting systematic behavior of an underlying data generating process that cannot be captured by a single model forecast alone. Frequently, combinations of forecasts from simple models outperform a forecast from a single, complex model.
  • FIGS. 33A, 33B, and 33C depict example systems for use in implementing an enterprise data management system. For example, FIG. 33A depicts an exemplary system 3300 that includes a standalone computer architecture where a processing system 3302 (e.g., one or more computer processors) includes a combined forecast engine 3304 being executed on it. The processing system 3302 has access to a computer-readable memory 3306 in addition to one or more data stores 3308. The one or more data stores 3308 may include models 3310 as well as model forecasts 3312.
  • FIG. 33B depicts a system 3320 that includes a client server architecture. One or more user PCs 3322 accesses one or more servers 3324 running a combined forecast engine 3326 on a processing system 3327 via one or more networks 3328. The one or more servers 3324 may access a computer readable memory 3330 as well as one or more data stores 3332. The one or more data stores 3332 may contain models 3334 as well as model forecasts 3336.
  • FIG. 33C shows a block diagram of exemplary hardware for a standalone computer architecture 3350, such as the architecture depicted in FIG. 33A that may be used to contain and/or implement the program instructions of system embodiments of the present invention. A bus 3352 may serve as the information highway interconnecting the other illustrated components of the hardware. A processing system 3354 labeled CPU (central processing unit) (e.g., one or more computer processors), may perform calculations and logic operations required to execute a program. A processor-readable storage medium, such as read only memory (ROM) 3356 and random access memory (RAM) 3358, may be in communication with the processing system 3354 and may contain one or more programming instructions for performing the method of implementing a combined forecast engine. Optionally, program instructions may be stored on a computer readable storage medium such as a magnetic disk, optical disk, recordable memory device, flash memory, or other physical storage medium. Computer instructions may also be communicated via a communications signal, or a modulated carrier wave.
  • A disk controller 3360 interfaces one or more optional disk drives to the system bus 3352. These disk drives may be external or internal floppy disk drives such as 3362, external or internal CD-ROM, CD-R, CD-RW or DVD drives such as 3364, or external or internal hard drives 3366. As indicated previously, these various disk drives and disk controllers are optional devices.
  • Each of the element managers, real-time data buffer, conveyors, file input processor, database index shared access memory loader, reference data buffer and data managers may include a software application stored in one or more of the disk drives connected to the disk controller 3360, the ROM 3356 and/or the RAM 3358. Preferably, the processor 3354 may access each component as required.
  • A display interface 3368 may permit information from the bus 3352 to be displayed on a display 3370 in audio, graphic, or alphanumeric format. Communication with external devices may optionally occur using various communication ports 3372.
  • In addition to the standard computer-type components, the hardware may also include data input devices, such as a keyboard 3373, or other input device 3374, such as a microphone, remote control, pointer, mouse and/or joystick.
  • FIGS. 34-41 depict example systems and methods for evaluating the performance of a combined forecast model using rolling simulations. Before using a time series model to forecast a time series, it is often important to evaluate the model's performance by validating the model's ability to forecast previously acquired data. Such performance measures are commonly referred to as ex-ante model performance measures. Ex-ante (i.e., before the fact) model performance measures are similar to ex post (i.e., after the fact) forecast performance measures; however, ex-post forecast performance measures are used to evaluate the performance of forecasts regardless of the source (e.g., model-based, judgment-based, possible adjustments, etc.) Rolling simulations can be used to perform ex-ante model performance by repeating the analyses over several forecast origins and lead times.
  • FIG. 34 is a block diagram depicting an example system 4000 for evaluating the performance of a combined forecast model using a rolling simulation analysis. The combined forecast model may be generated from a plurality of individual forecast models 4010 using a combined forecast engine 4020, for example using one or more of the systems and methods described above with reference to FIGS. 1-33. The plurality of forecast models 4010 may, for example, be generated based on sampled time-series data 4030 using a hierarchical time series forecasting system 4040, such as the Forecast Studio and Forecast Server software sold by SAS Institute Inc. of Cary, N.C.
  • The example system 4000 includes a rolling simulation engine 4050 that works in combination with the combined forecast engine 4020 to select a model combination 4060 and evaluate its performance over an out-of-sample range 4065. For instance, the rolling simulation engine 4050 and/or the combined forecast engine 4020 may present a user interface to select two or more of the individual forecast models 4010 for use in generating a combined forecast. In addition, a user interface may be provided by the rolling simulation engine 4050 to define the out-of-sample range 4065 (e.g., BACK=) over which the combined forecast is to be evaluated. The combined forecast is then repeated over the entire out-of-sample range, e.g., over the range of BACK=0 to a specified “back” value. The resulting ex-post forecasts 4070 (e.g., the forecasts for each BACK=value in the out-of-sample range) may then be displayed on a user interface 4080 along with the actual out-of-sample data 4090, such that the ex-post forecasts may be visually compared with the actual out-of-sample data over the specified period. In certain embodiments, the rolling simulation engine 4050 may also be used to simulate and display ex-ante forecasts for the combined model over a rolling simulation horizon (e.g., LEAD=value).
  • The rolling simulation engine 4050 may also calculate one or more performance statistics 4100 based on statistical comparisons of the actual out-of-sample data and the ex-post forecasts over the specified out-of-sample period 4065. The performance statistics 4100 may, for example, be displayed on a different tab of the user interface 4080. The performance statistics 4100 may include statistics that provide an indication of the average error between the forecast 4070 and the actual out-of-sample data 4090, such as mean, mean absolute percentage error (MAPE), mean absolute error (MAE), median absolute deviation (MAD) and/or MAD/Mean ratio calculations. Because these statistics indicate average errors between the actual and forecasted data, a low variance for the statistics over different out-of-sample ranges (e.g., different BACK=values) may provide an indication that the combined model is forecasting with required accuracy. On the other hand, a high variation in the forecast performance statistics 4100 over an out-of-sample range may indicate that the combined model is not performing at the required accuracy and may therefore result in larger errors when used over a wider horizon.
  • In certain embodiments, the rolling simulation engine 4050 and/or the combined forecast engine 4020 may be further configured to dynamically adjust one or more characteristics of the combined model based on the forecast horizon. For instance, the combined model list may be re-run for each (BACK, LEAD) pair such that the set of model candidates included in the combination can change from one pair of (BACK, LEAD) values to the next depending on how the combined model list is defined. For example, certain forecast quality tests, such as an encompassing test can be specified. An encompassing test examines forecasts produced by the component models of a combined model to determine whether the forecasts produced by one or more of the component models that make up the combined model are redundant when processing given data. For each (BACK, LEAD) pair, an encompassing test may be performed on the combined model based on the out-of-sample input data to be provided to the combined model for that (BACK, LEAD) pair. Should one or more of the component models be found to be redundant, those component models can be omitted from the combined model for that (BACK, LEAD) pair, with the weightings of the component models of the combined model being automatically adjusted accordingly. Component models can be dropped for other quality reasons as well, such as the inability or ineffectiveness of those component models in handling missing values present in the candidate model forecast's historical period and/or its horizon for a particular (BACK, LEAD) pair. In addition, component models may be dropped if the chosen method of weight estimation fails to produce a valid weight estimate for that model's forecast. Component model weights may also adaptively rescale over the span of the (BACK, LEAD) period to account for candidate models with missing values in their forecasts.
  • An example operation of the system 4000 of FIG. 34 is further detailed with reference to the exemplary screen shots set forth in FIGS. 35-41. With reference first to FIG. 35, this figure illustrates an example interface 4200 which may be generated to select a model or model combination for the rolling simulation analysis. The interface 4200 includes a model selection region 4210 that lists the fitted individual models that are available for analysis. The interface 4200 may also include a forecast summary region 4220 that displays forecast data and/or other information relating to a selected one of the fitted models listed in the selection region 4210. In order to specify a combined model for evaluation, the interface 4200 may enable the user to highlight two or more of the individual models from the selection region 4210 and then select a combine models icon 4230. Selecting the combine model icon 4230 may cause a combine models interface to be displayed, for example in a pop-up window, as illustrated in FIG. 36.
  • FIG. 36 depicts an example of an interface 4300 which may be generated to define the characteristics of a combined model for the rolling simulation analysis. The interface 4300 includes a model selection region 4310 for selecting the individual models for combination. As illustrated, the model selection region 4310 may list the model names along with one or more characteristics of the model, such as the model type and a model performance statistic (e.g., MAPE). In the illustrated example, two models (TOP 1 and TOP2) are selected for combination.
  • The model combination interface 4300 provides a plurality of user-editable fields 4320 for defining the characteristics of the combined model. For instance, the user-editable fields 4320 may include a field for defining the model combination method (e.g., average or specify weights), a field for editing any specific weights applied to each individual model, a field to define how the combined model will treat missing values, and/or other fields for defining the characteristics of the combined model. In addition, the interface 4300 may further include fields 4330 that may be used to define the percentage of missing forecast values in the combination horizon and the percentage of missing forecast values in the combination estimation region. The interface 4300 may also provide regions for naming the combined model and providing a model description.
  • As illustrated in FIG. 37, once the characteristics of the combined model have been established using the model combination interface, the model selection interface is updated to list the newly defined combined model 4400. The combined model 4400 may then be selected for rolling simulation analysis, as shown in FIG. 37. The rolling simulation analysis may, for example, be executed on the selected combined model 4400 by selecting a rolling simulation icon 4410 on the interface 4200.
  • FIG. 38 depicts an example rolling simulation interface 4500 that may be displayed upon executing a rolling simulation analysis on a selected combined model. The interface 4500 includes a user-editable region 4510 for defining the number of out-of-sample observations (e.g., the number of BACK=values) to be included in the simulation. Upon selecting a number of out-of-sample observations, the user may begin the simulation, for example by selecting a Run Simulation icon 4520. In the illustrated example, a simulation has been executed after selecting 6 out-of-sample observations. The rolling simulation engine then simulates an ex post forecast for each of out-of-sample observation (e.g., for each BACK=value 1-6), and plots the resulting forecasts on a display region 4530 of the interface 4500 along with the actual out-of-sample values. In this way, the rolling simulation interface 4500 enables the user to graphically view the predictions of the combined model at various forecast origins and compare the predictions to the actual out-of-sample data values.
  • For instance, in the illustrated example, six ex-post forecasts (BACK:1 through BACK:6) are plotted in the display region 4530 for comparison with the actual out-of-sample data values. Specifically, in the first out-of-sample observation (BACK:1) the combined forecast is generated one period into the past (December 02), in the next out-of-sample observation (BACK:2) the combined forecast is generated two periods into the past (December 02 thru November 02), and so on through the sixth out-of-sample observation (BACK:6).
  • In addition, for further comparison, the rolling simulation interface 4500 of FIG. 38 also includes a numerical display region 4540 that displays the forecast values for each out-of-sample observation (BACK:1 thru BACK:6) along with the actual values of the out-of-sample data. The forecasted values in the illustrated example are set forth in the numerical display region 4540 in bold text.
  • In certain embodiments, the rolling simulation engine may calculate an optimal number of out-of-sample observations to automatically populate the interface field 4510 with a back range default value. The default value for the back range field 4510 may, for example, be calculated using the following formula:

  • Default=min(lead,min(min(max(seasonality,4),52),min(t−6,max((int)(0.1*t),1))))
  • where:
      • lead—lead option value
      • T—length of series
      • S—length of seasonality
      • Minimum value of BackRange is: Minimum=1
      • Maximum value of BackRange is: Maximum=(T minus 6)

  • defaultBackRange=Math.min(lead,Math.min(Math.min(Math.max(seasonality,4),52),Math.min(t−6,Math.max((int)(0.1*t),1))))
  • As illustrated in FIG. 39, the rolling simulation interface 4500 may also include one or more fields for use in generating and displaying one or more forecast performance statistics. The simulation statistics fields may, for example, be included in a separate tab 4600 of the rolling simulation interface 4500, as illustrated in FIG. 39. The simulation statistics tab 4600 includes a field 4610 for selecting a particular forecast performance statistic for identifying the error between the forecasted values and the actual out-of-sample values, such as a Mean, MAE, MAPE, MAD or MAD/Mean ratio calculation. A graph of the selected performance statistic at each lead time is displayed in a display region 4620 of the interface.
  • In addition, the simulation statistics tab 4600 may also include a numerical display region 4630 that displays the performance statistic values for each lead time. In the illustrated example, the numerical display region 4630 includes values for a plurality of different performance statistics calculated by the rolling simulation engine. In this way, the user may simultaneously view the values for multiple performance statistics for the combined model in the numerical display region 4630, and select a particular one of the performance statistic for graphical display 4620.
  • The graph 4620 and numerical display 4630 in the example illustrated in FIG. 39 show the performance error in terms of the lead time of the ex-post forecast. For illustration, let et-b+l|t−b (m)=yt-b+l−ŷt-b+l|t−b (m) represent the multi-step-ahead prediction error for the time (t−b+l)th period and for the mth time series model for b=1, . . . , B and l=1, . . . , b, where b is the holdback index and l is the lead index. The holdback index, b, rolls the forecast origin forward in time. For each holdback index, b=1, . . . , B, multi-step-ahead prediction errors are computed for each lead index, l=1, . . . , b. By gathering the prediction errors by the lead index, l, for all holdback indices, b=1, . . . , B, forecast performance can be evaluated by lead time. In other words, a model's l-step-ahead forecast performance can be evaluated across various forecast origins.
  • With reference again to the simulation tab of the interface 4500, FIG. 40 illustrates the further ability of the rolling simulation engine to generate ex-ante (i.e., forward-looking) forecasts for the combined model. The example interface 4500 includes a user input region 4700 to select the desired number of periods to forecast. When a simulation is executed with both the number of out-of-sample observations 4510 and a number of forecast periods 4700 selected, the rolling simulation engine generates both ex-ante and ex-post forecasts for each of the out-of-sample observations. The forecasts for each out-of-sample observation are plotted along with the actual out-of-sample data in the graphical display region 4530 of the interface 4500. A vertical line 4730 separates the ex-post forecasts 4720 (to the left of the line) from the ex-ante 4710 forecasts (to the right of the line.) Of course, actual out-of-sample data values are displayed only for past periods in which data has been recorded, which is why there are no actual out-of-sample values to the right of the vertical line 4730 in the illustrated example.
  • In addition, for further comparison, the example interface 4500 also includes numerical values for both the ex-ante and ex-post forecasts in the numerical display region 4540, along with the actual out-of-sample data values. The ex-ante and ex-post forecasted values in the illustrated example are set forth in the numerical display region 4540 in bold text.
  • FIG. 41 illustrates an example of how the graphical display on the simulation interface 4500 may be modified to compare specific out-of-sample observations. In the illustrated example, the interface includes user input fields 4800 for selecting either all of the simulations or select simulations for display. If the “select simulations” field is selected, then the interface 4500 only displays the forecasts for out-of-sample observations that are selected in the numerical display region 4540. In the illustrated example, the Back:2, Back:4 and Back:6 observations have been selected, and therefore only these three forecasts are displayed in the graphical display region 4530 along with the actual out-of-sample values.
  • The methods and systems described herein may be implemented on many different types of processing devices by program code comprising program instructions that are executable by the device processing subsystem. The software program instructions may include source code, object code, machine code, or any other stored data that is operable to cause a processing system to perform the methods and operations described herein. Other implementations may also be used, however, such as firmware or even appropriately designed hardware configured to carry out the methods and systems described herein.
  • The systems' and methods' data (e.g., associations, mappings, data input, data output, intermediate data results, final data results, etc.) may be stored and implemented in one or more different types of computer-implemented data stores, such as different types of storage devices and programming constructs (e.g., RAM, ROM, Flash memory, flat files, databases, programming data structures, programming variables, IF-THEN (or similar type) statement constructs, etc.). It is noted that data structures describe formats for use in organizing and storing data in databases, programs, memory, or other computer-readable media for use by a computer program.
  • The computer components, software modules, functions, data stores and data structures described herein may be connected directly or indirectly to each other in order to allow the flow of data needed for their operations. It is also noted that a module or processor includes but is not limited to a unit of code that performs a software operation, and can be implemented for example as a subroutine unit of code, or as a software function unit of code, or as an object (as in an object-oriented paradigm), or as an applet, or in a computer script language, or as another type of computer code. The software components and/or functionality may be located on a single computer or distributed across multiple computers depending upon the situation at hand.
  • It should be understood that as used in the description herein and throughout the claims that follow, the meaning of “a,” “an,” and “the” includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise. Further, as used in the description herein and throughout the claims that follow, the meaning of “each” does not require “each and every” unless the context clearly dictates otherwise. Finally, as used in the description herein and throughout the claims that follow, the meanings of “and” and “or” include both the conjunctive and disjunctive and may be used interchangeably unless the context expressly dictates otherwise; the phrase “exclusive or” may be used to indicate situation where only the disjunctive meaning may apply.

Claims (26)

It is claimed:
1. A computer-program product tangibly embodied in a non-transitory, machine-readable storage medium having instructions stored thereon, the instructions operable to cause a data processing apparatus to perform operations including:
accessing an input that represents at least two models that are configured to provide information for ex post forecasts and ex-ante forecasts with regard to a forecast variable;
accessing a historical time series that includes multiple historical observations of the forecast variable;
defining multiple distinct holdout time series within the historical time series, wherein each of the holdout time series includes a portion of the historical observations of the forecast variable and spans a corresponding time period;
generating a combination model by combining the models;
processing multi-step forecasting characteristics of the combination model by using ex-post forecasting at multiple different points of origin, wherein the processing of the multi-step forecasting characteristics of the combination model includes performing operations including:
using the combination model to generate ex-post forecasted values of the forecast variable at multiple times during the time period;
comparing the ex-post forecasted values to coinciding historical observations included in the holdout time series that spans the time period; and
calculating ex-post forecast errors based on the comparison; and
performing at least one of storing at least part of the multi-step forecasting characteristics of the combination model in a computer data store or transmitting at least part of the multi-step forecasting characteristics of the combination model.
2. The computer-program product of claim 1, wherein:
the ex-post forecast errors calculated with respect to each of the time periods are displayed with reference to the time periods spanned by the respective holdout time series,
the processing of the multi-step forecasting characteristics of the combination model occurs with respect to each of the time periods, and
the performing at least one of storing at least part of the multi-step forecasting characteristics of the combination model or transmitting at least part of the multi-step forecasting characteristics of the combination model comprises storing the ex-post forecast errors or transmitting the ex-post forecast errors.
3. The computer-program product of claim 1, wherein each of the time periods differs in duration from every other one of the time periods.
4. The computer-program product of claim 1, the instructions are further operable to cause the data processing apparatus to perform operations including:
displaying an indication of multiple selectable holdout time series durations; and
receiving an input that represents a selection of at least two of the holdout time series durations, wherein the multiple holdout time series are defined based on the selected holdout time series durations.
5. The computer-program product of claim 1, wherein comparing the ex-post forecasted values to coinciding historical observations includes displaying the ex-post forecasted values and the coinciding historical observations on a same graph.
6. The computer-program product of claim 5, the instructions are further operable to cause the data processing apparatus to perform operations including:
using the combination model to generate an ex-ante forecast of the forecasting variable, wherein the ex-ante forecast includes forecasted values of the forecasting variable with respect to at least one future time or future time period.
7. The computer-program product of claim 5, the instructions are further operable to cause the data processing apparatus to perform operations including:
receiving an input representing an ex-ante forecasting step-ahead parameter, wherein generating the ex-ante forecast further includes setting a step-ahead time of the ex-ante forecast based on the input.
8. The computer-program product of claim 1, wherein:
defining multiple holdout time series within the historical time series is performed such that no two of the time periods begin at a same time,
the accessing the input that represents the at least two models is accessed from at least one computer data store, and
the instructions are further operable to cause the data processing apparatus to perform operations including displaying an output from a rolling simulation engine.
9. A computer-implemented method comprising:
accessing an input that represents at least two models that are configured to provide information for ex post forecasts and ex-ante forecasts with regard to a forecast variable;
accessing a historical time series that includes multiple historical observations of the forecast variable;
defining multiple distinct holdout time series within the historical time series, wherein each of the holdout time series includes a portion of the historical observations of the forecast variable and spans a corresponding time period;
generating a combination model by combining the models;
processing multi-step forecasting characteristics of the combination model by using ex-post forecasting at multiple different points of origin, wherein the processing of the multi-step forecasting characteristics of the combination model includes performing operations including:
using the combination model to generate ex-post forecasted values of the forecast variable at multiple times during the time period;
comparing the ex-post forecasted values to coinciding historical observations included in the holdout time series that spans the time period; and
calculating ex-post forecast errors based on the comparison; and
performing at least one of storing at least part of the multi-step forecasting characteristics of the combination model in a computer data store or transmitting at least part of the multi-step forecasting characteristics of the combination model.
10. The computer-implemented method of claim 9, wherein:
the ex-post forecast errors calculated with respect to each of the time periods are displayed with reference to the time periods spanned by the respective holdout time series,
the processing of the multi-step forecasting characteristics of the combination model occurs with respect to each of the time periods, and
the performing at least one of storing at least part of the multi-step forecasting characteristics of the combination model or transmitting at least part of the multi-step forecasting characteristics of the combination model comprises storing the ex-post forecast errors or transmitting the ex-post forecast errors.
11. The computer-implemented method of claim 9, wherein each of the time periods differs in duration from every other one of the time periods.
12. The computer-implemented method of claim 9, further comprising:
displaying an indication of multiple selectable holdout time series durations; and
receiving an input that represents a selection of at least two of the holdout time series durations, wherein the multiple holdout time series are defined based on the selected holdout time series durations.
13. The computer-implemented method of claim 9, wherein comparing the ex post forecasted values to coinciding historical observations includes displaying the ex-post forecasted values and the coinciding historical observations on a same graph.
14. The computer-implemented method of claim 13, further comprising:
using the combination model to generate an ex-ante forecast of the forecasting variable, wherein the ex-ante forecast includes forecasted values of the forecasting variable with respect to at least one future time or future time period.
15. The computer-implemented method of claim 13, further comprising:
receiving an input representing an ex-ante forecasting step-ahead parameter, wherein generating the ex-ante forecast further includes setting a step-ahead time of the ex-ante forecast based on the input.
16. The computer-implemented method of claim 9, wherein:
defining multiple holdout time series within the historical time series is performed such that no two of the time periods begin at a same time,
the accessing the input that represents the at least two models is accessed from at least one computer data store, and
the instructions are further operable to cause the data processing apparatus to perform operations including displaying an output from a rolling simulation engine.
17. A computerized system comprising:
a processor configured to perform operations including:
accessing an input that represents at least two models that are configured to provide information for ex post forecasts and ex-ante forecasts with regard to a forecast variable;
accessing a historical time series that includes multiple historical observations of the forecast variable;
defining multiple distinct holdout time series within the historical time series, wherein each of the holdout time series includes a portion of the historical observations of the forecast variable and spans a corresponding time period;
generating a combination model by combining the models;
processing multi-step forecasting characteristics of the combination model by using ex-post forecasting at multiple different points of origin, wherein the processing of the multi-step forecasting characteristics of the combination model includes performing operations including:
using the combination model to generate ex-post forecasted values of the forecast variable at multiple times during the time period;
comparing the ex-post forecasted values to coinciding historical observations included in the holdout time series that spans the time period; and
calculating ex-post forecast errors based on the comparison; and
performing at least one of storing at least part of the multi-step forecasting characteristics of the combination model in a computer data store or transmitting at least part of the multi-step forecasting characteristics of the combination model.
18. The system of claim 17, wherein:
the ex-post forecast errors calculated with respect to each of the time periods are displayed with reference to the time periods spanned by the respective holdout time series,
the operations for processing of the multi-step forecasting characteristics of the combination model occurs with respect to each of the time periods, and
the operations for performing at least one of storing at least part of the multi-step forecasting characteristics of the combination model or transmitting at least part of the multi-step forecasting characteristics of the combination model comprises storing the ex-post forecast errors or transmitting the ex-post forecast errors.
19. The system of claim 17, wherein each of the time periods differs in duration from every other one of the time periods.
20. The system of claim 17, wherein the operations further include:
displaying an indication of multiple selectable holdout time series durations; and
receiving an input that represents a selection of at least two of the holdout time series durations, wherein the multiple holdout time series are defined based on the selected holdout time series durations.
21. The system of claim 17, wherein comparing the ex-post forecasted values to coinciding historical observations includes displaying the ex-post forecasted values and the coinciding historical observations on a same graph.
22. The system of claim 21, wherein the operations further include:
using the combination model to generate an ex-ante forecast of the forecasting variable, wherein the ex-ante forecast includes forecasted values of the forecasting variable with respect to at least one future time or future time period.
23. The system of claim 21, wherein the operations further include:
receiving an input representing an ex-ante forecasting step-ahead parameter, wherein generating the ex-ante forecast further includes setting a step-ahead time of the ex-ante forecast based on the input.
24. The system of claim 17, wherein:
defining multiple holdout time series within the historical time series is performed such that no two of the time periods begin at a same time,
the accessing the input that represents the at least two models is accessed from at least one computer data store, and
the instructions are further operable to cause the data processing apparatus to perform operations including displaying an output from a rolling simulation engine.
25. The system of claim 17, further comprising:
a combined forecast engine that is operable to combine predictions from the models to generate a single combined forecast.
26. The system of claim 25, further comprising:
a rolling simulation engine that is operable to interact with the combined forecast engine to define the characteristics of the combination model for a rolling simulation analysis.
US14/557,312 2011-07-22 2014-12-01 Computer-Implemented Systems and Methods for Testing Large Scale Automatic Forecast Combinations Abandoned US20150120263A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/557,312 US20150120263A1 (en) 2011-07-22 2014-12-01 Computer-Implemented Systems and Methods for Testing Large Scale Automatic Forecast Combinations

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US13/189,131 US20130024167A1 (en) 2011-07-22 2011-07-22 Computer-Implemented Systems And Methods For Large Scale Automatic Forecast Combinations
US201261594442P 2012-02-03 2012-02-03
US13/440,045 US9047559B2 (en) 2011-07-22 2012-04-05 Computer-implemented systems and methods for testing large scale automatic forecast combinations
US14/557,312 US20150120263A1 (en) 2011-07-22 2014-12-01 Computer-Implemented Systems and Methods for Testing Large Scale Automatic Forecast Combinations

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/440,045 Continuation US9047559B2 (en) 2011-07-22 2012-04-05 Computer-implemented systems and methods for testing large scale automatic forecast combinations

Publications (1)

Publication Number Publication Date
US20150120263A1 true US20150120263A1 (en) 2015-04-30

Family

ID=47556391

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/440,045 Active 2032-01-07 US9047559B2 (en) 2011-07-22 2012-04-05 Computer-implemented systems and methods for testing large scale automatic forecast combinations
US14/557,312 Abandoned US20150120263A1 (en) 2011-07-22 2014-12-01 Computer-Implemented Systems and Methods for Testing Large Scale Automatic Forecast Combinations

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/440,045 Active 2032-01-07 US9047559B2 (en) 2011-07-22 2012-04-05 Computer-implemented systems and methods for testing large scale automatic forecast combinations

Country Status (1)

Country Link
US (2) US9047559B2 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9244887B2 (en) 2012-07-13 2016-01-26 Sas Institute Inc. Computer-implemented systems and methods for efficient structuring of time series data
US20170364614A1 (en) * 2016-06-16 2017-12-21 International Business Machines Corporation Adaptive forecasting of time-series
CN107798332A (en) * 2016-09-05 2018-03-13 华为技术有限公司 A kind of user's behavior prediction method and device
US9916282B2 (en) 2012-07-13 2018-03-13 Sas Institute Inc. Computer-implemented systems and methods for time series exploration
US10255085B1 (en) 2018-03-13 2019-04-09 Sas Institute Inc. Interactive graphical user interface with override guidance
US10331490B2 (en) 2017-11-16 2019-06-25 Sas Institute Inc. Scalable cloud-based time series analysis
US10338994B1 (en) 2018-02-22 2019-07-02 Sas Institute Inc. Predicting and adjusting computer functionality to avoid failures
US10438126B2 (en) 2015-12-31 2019-10-08 General Electric Company Systems and methods for data estimation and forecasting
US10560313B2 (en) 2018-06-26 2020-02-11 Sas Institute Inc. Pipeline system for time-series data forecasting
US10685283B2 (en) 2018-06-26 2020-06-16 Sas Institute Inc. Demand classification based pipeline system for time-series data forecasting
US10983682B2 (en) 2015-08-27 2021-04-20 Sas Institute Inc. Interactive graphical user-interface for analyzing and manipulating time-series projections
US11308414B2 (en) 2018-10-11 2022-04-19 International Business Machines Corporation Multi-step ahead forecasting using complex-valued vector autoregregression

Families Citing this family (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030200134A1 (en) * 2002-03-29 2003-10-23 Leonard Michael James System and method for large-scale automatic forecasting
US9547584B2 (en) * 2011-03-08 2017-01-17 Google Inc. Remote testing
US9147218B2 (en) 2013-03-06 2015-09-29 Sas Institute Inc. Devices for forecasting ratios in hierarchies
US9934259B2 (en) 2013-08-15 2018-04-03 Sas Institute Inc. In-memory time series database and processing in a distributed environment
US20150120381A1 (en) * 2013-10-24 2015-04-30 Oracle International Corporation Retail sales overlapping promotions forecasting using an optimized p-norm
US20150302432A1 (en) * 2014-04-17 2015-10-22 Sas Institute Inc. Classifying, Clustering, and Grouping Demand Series
US10169720B2 (en) * 2014-04-17 2019-01-01 Sas Institute Inc. Systems and methods for machine learning using classifying, clustering, and grouping time series data
US9892370B2 (en) 2014-06-12 2018-02-13 Sas Institute Inc. Systems and methods for resolving over multiple hierarchies
US11188946B2 (en) * 2014-07-14 2021-11-30 Nec Corporation Commercial message planning assistance system and sales prediction assistance system
US9208209B1 (en) 2014-10-02 2015-12-08 Sas Institute Inc. Techniques for monitoring transformation techniques using control charts
US9732593B2 (en) * 2014-11-05 2017-08-15 Saudi Arabian Oil Company Systems, methods, and computer medium to optimize storage for hydrocarbon reservoir simulation
US10373068B2 (en) 2014-11-10 2019-08-06 International Business Machines Corporation Weight adjusted composite model for forecasting in anomalous environments
US9418339B1 (en) 2015-01-26 2016-08-16 Sas Institute, Inc. Systems and methods for time series analysis techniques utilizing count data sets
WO2016129218A1 (en) * 2015-02-09 2016-08-18 日本電気株式会社 Display system for displaying analytical information, method, and program
WO2016152053A1 (en) * 2015-03-23 2016-09-29 日本電気株式会社 Accuracy-estimating-model generating system and accuracy estimating system
US10509869B2 (en) * 2015-09-15 2019-12-17 Baker Street Scientific Inc. System and method for heuristic predictive and nonpredictive modeling
US10394973B2 (en) * 2015-12-18 2019-08-27 Fisher-Rosemount Systems, Inc. Methods and apparatus for using analytical/statistical modeling for continued process verification (CPV)
US9940033B1 (en) * 2015-12-30 2018-04-10 EMC IP Holding Company LLC Method, system and computer readable medium for controlling performance of storage pools
US10726354B2 (en) 2016-01-29 2020-07-28 Splunk Inc. Concurrently forecasting multiple time series
US10650045B2 (en) 2016-02-05 2020-05-12 Sas Institute Inc. Staged training of neural networks for improved time series prediction performance
US10795935B2 (en) 2016-02-05 2020-10-06 Sas Institute Inc. Automated generation of job flow definitions
US10650046B2 (en) 2016-02-05 2020-05-12 Sas Institute Inc. Many task computing with distributed file system
US10642896B2 (en) 2016-02-05 2020-05-05 Sas Institute Inc. Handling of data sets during execution of task routines of multiple languages
US10885461B2 (en) 2016-02-29 2021-01-05 Oracle International Corporation Unsupervised method for classifying seasonal patterns
US10867421B2 (en) 2016-02-29 2020-12-15 Oracle International Corporation Seasonal aware method for forecasting and capacity planning
US10331802B2 (en) 2016-02-29 2019-06-25 Oracle International Corporation System for detecting and characterizing seasons
US10198339B2 (en) 2016-05-16 2019-02-05 Oracle International Corporation Correlation-based analytic for time-series data
US11082439B2 (en) 2016-08-04 2021-08-03 Oracle International Corporation Unsupervised method for baselining and anomaly detection in time-series data for enterprise systems
CN109791511A (en) * 2016-09-29 2019-05-21 惠普发展公司,有限责任合伙企业 Unit failure prediction
CN106599615B (en) * 2016-11-30 2019-04-05 广东顺德中山大学卡内基梅隆大学国际联合研究院 A kind of sequence signature analysis method for predicting miRNA target gene
USD898059S1 (en) 2017-02-06 2020-10-06 Sas Institute Inc. Display screen or portion thereof with graphical user interface
US10949436B2 (en) * 2017-02-24 2021-03-16 Oracle International Corporation Optimization for scalable analytics using time series models
US10915830B2 (en) 2017-02-24 2021-02-09 Oracle International Corporation Multiscale method for predictive alerting
US10817803B2 (en) 2017-06-02 2020-10-27 Oracle International Corporation Data driven methods and systems for what if analysis
USD898060S1 (en) 2017-06-05 2020-10-06 Sas Institute Inc. Display screen or portion thereof with graphical user interface
WO2019109338A1 (en) * 2017-12-08 2019-06-13 Nokia Shanghai Bell Co., Ltd Methods and systems for generation and adaptation of network baselines
JP6981539B2 (en) * 2018-03-30 2021-12-15 日本電気株式会社 Model estimation system, model estimation method and model estimation program
US11138090B2 (en) 2018-10-23 2021-10-05 Oracle International Corporation Systems and methods for forecasting time series with variable seasonality
US10855548B2 (en) 2019-02-15 2020-12-01 Oracle International Corporation Systems and methods for automatically detecting, summarizing, and responding to anomalies
US11106344B2 (en) * 2019-03-12 2021-08-31 DecisionNext, Inc. Methods and devices for capturing heuristic information via a weighting tool
US11533326B2 (en) 2019-05-01 2022-12-20 Oracle International Corporation Systems and methods for multivariate anomaly detection in software monitoring
US11537940B2 (en) 2019-05-13 2022-12-27 Oracle International Corporation Systems and methods for unsupervised anomaly detection using non-parametric tolerance intervals over a sliding window of t-digests
US11842252B2 (en) * 2019-06-27 2023-12-12 The Toronto-Dominion Bank System and method for examining data from a source used in downstream processes
WO2021168383A1 (en) * 2020-02-21 2021-08-26 Mission Bio, Inc. Using machine learning to optimize assays for single cell targeted sequencing
US20210034712A1 (en) * 2019-07-30 2021-02-04 Intuit Inc. Diagnostics framework for large scale hierarchical time-series forecasting models
US11887015B2 (en) 2019-09-13 2024-01-30 Oracle International Corporation Automatically-generated labels for time series data and numerical lists to use in analytic and machine learning systems
US11163435B2 (en) * 2019-10-06 2021-11-02 Td Ameritrade Ip Company, Inc. Systems and methods for computerized generation of user interface systems
US11409966B1 (en) 2021-12-17 2022-08-09 Sas Institute Inc. Automated trending input recognition and assimilation in forecast modeling
CN117130882B (en) * 2023-08-14 2024-03-08 中南民族大学 Node resource prediction method and system based on time sequence intervention analysis model

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6745150B1 (en) * 2000-09-25 2004-06-01 Group 1 Software, Inc. Time series analysis and forecasting program
US20060178927A1 (en) * 2005-02-04 2006-08-10 Taiwan Semiconductor Manufacturing Co., Ltd. Demand forecast system and method

Family Cites Families (111)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5461699A (en) 1993-10-25 1995-10-24 International Business Machines Corporation Forecasting using a neural network and a statistical forecast
US6052481A (en) 1994-09-02 2000-04-18 Apple Computers, Inc. Automatic method for scoring and clustering prototypes of handwritten stroke-based data
JP3302522B2 (en) 1994-12-26 2002-07-15 富士通株式会社 Database system and its information utilization support device
US5615109A (en) 1995-05-24 1997-03-25 Eder; Jeff Method of and system for generating feasible, profit maximizing requisition sets
US5870746A (en) 1995-10-12 1999-02-09 Ncr Corporation System and method for segmenting a database based upon data attributes
EP0770967A3 (en) 1995-10-26 1998-12-30 Koninklijke Philips Electronics N.V. Decision support system for the management of an agile supply chain
US5995943A (en) 1996-04-01 1999-11-30 Sabre Inc. Information aggregation and synthesization system
US5901287A (en) 1996-04-01 1999-05-04 The Sabre Group Inc. Information aggregation and synthesization system
US6189029B1 (en) 1996-09-20 2001-02-13 Silicon Graphics, Inc. Web survey tool builder and result compiler
US6400853B1 (en) 1997-03-19 2002-06-04 Canon Kabushiki Kaisha Image retrieval apparatus and method
US6063028A (en) 1997-03-20 2000-05-16 Luciano; Joanne Sylvia Automated treatment selection method
JP2002513489A (en) 1997-05-21 2002-05-08 カイメトリクス・インコーポレーテッド Method of controlled optimization of corporate planning model
US5991740A (en) 1997-06-10 1999-11-23 Messer; Stephen Dale Data processing system for integrated tracking and management of commerce related activities on a public access network
US6169534B1 (en) 1997-06-26 2001-01-02 Upshot.Com Graphical user interface for customer information management
JP3699807B2 (en) 1997-06-30 2005-09-28 株式会社東芝 Correlation extractor
US6128624A (en) 1997-11-12 2000-10-03 Ncr Corporation Collection and integration of internet and electronic commerce data in a database during web browsing
US6151584A (en) 1997-11-20 2000-11-21 Ncr Corporation Computer architecture and method for validating and collecting and metadata and data about the internet and electronic commerce environments (data discoverer)
US5918232A (en) 1997-11-26 1999-06-29 Whitelight Systems, Inc. Multidimensional domain modeling method and system
US6286005B1 (en) 1998-03-11 2001-09-04 Cannon Holdings, L.L.C. Method and apparatus for analyzing data and advertising optimization
US7260550B1 (en) 1998-09-18 2007-08-21 I2 Technologies Us, Inc. System and method for multi-enterprise supply chain optimization
JP4098420B2 (en) 1998-11-04 2008-06-11 富士通株式会社 Synchronous reconstruction method and apparatus for acoustic data and moving image data
US6397166B1 (en) 1998-11-06 2002-05-28 International Business Machines Corporation Method and system for model-based clustering and signal-bearing medium for storing program of same
US6216129B1 (en) 1998-12-03 2001-04-10 Expanse Networks, Inc. Advertisement selection system supporting discretionary target market characteristics
US6334110B1 (en) 1999-03-10 2001-12-25 Ncr Corporation System and method for analyzing customer transactions and interactions
US6591255B1 (en) 1999-04-05 2003-07-08 Netuitive, Inc. Automatic data extraction, error correction and forecasting system
US7072863B1 (en) 1999-09-08 2006-07-04 C4Cast.Com, Inc. Forecasting using interpolation modeling
US6792399B1 (en) 1999-09-08 2004-09-14 C4Cast.Com, Inc. Combination forecasting using clusterization
US6611726B1 (en) 1999-09-17 2003-08-26 Carl E. Crosswhite Method for determining optimal time series forecasting parameters
US6850871B1 (en) 1999-10-18 2005-02-01 Agilent Technologies, Inc. Method and apparatus for extraction of nonlinear black-box behavioral models from embeddings of the time-domain measurements
US6526405B1 (en) 1999-12-17 2003-02-25 Microsoft Corporation Determining similarity between event types in sequences
US6564190B1 (en) 1999-12-30 2003-05-13 General Electric Capital Corporaton Method and formulating an investment strategy for real estate investment
US6775646B1 (en) 2000-02-23 2004-08-10 Agilent Technologies, Inc. Excitation signal and radial basis function methods for use in extraction of nonlinear black-box behavioral models
US6539392B1 (en) 2000-03-29 2003-03-25 Bizrate.Com System and method for data collection, evaluation, information generation, and presentation
US6356842B1 (en) 2000-04-18 2002-03-12 Carmel Systems, Llc Space weather prediction system and method
US6542869B1 (en) 2000-05-11 2003-04-01 Fuji Xerox Co., Ltd. Method for automatic analysis of audio including music and speech
US7194434B2 (en) 2000-06-15 2007-03-20 Sergio Piccioli Method for predictive determination of financial investment performance
US7222082B1 (en) 2000-06-28 2007-05-22 Kronos Technology Systems Limited Partnership Business volume and workforce requirements forecaster
US6978249B1 (en) 2000-07-28 2005-12-20 Hewlett-Packard Development Company, L.P. Profile-based product demand forecasting
US7130822B1 (en) 2000-07-31 2006-10-31 Cognos Incorporated Budget planning
US6640227B1 (en) 2000-09-05 2003-10-28 Leonid Andreev Unsupervised automated hierarchical data clustering based on simulation of a similarity matrix evolution
US6876988B2 (en) 2000-10-23 2005-04-05 Netuitive, Inc. Enhanced computer performance forecasting system
WO2002037376A1 (en) 2000-10-27 2002-05-10 Manugistics, Inc. Supply chain demand forecasting and planning
US7523048B1 (en) 2001-01-19 2009-04-21 Bluefire Systems, Inc. Multipurpose presentation demand calendar for integrated management decision support
US6928398B1 (en) 2000-11-09 2005-08-09 Spss, Inc. System and method for building a time series model
US7660734B1 (en) 2000-12-20 2010-02-09 Demandtec, Inc. System for creating optimized promotion event calendar
US8010404B1 (en) 2000-12-22 2011-08-30 Demandtec, Inc. Systems and methods for price and promotion response analysis
US7437307B2 (en) 2001-02-20 2008-10-14 Telmar Group, Inc. Method of relating multiple independent databases
US7433834B2 (en) 2001-03-16 2008-10-07 Raymond Anthony Joao Apparatus and method for facilitating transactions
US6553352B2 (en) 2001-05-04 2003-04-22 Demand Tec Inc. Interface for merchandise price optimization
US7236940B2 (en) 2001-05-16 2007-06-26 Perot Systems Corporation Method and system for assessing and planning business operations utilizing rule-based statistical modeling
US20040172225A1 (en) 2001-06-01 2004-09-02 Prosanos Corp. Information processing method and system for synchronization of biomedical data
US7024388B2 (en) 2001-06-29 2006-04-04 Barra Inc. Method and apparatus for an integrative model of multiple asset classes
US7216088B1 (en) 2001-07-26 2007-05-08 Perot Systems Corporation System and method for managing a project based on team member interdependency and impact relationships
US20030101009A1 (en) 2001-10-30 2003-05-29 Johnson Controls Technology Company Apparatus and method for determining days of the week with similar utility consumption profiles
US7761324B2 (en) 2001-11-09 2010-07-20 Siebel Systems, Inc. Forecasting and revenue management system
US8108249B2 (en) 2001-12-04 2012-01-31 Kimberly-Clark Worldwide, Inc. Business planner
US7357298B2 (en) 2001-12-28 2008-04-15 Kimberly-Clark Worldwide, Inc. Integrating event-based production information with financial and purchasing systems in product manufacturing
JP3873135B2 (en) 2002-03-08 2007-01-24 インターナショナル・ビジネス・マシーンズ・コーポレーション Data processing method, information processing system and program using the same
US20030200134A1 (en) 2002-03-29 2003-10-23 Leonard Michael James System and method for large-scale automatic forecasting
US7634423B2 (en) 2002-03-29 2009-12-15 Sas Institute Inc. Computer-implemented system and method for web activity assessment
US20030212590A1 (en) 2002-05-13 2003-11-13 Klingler Gregory L. Process for forecasting product demand
JP2004038428A (en) 2002-07-02 2004-02-05 Yamatake Corp Method for generating model to be controlled, method for adjusting control parameter, program for generating the model, and program for adjusting the parameter
US20040030667A1 (en) 2002-08-02 2004-02-12 Capital One Financial Corporation Automated systems and methods for generating statistical models
US7570262B2 (en) 2002-08-08 2009-08-04 Reuters Limited Method and system for displaying time-series data and correlated events derived from text mining
JP4223767B2 (en) 2002-08-30 2009-02-12 富士通株式会社 Crossover detection method, radar apparatus, and crossover detection program
US7103222B2 (en) 2002-11-01 2006-09-05 Mitsubishi Electric Research Laboratories, Inc. Pattern discovery in multi-dimensional time series using multi-resolution matching
AU2003300823A1 (en) 2002-12-06 2004-06-30 Sandia Corporation Outcome prediction and risk classification in childhood leukemia
US7617167B2 (en) 2003-04-09 2009-11-10 Avisere, Inc. Machine vision system for enterprise management
US20050209732A1 (en) 2003-04-28 2005-09-22 Srinivasaragavan Audimoolam Decision support system for supply chain management
WO2005001631A2 (en) 2003-06-10 2005-01-06 Citibank, N.A. System and method for analyzing marketing efforts
US6878891B1 (en) 2003-11-03 2005-04-12 Siemens Energy & Automation, Inc. Switchgear enclosure
US7328111B2 (en) 2003-11-07 2008-02-05 Mitsubishi Electric Research Laboratories, Inc. Method for determining similarities between data sequences using cross-correlation matrices and deformation functions
US7512616B2 (en) 2003-11-20 2009-03-31 International Business Machines Corporation Apparatus, system, and method for communicating a binary code image
EP1544765A1 (en) 2003-12-17 2005-06-22 Sap Ag Method and system for planning demand for a configurable product in a managed supply chain
US7280986B2 (en) 2004-02-09 2007-10-09 The Board Of Trustees Of The University Of Illinois Methods and program products for optimizing problem clustering
US7409407B2 (en) 2004-05-07 2008-08-05 Mitsubishi Electric Research Laboratories, Inc. Multimedia event detection and summarization
US7565417B2 (en) 2004-05-20 2009-07-21 Rowady Jr E Paul Event-driven financial analysis interface and system
US7664173B2 (en) 2004-06-07 2010-02-16 Nahava Inc. Method and apparatus for cached adaptive transforms for compressing data streams, computing similarity, and recognizing patterns
EP1763782A4 (en) 2004-06-18 2009-04-08 Cvidya Networks Ltd Methods, systems and computer readable code for forecasting time series and for forecasting commodity consumption
JP4171514B2 (en) 2004-09-14 2008-10-22 株式会社アイ・ピー・ビー Document correlation diagram creation device that arranges documents in time series
US7620644B2 (en) 2004-10-19 2009-11-17 Microsoft Corporation Reentrant database object wizard
US20060112028A1 (en) 2004-11-24 2006-05-25 Weimin Xiao Neural Network and Method of Training
US7941339B2 (en) 2004-12-23 2011-05-10 International Business Machines Corporation Method and system for managing customer network value
US7702482B2 (en) 2004-12-30 2010-04-20 Microsoft Corporation Dependency structure from temporal data
AU2006227177A1 (en) 2005-03-22 2006-09-28 Ticketmaster Apparatus and methods for providing queue messaging over a network
US7610214B1 (en) 2005-03-24 2009-10-27 Amazon Technologies, Inc. Robust forecasting techniques with reduced sensitivity to anomalous data
US7171340B2 (en) 2005-05-02 2007-01-30 Sas Institute Inc. Computer-implemented regression systems and methods for time series data analysis
US8005707B1 (en) 2005-05-09 2011-08-23 Sas Institute Inc. Computer-implemented systems and methods for defining events
US7530025B2 (en) 2005-05-09 2009-05-05 Sas Institute Inc. Systems and methods for handling time-stamped data
US7251589B1 (en) 2005-05-09 2007-07-31 Sas Institute Inc. Computer-implemented system and method for generating forecasts
US7849049B2 (en) 2005-07-05 2010-12-07 Clarabridge, Inc. Schema and ETL tools for structured and unstructured data
US7937344B2 (en) 2005-07-25 2011-05-03 Splunk Inc. Machine data web
US7502763B2 (en) 2005-07-29 2009-03-10 The Florida International University Board Of Trustees Artificial neural network design and evaluation tool
US7873535B2 (en) 2005-11-04 2011-01-18 Accenture Global Services Ltd. Method and system for modeling marketing data
US20070203783A1 (en) 2006-02-24 2007-08-30 Beltramo Mark A Market simulation model
US7813870B2 (en) 2006-03-03 2010-10-12 Inrix, Inc. Dynamic time series prediction of future traffic conditions
US7711734B2 (en) 2006-04-06 2010-05-04 Sas Institute Inc. Systems and methods for mining transactional and time series data
US7842874B2 (en) 2006-06-15 2010-11-30 Massachusetts Institute Of Technology Creating music by concatenative synthesis
EP2111593A2 (en) 2007-01-26 2009-10-28 Information Resources, Inc. Analytic platform
US8036999B2 (en) 2007-02-14 2011-10-11 Isagacity Method for analyzing and classifying process data that operates a knowledge base in an open-book mode before defining any clusters
US7970759B2 (en) 2007-02-26 2011-06-28 International Business Machines Corporation System and method for deriving a hierarchical event based database optimized for pharmaceutical analysis
US20080288537A1 (en) 2007-05-16 2008-11-20 Fuji Xerox Co., Ltd. System and method for slide stream indexing based on multi-dimensional content similarity
US20090172035A1 (en) 2007-12-31 2009-07-02 Pieter Lessing System and method for capturing and storing casino information in a relational database system
US8374903B2 (en) * 2008-06-20 2013-02-12 Sas Institute Inc. Information criterion-based systems and methods for constructing combining weights for multimodel forecasting and prediction
US8266148B2 (en) 2008-10-07 2012-09-11 Aumni Data, Inc. Method and system for business intelligence analytics on unstructured data
US8145669B2 (en) 2009-12-11 2012-03-27 At&T Intellectual Property I, L.P. Methods and apparatus for representing probabilistic data using a probabilistic histogram
US8631040B2 (en) 2010-02-23 2014-01-14 Sas Institute Inc. Computer-implemented systems and methods for flexible definition of time intervals
US8601013B2 (en) 2010-06-10 2013-12-03 Micron Technology, Inc. Analyzing data using a hierarchical structure
US8326677B1 (en) 2010-12-09 2012-12-04 Jianqing Fan System and method for selecting an optimal forecasting hierarchy
US20130024167A1 (en) 2011-07-22 2013-01-24 Edward Tilden Blair Computer-Implemented Systems And Methods For Large Scale Automatic Forecast Combinations
US20130268318A1 (en) 2012-04-04 2013-10-10 Sas Institute Inc. Systems and Methods for Temporal Reconciliation of Forecast Results

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6745150B1 (en) * 2000-09-25 2004-06-01 Group 1 Software, Inc. Time series analysis and forecasting program
US20060178927A1 (en) * 2005-02-04 2006-08-10 Taiwan Semiconductor Manufacturing Co., Ltd. Demand forecast system and method

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9916282B2 (en) 2012-07-13 2018-03-13 Sas Institute Inc. Computer-implemented systems and methods for time series exploration
US10025753B2 (en) 2012-07-13 2018-07-17 Sas Institute Inc. Computer-implemented systems and methods for time series exploration
US10037305B2 (en) 2012-07-13 2018-07-31 Sas Institute Inc. Computer-implemented systems and methods for time series exploration
US9244887B2 (en) 2012-07-13 2016-01-26 Sas Institute Inc. Computer-implemented systems and methods for efficient structuring of time series data
US10983682B2 (en) 2015-08-27 2021-04-20 Sas Institute Inc. Interactive graphical user-interface for analyzing and manipulating time-series projections
US10438126B2 (en) 2015-12-31 2019-10-08 General Electric Company Systems and methods for data estimation and forecasting
US20170364614A1 (en) * 2016-06-16 2017-12-21 International Business Machines Corporation Adaptive forecasting of time-series
US10318669B2 (en) * 2016-06-16 2019-06-11 International Business Machines Corporation Adaptive forecasting of time-series
CN107798332A (en) * 2016-09-05 2018-03-13 华为技术有限公司 A kind of user's behavior prediction method and device
CN107798332B (en) * 2016-09-05 2021-04-20 华为技术有限公司 User behavior prediction method and device
US10331490B2 (en) 2017-11-16 2019-06-25 Sas Institute Inc. Scalable cloud-based time series analysis
US10338994B1 (en) 2018-02-22 2019-07-02 Sas Institute Inc. Predicting and adjusting computer functionality to avoid failures
US10255085B1 (en) 2018-03-13 2019-04-09 Sas Institute Inc. Interactive graphical user interface with override guidance
US10560313B2 (en) 2018-06-26 2020-02-11 Sas Institute Inc. Pipeline system for time-series data forecasting
US10685283B2 (en) 2018-06-26 2020-06-16 Sas Institute Inc. Demand classification based pipeline system for time-series data forecasting
US11308414B2 (en) 2018-10-11 2022-04-19 International Business Machines Corporation Multi-step ahead forecasting using complex-valued vector autoregregression

Also Published As

Publication number Publication date
US20130024173A1 (en) 2013-01-24
US9047559B2 (en) 2015-06-02

Similar Documents

Publication Publication Date Title
US9047559B2 (en) Computer-implemented systems and methods for testing large scale automatic forecast combinations
US20130024167A1 (en) Computer-Implemented Systems And Methods For Large Scale Automatic Forecast Combinations
CA2940752C (en) Intelligent visualization munging
Shepperd Software project economics: a roadmap
US11216741B2 (en) Analysis apparatus, analysis method, and non-transitory computer readable medium
Hu et al. An integrative framework for intelligent software project risk planning
US20190251458A1 (en) System and method for particle swarm optimization and quantile regression based rule mining for regression techniques
US20230385034A1 (en) Automated decision making using staged machine learning
Popović et al. A comparative evaluation of effort estimation methods in the software life cycle
Munialo et al. A review ofagile software effort estimation methods
Acebes et al. On the project risk baseline: Integrating aleatory uncertainty into project scheduling
Pereira et al. Towards a characterization of BPM tools' simulation support: the case of BPMN process models
Sakhrawi et al. Support vector regression for enhancement effort prediction of Scrum projects from COSMIC functional size
Große Kamphake et al. Digitalization in controlling
CN115699044A (en) Software project risk assessment method and device, computer equipment and storage medium
Raphael et al. Incremental development of CBR strategies for computing project cost probabilities
US11423045B2 (en) Augmented analytics techniques for generating data visualizations and actionable insights
Orea et al. Common methodological choices in nonparametric and parametric analyses of firms’ performance
Ziegenbein et al. Machine learning algorithms in machining: A guideline for efficient algorithm selection
US20180130002A1 (en) Requirements determination
Shapiro et al. DPCM: a method for modelling and analysing design process changes based on the applied signposting model
Satapathy Effort estimation methods in software development using machine learning algorithms
US20160147816A1 (en) Sample selection using hybrid clustering and exposure optimization
Simperl et al. Exploring the economical aspects of ontology engineering
Andriichuk et al. Usage of expert decision-making support systems in information operations detection

Legal Events

Date Code Title Description
AS Assignment

Owner name: SAS INSTITUTE INC., NORTH CAROLINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BRZEZICKI, JERZY MICHAL;APTE, DINESH P.;LEONARD, MICHAEL J.;AND OTHERS;REEL/FRAME:034425/0015

Effective date: 20120403

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION