Rule Extraction using Genetic Programming for Accurate Sales Forecasting
2014 (English)Conference paper (Refereed)
The purpose of this paper is to propose and evaluate a method
for reducing the inherent tendency of genetic programming to
overfit small and noisy data sets. In addition, the use of different
optimization criteria for symbolic regression is demonstrated.
The key idea is to reduce the risk of overfitting noise in the training
data by introducing an intermediate predictive model in the
process. More specifically, instead of directly evolving a genetic
regression model based on labeled training data, the first step is
to generate a highly accurate ensemble model. Since ensembles
are very robust, the resulting predictions will contain less noise
than the original data set. In the second step, an interpretable
model is evolved, using the ensemble predictions, instead of the
true labels, as the target variable. Experiments on 175 sales forecasting
data sets, from one of Sweden’s largest wholesale companies,
show that the proposed technique obtained significantly
better predictive performance, compared to both straightforward
use of genetic programming and the standard M5P technique.
Naturally, the level of improvement depends critically on
the performance of the intermediate ensemble.
Place, publisher, year, edition, pages
IEEE , 2014.
Genetic programming, Rule extraction, Overfitting, Regression, Sales forecasting, Machine learning, Data mining
Computer Science Computer and Information Science
IdentifiersURN: urn:nbn:se:hb:diva-7320Local ID: 2320/14624ISBN: 978-1-4799-4518-4/14OAI: oai:DiVA.org:hb-7320DiVA: diva2:888033
5th IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2014), 9-12 december, Orlando, FL, USA
This work was supported by the Swedish Retail and Wholesale Development
Council through the project Innovative Business Intelligence Tools (2013:5).2015-12-222015-12-22