Edwin Lughofer,
"Efficient Sample Selection in Data Stream Regression using Evolving Generalized Fuzzy Models"
: Proceedings of the International FUZZ-IEEE Conference 2015, Serie FUZZ-IEEE 2015, IEEE Press, Istanbul, 2015
Original Titel:
Efficient Sample Selection in Data Stream Regression using Evolving Generalized Fuzzy Models
Sprache des Titels:
Englisch
Original Buchtitel:
Proceedings of the International FUZZ-IEEE Conference 2015
Original Kurzfassung:
In this paper, we propose two criteria for efficient
sample selection in case of data stream regression problems.
The selection becomes apparent whenever the target values,
which guide the update of the regressors as well as the implicit model structures, are costly to measure. Reducing the samples used for model updates as much as possible while keeping the predictive accuracy of the models on a high level is thus a central challenge, especially in non-stationary environments
where (permanent) system changes or expansion can be expected.
Our selection criteria rely on two aspects: 1.) the extrapolation degree of the model combined with its non-linearity degree, 2.) the uncertainty in model outputs which can be measured in terms of confidence intervals reflected by so-called adaptive error bars,
which are updated over time synchronously to the model. The
selection criteria are developed in combination with evolving
generalized Takagi-Sugeno (TS) fuzzy models (containing rules in arbitrarily rotated position), which could be shown to outperform conventional evolving TS models (containing axis-parallel rules) and other stream regression techniques in previous publications.
The results based on two high-dimensional real-world streaming
problems show that a decrease of the number of model updates
by about 80-85% (as only 15-20% of samples are selected) can
still achieve similar accumulated model errors over time to the case when performing a full update on all samples. This may yield a significant reduction of computational demands and of costs whenever targets are costly to measure.