Efficient Sample Selection in Data Stream Regression using Evolving Generalized Fuzzy Models
Sprache des Vortragstitels:
Englisch
Original Tagungtitel:
FUZZ-IEEE 2015
Sprache des Tagungstitel:
Englisch
Original Kurzfassung:
In this talk, we propose two criteria for efficient sample selection in case of data stream regression problems.
The selection becomes apparent whenever the target values, which guide the update of the regressors as well as the implicit model structures, are costly to measure.
Reducing the selected samples as much as possible while keeping the predictive accuracy of the models on a high level is thus a central challenge, especially in non-stationary environments where (permanent) system changes or expansion can be expected.
Our selection criteria rely on two aspects: 1.) the extrapolation degree of the model combined with its non-linearity degree (the higher the non-linearity, the more a sample is important for model update when lying in the extrapolation region outside the model's definition range), 2.) the
uncertainty in
model outputs which can be measured in terms of confidence intervals reflected by so-called {\em adaptive error bars}, which are updated over time synchronously to the model; we integrate a localization of an incremental, adaptive noise level estimator in the error bar calculation.
The selection criteria are developed in combination with evolving generalized Takagi-Sugeno (TS) fuzzy models (containing rules in arbitrarily rotated position), which could be shown to be superior over conventional evolving TS models (containing axis-parallel rules) in previous publications.
The results based on two high-dimensional real-world streaming problems show that a decrease of the number of model updates by about 80-85\% (as only 15-20\% of samples are selected) can still achieve similar accumulated model errors (RMSE, MAE) over time to the case when conducting a full update on all samples.