Title:On-line Active Learning in Data Stream Regression using Uncertainty Sampling based on Evolving Generalized Fuzzy ModelsAuthor(s):Edwin Lughofer,  Mahardhika PratamaAbstract:In this paper, we propose three criteria for efficient sample selection in case of data stream regression problems within an on-line active learning context. The selection becomes important whenever the target values, which guide the update of the regressors as well as the implicit model structures, are costly or time-consuming to measure and also in case when very fast models updates are required to cope with stream mining real-time demands. Reducing the selected samples as much as possible while keeping the predictive accuracy of the models on a high level is thus a central challenge. This should be ideally achieved in unsupervised and single-pass manner. Our selection criteria rely on three aspects: 1.) the extrapolation degree combined with the model’s non-linearity degree, which is measured in terms of a new specific homogeneity criterion among adjacent local approximators, 2.) the uncertainty in model outputs which can be measured in terms of confidence intervals using so-called adaptive local error bars — we integrate a weighted localization of an incremental noise level estimator and propose formulas for on-line merging of local error bars; 3.) the uncertainty in model parameters which is estimated by the so-called A-optimality criterion which relies on the Fisher information matrix. The selection criteria are developed in combination with evolving generalized Takagi-Sugeno (TS) fuzzy models (containing rules in arbitrarily rotated position), as it could be shown in previous publications that these outperform conventional evolving TS models (containing axis-parallel rules). The results based on three high-dimensional real-world streaming problems show that a model update based on only 10%-20% selected samples can still achieve similar accumulated model errors over time to the case when performing a full model update on all samples. This can be achieved with a negligible sensitivity on the size of the active learning latency buffer (ALLB).Journal:IEEE Transactions on Fuzzy SystemsPublisher:IEEE PressISSN:1941-0034Page Reference:page 292-309, 18 page(s)Publishing:2018Volume:26Number:1

go back