Incremental model selection and ensemble prediction under virtual concept drifting environments

Yamauchi, Koichiro

doi:10.1007/s12530-011-9038-x

Incremental model selection and ensemble prediction under virtual concept drifting environments

Original Paper
Published: 13 September 2011

Volume 2, pages 249–260, (2011)
Cite this article

Evolving Systems Aims and scope Submit manuscript

Koichiro Yamauchi¹

170 Accesses
Explore all metrics

Abstract

Model selection for machine learning systems is one of the most important issues to be addressed for obtaining greater generalization capabilities. This paper proposes a strategy to achieve model selection incrementally under virtual concept drifting environments, where the distribution of learning samples varies over time. To carry out incremental model selection, the system generally uses all the learning samples that have been observed until now. Under virtual concept drifting environments, however, the distribution of the observed samples is considerably different from the distribution of cumulative dataset so that model selection is usually unsuccessful. To overcome this problem, the author had earlier proposed the weighted objective function and model-selection criterion based on the predictive input density of the learning samples. Although the previous method described in the author’s previous study shows good performances to some datasets, it occasionally fails to yield appropriate learning results because of the failure in the prediction of the actual input density. To reduce the adverse effect, the method proposed in this paper improves on the previously described method to yield the desired outputs using an ensemble of the constructed radial basis function neural networks (RBFNNs).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Adaptive Classification Framework for Unsupervised Model Updating in Nonstationary Environments

CDA-PDDWE: Concept Drift-Aware Performance-Based Diversified Dynamic Weighted Ensemble for Non-stationary Environments

Article 29 March 2024

A Study on Metrics for Concept Drift Detection Based on Predictions and Parameters of Ensemble Model

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

We assume that the time interval for state transition is considerably longer than that for presenting each sample.
Note that org-RBFNN is an optimal learning result under the assumption that the observed samples are i.i.d. samples from the original input distribution.

References

Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control AC-19(6):716–723
Article MathSciNet Google Scholar
Ans B, Roussert S (2000) Neural networks with a self-refreshing memory: knowledge transfer in sequential learning tasks without catastrophic forgetting. Connect Sci 12(1):1–19
Article Google Scholar
Bezdek J (1980) A convergence theorem for the fuzzy isodata clustering algorithms. IEEE Trans Pattern Anal Mach Intell 2:1–8
Article MATH Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc B 39(1):1–38
MathSciNet MATH Google Scholar
Feng G, Huang GB, Lin Q, Gay R (2009) Error minimized extreme learning machinewith growth of hidden nodes and incremental learning. IEEE Trans Neural Netw 20(8):1352–1357
Article Google Scholar
French RM (1997) Pseudo-recurrent connectionist networks: an approach to the “sensitivity stability” dilemma. Connect Sci 9(4):353–379
Article Google Scholar
Huang GB, Saratchandran P, Sundararajan N (2005) A generalized growing and pruning rbf (ggap-rbf) neural network for function approximation. IEEE Trans Neural Netw 16(1):57–67
Article Google Scholar
Lòpez-Rubio E (2009) Multivariate student-t self-organizing maps. Neural Netw 22:1432–1447
Article Google Scholar
Moody J, Darken CJ (1989) Fast learning in neural networks of locally-tuned processing units. Neural Comput 1:281–294
Article Google Scholar
Ozawa S, Okamoto K (2009) An incremental learning algorithm for resource allocating networks based on local linear regression. In: 16th international conference on neural information processing Bangkok, Thailand, December 1–5, 2009, vol LNCS5863, pp 562–569
Ozawa S, Toh SL, Abe S, Pang S, Kasabov N (2005) Incremental learning of feature space and classifier for face recognition. Neural Netw 18:575–584
Article Google Scholar
Platt J (1991) A resource allocating network for function interpolation. Neural Comput 3(2):213–225
Article MathSciNet Google Scholar
Pouzols FM, Lendasse A (2010) Evolving fuzzy optimally pruned extreme learning machine for regression problems. Evol Syst 1(1):43–58
Article Google Scholar
Sato M, Ishii S (2000) On-line em algorithm for the normalized gaussian network. Neural Comput 12:407–432
Article Google Scholar
Schaal S, Atkeson CG (1998) Constructive incremental learning from only local information. Neural Comput 10(8):2047–2084
Article Google Scholar
Shimodaira H (2000) Improving predictive inference under covariate shift by weighting the log-likelihood function. J Stat Plan Inference 90(2):227–244
Article MathSciNet MATH Google Scholar
Sugiyama M, Nakajima S, Kashima H, von Búnau P, Kawanabe M (2007) Direct importance estimation with model selection and its application to covariate shift adaptation. In: Twenty-first annual conference on neural information processing systems (NIPS2007)
Yamakawa H, Masumoto D, Kimoto T, Nagata S (1994) Active data selection and subsequent revision for sequential learning with neural networks. In: World congress of neural networks (WCNN’94), vol 3, pp 661–666
Yamauchi K (2009) Optimal incremental learning under covariate shift. Memet Comput 1(4):271–279
Article Google Scholar
Yamauchi K (2010a) Incremental learning and model selection under virtual concept drifting environments. In: The 2010 IEEE World Congress on Computational Intelligence (IEEE WCCI 2010), The Institute of Electrical and Electronics Engineers, Inc. New York, New York
Yamauchi K (2010b) Incremental model selection and ensemble prediction under virtual concept drifting environments. In: Zhang BT, Orgun MA (eds) PRICAI 2010: Trends in Artificial Intelligence, Springer, vol LNAI6230, pp 570–582
Yamauchi K, Hayami J (2007) Incremental learning and model selection for radial basis function network through sleep. IEICE Trans Inf Syst E90-D(4):722–735
Yamauchi K, Yamaguchi N, Ishii N (1999) Incremental learning methods with retrieving interfered patterns. IEEE Trans Neural Netw 10(6):1351–1365
Article Google Scholar
Yoneda T, Yamanaka M, Kakazu Y (1992) Study on optimization of grinding conditions using neural networks—a method of additional learning. J Jpn Soc Precis Eng/Seimitsu kogakukaishi 58(10):1707–1712
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Science, Chubu University, Matsumoto 1200, Kasugai, Aichi, Japan
Koichiro Yamauchi

Authors

Koichiro Yamauchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Koichiro Yamauchi.

Appendices

Appendix: Derivation of Eq. 18

The expected loss of the ensembled network can be rewrite under the condition ∑_n π_n = 1 as follows.

$$ \begin{aligned} E&=\int\left\{F(x)-\sum_{n \in E_{x}}\pi_{n} f_{{ {\varvec{\theta}}^{WLS}_{n}}({\user2{x}})}\right\}^{2} W_{1}({\user2{x}}) P({\varvec{x}}) d{\varvec{x}}\\ &=\int \left\{\sum_{n \in E_{x}} \pi_{n} \left(F({\varvec{x}})-f_{ {\varvec{\theta}}^{WLS}_{n}}({\user2{x}})\right)\right\}^2 W_{1}({\user2{x}}) P({\user2{x}})d{\user2{x}} \end{aligned} $$

(21)

Now, if we assume that there are no correlation between the validations of arbitrary two RBFNN’s, the above equation can approximately be rewrite as follows.

$$ \begin{aligned} E \simeq& \int \sum_{n \in E_{x}} \pi_{n}^2 \left \{F({\user2{x}})-f_{ {\varvec{\theta}}^{WLS}_{n}}({\user2{x}})\right\}^2 W_{1}({\user2{x}}) P({\user2{x}}) d {\user2{x}}\\ =&\sum_{n \in E_{x}} \pi_{n}^2 \int \left \{F({\user2{x}})-f_{ {\varvec{\theta}}^{WLS}_{n}}({\user2{x}})\right\}^2 W_{1}({\user2{x}}) P({\user2{x}}) d{\user2{x}} \end{aligned} $$

(22)

According to the definition of information criterions, the value of information criterion represents the averaged expected error of the statistical model. Therefore, Eq. 22 can be approximated by

$$ E\simeq\sum_{n \in E_{x}}\pi_{n}^2 \hat{E}(\lambda_{n}). $$

(23)

From these discussion, the objective function for π_n is

$$ U \equiv \sum_{n \in E_{x}} \pi_{n}^2 \hat{E}(\lambda_{n}) + \lambda \left( 1-\sum_{n \in E_{x}} \pi_{n} \right), $$

(24)

where λ denotes a Lagrangian coefficient. The solution of Eq. 24 is

$$ \frac{{\partial U}}{\partial \pi_{n}}=0 \Longleftrightarrow \ \pi_{n}=\frac{{\lambda}}{2\hat{E}(\lambda_{n})} $$

(25)

$$ \frac{{\partial U}}{\partial \lambda}=0\ \Longleftrightarrow \sum_{n \in E_{x}}\pi_{n}=1. $$

(26)

From Eqs. 25, 26, we obtain

$$ \pi_{n}^{*}=\frac{{1/\hat{E}(\lambda_{n})}}{\sum_{j \in E_{x}}1/ \hat{E}(\lambda_{j})}. $$

(27)

Derivation of Eq. 19

According to the definition of information criteria, IC _w predicts mean log likelihood:

$$ N_{total}\log\hat{E}(\lambda_{n}) \simeq IC_{w}(\lambda_{n}) $$

(28)

Therefore, we obtain:

$$ \hat{E}(\lambda_{n})\simeq \exp \left(\frac{{1}}{N_{total}} IC_{w}(\lambda_{n}) \right). $$

(29)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yamauchi, K. Incremental model selection and ensemble prediction under virtual concept drifting environments. Evolving Systems 2, 249–260 (2011). https://doi.org/10.1007/s12530-011-9038-x

Download citation

Received: 21 December 2010
Accepted: 08 June 2011
Published: 13 September 2011
Issue Date: December 2011
DOI: https://doi.org/10.1007/s12530-011-9038-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incremental model selection and ensemble prediction under virtual concept drifting environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Adaptive Classification Framework for Unsupervised Model Updating in Nonstationary Environments

CDA-PDDWE: Concept Drift-Aware Performance-Based Diversified Dynamic Weighted Ensemble for Non-stationary Environments

A Study on Metrics for Concept Drift Detection Based on Predictions and Parameters of Ensemble Model

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix: Derivation of Eq. 18

Derivation of Eq. 19

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Incremental model selection and ensemble prediction under virtual concept drifting environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Adaptive Classification Framework for Unsupervised Model Updating in Nonstationary Environments

CDA-PDDWE: Concept Drift-Aware Performance-Based Diversified Dynamic Weighted Ensemble for Non-stationary Environments

A Study on Metrics for Concept Drift Detection Based on Predictions and Parameters of Ensemble Model

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix: Derivation of Eq. 18

Derivation of Eq. 19

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation