A Multistrategy Approach to Classifier Learning from Time Series

Hsu, William H.; Ray, Sylvian R.; Wilkins, David C.

doi:10.1023/A:1007694209216

A Multistrategy Approach to Classifier Learning from Time Series

Published: January 2000

Volume 38, pages 213–236, (2000)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

A Multistrategy Approach to Classifier Learning from Time Series

Download PDF

William H. Hsu¹,
Sylvian R. Ray² &
David C. Wilkins³

570 Accesses
Explore all metrics

Abstract

We present an approach to inductive concept learning using multiple models for time series. Our objective is to improve the efficiency and accuracy of concept learning by decomposing learning tasks that admit multiple types of learning architectures and mixture estimation methods. The decomposition method adapts attribute subset selection and constructive induction (cluster definition) to define new subproblems. To these problem definitions, we can apply metric-based model selection to select from a database of learning components, thereby producing a specification for supervised learning using a mixture model. We report positive learning results using temporal artificial neural networks (ANNs), on a synthetic, multiattribute learning problem and on a real-world time series monitoring application.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Barr, A. & Feigenbaum, E. A. (1981). Search. The handbook of artificial intelligence (Vol. 1, pp. 19–139). Reading, MA: Addison-Wesley.
Google Scholar
Beauchamp, J.W., Maher, R. C., & Brown, R. (1993). Detection of musical pitch from recorded solo performances. Proceedings of the 94th Convention of the Audio Engineering Society, Berlin, Germany.
Benjamin, D. P. (Ed.) (1990). Change of representation and inductive bias. Boston: Kluwer Academic Publishers.
Google Scholar
Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford, UK: Clarendon Press.
Google Scholar
Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (1994). Time series analysis, forecasting, and control, 3rd ed. San Fransisco, CA: Holden-Day.
Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.
Google Scholar
Cover, T. M. & Thomas, J. A. (1991). Elements of information theory. New York, NY: John Wiley and Sons.
Google Scholar
Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39 (Series B), 1–38.
Google Scholar
Donoho, S. K. (1996). Knowledge-guided constructive induction. Ph.D. Thesis, Department of Computer Science, University of Illinois at Urbana-Champaign.
Duda, R. O. & Hart, P. E. (1973). Pattern classification and scene analysis. New York, NY: JohnWiley and Sons.
Google Scholar
Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14, 179–211.
Google Scholar
Engels, R., Verdenius, F., & Aha, D. (1998). Proceedings of the 1998 Joint AAAI-ICML Workshop on the Methodology of Applying Machine Learning (Technical Report WS–98–16). Menlo Park, CA: AAAI Press.
Google Scholar
Freund, T. & Schapire, R. E. (1996). Experiments with aNewBoosting Algorithm. Machine Learning: Proceedings of the Thirteenth International Conference on (ICML-96).
Fu, L.-M. & Buchanan, B. G. (1985). Learning intermediate concepts in constructing a hierarchical knowledge base. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-85), Los Angeles, CA (pp. 659–666).
Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural Networks and the Bias/Variance Dilemna. Neural Computation, 4, 1–58.
Google Scholar
Gershenfeld, N. A. & Weigend, A. S. (1994). The Future of Time Series: Learning and Understanding. In A. S. Weigend & N. A. Gershenfeld (Eds.), Time series prediction: forecasting the future and understanding the past, Santa Fe institute studies in the sciences of complexity (Vol. XV). Reading, MA: Addison-Wesley.
Google Scholar
Goldberg, D. E. (1989). Genetic algorithms in search, optimization, and machine learning. Reading, MA: Addison-Wesley.
Google Scholar
Grois, E., Hsu, W. H., Wilkins, D. C., & Voloshin, M. (1998). Bayesian network models for automatic generation of crisis management training scenarios. Proceedings of the National Conference on Innovative Applications of Artificial Intelligence (IAAI-98), Madison, WI (pp. 1113–1120). Menlo Park, CA: AAAI Press.
Google Scholar
Hayes-Roth, B., Larsson, J. E., Brownston, L., Gaba, D., & Flanagan, B. (1996). Guardian Project Home Page, URL: http://www-ksl.stanford.edu/projects/guardian/index.html.
Haykin, S. (1994). Neural networks: A comprehensive foundation. New York, NY: Macmillan College Publishing.
Google Scholar
Horvitz, E. & Barry, M. (1995). Display of information for time-critical decision making. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. San Mateo, CA: Morgan-Kaufmann.
Google Scholar
Heckerman, D. A. (1996). A tutorial on learning with bayesian networks. Microsoft Research Technical Report 95–06, revised June 1996.
Hsu, W. H. (1998). Time series learning with probabilistic network composites. Ph.D. Thesis, University of Illinois at Urbana-Champaign (UIUC-DCS-R2063). URL: http://www.ncsa.uiuc.edu/People/bhsu/thesis.html.
Hsu, W. H., Gettings, N. D., Lease, V. E., Pan, Y., & Wilkins, D. C. (1998). A new approach to multistrategy learning from heterogeneous time series. Proceedings of the InternationalWorkshop on Multistrategy Learning, Milan, Italy.
Hsu, W. H., Auvil, L. S., Pottenger, W. M., Teheng, D., & Welge, M. (1999). Self-organizing systems for knowledge discovery in databases. Proceedings of the International Joint Conference on Neural Networks (IJCNN-99), Washington, DC.
Hsu, W. H. & Ray, S. R. (1998). A new mixture model for concept learning from time series. Proceedings of the 1998 Joint AAAI-ICML Workshop on AI Approaches to Time Series Problems (Technical Report WS–98–07), Madison, WI (pp. 42–43). Menlo Park, CA: AAAI Press.
Google Scholar
Hsu, W. H. & Ray, S. R. (1999). A recurrent mixture model for time series classification. Proceedings of the International Joint Conference on Neural Networks (IJCNN-99), Washington, DC.
Hsu, W. H. & Zwarico, A. E. (1995). Automatic synthesis of compression techniques for heterogeneous files. Software: Practice and Experience, 25(10), 1097–1116.
Google Scholar
Jacobs, R. A., Jordan, M. I., & Barto, A. G. (1991). Task decomposition through competition in a modular connectionist architecture: the what and where vision tasks. Cognitive Science, 15, 219–250.
Google Scholar
Jacobs, R. A., Jordan, M. I., Nowlan, S. J., & Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Computation, 3, 79–87.
Google Scholar
Jordan, M. I. (1997a). Approximate inference via variational techniques. International Conference on Uncertainty in Artificial Intelligence, Providence, RI, invited talk.
Jordan, M. I. (1997b). Personal communication.
Jordan, M. I. & Jacobs, R. A. (1994). Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6, 181–214.
Google Scholar
Kantz, H. & Schreiber, T. (1997). Nonlinear time series analysis. Cambridge, UK: Cambridge University Press.
Google Scholar
Kira, K. & Rendell, L. A. (1992). The feature selection problem: traditional methods and a new algorithm. Proceedings of the National Conference on Artificial Intelligence (AAAI-92), San Jose, CA (pp. 129–134). Cambridge, MA: MIT Press.
Google Scholar
Kohavi, R. & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence (special issue on relevance), 97(1–2), 273–324.
Google Scholar
Kohavi, R., Sommerfield, D. & Dougherty, J. (1996). Data mining using MLC++: A machine learning library in C++. Tools with artificial intelligence (pp. 234–245). Rockville, MD: IEEE Computer Society Press.
Google Scholar
Kohonen, T. (1990). The self-organizing map. Proceedings of the IEEE, 78, 1464–1480.
Google Scholar
Lang, K. J., Waibel, A. H., & Hinton, G. E. (1990). A time-delay neural network architecture for isolated word recognition. Neural Networks, 3, 23–43.
Google Scholar
Li, T., Fang, L. & Li, K. Q-Q. (1993). Hierarchical classification and vector quantization with neural trees. Neurocomputing, 5, 119–139.
Google Scholar
McCullagh, P. & Nelder, J. A. (1983). Generalized linear models. London, UK: Chapman and Hall.
Google Scholar
Mengshoel, O. J. & Wilkins, D. C. (1996). Recognition and critiquing of erroneous student actions. Proceedings of the AAAI Workshop on Agent Modeling (pp. 61–68). Menlo Park, CA: AAAI Press.
Google Scholar
Michalski, R. S. (1983). A theory and methodology of inductive learning. Artificial Intelligence, 20(2), 111–161. Reprinted in; Readings in knowledge acquisition and learning, B. G. Buchanan & D. C. Wilkins (Eds.) (1993). San Mateo, CA: Morgan-Kaufmann.
Google Scholar
Mozer, M. C. (1994). Neural net architectures for temporal sequence processing. In A. S. Weigend & N. A. Gershenfeld (Eds.), Time series prediction: forecasting the future and understanding the past, Santa Fe institute studies in the sciences of complexity (Vol. XV). Reading, MA: Addison-Wesley.
Google Scholar
Neal, R. M. (1996). Bayesian learning for neural networks. New York, NY: Springer-Verlag.
Google Scholar
Principè, J. & deVries. (1992). The Gamma Model-A new neural net model for temporal processing. Neural Networks, 5, 565–576.
Google Scholar
Principè, J. & Lefebvre, C. (1998). NeuroSolutions v3.02. Gainesville, FL: NeuroDimension. URL: http://www.nd.com.
Google Scholar
Ray, S. R. & Hsu, W. H. (1998). Self-organized-expert modular network for classification of spatiotemporal sequences. Journal of Intelligent Data Analysis, 2(4). URL: http://www-east.elsevier.com/ida/browse/0204/ida00039/ida00039.htm.
Resnick, P. & Varian, H. R. (1997). Recommender systems. Communications of the ACM, 40(3), 56–58.
Google Scholar
Rueckl, J. G., Cave, K. R., & Kosslyn, S. M. (1989). Why are “What” and “Where” Processed by Separate Cortical Visual Systems? A computational investigation. Journal of Cognitive Neuroscience, 1, 171–186.
Google Scholar
Russell, S. & Norvig, P. (1995). Artificial intelligence: A modern approach. Englewood Cliffs, NJ: Prentice Hall.
Sarle, W. S. (Ed.) (1999). Neural network FAQ, periodic posting to the USENET newsgroup comp.ai.neural-nets.
Schuurmans, D. (1997). A new metric-based approach to model selection. Proceedings of the Fourteenth National Conference on Artificial Intelligence (AAAI-97), Providence, RI (pp. 552–558).
Smyth, P. (1998). Challlenges for the application of machine learning problems. 1998 Joint AAAI-ICMLWorkshop on the Methodology of Applying Machine, Madison, WI, Invited talk. Menlo Park, CA: AAAI Press.
Google Scholar
Stein, B. & Meredith, M. A. (1993). The merging of the senses. Cambridge, MA: MIT Press.
Google Scholar
Stepp, R. E. & Michalski, R. S. (1986). Conceptual clustering: Inventing goal-oriented classifications of structured objects. In R. S. Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.), Machine learning: An artificial Intelligence Approach. San Mateo, CA: Morgan-Kaufmann.
Google Scholar
Stone, M. (1977). An asymptoticl equivalence of choice of models by cross-validation and akaike's criterion. Journal of the Royal Statistical Society Series B, 39, 44–47.
Google Scholar
Vapnik, V. N. (1996). The nature of statistical learning theory. New York, NY: Springer-Verlag.
Google Scholar
Watanabe, S. (1985). Pattern recognition: human and mechanical. New York, NY: John Wiley and Sons.
Google Scholar
Wilkins, D. C. & Sniezek, J. A. (1997). DC-ARM: Automation for reduced manning. Knowledge Based Systems Laboratory, Technical Report UIUC-BI-KBS–97–012. Beckman Institute, University of Illinois at Urbana-Champaign.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5, 241–259.
Google Scholar

Download references

Author information

Authors and Affiliations

National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Champaign, IL, 61820, USA
William H. Hsu
Department of Computer Science and Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
Sylvian R. Ray
Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
David C. Wilkins

Authors

William H. Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Sylvian R. Ray
View author publications
You can also search for this author in PubMed Google Scholar
David C. Wilkins
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsu, W.H., Ray, S.R. & Wilkins, D.C. A Multistrategy Approach to Classifier Learning from Time Series. Machine Learning 38, 213–236 (2000). https://doi.org/10.1023/A:1007694209216

Download citation

Issue Date: January 2000
DOI: https://doi.org/10.1023/A:1007694209216

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Multistrategy Approach to Classifier Learning from Time Series

Abstract

Article PDF

Similar content being viewed by others

TSFuse: automated feature construction for multiple time series data

Classification Cascades of Overlapping Feature Ensembles for Energy Time Series Data

Combining unsupervised and supervised learning techniques for enhancing the performance of functional data classifiers

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A Multistrategy Approach to Classifier Learning from Time Series

Abstract

Article PDF

Similar content being viewed by others

TSFuse: automated feature construction for multiple time series data

Classification Cascades of Overlapping Feature Ensembles for Energy Time Series Data

Combining unsupervised and supervised learning techniques for enhancing the performance of functional data classifiers

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation