Poly-transformation

King, Ross D.; Ouali, Mohammed

doi:10.1007/978-3-540-28651-6_15

Ross D. King¹⁹ &
Mohammed Ouali¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3177))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1346 Accesses

Abstract

Poly-transformation is the extension of the idea of ensemble learning to the transformation step of Knowledge Discovery in Databases (KDD). In poly-transformation multiple transformations of the data are made before learning (data mining) is applied. The theoretical basis for poly-transformation is the same as that for other combining methods – using different predictors to remove uncorrelated errors. It is not possible to demonstrate the utility of poly-transformation using standard datasets, because no pre-transformed data exists for such datasets. We therefore demonstrate its utility by applying it to a single well-known hard problem for which we have expertise – the problem of predicting protein secondary structure from primary structure. We applied four different transformations of the data, each of which was justifiable by biological background knowledge. We then applied four different learning methods (linear discrimination, back-propagation, C5.0, and learning vector quantization) both to the four transformations, and to combining predictions from the different transformations to form the poly-transformation predictions. Each of the learning methods produced significantly higher accuracy with poly-transformation than with only a single transformation. Poly-transformation is the basis of the secondary structure prediction method Prof, which is one of the most accurate existing methods for this problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Solutions to Data Science Problems

Data Mining Methods and Applications

Designing Algorithms for Machine Learning and Data Mining

References

Fayyad, U., Pietetsky-Shapiro, G., Smyth, P.: Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge (1996)
Google Scholar
Dietterich, T.G., Lathrop, R.H., Lozano-Perez, T.: Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence 89(1-2), 31–71 (1997)
Article MATH Google Scholar
King, R.D., Srinivasan, A., Sternberg, M.J.E.: Relating chemical activity to structure: an examination of ILP successes. New Gen. Computing. 13, 411–433 (1995)
Article Google Scholar
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)
Article Google Scholar
Cherkauer, K.J.: Human expert-level performance on a scientific image analysis by a system using combined artificial neural network. In: Chan, P. (ed.) Working Notes of AAAI Workshop on Integrating Multiple Learned Models, pp. 15–21 (1996)
Google Scholar
Zheng, Z., Webb, G.I.: Stochastic attribute selection committees. In: Proceedings of the Eleventh Australian Joint Conference on Artificial Intelligence (AI 1998), pp. 321–332. Springer, Berlin (1998)
Google Scholar
King, R.D., Sternberg, J.E.: Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Science 5, 2298–2310 (1996)
Article Google Scholar
Salamov, A.A., Solovyev, V.V.: Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. J. Mol. Biol. 247, 11–15 (1995)
Article Google Scholar
Muggleton, S., King, R.D., Sternberg, M.J.E.: Protein secondary structure prediction using logic. Protein Eng. 5, 647–657 (1992)
Article Google Scholar
Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599 (1993)
Article Google Scholar
Dietterich, T.G.: Machine Learning Research: Four Current Directions. AI Magazine 18, 97–136 (1997)
Google Scholar
Ouali, M., King, R.D.: Cascaded multiple classifiers for secondary structure prediction. Protein Sci. 9, 1162–1176 (2000)
Article Google Scholar
Garnier, J., Gibrat, J.F., Robson, B.G.: Method for Predicting Protein Secondary Structure from Amino Acid Sequence. Methods in Enzymology 266, 541–553 (1996)
Article Google Scholar
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402 (1997)
Article Google Scholar
Jones, D.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999)
Article Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, Heidelberg (2001)
MATH Google Scholar
Kohonen, T., Kangas, J., Laaksonen, J., Torkkola, K.: LVQ_PAK: A program package for the correct application of Learning Vector Quantization algorithms. In: Proceedings of the International Joint Conference on Neural Networks, pp. 725–730 (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Wales, Aberystwyth, SY23 3DB, Aberystwyth, Wales, UK
Ross D. King & Mohammed Ouali

Authors

Ross D. King
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Ouali
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering, Computing, and Mathematics, University of Exeter, EX4 4QF, Exeter, UK
Zheng Rong Yang
School of Electrical and Electronic Engineering, University of Manchester, UK
Hujun Yin
School of Engineering, Computer Science and Mathematics, University of Exeter, EX4 4QF, UK
Richard M. Everson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

King, R.D., Ouali, M. (2004). Poly-transformation. In: Yang, Z.R., Yin, H., Everson, R.M. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2004. IDEAL 2004. Lecture Notes in Computer Science, vol 3177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28651-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-540-28651-6_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22881-3
Online ISBN: 978-3-540-28651-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Poly-transformation

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Solutions to Data Science Problems

Data Mining Methods and Applications

Designing Algorithms for Machine Learning and Data Mining

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Poly-transformation

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Solutions to Data Science Problems

Data Mining Methods and Applications

Designing Algorithms for Machine Learning and Data Mining

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation