Questionnaire Free Text Summarisation Using Hierarchical Classification

Garcia-Constantino, Matias; Coenen, Frans; Noble, P-J; Radford, Alan

doi:10.1007/978-1-4471-4739-8_3

Matias Garcia-Constantino³,
Frans Coenen³,
P-J Noble⁴ &
…
Alan Radford⁴

Included in the following conference series:

International Conference on Innovative Techniques and Applications of Artificial Intelligence

897 Accesses

Abstract

This paper presents an investigation into the summarisation of the free text element of questionnaire data using hierarchical text classification. The process makes the assumption that text summarisation can be achieved using a classification approach whereby several class labels can be associated with documents which then constitute the summarisation. A hierarchical classification approach is suggested which offers the advantage that different levels of classification can be used and the summarisation customised according to which branch of the tree the current document is located. The approach is evaluated using free text from questionnaires used in the SAVSNET (Small Animal Veterinary Surveillance Network) project. The results demonstrate the viability of using hierarchical classification to generate free text summaries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automated Information Extraction and Classification of Matrix-Based Questionnaire Data

An Unsupervised Text-Mining Approach and a Hybrid Methodology to Improve Early Warnings in Construction Project Management

Multilevel Separation Pipeline for Similar Structure Data

References

Afantenos, S. and Karkaletsis, V. and Stamatopoulos, P. (2005). Summarization from medical documents: a survey. Artificial Intelligence in Medicine Vol. 33, pp157-177.
Article Google Scholar
Alonso, L. and Castell’on, I. and Climent, S. and Fuentes, M. and Padr’o, L. and Rodr’ıguez, H (2004). Approaches to text summarization: Questions and answers. Inteligencia Artificial Vol. 8, pp22.
Article Google Scholar
Celikyilmaz, A. and Hakkani-T‥ur, D. (2011). Concept-based classification for multi-document summarization. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, pp5540-5543.
Google Scholar
Chuang, W. and Tiyyagura, A. and Yang, J. and Giuffrida, G. (2000). A fast algorithm for hierarchical text classification. Data Warehousing and Knowledge Discovery, pp409-418.
Google Scholar
Dhillon, I.S. and Mallela, S. and Kumar, R. (2002). Enhanced word clustering for hierarchical text classification. Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp191-200.
Google Scholar
Dumais, S. and Chen, H. (2000). Hierarchical classification of web content. Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, pp256-263.
Google Scholar
Duwairi, R. and Al-Zubaidi, R. (2011). A Hierarchical K-NN Classifier for Textual Data. The International Arab Journal of Information Technology. Vol. 8, pp251-259.
Google Scholar
Fragoudis, D. and Meretakis, D. and Likothanassis, S. (2005). Best terms: an efficient featureselection algorithm for text categorization. Knowledge and Information Systems. Vol. 8, pp16- 33.
Article Google Scholar
Gao, F. and Fu, W. and Zhong, Y. and Zhao, D. (2004). Large-Scale Hierarchical Text Classification Based on Path Semantic Vector and Prior Information. CIS’09. International Conference on Computational Intelligence and Security. Vol. 1, pp54-58.
Google Scholar
Garcia-Constantino, M. F. and Coenen, F. and Noble, P. and Radford, A. and Setzkorn, C. and Tierney, A. (2011). An Investigation Concerning the Generation of Text Summarisation Classifiers using Secondary Data. Seventh International Conference on Machine Learning and Data Mining. Springer, pp387-398.
Google Scholar
Garcia-Constantino, M. F. and Coenen, F. and Noble, P. and Radford, A. and Setzkorn, C. (2012). A Semi-Automated Approach to Building Text Summarisation Classifiers. To be presented at the Eight International Conference on Machine Learning and Data Mining. Springer.
Google Scholar
Granitzer, M. (2003). Hierarchical text classification using methods from machine learning. Master’s Thesis, Graz University of Technology.
Google Scholar
Hand, D.J. and Till, R.J. (2001). A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning, 45, pp171-186.
Article MATH Google Scholar
Hardy, H. and Shimizu, N. and Strzalkowski, T. and Ting, L. and Zhang, X. and Wise, G.B. (2002). Cross-document summarization by concept classification. Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp121-128.
Google Scholar
Jaoua, M. and Hamadou, A. (2003). Automatic text summarization of scientific articles based on classification of extracts population. Computational Linguistics and Intelligent Text Processing, pp363-377.
Google Scholar
Jones, K.S. and others. (1999). Automatic summarizing: factors and directions. Advances in automatic text summarization, pp1-12.
Google Scholar
Katakis, I. and Tsoumakas, G. and Vlahavas, I. (2008). Multilabel text classification for automated tag suggestion. Proceedings of the ECML/PKDD 2008. Workshop in Discovery Challenge, pp75-83. Antwerp, Belgium.
Google Scholar
Koller, D. and Sahami, M. (1997). Hierarchically Classifying Documents Using Very Few Words. Proceedings of the Fourteenth International Conference on Machine Learning, pp170- 178.
Google Scholar
Kumilachew, A. (2011). Hierarchical Amharic News Text Classification: Using Support Vector Machine Approach. VDM Verlag Dr. M‥uller.
Google Scholar
Platt, J.C. (1999). Using analytic QP and sparseness to speed training of support vector machines. Advances in neural information processing systems, pp557-563.
Google Scholar
Pulijala, A. and Gauch, S. (2004). Hierarchical text classification. International Conference on Cybernetics and Information Technologies, Systems and Applications: CITSA, pp21-25.
Google Scholar
Qiu, X. and Huang, X. and Liu, Z. and Zhou, J. (2011). Hierarchical Text Classification with Latent Concepts. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Vol. 2, pp598-602.
Google Scholar
Radford, A. and Tierney, A’. and Coyne, K.P. and Gaskell, R.M. and Noble, P.J. and Dawson, S. and Setzkorn, C. and Jones, P.H. and Buchan, I.E. and Newton, J.R. and Bryan, J.G.E. (2010). Developing a network for small animal disease surveillance. Veterinary Record. Vol. 167, pp472-474.
Article Google Scholar
Rousu, J. and Saunders, C. and Szedmak, S. and Shawe-Taylor, J. (2005). Learning Hierarchical Multi-Category Text Classification Models. Proceedings of the 22nd International Conference on Machine Learning, pp744-751.
Chapter Google Scholar
Ruiz, M.E. and Srinivasan, P. (2002). Hierarchical text categorization using neural networks. Information Retrieval. Vol. 5, pp87-118.
Article MATH Google Scholar
Saravanan, M. and Raj, P.C.R. and Raman, S. (2003). Summarization and categorization of text data in high-level data cleaning for information retrieval. Applied Artificial Intelligence, Vol. 17, pp461-474.
Article Google Scholar
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR). Vol. 34, pp1-47.
Google Scholar
Silla, C.N. and Freitas, A.A. (2011). A survey of hierarchical classification across different application domains. Data Mining and Knowledge Discovery Vol. 22, pp31-72.
Article MathSciNet MATH Google Scholar
Sun, A. and Lim, E.P. (2001). Hierarchical text classification and evaluation. ICDM 2001, Proceedings IEEE International Conference on Data Mining. IEEE, pp521-528.
Google Scholar
Toutanova, K. and Chen, F. and Popat, K. and Hofmann, T. (2001). Text classification in a hierarchical mixture model for small training sets. Proceedings of the tenth international conference on Information and knowledge management, pp105-113.
Google Scholar
Willett, P. (2006). The Porter stemming algorithm: then and now. Program: electronic library and information systems Vol. 40, pp219-223.
Google Scholar
Zheng, Z. and Wu, X. and Srihari, R. (2004). Feature selection for text categorization on imbalanced data. ACM SIGKDD Explorations Newsletter Vol. 6, pp80-89.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, The University of Liverpool, Liverpool, L69 3BX, UK
Matias Garcia-Constantino & Frans Coenen
School of Veterinary Science, University of Liverpool, Leahurst, Neston, CH64 7TE, UK
P-J Noble & Alan Radford

Authors

Matias Garcia-Constantino
View author publications
You can also search for this author in PubMed Google Scholar
Frans Coenen
View author publications
You can also search for this author in PubMed Google Scholar
P-J Noble
View author publications
You can also search for this author in PubMed Google Scholar
Alan Radford
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matias Garcia-Constantino .

Editor information

Editors and Affiliations

School of Computing, University of Portsmouth, Whitepost Lane The Lilacs, Portsmouth, PO1 3AH, Hampshire, United Kingdom
Max Bramer
School of Computing, Engineering & Mathe, University of Brighton, Lewes Road, Brighton, BN2 4GJ, West Sussex, United Kingdom
Miltos Petridis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Garcia-Constantino, M., Coenen, F., Noble, PJ., Radford, A. (2012). Questionnaire Free Text Summarisation Using Hierarchical Classification. In: Bramer, M., Petridis, M. (eds) Research and Development in Intelligent Systems XXIX. SGAI 2012. Springer, London. https://doi.org/10.1007/978-1-4471-4739-8_3

Download citation

DOI: https://doi.org/10.1007/978-1-4471-4739-8_3
Published: 09 October 2012
Publisher Name: Springer, London
Print ISBN: 978-1-4471-4738-1
Online ISBN: 978-1-4471-4739-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Questionnaire Free Text Summarisation Using Hierarchical Classification

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Automated Information Extraction and Classification of Matrix-Based Questionnaire Data

An Unsupervised Text-Mining Approach and a Hybrid Methodology to Improve Early Warnings in Construction Project Management

Multilevel Separation Pipeline for Similar Structure Data

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Questionnaire Free Text Summarisation Using Hierarchical Classification

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Automated Information Extraction and Classification of Matrix-Based Questionnaire Data

An Unsupervised Text-Mining Approach and a Hybrid Methodology to Improve Early Warnings in Construction Project Management

Multilevel Separation Pipeline for Similar Structure Data

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation