Authors:
Andrea Ciapetti
1
;
Rosario Di Florio
1
;
Luigi Lomasto
1
;
Giuseppe Miscione
1
;
Giulia Ruggiero
1
and
Daniele Toti
2
Affiliations:
1
Innovation Engineering S.r.l., Rome and Italy
;
2
Innovation Engineering S.r.l., Rome, Italy, Department of Sciences, Roma Tre University, Rome and Italy
Keyword(s):
Machine Learning, Neural Networks, Taxonomies, Text Classification.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Artificial Intelligence and Decision Support Systems
;
Biomedical Engineering
;
Biomedical Signal Processing
;
Computational Intelligence
;
Enterprise Information Systems
;
Health Engineering and Technology Applications
;
Human-Computer Interaction
;
Industrial Applications of Artificial Intelligence
;
Methodologies and Methods
;
Neural Network Software and Applications
;
Neural Networks
;
Neurocomputing
;
Neurotechnology, Electronics and Informatics
;
Pattern Recognition
;
Physiological Computing Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Theory and Methods
Abstract:
This paper presents NETHIC, a software system for the automatic classification of textual documents based on hierarchical taxonomies and artificial neural networks. This approach combines the advantages of highly-structured hierarchies of textual labels with the versatility and scalability of neural networks, thus bringing about a textual classifier that displays high levels of performance in terms of both effectiveness and efficiency. The system has first been tested as a general-purpose classifier on a generic document corpus, and then applied to the specific domain tackled by DANTE, a European project that is meant to address criminal and terrorist-related online contents, showing consistent results across both application domains.