Classifying informative and imaginative prose using complex networks

de Arruda, Henrique F.; Costa, Luciano da F.; Amancio, Diego R.

doi:10.1209/0295-5075/113/28007

Computer Science > Computation and Language

arXiv:1507.07826v1 (cs)

[Submitted on 28 Jul 2015]

Title:Classifying informative and imaginative prose using complex networks

Authors:Henrique F. de Arruda, Luciano da F. Costa, Diego R. Amancio

View PDF

Abstract:Statistical methods have been widely employed in recent years to grasp many language properties. The application of such techniques have allowed an improvement of several linguistic applications, which encompasses machine translation, automatic summarization and document classification. In the latter, many approaches have emphasized the semantical content of texts, as it is the case of bag-of-word language models. This approach has certainly yielded reasonable performance. However, some potential features such as the structural organization of texts have been used only on a few studies. In this context, we probe how features derived from textual structure analysis can be effectively employed in a classification task. More specifically, we performed a supervised classification aiming at discriminating informative from imaginative documents. Using a networked model that describes the local topological/dynamical properties of function words, we achieved an accuracy rate of up to 95%, which is much higher than similar networked approaches. A systematic analysis of feature relevance revealed that symmetry and accessibility measurements are among the most prominent network measurements. Our results suggest that these measurements could be used in related language applications, as they play a complementary role in characterizing texts.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1507.07826 [cs.CL]
	(or arXiv:1507.07826v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1507.07826
Journal reference:	Europhysics Letters (EPL) 113 (2016) 28007
Related DOI:	https://doi.org/10.1209/0295-5075/113/28007

Submission history

From: Diego Amancio [view email]
[v1] Tue, 28 Jul 2015 15:59:39 UTC (1,009 KB)

Computer Science > Computation and Language

Title:Classifying informative and imaginative prose using complex networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Classifying informative and imaginative prose using complex networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators