Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A link-based approach to semantic relation analysis

Published: 22 April 2015 Publication History

Abstract

The semantic relation analysis is an interesting issue in natural language processing. To capture the semantic relation between terms (words or phrases), various approaches have been proposed by using the co-occurrence statistics within corpus. However, it is still a challenging task to build a robust relation measure due to the complexity of the natural language. In this paper, we present a novel approach for the semantic relation analysis, which takes account of both the pairwise relation and the link-based relation within terms. The pairwise relation captures the relation between terms from the local view, which conveys the co-occurrence pattern between terms to measure their relation. The link-based relation involves the global information into the relation measure, which derives the relation between terms from the similarity of their context information. The combination of these two relations creates a model for robust and accurate semantic relation analysis. Experimental evaluation indicates that our proposed approach leads to much improved result in document clustering over the existed methods.

References

[1]
R.K. Srihari, Z. Zhang, A. Rao, Intelligent indexing and semantic retrieval of multimodal documents, Inf. Retr., 2 (2000) 245-275.
[2]
H. Billhardt, D. Borrajo, V. Maojo, A context vector model for information retrieval, J. Am. Soc. Inf. Sci. Technol., 53 (2002) 236-249.
[3]
A. Budanitsky, G. Hirst, Evaluating wordnet-based measures of lexical semantic relatedness, Comput. Linguist., 32 (2006) 13-47.
[4]
L. Wenyin, X. Quan, M. Feng, B. Qiu, A short text modeling method combining semantic and statistical information, Inf. Sci., 180 (2010) 4031-4041.
[5]
P. Resnik, Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language, J. Artif. Intell. Res., 11 (1999) 95-130.
[6]
P.D. Turney, Mining the web for synonyms: PMI-IR versus LSA on TOEFL, in: Proceedings of the 12th European Conference on Machine Learning, Springer-Verlag, 2001, pp. 491-502.
[7]
H. Schütze, Automatic word sense discrimination, Comput. Linguist., 24 (1998) 97-123.
[8]
S.T. Dumais, Latent semantic analysis, Ann. Rev. Inf. Sci. Technol., 38 (2004) 188-230.
[9]
S.C. Deerwester, S.T. Dumais, T.K. Landauer, G.W. Furnas, R.A. Harshman, Indexing by latent semantic analysis, J. Am. Soc. Inf. Sci., 41 (1990) 391-407.
[10]
K. Lund, C. Burgess, Producing high-dimensional semantic spaces from lexical co-occurrence, Behav. Res. Methods Instrum. Comput., 28 (1996) 203-208.
[11]
F. Figueiredo, L. Rocha, T. Couto, T. Salles, M.A. Gonçalves, W. Meira, Word co-occurrence features for text classification, Inf. Syst., 36 (2011) 843-858.
[12]
A.K. Farahat, M.S. Kamel, Statistical semantics for enhancing document clustering, Knowl. Inf. Syst., 28 (2011) 365-393.
[13]
T. Hofmann, Probabilistic latent semantic indexing, in: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, 1999, pp. 50-57.
[14]
C. Burgess, Representing and resolving semantic ambiguity: a contribution from high-dimensional memory modeling., in: On the Consequences of Meaning Selection: Perspectives on Resolving Lexical Ambiguity, American Psychological Association, 2001.
[15]
C. Burgess, K. Lund, Representing abstract words and emotional connotation in a high-dimensional memory space, in: Proceedings of the Cognitive Science Society, 1997, pp. 61-66.
[16]
C. Audet, C. Burgess, et al., Using a high-dimensional memory model to evaluate the properties of abstract and concrete words, in: Proceedings of the Cognitive Science Society, Erlbaum, Mahwah, NJ, 1999, pp. 37-42.
[17]
Curt Burgess, Kevin Lund, Modelling parsing constraints with high-dimensional context space, Lang. Cogn. Process., 12 (1997) 177-210.
[18]
P. Li, C. Burgess, K. Lund, The acquisition of word meaning through global lexical co-occurrences, in: Proceedings of the Thirtieth Annual Child Language Research Forum, 2000, pp. 166-178.
[19]
G.A. Miller, W.G. Charles, Contextual correlates of semantic similarity, Lang. Cogn. Process., 6 (1991) 1-28.
[20]
I. Dagan, L. Lee, F.C. Pereira, Similarity-based models of word cooccurrence probabilities, Mach. Learn., 34 (1999) 43-69.
[21]
T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in: Advances in Neural Information Processing Systems, 2013, pp. 3111-3119.
[22]
R. Collobert, J. Weston, A unified architecture for natural language processing: deep neural networks with multitask learning, in: Proceedings of the 25th International Conference on Machine Learning, ACM, 2008, pp. 160-167.
[23]
A. Mnih, G.E. Hinton, A scalable hierarchical distributed language model, in: Advances in Neural Information Processing Systems, 2009, pp. 1081-1088.
[24]
S. Wong, W. Ziarko, P. Wong, Generalized vector spaces model in information retrieval, in: SIGIR 1985, ACM, 1985, pp. 18-25.
[25]
G.A. Miller, Wordnet, Commun. ACM, 38 (1995) 39-41.
[26]
E. Gabrilovich, S. Markovitch, Computing semantic relatedness using Wikipedia-based explicit semantic analysis., in: IJCAI, vol. 7, 2007, pp. 1606-1611.
[27]
X. Hu, X. Zhang, C. Lu, E.K. Park, X. Zhou, Exploiting Wikipedia as external knowledge for document clustering, in: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2009, pp. 389-396.
[28]
P. Wang, J. Hu, H.-J. Zeng, Z. Chen, Using Wikipedia knowledge to improve text classification, Knowl. Inf. Syst., 19 (2009) 265-281.
[29]
L. Getoor, C.P. Diehl, Link mining, ACM SIGKDD Explor. Newsl., 7 (2005) 3-12.
[30]
D. Liben-Nowell, J. Kleinberg, The link-prediction problem for social networks, J. Am. Soc. Inf. Sci. Technol., 58 (2007) 1019-1031.
[31]
A.Y. Ng, M.I. Jordan, Y. Weiss, On spectral clustering, Adv. Neural Inf. Process. Syst., 2 (2002) 849-856.
[32]
K. Lang, Newsweeder: learning to filter netnews, in: Proceedings of the Twelfth International Conference on Machine Learning, Citeseer, 1995.
[33]
D.D. Lewis, Reuters-21578 Text Categorization Test Collection, Distribution 1.0, {http://www.research.att.com/~lewis/reuters21578.html}.
[34]
M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, S. Slattery, Learning to extract symbolic knowledge from the world wide web, in: Proceedings of the 15th National Conference on Artificial Intelligence, 1998.
[35]
C. Luo, Y. Li, S.M. Chung, Text document clustering based on neighbors, Data Knowl. Eng., 68 (2009) 1271-1288.
[36]
R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, in: International Joint Conference on Artificial Intelligence, vol. 14, 1995, pp. 1137-1145.
[37]
D.L. Davies, D.W. Bouldin, A cluster separation measure, in: IEEE Transactions on Pattern Analysis and Machine Intelligence (2) (1979) 224-227.

Index Terms

  1. A link-based approach to semantic relation analysis
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Neurocomputing
      Neurocomputing  Volume 154, Issue C
      April 2015
      347 pages

      Publisher

      Elsevier Science Publishers B. V.

      Netherlands

      Publication History

      Published: 22 April 2015

      Author Tags

      1. Co-occurrence statistics
      2. Link-based relation
      3. Neighbor information
      4. Semantic relation analysis

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 0
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 30 Aug 2024

      Other Metrics

      Citations

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media