Soto Montalvo

Universidad Rey Juan Carlos, Ciencias de la Computación, Faculty Member

Followers

Following

Co-authors

Public Views

Interests

Uploads

Papers

URJCyUNED at ImageCLEF 2012 Photo Annotation Task

An International Platform for Teaching Support Based on Breaking News

Download

Plataforma para el apoyo a la docencia basada en la Web 2.0 y la actualidad relevante

Multilingual document clustering

Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06, 2006

Download

Una plataforma internacional de apoyo a la docencia basada en noticias

by Soto Montalvo and Jesus Palomo

Arbor: Ciencia, pensamiento y cultura, Jan 1, 2011

Download

Multilingual Information Access on the Web

by Soto Montalvo and Rafael Capilla

Computer, 2015

ABSTRACT Named entities (NEs) can facilitate access to multilingual knowledge sources--which have... more

Bridging the gap between teaching and breaking news: A new approach based on ESHE and ICT

by Soto Montalvo and Jesus Palomo

Procedia - Social and Behavioral Sciences, 2010

Download

Exploiting named entities for bilingual news clustering

Journal of the Association for Information Science and Technology, 2014

ABSTRACT In this article, we present a new algorithm for clustering a bilingual collection of com... more ABSTRACT In this article, we present a new algorithm for clustering a bilingual collection of comparable news items in groups of specific topics. Our hypothesis is that named entities (NEs) are more informative than other features in the news when clustering fine grained topics. The algorithm does not need as input any information related to the number of clusters, and carries out the clustering only based on information regarding the shared named entities of the news items. This proposal is evaluated using different data sets and outperforms other state-of-the-art algorithms, thereby proving the plausibility of the approach. In addition, because the applicability of our approach depends on the possibility of identifying equivalent named entities among the news, we propose a heuristic system to identify equivalent named entities in the same and different languages, thereby obtaining good performance.

Bilingual News Clustering Using Named Entities and Fuzzy Similarity

Lecture Notes in Computer Science, 2007

Download

Multilingual News Document Clustering: Two Algorithms Based on Cognate Named Entities

Lecture Notes in Computer Science, 2006

Download

Improving Web Page Clustering Through Selecting Appropiate Term Weighting Functions

2006 1st International Conference on Digital Information Management, 2007

Download

Multilingual Document Clustering: An Heuristic Approach Based on Cognate Named Entities

Meeting of the Association for Computational Linguistics, 2006

This paper presents an approach for Mul- tilingual Document Clustering in compa- rable corpora. T... more This paper presents an approach for Mul- tilingual Document Clustering in compa- rable corpora. The algorithm is of heuris- tic nature and it uses as unique evidence for clustering the identification of cognate named entities between both sides of the comparable corpora. One of the main ad- vantages of this approach is that it does not depend on bilingual or

Download

Multilingual news clustering: Feature translation vs. identification of cognate named entities

Pattern Recognition Letters, 2007

Download

Automatic cognate identification based on a fuzzy combination of string similarity measures

by Soto Montalvo and E. Pardo

ABSTRACT Cognates are words in different languages that have similar spelling and meaning. The id... more ABSTRACT Cognates are words in different languages that have similar spelling and meaning. The identification of cognates is very useful for many different Natural Language Processing tasks, and also in the process of learning a second language. This paper presents a new approach to classify pairs of words into cognates/false friends or not related classes. The proposed approach uses a fuzzy system to combine complementary string similarity measures in order to improve the cognate identification task. The underlying hypothesis is that the combination of different string measures by applying heuristic knowledge, can outperform those measures working separately. The results obtained by the proposed system confirm the previous hypothesis, and furthermore it also outperforms other systems that combine string measures by using a supervised approach. As an additional contribution, we have created a bilingual test data set which include pairs of cognates, false friends and unrelated words in Spanish and English, that is freely available for research purposes.