Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Jorge Morato
  • University Carlos III
    Dep. Computer Science
    Avda. Universidad, 30
    28911 Leganés
    Madrid- Spain
The glossariumBITri, planned as a central activity for the interdisciplinary study of information, developed by BITrum group in cooperation with the University of Santa Elena (Ecuador), essentially aims at serving as a tool for the... more
The glossariumBITri, planned as a central activity for the interdisciplinary study of information, developed by BITrum group in cooperation with the University of Santa Elena (Ecuador), essentially aims at serving as a tool for the clarification of concepts, theories and problems concerning information. Intending to embrace the most relevant points of view with respect to information, it is interdisciplinarily developed by a board of experts coming from a wide variety of scientific fields. The glossariumBITri kindly invites the scientific community to make contributions of any kind aimed at clarifying in the field of information studies.
Research Interests:
Se presenta una visión integradora de las diferentes herramientas que permiten el estudio de las conexiones entre documentos, las pautas de publicación, la representación del contenido y la optimización de la recuperación. Se entremezclan... more
Se presenta una visión integradora de las diferentes herramientas que permiten el estudio de las conexiones entre documentos, las pautas de publicación, la representación del contenido y la optimización de la recuperación. Se entremezclan conceptos de Psicología Cognitiva,Lingüística, Cienciometría, Documentación, Estadística, Clasificación e Informática en sus vertientes
más relacionadas con el tratamiento, organización y caracterización de información textual.

El objetivo final es analizar la influencia que tiene el análisis de
género en la carcterización de los parámetros cualitativos y cuantitativos, y en concreto, de las herramientas que se encargan tradicionalmente de estos estudios, como los indicadores cienciométricos y la clasificación de términos.
In the present scenario, Automatic Text Summarization (ATS) is in great demand to address the ever-growing volume of text data available online to discover relevant information faster. In this research, the ATS methodology is proposed for... more
In the present scenario, Automatic Text Summarization (ATS) is in great demand to address the ever-growing volume of text data available online to discover relevant information faster. In this research, the ATS methodology is proposed for the Hindi language using Real Coded Genetic Algorithm (RCGA) over the health corpus, available in the Kaggle dataset. The methodology comprises five phases: preprocessing, feature extraction, processing, sentence ranking, and summary generation. Rigorous experimentation on varied feature sets is performed where distinguishing features, namely- sentence similarity and named entity features are combined with others for computing the evaluation metrics. The top 14 feature combinations are evaluated through Recall-Oriented Understudy for Gisting Evaluation (ROUGE) measure. RCGA computes appropriate feature weights through strings of features, chromosomes selection, and reproduction operators: Simulating Binary Crossover and Polynomial Mutation. To extr...
New technologies require an upgrade of LIS professionals’ skills. This study analyzes changes in the field and suggests guidelines to improve LIS curriculum. We have carried out a content analysis of 20 curricula from LIS educational... more
New technologies require an upgrade of LIS professionals’ skills. This study analyzes changes in the field and suggests guidelines to improve LIS curriculum. We have carried out a content analysis of 20 curricula from LIS educational programs to identify terms associated with technological skills. In addition, we identified terms related to new competences from 735 job openings published on generic web sites and 170 on specific web sites. These terms include marketing, management, and content management, mainly related to web applications. The results confirm a positive trend in the demand for technological skills related to LIS. Nevertheless, there is a need to increase awareness about the competences and abilities of LIS professionals, because many relevant job openings are listed in other job categories. Therefore, we have collected a list of key technological skills. In conclusion, it is clear that computer science and the Internet are bringing new opportunities to LIS professio...
"La automatización del proceso de adquisición de conocimiento a partir de documentos textuales en formato electrónico conlleva múltiples dificultades. Una de esas dificultades es el tratamiento textual en si mismo por la... more
"La automatización del proceso de adquisición de conocimiento a partir de documentos textuales en formato electrónico conlleva múltiples dificultades. Una de esas dificultades es el tratamiento textual en si mismo por la diversidad de formatos o su ausencia. Así una metodología idónea deberá marcar bajo que pautas estos documentos serán tratados. Por otra parte, un procesamiento lingüístico potente es necesario para salvar estas dificultades.  Se ha mostrado una metodología que está demostrando ser eficiente en distintos dominios y aplicaciones.  La automatización completa por el momento no ha resultado factible.  Existen en estudio múltiples aplicaciones a Vigilancia tecnológica y Reutilización de Software.  Según los experimentos desarrollados, la metodología idónea debe estar basada en una clasificación facetada que facilite la reutilización e interoperabilidad de estos sistemas de sistemas de organización del conocimiento.  Todas las experiencias ensayadas indican que las clasificaciones propuestas, tras la aplicación de las herramientas informáticas al análisis de un dominio, deberán ser valoradas y validadas por un experto. Por el momento una automatización totalmente ajena a la intervención humana no parece ser realista. Como futuro desarrollo se deberá resolver los problemas asociadas a la validación por los expertos en el dominio, ya que provoca demoras e inconsistencias en la obtención del dominio. En conclusión, las tareas asignadas a los expertos deberán ser breves y sencillas para que la metodología sea efectiva, o bien ser simplificadas mediante minería de datos."
Research Interests:
Título: Estructuración y clasificación automática de información: aplicación a una colección de textos médicos. Autores: Morato, J. Revista: Revista Interamericana de Bibliotecología, 2001 ENE-JUN; 24 (1) Página(s): 117-136 ISSN: 01200976... more
Título: Estructuración y clasificación automática de información: aplicación a una colección de textos médicos. Autores: Morato, J. Revista: Revista Interamericana de Bibliotecología, 2001 ENE-JUN; 24 (1) Página(s): 117-136 ISSN: 01200976 Resumen: Se describe una ...
Research Interests:
Abstract. The suitability of the algorithms for recognition and classification of entities (NERC) is evaluated through competitions such as MUC, CONLL or ACE. In general, these competitions are limited to the recognition of predefined... more
Abstract. The suitability of the algorithms for recognition and classification of entities (NERC) is evaluated through competitions such as MUC, CONLL or ACE. In general, these competitions are limited to the recognition of predefined entity types in certain languages. ...
Abstract. We describe a new approach for computing the similarity between symbolic musical pieces, based on the differences in shape between the interpolating curves defining the pieces. We outline several requirements for a symbolic... more
Abstract. We describe a new approach for computing the similarity between symbolic musical pieces, based on the differences in shape between the interpolating curves defining the pieces. We outline several requirements for a symbolic musical similarity system, and ...
ABSTRACT In this article we perform a qualitative analysis of well-known generic ontologies according to their retrieval potential in order to implement a conceptual retrieval system. This retrieval system aims to search metadata schemes.... more
ABSTRACT In this article we perform a qualitative analysis of well-known generic ontologies according to their retrieval potential in order to implement a conceptual retrieval system. This retrieval system aims to search metadata schemes. The main problem in the implementation of retrieval system has been finding a reference ontology covering the domain and matching the system's requirements. We performed an evaluation of ontologies' characteristics for their suitability to represent the semantic of specific metadata scheme. Finally, PROTON has been selected as ontology due to its extensibility, adaptability to the domain, adequacy to the retrieval system and its availability. The principal contribution of this study is the provision of guidelines towards the selection of ontologies to be mapped, based on qualitative analysis and experience.
The terminology used in Biomedicine shows lexical peculiarities that have required the elaboration of terminological resources and information retrieval systems with specific functionalities. The main characteristics are the high rates of... more
The terminology used in Biomedicine shows lexical peculiarities that have required the elaboration of terminological resources and information retrieval systems with specific functionalities. The main characteristics are the high rates of synonymy and homonymy, due to phenomena such as the proliferation of polysemic acronyms and their interaction with common language. Information retrieval systems in the biomedical domain use techniques oriented to the treatment of these lexical peculiarities. In this paper we review some of the techniques used in this domain, such as the application of Natural Language Processing (BioNLP), the incorporation of lexical-semantic resources, and the application of Named Entity Recognition (BioNER). Finally, we present the evaluation methods adopted to assess the suitability of these techniques for retrieving biomedical resources.
Se presenta el desarrollo de una aplicación móvil de bajo costo para una red de bibliotecas. Existe una necesidad de aplicaciones que proporcionen servicio a los usuarios de las bibliotecas de acuerdo con los usos actuales, fidelizando a... more
Se presenta el desarrollo de una aplicación móvil de bajo costo para una red de bibliotecas. Existe una necesidad de aplicaciones que proporcionen servicio a los usuarios de las bibliotecas de acuerdo con los usos actuales, fidelizando a los usuarios y simplificando el acceso recurrente a múltiples sitios web. Se utiliza la Red Valenciana de Lectura Pública como estudio de caso para ilustrar la propuesta. El punto de partida es un proceso analítico relativo a las características de la entidad y los requisitos de la aplicación móvil. Para el desarrollo de la aplicación se comparan diferentes plataformas para la construcción de aplicaciones móviles. A continuación, se evalúa el producto final en relación con la eficiencia y la facilidad de uso. Los resultados indican que la utilización de una aplicación, que integra la información en un único punto, mejora el rendimiento en términos de tiempo de búsqueda y tasa de error. La principal contribución de este trabajo destaca las apps como ...
Introduction. The objective of this study is understand the information needs that businesses have while seeking Library and Information Science professionals and analyse how they formulate those needs. Method. The analysis is performed... more
Introduction. The objective of this study is understand the information needs that businesses have while seeking Library and Information Science professionals and analyse how they formulate those needs. Method. The analysis is performed by examining the professional skills and capabilities demanded in job offers published. A total of 1,020 job offers collected from a Spanish employment agency Website have been analysed for the period between 2006 and 2008. Analysis. Knowledge representation techniques using thesauri have been used for the automatic content analysis based on natural language processing. Data extracted from the corpora have been analysed statistically. Results. Results of this study indicate a demand for skills related to technological advances and the management of electronic resources as well as to technical aspects associated with the Informatics domain. The knowledge of languages and the possession of an academic title represent essential factors in the job offers...
Open source software is becoming more popular worldwide due to the quality of its products. Open source repositories are tools to access this kind of software, but when it comes to search any particular component, it is not easy to find... more
Open source software is becoming more popular worldwide due to the quality of its products. Open source repositories are tools to access this kind of software, but when it comes to search any particular component, it is not easy to find what is required quickly. This paper studies the feasibility of using other sorting algorithms, in order to improve the results provided by open source software repositories; for this purpose the use of sorting algorithms based on graphs of relationships between open source software projects is analyzed. The results of four different sorting algorithms have been compared with the opinion of a group of experts in the domain area where the experiment was conducted. The results show that there are slight discrepancies between the ranking provided by the open source repository, sorting algorithms and expert opinion. These results underscore the possibility of including new sorting techniques in open source repositories in order to obtain better results i...
ABSTRACT
Research Interests:
In this article we perform a comparison between two approaches to the modeling of the hierarchical structure of the real world: on the one hand, generic and whole-part relationships in a descriptors thesaurus; on the other hand,... more
In this article we perform a comparison between two approaches to the modeling of the hierarchical structure of the real world: on the one hand, generic and whole-part relationships in a descriptors thesaurus; on the other hand, generalization and aggregation relationships in UML. Trying to shorten the distance between them leads to a new metamodel of relationships that can reflect better the mental habits of modelers when dealing with hierarchical trees.
Research Interests:
Research Interests:
At the moment, organizations are used to transforming in a continuous way and one of the main changes is technology; it is needed to develop new systems that help old systems to evolve. The change brings with it an intrinsic study and... more
At the moment, organizations are used to transforming in a continuous way and one of the main changes is technology; it is needed to develop new systems that help old systems to evolve. The change brings with it an intrinsic study and reuse of databases, its design must be assumed by software developers, they need to study old database designs
Purpose – This paper seeks to analyze and evaluate different types of semantic web retrieval systems, with respect to their ability to manage and retrieve semantic documents. Design/methodology/approach – The authors provide a brief... more
Purpose – This paper seeks to analyze and evaluate different types of semantic web retrieval systems, with respect to their ability to manage and retrieve semantic documents. Design/methodology/approach – The authors provide a brief overview of knowledge modeling and semantic retrieval systems in order to identify their major problems. They classify a set of characteristics to evaluate the management of semantic documents. For doing the same the authors select 12 retrieval systems classified according to these features. The evaluation methodology followed in this work is the one that has been used in the Desmet project for the evaluation of qualitative characteristics. Findings – A review of the literature has shown deficiencies in the current state of the semantic web to cope with known problems. Additionally, the way semantic retrieval systems are implemented shows discrepancies in their implementation. The authors analyze the presence of a set of functionalities in different type...
The terminology used in Biomedicine shows lexical peculiarities that have required the elaboration of terminological resources and information retrieval systems with specific functionalities. The main characteristics are the high rates of... more
The terminology used in Biomedicine shows lexical peculiarities that have required the elaboration of terminological resources and information retrieval systems with specific functionalities. The main characteristics are the high rates of synonymy and homonymy, due to phenomena such as the proliferation of polysemic acronyms and their interaction with common language. Information retrieval systems in the biomedical domain use techniques oriented to the treatment of these lexical peculiarities. In this paper we review some of the techniques used in this domain, such as the application of Natural Language Processing (BioNLP), the incorporation of lexical-semantic resources, and the application of Named Entity Recognition (BioNER). Finally, we present the evaluation methods adopted to assess the suitability of these techniques for retrieving biomedical resources.
Researchers in indexing and retrieval systems have been advocating the inclusion of more contextual information to improve results. The proliferation of full-text databases and advances in computer storage capacity have made it possible... more
Researchers in indexing and retrieval systems have been advocating the inclusion of more contextual information to improve results. The proliferation of full-text databases and advances in computer storage capacity have made it possible to carry out text analysis by means of linguistic ...
En este art culo se presenta un m todo para consensuar la sem ntica de los conceptos de los vocabularios de metadatos. El an lisis conceptual de mapeados y alineamientos de ontolog as ha permitido dise ar un sistema para consensuar sem... more
En este art culo se presenta un m todo para consensuar la sem ntica de los conceptos de los vocabularios de metadatos. El an lisis conceptual de mapeados y alineamientos de ontolog as ha permitido dise ar un sistema para consensuar sem ntica de vocabularios de ...
This short paper describes our five submissions to the 2012 edition of the MIREX Symbolic Melodic Similarity task. All five submissions rely on a geometric model that represents melodies as spline curves in the pitch-time plane. The... more
This short paper describes our five submissions to the 2012 edition of the MIREX Symbolic Melodic Similarity task. All five submissions rely on a geometric model that represents melodies as spline curves in the pitch-time plane. The similarity between two melodies is then computed with a sequence alignment algorithm between sequences of spline spans: the more similar the shape of the curves, the more similar the melodies they represent. As in MIREX 2010 and 2011, our systems ranked first for all effectiveness measures used. However, this year there was only one competing system, so we employ this report mainly to describe and compare results within our systems.
Abstract: The third Information Retrieval Education through EXperimentation track (EIREX 2012) was run at the University Carlos III of Madrid, during the 2012 spring semester. EIREX 2012 is the third in a series of experiments designed to... more
Abstract: The third Information Retrieval Education through EXperimentation track (EIREX 2012) was run at the University Carlos III of Madrid, during the 2012 spring semester. EIREX 2012 is the third in a series of experiments designed to foster new Information Retrieval (IR) education methodologies and resources, with the specific goal of teaching undergraduate IR courses from an experimental perspective. For an introduction to the motivation behind the EIREX experiments, see the first sections of [Urbano et al., 2011a]. For information on ...
This short paper describes our three submissions to the MIREX 2011 Symbolic Melodic Similarity task. All three submissions rely on a geometric model that represents melodies as spline curves in the pitch-time plane. The similarity between... more
This short paper describes our three submissions to the MIREX 2011 Symbolic Melodic Similarity task. All three submissions rely on a geometric model that represents melodies as spline curves in the pitch-time plane. The similarity between two melodies is then computed with a sequence alignment algorithm between sequences of spline spans: the more similar the shape of the curves, the more similar the melodies they represent. As in MIREX 2010, our systems ranked first for all effectiveness measures used.
El objetivo del GTI es la generación semiautomática de tesauros mediante el análisis de un corpus. Tras ensayar distintos métodos de clasificación de la información, desde co-ocurrencia de términos a redes neuronales, se mostró necesaria... more
El objetivo del GTI es la generación semiautomática de tesauros mediante el análisis de un corpus. Tras ensayar distintos métodos de clasificación de la información, desde co-ocurrencia de términos a redes neuronales, se mostró necesaria la creación de nuevos indicadores ...
Resumen: Las ontologías son una pieza clave para el desarrollo de la Web Semántica. La irrupción de las ontologías en Internet es un fenómeno reciente pero de trascendental importancia para la transmisión y almacenamiento de datos en el... more
Resumen: Las ontologías son una pieza clave para el desarrollo de la Web Semántica. La irrupción de las ontologías en Internet es un fenómeno reciente pero de trascendental importancia para la transmisión y almacenamiento de datos en el ámbito tecnológico y empresarial. La ...
The development of the Semantic Web depends on agreed and unambiguous knowledge representations, on the availability and accessibility of knowledge, as well as on retrieval capabilities. The scarce agreement on knowledge representation... more
The development of the Semantic Web depends on agreed and unambiguous knowledge representations, on the availability and accessibility of knowledge, as well as on retrieval capabilities. The scarce agreement on knowledge representation and the lack of ...

And 161 more