Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–6 of 6 results for author: Thiago, R

.
  1. arXiv:2403.10304  [pdf, ps, other

    cs.AI cs.DB

    KIF: A Wikidata-Based Framework for Integrating Heterogeneous Knowledge Sources

    Authors: Guilherme Lima, João M. B. Rodrigues, Marcelo Machado, Elton Soares, Sandro R. Fiorini, Raphael Thiago, Leonardo G. Azevedo, Viviane T. da Silva, Renato Cerqueira

    Abstract: We present a Wikidata-based framework, called KIF, for virtually integrating heterogeneous knowledge sources. KIF is written in Python and is released as open-source. It leverages Wikidata's data model and vocabulary plus user-defined mappings to construct a unified view of the underlying sources while keeping track of the context and provenance of their statements. The underlying sources can be t… ▽ More

    Submitted 24 July, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  2. arXiv:2308.03584  [pdf, other

    cs.DB

    A Polystore Architecture Using Knowledge Graphs to Support Queries on Heterogeneous Data Stores

    Authors: Leonardo Guerreiro Azevedo, Renan Francisco Santos Souza, Elton F. de S. Soares, Raphael M. Thiago, Julio Cesar Cardoso Tesolin, Ann C. Oliveira, Marcio Ferreira Moreno

    Abstract: Modern applications commonly need to manage dataset types composed of heterogeneous data and schemas, making it difficult to access them in an integrated way. A single data store to manage heterogeneous data using a common data model is not effective in such a scenario, which results in the domain data being fragmented in the data stores that best fit their storage and access requirements (e.g., N… ▽ More

    Submitted 15 March, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: Reference the paper as L. G. Azevedo, R. Souza, E. F. de S. Soares, R. M. Thiago, J. C. D. Tesolin, A. C. Oliveira, M. F. Moreno, A Polystore Architecture Using Knowledge Graphs to Support Queries on Heterogeneous Data Stores. Proceedings of 20th Brazilian Symposium in Information Systems, 2024 (to be published)

  3. DevOps and Microservices in Scientific System development

    Authors: Maximillien de Bayser, Vinicius Segura, Leonardo Guerreiro Azevedo, Leonardo P. Tizzei, Raphael Melo Thiago, Elton Soares, Renato Cerqueira

    Abstract: There is a gap in scientific information systems development concerning modern software engineering and scientific computing. Historically, software engineering methodologies have been perceived as an unwanted accidental complexity to computational scientists in their scientific systems development. More recent trends, like the end of Moore's law and the subsequent diversification of hardware plat… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Comments: 14 pages, 4 figures, paper accepted as poster in ACM SAC 2022, ACM ISBN 978-1-4503-8713-2/22/04

  4. arXiv:2010.00330  [pdf, other

    cs.DB cs.AI cs.DC cs.LG

    Workflow Provenance in the Lifecycle of Scientific Machine Learning

    Authors: Renan Souza, Leonardo G. Azevedo, Vítor Lourenço, Elton Soares, Raphael Thiago, Rafael Brandão, Daniel Civitarese, Emilio Vital Brazil, Marcio Moreno, Patrick Valduriez, Marta Mattoso, Renato Cerqueira, Marco A. S. Netto

    Abstract: Machine Learning (ML) has already fundamentally changed several businesses. More recently, it has also been profoundly impacting the computational science and engineering domains, like geoscience, climate science, and health science. In these domains, users need to perform comprehensive data analyses combining scientific data and ML models to provide for critical requirements, such as reproducibil… ▽ More

    Submitted 25 August, 2021; v1 submitted 30 September, 2020; originally announced October 2020.

    Comments: 21 pages, 10 figures, text overlap with arXiv:1910.04223, a workshop paper being extended in this journal paper

    MSC Class: 65Y05; 68P15 ACM Class: I.2; H.2; C.4; J.2

    Journal ref: Concurrency Computation Practice Experience. 2021;e6544

  5. arXiv:2003.04915  [pdf, other

    cs.DB cs.CY cs.DC cs.LG

    Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case

    Authors: Raphael Thiago, Renan Souza, L. Azevedo, E. Soares, Rodrigo Santos, Wallas Santos, Max De Bayser, M. Cardoso, M. Moreno, Renato Cerqueira

    Abstract: Machine Learning (ML) has increased its role, becoming essential in several industries. However, questions around training data lineage, such as "where has the dataset used to train this model come from?"; the introduction of several new data protection legislation; and, the need for data governance requirements, have hindered the adoption of ML models in the real world. In this paper, we discuss… ▽ More

    Submitted 10 March, 2020; originally announced March 2020.

    Comments: Author preprint of paper accepted at the 2020 European Association of Geoscientists and Engineers (EAGE) Digitalization Conference and Exhibition

    MSC Class: 65Y05; 68P15 ACM Class: I.2; H.2; C.4; J.2

    Journal ref: 2020 European Association of Geoscientists and Engineers (EAGE) Digitalization Conference and Exhibition

  6. arXiv:1910.04223  [pdf, other

    cs.DC cs.DB cs.LG

    Provenance Data in the Machine Learning Lifecycle in Computational Science and Engineering

    Authors: Renan Souza, Leonardo Azevedo, Vítor Lourenço, Elton Soares, Raphael Thiago, Rafael Brandão, Daniel Civitarese, Emilio Vital Brazil, Marcio Moreno, Patrick Valduriez, Marta Mattoso, Renato Cerqueira, Marco A. S. Netto

    Abstract: Machine Learning (ML) has become essential in several industries. In Computational Science and Engineering (CSE), the complexity of the ML lifecycle comes from the large variety of data, scientists' expertise, tools, and workflows. If data are not tracked properly during the lifecycle, it becomes unfeasible to recreate a ML model from scratch or to explain to stakeholders how it was created. The m… ▽ More

    Submitted 21 October, 2019; v1 submitted 9 October, 2019; originally announced October 2019.

    Comments: 10 pages, 7 figures, Accepted at Workflows in Support of Large-scale Science (WORKS) co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC) 2019, Denver, Colorado

    MSC Class: 65Y05; 68P15 ACM Class: I.2; H.2; C.4; J.2