Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–30 of 30 results for author: Gama, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.12785  [pdf, other

    cs.AI

    Artificial Intelligence Approaches for Predictive Maintenance in the Steel Industry: A Survey

    Authors: Jakub Jakubowski, Natalia Wojak-Strzelecka, Rita P. Ribeiro, Sepideh Pashami, Szymon Bobek, Joao Gama, Grzegorz J Nalepa

    Abstract: Predictive Maintenance (PdM) emerged as one of the pillars of Industry 4.0, and became crucial for enhancing operational efficiency, allowing to minimize downtime, extend lifespan of equipment, and prevent failures. A wide range of PdM tasks can be performed using Artificial Intelligence (AI) methods, which often use data generated from industrial sensors. The steel industry, which is an important… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Preprint submitted to Engineering Applications of Artificial Intelligence

  2. arXiv:2405.05809  [pdf

    cs.LG cs.AI cs.CY

    Aequitas Flow: Streamlining Fair ML Experimentation

    Authors: Sérgio Jesus, Pedro Saleiro, Inês Oliveira e Silva, Beatriz M. Jorge, Rita P. Ribeiro, João Gama, Pedro Bizarro, Rayid Ghani

    Abstract: Aequitas Flow is an open-source framework for end-to-end Fair Machine Learning (ML) experimentation in Python. This package fills the existing integration gaps in other Fair ML packages of complete and accessible experimentation. It provides a pipeline for fairness-aware model training, hyperparameter optimization, and evaluation, enabling rapid and simple experiments and result analysis. Aimed at… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2404.14455  [pdf, other

    cs.LG cs.AI

    A Neuro-Symbolic Explainer for Rare Events: A Case Study on Predictive Maintenance

    Authors: João Gama, Rita P. Ribeiro, Saulo Mastelini, Narjes Davarid, Bruno Veloso

    Abstract: Predictive Maintenance applications are increasingly complex, with interactions between many components. Black box models are popular approaches based on deep learning techniques due to their predictive accuracy. This paper proposes a neural-symbolic architecture that uses an online rule-learning algorithm to explain when the black box model predicts failures. The proposed system solves two proble… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 26 pages

  4. arXiv:2404.01790  [pdf, other

    cs.CV cs.LG

    Super-Resolution Analysis for Landfill Waste Classification

    Authors: Matias Molina, Rita P. Ribeiro, Bruno Veloso, João Gama

    Abstract: Illegal landfills are a critical issue due to their environmental, economic, and public health impacts. This study leverages aerial imagery for environmental crime monitoring. While advances in artificial intelligence and computer vision hold promise, the challenge lies in training models with high-resolution literature datasets and adapting them to open-access low-resolution images. Considering t… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: This article has been accepted by the Symposium on Intelligent Data Analysis (IDA 2024)

  5. S+t-SNE -- Bringing dimensionality reduction to data streams

    Authors: Pedro C. Vieira, João P. Montrezol, João T. Vieira, João Gama

    Abstract: We present S+t-SNE, an adaptation of the t-SNE algorithm designed to handle infinite data streams. The core idea behind S+t-SNE is to update the t-SNE embedding incrementally as new data arrives, ensuring scalability and adaptability to handle streaming scenarios. By selecting the most important points at each step, the algorithm ensures scalability while keeping informative visualisations. Employ… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: This preprint has not undergone peer review or any post-submission improvements or corrections. We will soon add a link to the final version of this contribution that underwent peer-review and post-acceptance improvements and was presented at IDA2024 (https://ida2024.org/)

    Journal ref: Advances in Intelligent Data Analysis XXII. IDA 2024. Lecture Notes in Computer Science, vol 14642., pp 95-106 (2024). Springer, Cham

  6. arXiv:2402.07586  [pdf, other

    cs.LG

    Unveiling Group-Specific Distributed Concept Drift: A Fairness Imperative in Federated Learning

    Authors: Teresa Salazar, João Gama, Helder Araújo, Pedro Henriques Abreu

    Abstract: In the evolving field of machine learning, ensuring fairness has become a critical concern, prompting the development of algorithms designed to mitigate discriminatory outcomes in decision-making processes. However, achieving fairness in the presence of group-specific concept drift remains an unexplored frontier, and our research represents pioneering efforts in this regard. Group-specific concept… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    MSC Class: 68T01 ACM Class: I.2.m

  7. arXiv:2306.05120  [pdf, other

    cs.AI

    Explainable Predictive Maintenance

    Authors: Sepideh Pashami, Slawomir Nowaczyk, Yuantao Fan, Jakub Jakubowski, Nuno Paiva, Narjes Davari, Szymon Bobek, Samaneh Jamshidi, Hamid Sarmadi, Abdallah Alabdallah, Rita P. Ribeiro, Bruno Veloso, Moamar Sayed-Mouchaweh, Lala Rajaoarisoa, Grzegorz J. Nalepa, João Gama

    Abstract: Explainable Artificial Intelligence (XAI) fills the role of a critical interface fostering interactions between sophisticated intelligent systems and diverse individuals, including data scientists, domain experts, end-users, and more. It aids in deciphering the intricate internal mechanisms of ``black box'' Machine Learning (ML), rendering the reasons behind their decisions more understandable. Ho… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 51 pages, 9 figures

    ACM Class: I.2.1

  8. arXiv:2304.13267  [pdf, other

    cs.LG cs.AI cs.DC

    Bayesian Federated Learning: A Survey

    Authors: Longbing Cao, Hui Chen, Xuhui Fan, Joao Gama, Yew-Soon Ong, Vipin Kumar

    Abstract: Federated learning (FL) demonstrates its advantages in integrating distributed infrastructure, communication, computing and learning in a privacy-preserving manner. However, the robustness and capabilities of existing FL methods are challenged by limited and dynamic data and conditions, complexities including heterogeneities and uncertainties, and analytical explainability. Bayesian federated lear… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: Accepted by IJCAI 2023 Survey Track, copyright is owned to IJCAI

  9. arXiv:2303.06067  [pdf, other

    cs.LG stat.ML

    Modeling Events and Interactions through Temporal Processes -- A Survey

    Authors: Angelica Liguori, Luciano Caroprese, Marco Minici, Bruno Veloso, Francesco Spinnato, Mirco Nanni, Giuseppe Manco, Joao Gama

    Abstract: In real-world scenario, many phenomena produce a collection of events that occur in continuous time. Point Processes provide a natural mathematical framework for modeling these sequences of events. In this survey, we investigate probabilistic models for modeling event sequences through temporal processes. We revise the notion of event modeling and provide the mathematical foundations that characte… ▽ More

    Submitted 21 July, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    Comments: Image replacements

  10. arXiv:2211.13358  [pdf, other

    cs.LG

    Turning the Tables: Biased, Imbalanced, Dynamic Tabular Datasets for ML Evaluation

    Authors: Sérgio Jesus, José Pombal, Duarte Alves, André Cruz, Pedro Saleiro, Rita P. Ribeiro, João Gama, Pedro Bizarro

    Abstract: Evaluating new techniques on realistic datasets plays a crucial role in the development of ML research and its broader adoption by practitioners. In recent years, there has been a significant increase of publicly available unstructured data resources for computer vision and NLP tasks. However, tabular data -- which is prevalent in many high-stakes domains -- has been lagging behind. To bridge this… ▽ More

    Submitted 28 November, 2022; v1 submitted 23 November, 2022; originally announced November 2022.

    Comments: Accepted at NeurIPS 2022. https://openreview.net/forum?id=UrAYT2QwOX8

  11. arXiv:2207.05466  [pdf, other

    cs.LG cs.AI

    A Benchmark dataset for predictive maintenance

    Authors: Bruno Veloso, João Gama, Rita P. Ribeiro, Pedro M. Pereira

    Abstract: The paper describes the MetroPT data set, an outcome of a eXplainable Predictive Maintenance (XPM) project with an urban metro public transportation service in Porto, Portugal. The data was collected in 2022 that aimed to evaluate machine learning methods for online anomaly detection and failure prediction. By capturing several analogic sensor signals (pressure, temperature, current consumption),… ▽ More

    Submitted 18 July, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

  12. arXiv:2206.02632  [pdf, other

    cs.IR cs.AI cs.CL stat.AP

    Contextualization for the Organization of Text Documents Streams

    Authors: Rui Portocarrero Sarmento, Douglas O. Cardoso, João Gama, Pavel Brazdil

    Abstract: There has been a significant effort by the research community to address the problem of providing methods to organize documentation with the help of information Retrieval methods. In this report paper, we present several experiments with some stream analysis methods to explore streams of text documents. We use only dynamic algorithms to explore, analyze, and organize the flux of text documents. Th… ▽ More

    Submitted 30 May, 2022; originally announced June 2022.

  13. arXiv:2205.07829  [pdf, ps, other

    cs.LG cs.DC

    Federated Anomaly Detection over Distributed Data Streams

    Authors: Paula Raissa Silva, João Vinagre, João Gama

    Abstract: Sharing of telecommunication network data, for example, even at high aggregation levels, is nowadays highly restricted due to privacy legislation and regulations and other important ethical concerns. It leads to scattering data across institutions, regions, and states, inhibiting the usage of AI methods that could otherwise take advantage of data at scale. It creates the need to build a platform t… ▽ More

    Submitted 17 May, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

    Comments: DSAA'2021 Conference - PhD Track

  14. Open challenges for Machine Learning based Early Decision-Making research

    Authors: Alexis Bondu, Youssef Achenchabe, Albert Bifet, Fabrice Clérot, Antoine Cornuéjols, Joao Gama, Georges Hébrail, Vincent Lemaire, Pierre-François Marteau

    Abstract: More and more applications require early decisions, i.e. taken as soon as possible from partially observed data. However, the later a decision is made, the more its accuracy tends to improve, since the description of the problem to hand is enriched over time. Such a compromise between the earliness and the accuracy of decisions has been particularly studied in the field of Early Time Series Classi… ▽ More

    Submitted 20 May, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

  15. arXiv:2110.11751  [pdf, other

    q-fin.CP cs.LG cs.SI q-fin.PM

    Forecasting Financial Market Structure from Network Features using Machine Learning

    Authors: Douglas Castilho, Tharsis T. P. Souza, Soong Moon Kang, João Gama, André C. P. L. F. de Carvalho

    Abstract: We propose a model that forecasts market correlation structure from link- and node-based financial network features using machine learning. For such, market structure is modeled as a dynamic asset network by quantifying time-dependent co-movement of asset price returns across company constituents of major global market indices. We provide empirical evidence using three different network filtering… ▽ More

    Submitted 22 October, 2021; originally announced October 2021.

    Comments: 22 pages, 13 figures

  16. How can I choose an explainer? An Application-grounded Evaluation of Post-hoc Explanations

    Authors: Sérgio Jesus, Catarina Belém, Vladimir Balayan, João Bento, Pedro Saleiro, Pedro Bizarro, João Gama

    Abstract: There have been several research works proposing new Explainable AI (XAI) methods designed to generate model explanations having specific properties, or desiderata, such as fidelity, robustness, or human-interpretability. However, explanations are seldom evaluated based on their true practical impact on decision-making tasks. Without that assessment, explanations might be chosen that, in fact, hur… ▽ More

    Submitted 22 January, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

    Comments: Accepted at FAccT'21, the ACM Conference on Fairness, Accountability, and Transparency

  17. arXiv:2004.09397  [pdf, other

    cs.LG stat.ML

    Multi-label Stream Classification with Self-Organizing Maps

    Authors: Ricardo Cerri, Joel David Costa Júnior, Elaine Ribeiro de Faria Paiva, João Manuel Portela da Gama

    Abstract: Several learning algorithms have been proposed for offline multi-label classification. However, applications in areas such as traffic monitoring, social networks, and sensors produce data continuously, the so called data streams, posing challenges to batch multi-label learning. With the lack of stationarity in the distribution of data streams, new algorithms are needed to online adapt to such chan… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Comments: 7 pages, 14 figures

    ACM Class: I.2.6

  18. Learning under Concept Drift: A Review

    Authors: Jie Lu, Anjin Liu, Fan Dong, Feng Gu, Joao Gama, Guangquan Zhang

    Abstract: Concept drift describes unforeseeable changes in the underlying distribution of streaming data over time. Concept drift research involves the development of methodologies and techniques for drift detection, understanding and adaptation. Data analysis has revealed that machine learning in a concept drift environment will result in poor learning results if the drift is not addressed. To help researc… ▽ More

    Submitted 13 April, 2020; originally announced April 2020.

    Journal ref: IEEE Transactions on Knowledge and Data Engineering 31, no. 12 (2018): 2346-2363

  19. arXiv:1909.11406  [pdf, other

    cs.LG cs.SI stat.ML

    Mining Human Mobility Data to Discover Locations and Habits

    Authors: Thiago Andrade, Brais Cancela, João Gama

    Abstract: Many aspects of life are associated with places of human mobility patterns and nowadays we are facing an increase in the pervasiveness of mobile devices these individuals carry. Positioning technologies that serve these devices such as the cellular antenna (GSM networks), global navigation satellite systems (GPS), and more recently the WiFi positioning system (WPS) provide large amounts of spatio-… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

  20. arXiv:1907.04233  [pdf, other

    cs.LG stat.ML

    Contextual One-Class Classification in Data Streams

    Authors: Richard Hugh Moulton, Herna L. Viktor, Nathalie Japkowicz, João Gama

    Abstract: In machine learning, the one-class classification problem occurs when training instances are only available from one class. It has been observed that making use of this class's structure, or its different contexts, may improve one-class classifier performance. Although this observation has been demonstrated for static data, a rigorous application of the idea within the data stream environment is l… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Comments: 49 pages, 18 figures, 2 appendices

  21. A scalable saliency-based Feature selection method with instance level information

    Authors: Brais Cancela, Verónica Bolón-Canedo, Amparo Alonso-Betanzos, João Gama

    Abstract: Classic feature selection techniques remove those features that are either irrelevant or redundant, achieving a subset of relevant features that help to provide a better knowledge extraction. This allows the creation of compact models that are easier to interpret. Most of these techniques work over the whole dataset, but they are unable to provide the user with successful information when only ins… ▽ More

    Submitted 30 April, 2019; originally announced April 2019.

  22. arXiv:1904.09357  [pdf, other

    cs.LG stat.ML

    Identifying Points of Interest and Similar Individuals from Raw GPS Data

    Authors: Thiago Andrade, João Gama

    Abstract: Smartphones and portable devices have become ubiquitous and part of everyone's life. Due to the fact of its portability, these devices are perfect to record individuals' traces and life-logging generating vast amounts of data at low costs. These data is emerging as a new source for studies in human mobility patterns raising the number of research projects and techniques aiming to analyze and retri… ▽ More

    Submitted 19 April, 2019; originally announced April 2019.

    Comments: Conference paper at Mobility IoT 2018 - http://mobilityiot2018.eai-conferences.org/full-program/

  23. arXiv:1808.02960  [pdf, other

    cs.SI physics.soc-ph

    Dynamic Laplace: Efficient Centrality Measure for Weighted or Unweighted Evolving Networks

    Authors: Mário Cordeiro, Rui Portocarrero Sarmento, Pavel Brazdil, João Gama

    Abstract: With its origin in sociology, Social Network Analysis (SNA), quickly emerged and spread to other areas of research, including anthropology, biology, information science, organizational studies, political science, and computer science. Being it's objective the investigation of social structures through the use of networks and graph theory, Social Network Analysis is, nowadays, an important research… ▽ More

    Submitted 8 August, 2018; originally announced August 2018.

  24. arXiv:1612.03772  [pdf, ps, other

    cs.MS math.NA stat.ML

    SimTensor: A synthetic tensor data generator

    Authors: Hadi Fanaee-T, Joao Gama

    Abstract: SimTensor is a multi-platform, open-source software for generating artificial tensor data (either with CP/PARAFAC or Tucker structure) for reproducible research on tensor factorization algorithms. SimTensor is a stand-alone application based on MATALB. It provides a wide range of facilities for generating tensor data with various configurations. It comes with a user-friendly graphical user interfa… ▽ More

    Submitted 9 December, 2016; originally announced December 2016.

  25. Improving incremental recommenders with online bagging

    Authors: João Vinagre, Alípio Mário Jorge, João Gama

    Abstract: Online recommender systems often deal with continuous, potentially fast and unbounded flows of data. Ensemble methods for recommender systems have been used in the past in batch algorithms, however they have never been studied with incremental algorithms that learn from data streams. We evaluate online bagging with an incremental matrix factorization algorithm for top-N recommendation with positiv… ▽ More

    Submitted 26 March, 2018; v1 submitted 2 November, 2016; originally announced November 2016.

    Comments: Submitted to EPIA 2017

    Journal ref: In: Oliveira E., Gama J., Vale Z., Lopes Cardoso H. (eds) Progress in Artificial Intelligence. EPIA 2017. Lecture Notes in Computer Science, vol 10423. Springer, Cham

  26. Evaluation of recommender systems in streaming environments

    Authors: João Vinagre, Alípio Mário Jorge, João Gama

    Abstract: Evaluation of recommender systems is typically done with finite datasets. This means that conventional evaluation methodologies are only applicable in offline experiments, where data and models are stationary. However, in real world systems, user feedback is continuously generated, at unpredictable rates. Given this setting, one important issue is how to evaluate algorithms in such a streaming dat… ▽ More

    Submitted 30 April, 2015; originally announced April 2015.

    Comments: Workshop on 'Recommender Systems Evaluation: Dimensions and Design' (REDD 2014), held in conjunction with RecSys 2014. October 10, 2014, Silicon Valley, United States

  27. arXiv:1406.3506  [pdf, other

    cs.AI stat.AP

    Eigenspace Method for Spatiotemporal Hotspot Detection

    Authors: Hadi Fanaee-T, João Gama

    Abstract: Hotspot detection aims at identifying subgroups in the observations that are unexpected, with respect to the some baseline information. For instance, in disease surveillance, the purpose is to detect sub-regions in spatiotemporal space, where the count of reported diseases (e.g. Cancer) is higher than expected, with respect to the population. The state-of-the-art method for this kind of problem is… ▽ More

    Submitted 13 June, 2014; originally announced June 2014.

    Comments: To appear in Expert Systems Journal

  28. arXiv:1406.3496  [pdf, other

    cs.AI cs.LG stat.AP

    EigenEvent: An Algorithm for Event Detection from Complex Data Streams in Syndromic Surveillance

    Authors: Hadi Fanaee-T, João Gama

    Abstract: Syndromic surveillance systems continuously monitor multiple pre-diagnostic daily streams of indicators from different regions with the aim of early detection of disease outbreaks. The main objective of these systems is to detect outbreaks hours or days before the clinical and laboratory confirmation. The type of data that is being generated via these systems is usually multivariate and seasonal w… ▽ More

    Submitted 13 June, 2014; originally announced June 2014.

    Comments: To appear in Intelligent Data Analysis Journal, vol. 19(3), 2015

    Journal ref: PP. 597-616, Vol. 19, No. 3, June 2015, Intelligent Data Analysis

  29. arXiv:1406.3266  [pdf, other

    cs.AI

    Event and Anomaly Detection Using Tucker3 Decomposition

    Authors: Hadi Fanaee-T, Márcia D. B. Oliveira, João Gama, Simon Malinowski, Ricardo Morla

    Abstract: Failure detection in telecommunication networks is a vital task. So far, several supervised and unsupervised solutions have been provided for discovering failures in such networks. Among them unsupervised approaches has attracted more attention since no label data is required. Often, network devices are not able to provide information about the type of failure. In such cases the type of failure is… ▽ More

    Submitted 12 June, 2014; originally announced June 2014.

    Journal ref: In Proceedings of 20th European Conference on Artificial Intelligence (ECAI'2013)- Ubiquitous Data Mining Workshop, pp. 8-12, vol. 1, August 27-31, 2012

  30. arXiv:1406.3191  [pdf, other

    cs.AI

    An eigenvector-based hotspot detection

    Authors: Hadi Fanaee-T, Joao Gama

    Abstract: Space and time are two critical components of many real world systems. For this reason, analysis of anomalies in spatiotemporal data has been a great of interest. In this work, application of tensor decomposition and eigenspace techniques on spatiotemporal hotspot detection is investigated. An algorithm called SST-Hotspot is proposed which accounts for spatiotemporal variations in data and detect… ▽ More

    Submitted 13 June, 2014; v1 submitted 12 June, 2014; originally announced June 2014.

    Journal ref: In Proceedings of 16th Portuguese Conference on Artificial Intelligence (EPIA 2013), Acores, Portugal, 9-12 September 2013, PP. 290-301