Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–25 of 25 results for author: Albert, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.05528  [pdf, other

    cs.CV

    An accurate detection is not all you need to combat label noise in web-noisy datasets

    Authors: Paul Albert, Jack Valmadre, Eric Arazo, Tarun Krishna, Noel E. O'Connor, Kevin McGuinness

    Abstract: Training a classifier on web-crawled data demands learning algorithms that are robust to annotation errors and irrelevant examples. This paper builds upon the recent empirical observation that applying unsupervised contrastive learning to noisy, web-crawled datasets yields a feature representation under which the in-distribution (ID) and out-of-distribution (OOD) samples are linearly separable. We… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Accepted in the European Conference on Computer Vision (ECCV) 2024

  2. arXiv:2407.02880  [pdf, other

    cs.LG cs.AI cs.CV

    Knowledge Composition using Task Vectors with Learned Anisotropic Scaling

    Authors: Frederic Z. Zhang, Paul Albert, Cristian Rodriguez-Opazo, Anton van den Hengel, Ehsan Abbasnejad

    Abstract: Pre-trained models produce strong generic representations that can be adapted via fine-tuning. The learned weight difference relative to the pre-trained model, known as a task vector, characterises the direction and stride of fine-tuning. The significance of task vectors is such that simple arithmetic operations on them can be used to combine diverse representations from different domains. This pa… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  3. arXiv:2404.11230  [pdf, other

    cs.CV cs.AI

    Energy-Efficient Uncertainty-Aware Biomass Composition Prediction at the Edge

    Authors: Muhammad Zawish, Paul Albert, Flavio Esposito, Steven Davy, Lizy Abraham

    Abstract: Clover fixates nitrogen from the atmosphere to the ground, making grass-clover mixtures highly desirable to reduce external nitrogen fertilization. Herbage containing clover additionally promotes higher food intake, resulting in higher milk production. Herbage probing however remains largely unused as it requires a time-intensive manual laboratory analysis. Without this information, farmers are un… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: The paper has been accepted to CVPR 2024 5th Workshop on Vision for Agriculture

  4. arXiv:2307.09288  [pdf, other

    cs.CL cs.AI

    Llama 2: Open Foundation and Fine-Tuned Chat Models

    Authors: Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini , et al. (43 additional authors not shown)

    Abstract: In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be… ▽ More

    Submitted 19 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

  5. arXiv:2304.09871  [pdf, other

    cs.LG cs.AI math.OC

    A Theory on Adam Instability in Large-Scale Machine Learning

    Authors: Igor Molybog, Peter Albert, Moya Chen, Zachary DeVito, David Esiobu, Naman Goyal, Punit Singh Koura, Sharan Narang, Andrew Poulton, Ruan Silva, Binh Tang, Diana Liskovich, Puxin Xu, Yuchen Zhang, Melanie Kambadur, Stephen Roller, Susan Zhang

    Abstract: We present a theory for the previously unexplained divergent behavior noticed in the training of large language models. We argue that the phenomenon is an artifact of the dominant optimization algorithm used for training, called Adam. We observe that Adam can enter a state in which the parameter update vector has a relatively large norm and is essentially uncorrelated with the direction of descent… ▽ More

    Submitted 25 April, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

  6. arXiv:2301.09164  [pdf, other

    cs.LG cs.CV

    Unifying Synergies between Self-supervised Learning and Dynamic Computation

    Authors: Tarun Krishna, Ayush K Rai, Alexandru Drimbarean, Eric Arazo, Paul Albert, Alan F Smeaton, Kevin McGuinness, Noel E O'Connor

    Abstract: Computationally expensive training strategies make self-supervised learning (SSL) impractical for resource constrained industrial settings. Techniques like knowledge distillation (KD), dynamic computation (DC), and pruning are often used to obtain a lightweightmodel, which usually involves multiple epochs of fine-tuning (or distilling steps) of a large pre-trained model, making it more computation… ▽ More

    Submitted 9 September, 2023; v1 submitted 22 January, 2023; originally announced January 2023.

    Comments: Accepted in BMVC 2023

  7. arXiv:2210.04578  [pdf, other

    cs.CV cs.LG

    Is your noise correction noisy? PLS: Robustness to label noise with two stage detection

    Authors: Paul Albert, Eric Arazo, Tarun Krishna, Noel E. O'Connor, Kevin McGuinness

    Abstract: Designing robust algorithms capable of training accurate neural networks on uncurated datasets from the web has been the subject of much research as it reduces the need for time consuming human labor. The focus of many previous research contributions has been on the detection of different types of label noise; however, this paper proposes to improve the correction accuracy of noisy samples once th… ▽ More

    Submitted 15 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: 9 pages 4 figures. Accepted at WACV 2023

  8. arXiv:2207.01573  [pdf, other

    cs.CV

    Embedding contrastive unsupervised features to cluster in- and out-of-distribution noise in corrupted image datasets

    Authors: Paul Albert, Eric Arazo, Noel E. O'Connor, Kevin McGuinness

    Abstract: Using search engines for web image retrieval is a tempting alternative to manual curation when creating an image dataset, but their main drawback remains the proportion of incorrect (noisy) samples retrieved. These noisy samples have been evidenced by previous works to be a mixture of in-distribution (ID) samples, assigned to the incorrect category but presenting similar visual semantics to other… ▽ More

    Submitted 18 July, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted at ECCV 2022

  9. arXiv:2204.09343  [pdf

    cs.CV

    Utilizing unsupervised learning to improve sward content prediction and herbage mass estimation

    Authors: Paul Albert, Mohamed Saadeldin, Badri Narayanan, Brian Mac Namee, Deirdre Hennessy, Aisling H. O'Connor, Noel E. O'Connor, Kevin McGuinness

    Abstract: Sward species composition estimation is a tedious one. Herbage must be collected in the field, manually separated into components, dried and weighed to estimate species composition. Deep learning approaches using neural networks have been used in previous work to propose faster and more cost efficient alternatives to this process by estimating the biomass information from a picture of an area of p… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: 3 pages. Accepted at the 29th EGF General Meeting 2022

  10. arXiv:2204.08271  [pdf, other

    cs.CV

    Unsupervised domain adaptation and super resolution on drone images for autonomous dry herbage biomass estimation

    Authors: Paul Albert, Mohamed Saadeldin, Badri Narayanan, Jaime Fernandez, Brian Mac Namee, Deirdre Hennessey, Noel E. O'Connor, Kevin McGuinness

    Abstract: Herbage mass yield and composition estimation is an important tool for dairy farmers to ensure an adequate supply of high quality herbage for grazing and subsequently milk production. By accurately estimating herbage mass and composition, targeted nitrogen fertiliser application strategies can be deployed to improve localised regions in a herbage field, effectively reducing the negative impacts of… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: 11 pages, 5 figures. Accepted at the Agriculture-Vision CVPR 2022 Workshop

  11. arXiv:2110.14283  [pdf, other

    cs.CV

    How Important is Importance Sampling for Deep Budgeted Training?

    Authors: Eric Arazo, Diego Ortego, Paul Albert, Noel E. O'Connor, Kevin McGuinness

    Abstract: Long iterative training processes for Deep Neural Networks (DNNs) are commonly required to achieve state-of-the-art performance in many computer vision tasks. Importance sampling approaches might play a key role in budgeted training regimes, i.e. when limiting the number of training iterations. These approaches aim at dynamically estimating the importance of each sample to focus on the most releva… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Comments: British Machine Vision Conference (BMVC) 2021, oral presentation

  12. arXiv:2110.13719  [pdf, other

    cs.CV

    Semi-supervised dry herbage mass estimation using automatic data and synthetic images

    Authors: Paul Albert, Mohamed Saadeldin, Badri Narayanan, Brian Mac Namee, Deirdre Hennessy, Aisling O'Connor, Noel O'Connor, Kevin McGuinness

    Abstract: Monitoring species-specific dry herbage biomass is an important aspect of pasture-based milk production systems. Being aware of the herbage biomass in the field enables farmers to manage surpluses and deficits in herbage supply, as well as using targeted nitrogen fertilization when necessary. Deep learning for computer vision is a powerful tool in this context as it can accurately estimate the dry… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: Published at CVPPA 2021, ICCVW 2021

  13. arXiv:2110.13699  [pdf, other

    cs.CV

    Addressing out-of-distribution label noise in webly-labelled data

    Authors: Paul Albert, Diego Ortego, Eric Arazo, Noel O'Connor, Kevin McGuinness

    Abstract: A recurring focus of the deep learning community is towards reducing the labeling effort. Data gathering and annotation using a search engine is a simple alternative to generating a fully human-annotated and human-gathered dataset. Although web crawling is very time efficient, some of the retrieved images are unavoidably noisy, i.e. incorrectly labeled. Designing robust algorithms for training on… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: Accepted at WACV 2022

  14. arXiv:2101.03198  [pdf, other

    cs.CV cs.LG

    Extracting Pasture Phenotype and Biomass Percentages using Weakly Supervised Multi-target Deep Learning on a Small Dataset

    Authors: Badri Narayanan, Mohamed Saadeldin, Paul Albert, Kevin McGuinness, Brian Mac Namee

    Abstract: The dairy industry uses clover and grass as fodder for cows. Accurate estimation of grass and clover biomass yield enables smart decisions in optimizing fertilization and seeding density, resulting in increased productivity and positive environmental impact. Grass and clover are usually planted together, since clover is a nitrogen-fixing plant that brings nutrients to the soil. Adjusting the right… ▽ More

    Submitted 8 January, 2021; originally announced January 2021.

    Journal ref: Irish Machine Vision and Image Processing Conference (2020) 21-28

  15. arXiv:2012.04462  [pdf, other

    cs.CV

    Multi-Objective Interpolation Training for Robustness to Label Noise

    Authors: Diego Ortego, Eric Arazo, Paul Albert, Noel E. O'Connor, Kevin McGuinness

    Abstract: Deep neural networks trained with standard cross-entropy loss memorize noisy labels, which degrades their performance. Most research to mitigate this memorization proposes new robust classification loss functions. Conversely, we propose a Multi-Objective Interpolation Training (MOIT) approach that jointly exploits contrastive learning and classification to mutually help each other and boost perfor… ▽ More

    Submitted 18 March, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: Accepted to CVPR 2021. 10 pages, 1 figure, and 9 tables

  16. arXiv:2009.14361  [pdf, other

    cs.CL cs.CY

    Ethically Collecting Multi-Modal Spontaneous Conversations with People that have Cognitive Impairments

    Authors: Angus Addlesee, Pierre Albert

    Abstract: In order to make spoken dialogue systems (such as Amazon Alexa or Google Assistant) more accessible and naturally interactive for people with cognitive impairments, appropriate data must be obtainable. Recordings of multi-modal spontaneous conversations with vulnerable user groups are scarce however and this valuable data is challenging to collect. Researchers that call for this data are commonly… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: Published at LREC's Workshop on Legal and Ethical Issues in Human Language Technologies 2020

    Journal ref: LREC Workshop on Legal and Ethical Issues in Human Language Technologies (2020) 15-20

  17. arXiv:2007.11866  [pdf, other

    cs.CV

    Reliable Label Bootstrapping for Semi-Supervised Learning

    Authors: Paul Albert, Diego Ortego, Eric Arazo, Noel E. O'Connor, Kevin McGuinness

    Abstract: Reducing the amount of labels required to train convolutional neural networks without performance degradation is key to effectively reduce human annotation efforts. We propose Reliable Label Bootstrapping (ReLaB), an unsupervised preprossessing algorithm which improves the performance of semi-supervised algorithms in extremely low supervision settings. Given a dataset with few labeled samples, we… ▽ More

    Submitted 25 February, 2021; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: 10 pages, 3 figures

  18. arXiv:1912.08741  [pdf, other

    cs.CV

    Towards Robust Learning with Different Label Noise Distributions

    Authors: Diego Ortego, Eric Arazo, Paul Albert, Noel E. O'Connor, Kevin McGuinness

    Abstract: Noisy labels are an unavoidable consequence of labeling processes and detecting them is an important step towards preventing performance degradations in Convolutional Neural Networks. Discarding noisy labels avoids a harmful memorization, while the associated image content can still be exploited in a semi-supervised learning (SSL) setup. Clean samples are usually identified using the small loss tr… ▽ More

    Submitted 27 July, 2020; v1 submitted 18 December, 2019; originally announced December 2019.

  19. arXiv:1908.10623  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Emotion Recognition in Low-Resource Settings: An Evaluation of Automatic Feature Selection Methods

    Authors: Fasih Haider, Senja Pollak, Pierre Albert, Saturnino Luz

    Abstract: Research in automatic affect recognition has seldom addressed the issue of computational resource utilization. With the advent of ambient intelligence technology which employs a variety of low-power, resource-constrained devices, this issue is increasingly gaining interest. This is especially the case in the context of health and elderly care technologies, where interventions may rely on monitorin… ▽ More

    Submitted 29 May, 2020; v1 submitted 28 August, 2019; originally announced August 2019.

  20. arXiv:1908.02983  [pdf, other

    cs.CV

    Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning

    Authors: Eric Arazo, Diego Ortego, Paul Albert, Noel E. O'Connor, Kevin McGuinness

    Abstract: Semi-supervised learning, i.e. jointly learning from labeled and unlabeled samples, is an active research topic due to its key role on relaxing human supervision. In the context of image classification, recent advances to learn from unlabeled samples are mainly focused on consistency regularization methods that encourage invariant predictions for different perturbations of unlabeled samples. We, c… ▽ More

    Submitted 29 June, 2020; v1 submitted 8 August, 2019; originally announced August 2019.

  21. arXiv:1904.11238  [pdf, other

    cs.CV

    Unsupervised Label Noise Modeling and Loss Correction

    Authors: Eric Arazo, Diego Ortego, Paul Albert, Noel E. O'Connor, Kevin McGuinness

    Abstract: Despite being robust to small amounts of label noise, convolutional neural networks trained with stochastic gradient methods have been shown to easily fit random labels. When there are a mixture of correct and mislabelled targets, networks tend to fit the former before the latter. This suggests using a suitable two-component mixture model as an unsupervised generative model of sample loss values d… ▽ More

    Submitted 5 June, 2019; v1 submitted 25 April, 2019; originally announced April 2019.

    Comments: Accepted to ICML 2019

  22. arXiv:1811.09919  [pdf, other

    eess.AS cs.LG cs.SD

    A Method for Analysis of Patient Speech in Dialogue for Dementia Detection

    Authors: Saturnino Luz, Sofia de la Fuente, Pierre Albert

    Abstract: We present an approach to automatic detection of Alzheimer's type dementia based on characteristics of spontaneous spoken language dialogue consisting of interviews recorded in natural settings. The proposed method employs additive logistic regression (a machine learning boosting method) on content-free features extracted from dialogical interaction to build a predictive model. The model training… ▽ More

    Submitted 24 November, 2018; originally announced November 2018.

    Comments: 8 pages, Resources and ProcessIng of linguistic, paralinguistic and extra-linguistic Data from people with various forms of cognitive impairment, LREC 2018

  23. arXiv:0709.2346  [pdf, ps, other

    cs.IT cs.CC

    Pushdown Compression

    Authors: Pilar Albert, Elvira Mayordomo, Philippe Moser, Sylvain Perifel

    Abstract: The pressing need for eficient compression schemes for XML documents has recently been focused on stack computation [6, 9], and in particular calls for a formulation of information-lossless stack or pushdown compressors that allows a formal analysis of their performance and a more ambitious use of the stack in XML compression, where so far it is mainly connected to parsing mechanisms. In this pa… ▽ More

    Submitted 17 September, 2007; v1 submitted 14 September, 2007; originally announced September 2007.

  24. arXiv:0704.2386  [pdf, ps, other

    cs.CC cs.IT

    Bounded Pushdown dimension vs Lempel Ziv information density

    Authors: Pilar Albert, Elvira Mayordomo, Philippe Moser

    Abstract: In this paper we introduce a variant of pushdown dimension called bounded pushdown (BPD) dimension, that measures the density of information contained in a sequence, relative to a BPD automata, i.e. a finite state machine equipped with an extra infinite memory stack, with the additional requirement that every input symbol only allows a bounded number of stack movements. BPD automata are a natura… ▽ More

    Submitted 18 April, 2007; originally announced April 2007.

  25. arXiv:cs/0506031  [pdf, ps, other

    cs.AI

    A Constrained Object Model for Configuration Based Workflow Composition

    Authors: Patrick Albert, Laurent Henocque, Mathias Kleiner

    Abstract: Automatic or assisted workflow composition is a field of intense research for applications to the world wide web or to business process modeling. Workflow composition is traditionally addressed in various ways, generally via theorem proving techniques. Recent research observed that building a composite workflow bears strong relationships with finite model search, and that some workflow languages… ▽ More

    Submitted 9 June, 2005; originally announced June 2005.

    Comments: This is an extended version of the article published at BPM'05, Third International Conference on Business Process Management, Nancy France

    ACM Class: C.0; D.2.1; D.3.1; F.4.1