Search | arXiv e-print repository

arXiv:2406.19307 [pdf, other]

The Odyssey of Commonsense Causality: From Foundational Benchmarks to Cutting-Edge Reasoning

Authors: Shaobo Cui, Zhijing Jin, Bernhard Schölkopf, Boi Faltings

Abstract: Understanding commonsense causality is a unique mark of intelligence for humans. It helps people understand the principles of the real world better and benefits the decision-making process related to causation. For instance, commonsense causality is crucial in judging whether a defendant's action causes the plaintiff's loss in determining legal liability. Despite its significance, a systematic exp… ▽ More Understanding commonsense causality is a unique mark of intelligence for humans. It helps people understand the principles of the real world better and benefits the decision-making process related to causation. For instance, commonsense causality is crucial in judging whether a defendant's action causes the plaintiff's loss in determining legal liability. Despite its significance, a systematic exploration of this topic is notably lacking. Our comprehensive survey bridges this gap by focusing on taxonomies, benchmarks, acquisition methods, qualitative reasoning, and quantitative measurements in commonsense causality, synthesizing insights from over 200 representative articles. Our work aims to provide a systematic overview, update scholars on recent advancements, provide a pragmatic guide for beginners, and highlight promising future research directions in this vital field. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 42 pages

arXiv:2402.13950 [pdf, other]

Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning

Authors: Debjit Paul, Robert West, Antoine Bosselut, Boi Faltings

Abstract: Large language models (LLMs) have been shown to perform better when asked to reason step-by-step before answering a question. However, it is unclear to what degree the model's final answer is faithful to the stated reasoning steps. In this paper, we perform a causal mediation analysis on twelve LLMs to examine how intermediate reasoning steps generated by the LLM influence the final outcome and fi… ▽ More Large language models (LLMs) have been shown to perform better when asked to reason step-by-step before answering a question. However, it is unclear to what degree the model's final answer is faithful to the stated reasoning steps. In this paper, we perform a causal mediation analysis on twelve LLMs to examine how intermediate reasoning steps generated by the LLM influence the final outcome and find that LLMs do not reliably use their intermediate reasoning steps when generating an answer. To address this issue, we introduce FRODO, a framework to tailor small-sized LMs to generate correct reasoning steps and robustly reason over these steps. FRODO consists of an inference module that learns to generate correct reasoning steps using an implicit causal reward function and a reasoning module that learns to faithfully reason over these intermediate inferences using a counterfactual and causal preference objective. Our experiments show that FRODO significantly outperforms four competitive baselines. Furthermore, FRODO improves the robustness and generalization ability of the reasoning LM, yielding higher performance on out-of-distribution test sets. Finally, we find that FRODO's rationales are more faithful to its final answer predictions than standard supervised fine-tuning. △ Less

Submitted 23 February, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

arXiv:2401.03183 [pdf, other]

Exploring Defeasibility in Causal Reasoning

Authors: Shaobo Cui, Lazar Milikic, Yiyang Feng, Mete Ismayilzada, Debjit Paul, Antoine Bosselut, Boi Faltings

Abstract: Defeasibility in causal reasoning implies that the causal relationship between cause and effect can be strengthened or weakened. Namely, the causal strength between cause and effect should increase or decrease with the incorporation of strengthening arguments (supporters) or weakening arguments (defeaters), respectively. However, existing works ignore defeasibility in causal reasoning and fail to… ▽ More Defeasibility in causal reasoning implies that the causal relationship between cause and effect can be strengthened or weakened. Namely, the causal strength between cause and effect should increase or decrease with the incorporation of strengthening arguments (supporters) or weakening arguments (defeaters), respectively. However, existing works ignore defeasibility in causal reasoning and fail to evaluate existing causal strength metrics in defeasible settings. In this work, we present $δ$-CAUSAL, the first benchmark dataset for studying defeasibility in causal reasoning. $δ$-CAUSAL includes around 11K events spanning ten domains, featuring defeasible causality pairs, i.e., cause-effect pairs accompanied by supporters and defeaters. We further show current causal strength metrics fail to reflect the change of causal strength with the incorporation of supporters or defeaters in $δ$-CAUSAL. To this end, we propose CESAR (Causal Embedding aSsociation with Attention Rating), a metric that measures causal strength based on token-level causal relationships. CESAR achieves a significant 69.7% relative improvement over existing metrics, increasing from 47.2% to 80.1% in capturing the causal strength change brought by supporters and defeaters. We further demonstrate even Large Language Models (LLMs) like GPT-3.5 still lag 4.5 and 10.7 points behind humans in generating supporters and defeaters, emphasizing the challenge posed by $δ$-CAUSAL. △ Less

Submitted 27 June, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

Comments: Accepted by ACL 2024 (Findings)

arXiv:2312.12303 [pdf, other]

Peer Neighborhood Mechanisms: A Framework for Mechanism Generalization

Authors: Adam Richardson, Boi Faltings

Abstract: Peer prediction incentive mechanisms for crowdsourcing are generally limited to eliciting samples from categorical distributions. Prior work on extending peer prediction to arbitrary distributions has largely relied on assumptions on the structures of the distributions or known properties of the data providers. We introduce a novel class of incentive mechanisms that extend peer prediction mechanis… ▽ More Peer prediction incentive mechanisms for crowdsourcing are generally limited to eliciting samples from categorical distributions. Prior work on extending peer prediction to arbitrary distributions has largely relied on assumptions on the structures of the distributions or known properties of the data providers. We introduce a novel class of incentive mechanisms that extend peer prediction mechanisms to arbitrary distributions by replacing the notion of an exact match with a concept of neighborhood matching. We present conditions on the belief updates of the data providers that guarantee incentive-compatibility for rational data providers, and admit a broad class of possible reasonable updates. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: Full paper with technical Appendix to reference from AAAI conference paper

arXiv:2304.01904 [pdf, other]

REFINER: Reasoning Feedback on Intermediate Representations

Authors: Debjit Paul, Mete Ismayilzada, Maxime Peyrard, Beatriz Borges, Antoine Bosselut, Robert West, Boi Faltings

Abstract: Language models (LMs) have recently shown remarkable performance on reasoning tasks by explicitly generating intermediate inferences, e.g., chain-of-thought prompting. However, these intermediate inference steps may be inappropriate deductions from the initial context and lead to incorrect final predictions. Here we introduce REFINER, a framework for finetuning LMs to explicitly generate intermedi… ▽ More Language models (LMs) have recently shown remarkable performance on reasoning tasks by explicitly generating intermediate inferences, e.g., chain-of-thought prompting. However, these intermediate inference steps may be inappropriate deductions from the initial context and lead to incorrect final predictions. Here we introduce REFINER, a framework for finetuning LMs to explicitly generate intermediate reasoning steps while interacting with a critic model that provides automated feedback on the reasoning. Specifically, the critic provides structured feedback that the reasoning LM uses to iteratively improve its intermediate arguments. Empirical evaluations of REFINER on three diverse reasoning tasks show significant improvements over baseline LMs of comparable scale. Furthermore, when using GPT-3.5 or ChatGPT as the reasoner, the trained critic significantly improves reasoning without finetuning the reasoner. Finally, our critic model is trained without expensive human-in-the-loop data but can be substituted with humans at inference time. △ Less

Submitted 4 February, 2024; v1 submitted 4 April, 2023; originally announced April 2023.

Comments: Accepted at EACL 2024

arXiv:2210.07228 [pdf, other]

Language Model Decoding as Likelihood-Utility Alignment

Authors: Martin Josifoski, Maxime Peyrard, Frano Rajic, Jiheng Wei, Debjit Paul, Valentin Hartmann, Barun Patra, Vishrav Chaudhary, Emre Kıcıman, Boi Faltings, Robert West

Abstract: A critical component of a successful language generation pipeline is the decoding algorithm. However, the general principles that should guide the choice of a decoding algorithm remain unclear. Previous works only compare decoding algorithms in narrow scenarios, and their findings do not generalize across tasks. We argue that the misalignment between the model's likelihood and the task-specific no… ▽ More A critical component of a successful language generation pipeline is the decoding algorithm. However, the general principles that should guide the choice of a decoding algorithm remain unclear. Previous works only compare decoding algorithms in narrow scenarios, and their findings do not generalize across tasks. We argue that the misalignment between the model's likelihood and the task-specific notion of utility is the key factor to understanding the effectiveness of decoding algorithms. To structure the discussion, we introduce a taxonomy of misalignment mitigation strategies (MMSs), providing a unifying view of decoding as a tool for alignment. The MMS taxonomy groups decoding algorithms based on their implicit assumptions about likelihood--utility misalignment, yielding general statements about their applicability across tasks. Specifically, by analyzing the correlation between the likelihood and the utility of predictions across a diverse set of tasks, we provide empirical evidence supporting the proposed taxonomy and a set of principles to structure reasoning when choosing a decoding algorithm. Crucially, our analysis is the first to relate likelihood-based decoding algorithms with algorithms that rely on external information, such as value-guided methods and prompting, and covers the most diverse set of tasks to date. Code, data, and models are available at https://github.com/epfl-dlab/understanding-decoding. △ Less

Submitted 16 March, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

Comments: Accepted at EACL (Findings) 2023

arXiv:2205.11518 [pdf, other]

LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning Using a Lazy Influence Approximation

Authors: Ljubomir Rokvic, Panayiotis Danassis, Sai Praneeth Karimireddy, Boi Faltings

Abstract: In Federated Learning, it is crucial to handle low-quality, corrupted, or malicious data. However, traditional data valuation methods are not suitable due to privacy concerns. To address this, we propose a simple yet effective approach that utilizes a new influence approximation called "lazy influence" to filter and score data while preserving privacy. To do this, each participant uses their own d… ▽ More In Federated Learning, it is crucial to handle low-quality, corrupted, or malicious data. However, traditional data valuation methods are not suitable due to privacy concerns. To address this, we propose a simple yet effective approach that utilizes a new influence approximation called "lazy influence" to filter and score data while preserving privacy. To do this, each participant uses their own data to estimate the influence of another participant's batch and sends a differentially private obfuscated score to the central coordinator. Our method has been shown to successfully filter out biased and corrupted data in various simulated and real-world settings, achieving a recall rate of over $>90\%$ (sometimes up to $100\%$) while maintaining strong differential privacy guarantees with $\varepsilon \leq 1$. △ Less

Submitted 30 May, 2024; v1 submitted 23 May, 2022; originally announced May 2022.

Comments: A preliminary version of this work received the Best Paper Award at the International Workshop on Trustworthy Federated Learning at IJCAI (FL-IJCAI) 2023

arXiv:2205.06756 [pdf, other]

Interlock-Free Multi-Aspect Rationalization for Text Classification

Authors: Shuangqi Li, Diego Antognini, Boi Faltings

Abstract: Explanation is important for text classification tasks. One prevalent type of explanation is rationales, which are text snippets of input text that suffice to yield the prediction and are meaningful to humans. A lot of research on rationalization has been based on the selective rationalization framework, which has recently been shown to be problematic due to the interlocking dynamics. In this pape… ▽ More Explanation is important for text classification tasks. One prevalent type of explanation is rationales, which are text snippets of input text that suffice to yield the prediction and are meaningful to humans. A lot of research on rationalization has been based on the selective rationalization framework, which has recently been shown to be problematic due to the interlocking dynamics. In this paper, we show that we address the interlocking problem in the multi-aspect setting, where we aim to generate multiple rationales for multiple outputs. More specifically, we propose a multi-stage training method incorporating an additional self-supervised contrastive loss that helps to generate more semantically diverse rationales. Empirical results on the beer review dataset show that our method improves significantly the rationalization performance. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Comments: Work in progress. 5 pages, 2 figures, 1 table

arXiv:2205.02454 [pdf, other]

Assistive Recipe Editing through Critiquing

Authors: Diego Antognini, Shuyang Li, Boi Faltings, Julian McAuley

Abstract: There has recently been growing interest in the automatic generation of cooking recipes that satisfy some form of dietary restrictions, thanks in part to the availability of online recipe data. Prior studies have used pre-trained language models, or relied on small paired recipe data (e.g., a recipe paired with a similar one that satisfies a dietary constraint). However, pre-trained language model… ▽ More There has recently been growing interest in the automatic generation of cooking recipes that satisfy some form of dietary restrictions, thanks in part to the availability of online recipe data. Prior studies have used pre-trained language models, or relied on small paired recipe data (e.g., a recipe paired with a similar one that satisfies a dietary constraint). However, pre-trained language models generate inconsistent or incoherent recipes, and paired datasets are not available at scale. We address these deficiencies with RecipeCrit, a hierarchical denoising auto-encoder that edits recipes given ingredient-level critiques. The model is trained for recipe completion to learn semantic relationships within recipes. Our work's main innovation is our unsupervised critiquing module that allows users to edit recipes by interacting with the predicted ingredients; the system iteratively rewrites recipes to satisfy users' feedback. Experiments on the Recipe1M recipe dataset show that our model can more effectively edit recipes compared to strong language-modeling baselines, creating recipes that satisfy user constraints and are more correct, serendipitous, coherent, and relevant as measured by human judges. △ Less

Submitted 26 January, 2023; v1 submitted 5 May, 2022; originally announced May 2022.

Comments: Accepted at EACL 2023. 10 pages, 2 figures, 6 tables, 1 algorithm

arXiv:2204.02162 [pdf, other]

Positive and Negative Critiquing for VAE-based Recommenders

Authors: Diego Antognini, Boi Faltings

Abstract: Providing explanations for recommended items allows users to refine the recommendations by critiquing parts of the explanations. As a result of revisiting critiquing from the perspective of multimodal generative models, recent work has proposed M&Ms-VAE, which achieves state-of-the-art performance in terms of recommendation, explanation, and critiquing. M&Ms-VAE and similar models allow users to n… ▽ More Providing explanations for recommended items allows users to refine the recommendations by critiquing parts of the explanations. As a result of revisiting critiquing from the perspective of multimodal generative models, recent work has proposed M&Ms-VAE, which achieves state-of-the-art performance in terms of recommendation, explanation, and critiquing. M&Ms-VAE and similar models allow users to negatively critique (i.e., explicitly disagree). However, they share a significant drawback: users cannot positively critique (i.e., highlight a desired feature). We address this deficiency with M&Ms-VAE+, an extension of M&Ms-VAE that enables positive and negative critiquing. In addition to modeling users' interactions and keyphrase-usage preferences, we model their keyphrase-usage dislikes. Moreover, we design a novel critiquing module that is trained in a self-supervised fashion. Our experiments on two datasets show that M&Ms-VAE+ matches or exceeds M&Ms-VAE in recommendation and explanation performance. Furthermore, our results demonstrate that representing positive and negative critiques differently enables M&Ms-VAE+ to significantly outperform M&Ms-VAE and other models in positive and negative multi-step critiquing. △ Less

Submitted 5 April, 2022; originally announced April 2022.

Comments: 5 pages, 2 figures, 2 tables

arXiv:2108.12596 [pdf, other]

Representation Memorization for Fast Learning New Knowledge without Forgetting

Authors: Fei Mi, Tao Lin, Boi Faltings

Abstract: The ability to quickly learn new knowledge (e.g. new classes or data distributions) is a big step towards human-level intelligence. In this paper, we consider scenarios that require learning new classes or data distributions quickly and incrementally over time, as it often occurs in real-world dynamic environments. We propose "Memory-based Hebbian Parameter Adaptation" (Hebb) to tackle the two maj… ▽ More The ability to quickly learn new knowledge (e.g. new classes or data distributions) is a big step towards human-level intelligence. In this paper, we consider scenarios that require learning new classes or data distributions quickly and incrementally over time, as it often occurs in real-world dynamic environments. We propose "Memory-based Hebbian Parameter Adaptation" (Hebb) to tackle the two major challenges (i.e., catastrophic forgetting and sample efficiency) towards this goal in a unified framework. To mitigate catastrophic forgetting, Hebb augments a regular neural classifier with a continuously updated memory module to store representations of previous data. To improve sample efficiency, we propose a parameter adaptation method based on the well-known Hebbian theory, which directly "wires" the output network's parameters with similar representations retrieved from the memory. We empirically verify the superior performance of Hebb through extensive experiments on a wide range of learning tasks (image classification, language model) and learning scenarios (continual, incremental, online). We demonstrate that Hebb effectively mitigates catastrophic forgetting, and it indeed learns new knowledge better and faster than the current state-of-the-art. △ Less

Submitted 10 September, 2021; v1 submitted 28 August, 2021; originally announced August 2021.

arXiv:2108.12589 [pdf, other]

Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems

Authors: Fei Mi, Wanhao Zhou, Fengyu Cai, Lingjing Kong, Minlie Huang, Boi Faltings

Abstract: As the labeling cost for different modules in task-oriented dialog (ToD) systems is expensive, a major challenge is to train different modules with the least amount of labeled data. Recently, large-scale pre-trained language models, have shown promising results for few-shot learning in ToD. In this paper, we devise a self-training approach to utilize the abundant unlabeled dialog data to further i… ▽ More As the labeling cost for different modules in task-oriented dialog (ToD) systems is expensive, a major challenge is to train different modules with the least amount of labeled data. Recently, large-scale pre-trained language models, have shown promising results for few-shot learning in ToD. In this paper, we devise a self-training approach to utilize the abundant unlabeled dialog data to further improve state-of-the-art pre-trained models in few-shot learning scenarios for ToD systems. Specifically, we propose a self-training approach that iteratively labels the most confident unlabeled data to train a stronger Student model. Moreover, a new text augmentation technique (GradAug) is proposed to better train the Student by replacing non-crucial tokens using a masked language model. We conduct extensive experiments and present analyses on four downstream tasks in ToD, including intent classification, dialog state tracking, dialog act prediction, and response selection. Empirical results demonstrate that the proposed self-training approach consistently improves state-of-the-art pre-trained models (BERT, ToD-BERT) when only a small number of labeled data are available. △ Less

Submitted 28 August, 2021; originally announced August 2021.

Comments: Accepted as Long Paper at "EMNLP, 2021"

arXiv:2108.11711 [pdf, other]

SLIM: Explicit Slot-Intent Mapping with BERT for Joint Multi-Intent Detection and Slot Filling

Authors: Fengyu Cai, Wanhao Zhou, Fei Mi, Boi Faltings

Abstract: Utterance-level intent detection and token-level slot filling are two key tasks for natural language understanding (NLU) in task-oriented systems. Most existing approaches assume that only a single intent exists in an utterance. However, there are often multiple intents within an utterance in real-life scenarios. In this paper, we propose a multi-intent NLU framework, called SLIM, to jointly learn… ▽ More Utterance-level intent detection and token-level slot filling are two key tasks for natural language understanding (NLU) in task-oriented systems. Most existing approaches assume that only a single intent exists in an utterance. However, there are often multiple intents within an utterance in real-life scenarios. In this paper, we propose a multi-intent NLU framework, called SLIM, to jointly learn multi-intent detection and slot filling based on BERT. To fully exploit the existing annotation data and capture the interactions between slots and intents, SLIM introduces an explicit slot-intent classifier to learn the many-to-one mapping between slots and intents. Empirical results on three public multi-intent datasets demonstrate (1) the superior performance of SLIM compared to the current state-of-the-art for NLU with multiple intents and (2) the benefits obtained from the slot-intent classifier. △ Less

Submitted 26 August, 2021; originally announced August 2021.

arXiv:2107.06416 [pdf, other]

Multi-Step Critiquing User Interface for Recommender Systems

Authors: Diana Petrescu, Diego Antognini, Boi Faltings

Abstract: Recommendations with personalized explanations have been shown to increase user trust and perceived quality and help users make better decisions. Moreover, such explanations allow users to provide feedback by critiquing them. Several algorithms for recommender systems with multi-step critiquing have therefore been developed. However, providing a user-friendly interface based on personalized explan… ▽ More Recommendations with personalized explanations have been shown to increase user trust and perceived quality and help users make better decisions. Moreover, such explanations allow users to provide feedback by critiquing them. Several algorithms for recommender systems with multi-step critiquing have therefore been developed. However, providing a user-friendly interface based on personalized explanations and critiquing has not been addressed in the last decade. In this paper, we introduce four different web interfaces (available under https://lia.epfl.ch/critiquing/) helping users making decisions and finding their ideal item. We have chosen the hotel recommendation domain as a use case even though our approach is trivially adaptable for other domains. Moreover, our system is model-agnostic (for both recommender systems and critiquing models) allowing a great flexibility and further extensions. Our interfaces are above all a useful tool to help research in recommendation with critiquing. They allow to test such systems on a real use case and also to highlight some limitations of these approaches to find solutions to overcome them. △ Less

Submitted 5 August, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

Comments: Accepted at RecSys 2021. 5 pages, 7 figures

arXiv:2106.06060 [pdf, other]

AI-driven Prices for Externalities and Sustainability in Production Markets

Authors: Panayiotis Danassis, Aris Filos-Ratsikas, Haipeng Chen, Milind Tambe, Boi Faltings

Abstract: Traditional competitive markets do not account for negative externalities; indirect costs that some participants impose on others, such as the cost of over-appropriating a common-pool resource (which diminishes future stock, and thus harvest, for everyone). Quantifying appropriate interventions to market prices has proven to be quite challenging. We propose a practical approach to computing market… ▽ More Traditional competitive markets do not account for negative externalities; indirect costs that some participants impose on others, such as the cost of over-appropriating a common-pool resource (which diminishes future stock, and thus harvest, for everyone). Quantifying appropriate interventions to market prices has proven to be quite challenging. We propose a practical approach to computing market prices and allocations via a deep reinforcement learning policymaker agent, operating in an environment of other learning agents. Our policymaker allows us to tune the prices with regard to diverse objectives such as sustainability and resource wastefulness, fairness, buyers' and sellers' welfare, etc. As a highlight of our findings, our policymaker is significantly more successful in maintaining resource sustainability, compared to the market equilibrium outcome, in scarce resource environments. △ Less

Submitted 12 January, 2023; v1 submitted 10 June, 2021; originally announced June 2021.

Comments: Accepted to the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023)

arXiv:2105.04837 [pdf, other]

Rationalization through Concepts

Authors: Diego Antognini, Boi Faltings

Abstract: Automated predictions require explanations to be interpretable by humans. One type of explanation is a rationale, i.e., a selection of input features such as relevant text snippets from which the model computes the outcome. However, a single overall selection does not provide a complete explanation, e.g., weighing several aspects for decisions. To this end, we present a novel self-interpretable mo… ▽ More Automated predictions require explanations to be interpretable by humans. One type of explanation is a rationale, i.e., a selection of input features such as relevant text snippets from which the model computes the outcome. However, a single overall selection does not provide a complete explanation, e.g., weighing several aspects for decisions. To this end, we present a novel self-interpretable model called ConRAT. Inspired by how human explanations for high-level decisions are often based on key concepts, ConRAT extracts a set of text snippets as concepts and infers which ones are described in the document. Then, it explains the outcome with a linear aggregation of concepts. Two regularizers drive ConRAT to build interpretable concepts. In addition, we propose two techniques to boost the rationale and predictive performance further. Experiments on both single- and multi-aspect sentiment classification tasks show that ConRAT is the first to generate concepts that align with human rationalization while using only the overall label. Further, it outperforms state-of-the-art methods trained on each aspect label independently. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: Accepted at ACL 2021 (findings). 15 pages, 10 figures, 7 tables

arXiv:2105.04027 [pdf, other]

Improving Multi-agent Coordination by Learning to Estimate Contention

Authors: Panayiotis Danassis, Florian Wiedemair, Boi Faltings

Abstract: We present a multi-agent learning algorithm, ALMA-Learning, for efficient and fair allocations in large-scale systems. We circumvent the traditional pitfalls of multi-agent learning (e.g., the moving target problem, the curse of dimensionality, or the need for mutually consistent actions) by relying on the ALMA heuristic as a coordination mechanism for each stage game. ALMA-Learning is decentraliz… ▽ More We present a multi-agent learning algorithm, ALMA-Learning, for efficient and fair allocations in large-scale systems. We circumvent the traditional pitfalls of multi-agent learning (e.g., the moving target problem, the curse of dimensionality, or the need for mutually consistent actions) by relying on the ALMA heuristic as a coordination mechanism for each stage game. ALMA-Learning is decentralized, observes only own action/reward pairs, requires no inter-agent communication, and achieves near-optimal (<5% loss) and fair coordination in a variety of synthetic scenarios and a real-world meeting scheduling problem. The lightweight nature and fast learning constitute ALMA-Learning ideal for on-device deployment. △ Less

Submitted 20 June, 2021; v1 submitted 9 May, 2021; originally announced May 2021.

Comments: Accepted to the 30th International Joint Conference on Artificial Intelligence (IJCAI-21)

arXiv:2105.00774 [pdf, other]

Fast Multi-Step Critiquing for VAE-based Recommender Systems

Authors: Diego Antognini, Boi Faltings

Abstract: Recent studies have shown that providing personalized explanations alongside recommendations increases trust and perceived quality. Furthermore, it gives users an opportunity to refine the recommendations by critiquing parts of the explanations. On one hand, current recommender systems model the recommendation, explanation, and critiquing objectives jointly, but this creates an inherent trade-off… ▽ More Recent studies have shown that providing personalized explanations alongside recommendations increases trust and perceived quality. Furthermore, it gives users an opportunity to refine the recommendations by critiquing parts of the explanations. On one hand, current recommender systems model the recommendation, explanation, and critiquing objectives jointly, but this creates an inherent trade-off between their respective performance. On the other hand, although recent latent linear critiquing approaches are built upon an existing recommender system, they suffer from computational inefficiency at inference due to the objective optimized at each conversation's turn. We address these deficiencies with M&Ms-VAE, a novel variational autoencoder for recommendation and explanation that is based on multimodal modeling assumptions. We train the model under a weak supervision scheme to simulate both fully and partially observed variables. Then, we leverage the generalization ability of a trained M&Ms-VAE model to embed the user preference and the critique separately. Our work's most important innovation is our critiquing module, which is built upon and trained in a self-supervised manner with a simple ranking objective. Experiments on four real-world datasets demonstrate that among state-of-the-art models, our system is the first to dominate or match the performance in terms of recommendation, explanation, and multi-step critiquing. Moreover, M&Ms-VAE processes the critiques up to 25.6x faster than the best baselines. Finally, we show that our model infers coherent joint and cross generation, even under weak supervision, thanks to our multimodal-based modeling and training scheme. △ Less

Submitted 7 July, 2021; v1 submitted 3 May, 2021; originally announced May 2021.

Comments: Accepted at RecSys 2021. 19 pages, 7 figures, 5 tables

arXiv:2102.02304 [pdf, ps, other]

doi 10.1007/s10458-021-09541-7

Improved Cooperation by Exploiting a Common Signal

Authors: Panayiotis Danassis, Zeki Doruk Erden, Boi Faltings

Abstract: Can artificial agents benefit from human conventions? Human societies manage to successfully self-organize and resolve the tragedy of the commons in common-pool resources, in spite of the bleak prediction of non-cooperative game theory. On top of that, real-world problems are inherently large-scale and of low observability. One key concept that facilitates human coordination in such settings is th… ▽ More Can artificial agents benefit from human conventions? Human societies manage to successfully self-organize and resolve the tragedy of the commons in common-pool resources, in spite of the bleak prediction of non-cooperative game theory. On top of that, real-world problems are inherently large-scale and of low observability. One key concept that facilitates human coordination in such settings is the use of conventions. Inspired by human behavior, we investigate the learning dynamics and emergence of temporal conventions, focusing on common-pool resources. Extra emphasis was given in designing a realistic evaluation setting: (a) environment dynamics are modeled on real-world fisheries, (b) we assume decentralized learning, where agents can observe only their own history, and (c) we run large-scale simulations (up to 64 agents). Uncoupled policies and low observability make cooperation hard to achieve; as the number of agents grow, the probability of taking a correct gradient direction decreases exponentially. By introducing an arbitrary common signal (e.g., date, time, or any periodic set of numbers) as a means to couple the learning process, we show that temporal conventions can emerge and agents reach sustainable harvesting strategies. The introduction of the signal consistently improves the social welfare (by 258% on average, up to 3306%), the range of environmental parameters where sustainability can be achieved (by 46% on average, up to 300%), and the convergence speed in low abundance settings (by 13% on average, up to 53%). △ Less

Submitted 3 February, 2021; originally announced February 2021.

Comments: Accepted to the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021)

Journal ref: An extended version of this paper has been published in the Autonomous Agents and Multi-Agent Systems (2022)

arXiv:2101.04825 [pdf, other]

Towards Mobile Distributed Ledgers

Authors: Dimitris Chatzopoulos, Anurag Jain, Sujit Gujar, Boi Faltings, Pan Hui

Abstract: Advances in mobile computing have paved the way for new types of distributed applications that can be executed solely by mobile devices on device-to-device (D2D) ecosystems (e.g., crowdsensing). Sophisticated applications, like cryptocurrencies, need distributed ledgers to function. Distributed ledgers, such as blockchains and directed acyclic graphs (DAGs), employ consensus protocols to add data… ▽ More Advances in mobile computing have paved the way for new types of distributed applications that can be executed solely by mobile devices on device-to-device (D2D) ecosystems (e.g., crowdsensing). Sophisticated applications, like cryptocurrencies, need distributed ledgers to function. Distributed ledgers, such as blockchains and directed acyclic graphs (DAGs), employ consensus protocols to add data in the form of blocks. However, such protocols are designed for resourceful devices that are interconnected via the Internet. Moreover, existing distributed ledgers are not deployable to D2D ecosystems since their storage needs are continuously increasing. In this work, we introduce and analyse Mneme, a DAG-based distributed ledger that can be maintained solely by mobile devices. Mneme utilizes two novel consensus protocols: Proof-of-Context (PoC) and Proof-of-Equivalence (PoE). PoC employs users' context to add data on Mneme. PoE is executed periodically to summarize data and produce equivalent blocks that require less storage. We analyze Mneme's security and justify the ability of PoC and PoE to guarantee the characteristics of distributed ledgers: persistence and liveness. Furthermore, we analyze potential attacks from malicious users and prove that the probability of a successful attack is inversely proportional to the square of the number of mobile users who maintain Mneme. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: Part of it was presented in IEEE Infocom 2020

arXiv:2011.07934 [pdf, other]

A Distributed Differentially Private Algorithm for Resource Allocation in Unboundedly Large Settings

Authors: Panayiotis Danassis, Aleksei Triastcyn, Boi Faltings

Abstract: We introduce a practical and scalable algorithm (PALMA) for solving one of the fundamental problems of multi-agent systems -- finding matches and allocations -- in unboundedly large settings (e.g., resource allocation in urban environments, mobility-on-demand systems, etc.), while providing strong worst-case privacy guarantees. PALMA is decentralized, runs on-device, requires no inter-agent commun… ▽ More We introduce a practical and scalable algorithm (PALMA) for solving one of the fundamental problems of multi-agent systems -- finding matches and allocations -- in unboundedly large settings (e.g., resource allocation in urban environments, mobility-on-demand systems, etc.), while providing strong worst-case privacy guarantees. PALMA is decentralized, runs on-device, requires no inter-agent communication, and converges in constant time under reasonable assumptions. We evaluate PALMA in a mobility-on-demand and a paper assignment scenario, using real data in both, and demonstrate that it provides a strong level of privacy ($\varepsilon \leq 1$ and median as low as $\varepsilon = 0.5$ across agents) and high-quality matchings (up to $86\%$ of the non-private optimal, outperforming even the privacy-preserving centralized maximum-weight matching baseline). △ Less

Submitted 13 March, 2022; v1 submitted 16 November, 2020; originally announced November 2020.

Comments: Accepted to the 21th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2022)

arXiv:2010.00910 [pdf, other]

Continual Learning for Natural Language Generation in Task-oriented Dialog Systems

Authors: Fei Mi, Liangwei Chen, Mengjie Zhao, Minlie Huang, Boi Faltings

Abstract: Natural language generation (NLG) is an essential component of task-oriented dialog systems. Despite the recent success of neural approaches for NLG, they are typically developed in an offline manner for particular domains. To better fit real-life applications where new data come in a stream, we study NLG in a "continual learning" setting to expand its knowledge to new domains or functionalities i… ▽ More Natural language generation (NLG) is an essential component of task-oriented dialog systems. Despite the recent success of neural approaches for NLG, they are typically developed in an offline manner for particular domains. To better fit real-life applications where new data come in a stream, we study NLG in a "continual learning" setting to expand its knowledge to new domains or functionalities incrementally. The major challenge towards this goal is catastrophic forgetting, meaning that a continually trained model tends to forget the knowledge it has learned before. To this end, we propose a method called ARPER (Adaptively Regularized Prioritized Exemplar Replay) by replaying prioritized historical exemplars, together with an adaptive regularization technique based on ElasticWeight Consolidation. Extensive experiments to continually learn new domains and intents are conducted on MultiWoZ-2.0 to benchmark ARPER with a wide range of techniques. Empirical results demonstrate that ARPER significantly outperforms other methods by effectively mitigating the detrimental catastrophic forgetting issue. △ Less

Submitted 2 October, 2020; originally announced October 2020.

Comments: Accepted as Long Paper at "Findgins of EMNLP, 2020"

arXiv:2009.08978 [pdf, other]

Modeling Online Behavior in Recommender Systems: The Importance of Temporal Context

Authors: Milena Filipovic, Blagoj Mitrevski, Diego Antognini, Emma Lejal Glaude, Boi Faltings, Claudiu Musat

Abstract: Recommender systems research tends to evaluate model performance offline and on randomly sampled targets, yet the same systems are later used to predict user behavior sequentially from a fixed point in time. Simulating online recommender system performance is notoriously difficult and the discrepancy between online and offline behaviors is typically not accounted for in offline evaluations. This d… ▽ More Recommender systems research tends to evaluate model performance offline and on randomly sampled targets, yet the same systems are later used to predict user behavior sequentially from a fixed point in time. Simulating online recommender system performance is notoriously difficult and the discrepancy between online and offline behaviors is typically not accounted for in offline evaluations. This disparity permits weaknesses to go unnoticed until the model is deployed in a production setting. In this paper, we first demonstrate how omitting temporal context when evaluating recommender system performance leads to false confidence. To overcome this, we postulate that offline evaluation protocols can only model real-life use-cases if they account for temporal context. Next, we propose a training procedure to further embed the temporal context in existing models. We use a multi-objective approach to introduce temporal context into traditionally time-unaware recommender systems and confirm its advantage via the proposed evaluation protocol. Finally, we validate that the Pareto Fronts obtained with the added objective dominate those produced by state-of-the-art models that are only optimized for accuracy on three real-world publicly available datasets. The results show that including our temporal objective can improve recall@20 by up to 20%. △ Less

Submitted 5 September, 2021; v1 submitted 19 September, 2020; originally announced September 2020.

Comments: 11 pages, 3 figures, 2 tables, accepted at RecSys 2021 - Workshop on Perspectives on the Evaluation of Recommender Systems (PERSPECTIVES)

arXiv:2009.04695 [pdf, other]

Momentum-based Gradient Methods in Multi-Objective Recommendation

Authors: Blagoj Mitrevski, Milena Filipovic, Diego Antognini, Emma Lejal Glaude, Boi Faltings, Claudiu Musat

Abstract: Multi-objective gradient methods are becoming the standard for solving multi-objective problems. Among others, they show promising results in developing multi-objective recommender systems with both correlated and conflicting objectives. Classic multi-gradient~descent usually relies on the combination of the gradients, not including the computation of first and second moments of the gradients. Thi… ▽ More Multi-objective gradient methods are becoming the standard for solving multi-objective problems. Among others, they show promising results in developing multi-objective recommender systems with both correlated and conflicting objectives. Classic multi-gradient~descent usually relies on the combination of the gradients, not including the computation of first and second moments of the gradients. This leads to a brittle behavior and misses important areas in the solution space. In this work, we create a multi-objective model-agnostic Adamize method that leverages the benefits of the Adam optimizer in single-objective problems. This corrects and stabilizes~the~gradients of every objective before calculating a common gradient descent vector that optimizes all the objectives simultaneously. We evaluate the benefits of Multi-objective Adamize on two multi-objective recommender systems and for three different objective combinations, both correlated or conflicting. We report significant improvements, measured with three different Pareto front metrics: hypervolume, coverage, and spacing. Finally, we show that the \textit{Adamized} Pareto front strictly dominates the previous one on multiple objective pairs. △ Less

Submitted 1 September, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: 10 pages, 2 figures, 2 tables, accepted at RecSys 2021 - Workshop on Multi-Objective Recommender Systems (MORS)

arXiv:2009.04441 [pdf, other]

Addressing Fairness in Classification with a Model-Agnostic Multi-Objective Algorithm

Authors: Kirtan Padh, Diego Antognini, Emma Lejal Glaude, Boi Faltings, Claudiu Musat

Abstract: The goal of fairness in classification is to learn a classifier that does not discriminate against groups of individuals based on sensitive attributes, such as race and gender. One approach to designing fair algorithms is to use relaxations of fairness notions as regularization terms or in a constrained optimization problem. We observe that the hyperbolic tangent function can approximate the indic… ▽ More The goal of fairness in classification is to learn a classifier that does not discriminate against groups of individuals based on sensitive attributes, such as race and gender. One approach to designing fair algorithms is to use relaxations of fairness notions as regularization terms or in a constrained optimization problem. We observe that the hyperbolic tangent function can approximate the indicator function. We leverage this property to define a differentiable relaxation that approximates fairness notions provably better than existing relaxations. In addition, we propose a model-agnostic multi-objective architecture that can simultaneously optimize for multiple fairness notions and multiple sensitive attributes and supports all statistical parity-based notions of fairness. We use our relaxation with the multi-objective architecture to learn fair classifiers. Experiments on public datasets show that our method suffers a significantly lower loss of accuracy than current debiasing algorithms relative to the unconstrained model. △ Less

Submitted 8 June, 2021; v1 submitted 9 September, 2020; originally announced September 2020.

Comments: Accepted at UAI 2021. 14 pages, 5 figures, 4 tables

arXiv:2007.12000 [pdf, other]

ADER: Adaptively Distilled Exemplar Replay Towards Continual Learning for Session-based Recommendation

Authors: Fei Mi, Xiaoyu Lin, Boi Faltings

Abstract: Session-based recommendation has received growing attention recently due to the increasing privacy concern. Despite the recent success of neural session-based recommenders, they are typically developed in an offline manner using a static dataset. However, recommendation requires continual adaptation to take into account new and obsolete items and users, and requires "continual learning" in real-li… ▽ More Session-based recommendation has received growing attention recently due to the increasing privacy concern. Despite the recent success of neural session-based recommenders, they are typically developed in an offline manner using a static dataset. However, recommendation requires continual adaptation to take into account new and obsolete items and users, and requires "continual learning" in real-life applications. In this case, the recommender is updated continually and periodically with new data that arrives in each update cycle, and the updated model needs to provide recommendations for user activities before the next model update. A major challenge for continual learning with neural models is catastrophic forgetting, in which a continually trained model forgets user preference patterns it has learned before. To deal with this challenge, we propose a method called Adaptively Distilled Exemplar Replay (ADER) by periodically replaying previous training samples (i.e., exemplars) to the current model with an adaptive distillation loss. Experiments are conducted based on the state-of-the-art SASRec model using two widely used datasets to benchmark ADER with several well-known continual learning techniques. We empirically demonstrate that ADER consistently outperforms other baselines, and it even outperforms the method using all historical data at every update cycle. This result reveals that ADER is a promising solution to mitigate the catastrophic forgetting issue towards building more realistic and scalable session-based recommenders. △ Less

Submitted 23 July, 2020; originally announced July 2020.

Comments: Accepted at RecSys 2020

arXiv:2005.11067 [pdf, other]

Interacting with Explanations through Critiquing

Authors: Diego Antognini, Claudiu Musat, Boi Faltings

Abstract: Using personalized explanations to support recommendations has been shown to increase trust and perceived quality. However, to actually obtain better recommendations, there needs to be a means for users to modify the recommendation criteria by interacting with the explanation. We present a novel technique using aspect markers that learns to generate personalized explanations of recommendations fro… ▽ More Using personalized explanations to support recommendations has been shown to increase trust and perceived quality. However, to actually obtain better recommendations, there needs to be a means for users to modify the recommendation criteria by interacting with the explanation. We present a novel technique using aspect markers that learns to generate personalized explanations of recommendations from review texts, and we show that human users significantly prefer these explanations over those produced by state-of-the-art techniques. Our work's most important innovation is that it allows users to react to a recommendation by critiquing the textual explanation: removing (symmetrically adding) certain aspects they dislike or that are no longer relevant (symmetrically that are of interest). The system updates its user model and the resulting recommendations according to the critique. This is based on a novel unsupervised critiquing method for single- and multi-step critiquing with textual explanations. Experiments on two real-world datasets show that our system is the first to achieve good performance in adapting to the preferences expressed in multi-step critiquing. △ Less

Submitted 12 January, 2022; v1 submitted 22 May, 2020; originally announced May 2020.

Comments: Accepted at IJCAI 2021. 15 pages, 10 figures, 12 tables

arXiv:2005.01573 [pdf, other]

Memory Augmented Neural Model for Incremental Session-based Recommendation

Authors: Fei Mi, Boi Faltings

Abstract: Increasing concerns with privacy have stimulated interests in Session-based Recommendation (SR) using no personal data other than what is observed in the current browser session. Existing methods are evaluated in static settings which rarely occur in real-world applications. To better address the dynamic nature of SR tasks, we study an incremental SR scenario, where new items and preferences appea… ▽ More Increasing concerns with privacy have stimulated interests in Session-based Recommendation (SR) using no personal data other than what is observed in the current browser session. Existing methods are evaluated in static settings which rarely occur in real-world applications. To better address the dynamic nature of SR tasks, we study an incremental SR scenario, where new items and preferences appear continuously. We show that existing neural recommenders can be used in incremental SR scenarios with small incremental updates to alleviate computation overhead and catastrophic forgetting. More importantly, we propose a general framework called Memory Augmented Neural model (MAN). MAN augments a base neural recommender with a continuously queried and updated nonparametric memory, and the predictions from the neural and the memory components are combined through another lightweight gating network. We empirically show that MAN is well-suited for the incremental SR task, and it consistently outperforms state-of-the-art neural and nonparametric methods. We analyze the results and demonstrate that it is particularly good at incrementally learning preferences on new and infrequent items. △ Less

Submitted 28 April, 2020; originally announced May 2020.

Comments: Accepted as a full paper at IJCAI 2020

arXiv:2003.00997 [pdf, other]

Generating Higher-Fidelity Synthetic Datasets with Privacy Guarantees

Authors: Aleksei Triastcyn, Boi Faltings

Abstract: This paper considers the problem of enhancing user privacy in common machine learning development tasks, such as data annotation and inspection, by substituting the real data with samples form a generative adversarial network. We propose employing Bayesian differential privacy as the means to achieve a rigorous theoretical guarantee while providing a better privacy-utility trade-off. We demonstrat… ▽ More This paper considers the problem of enhancing user privacy in common machine learning development tasks, such as data annotation and inspection, by substituting the real data with samples form a generative adversarial network. We propose employing Bayesian differential privacy as the means to achieve a rigorous theoretical guarantee while providing a better privacy-utility trade-off. We demonstrate experimentally that our approach produces higher-fidelity samples, compared to prior work, allowing to (1) detect more subtle data errors and biases, and (2) reduce the need for real data labelling by achieving high accuracy when training directly on artificial samples. △ Less

Submitted 2 March, 2020; originally announced March 2020.

Comments: 7 pages, 4 figures, 1 table

arXiv:2002.06854 [pdf, other]

HotelRec: a Novel Very Large-Scale Hotel Recommendation Dataset

Authors: Diego Antognini, Boi Faltings

Abstract: Today, recommender systems are an inevitable part of everyone's daily digital routine and are present on most internet platforms. State-of-the-art deep learning-based models require a large number of data to achieve their best performance. Many datasets fulfilling this criterion have been proposed for multiple domains, such as Amazon products, restaurants, or beers. However, works and datasets in… ▽ More Today, recommender systems are an inevitable part of everyone's daily digital routine and are present on most internet platforms. State-of-the-art deep learning-based models require a large number of data to achieve their best performance. Many datasets fulfilling this criterion have been proposed for multiple domains, such as Amazon products, restaurants, or beers. However, works and datasets in the hotel domain are limited: the largest hotel review dataset is below the million samples. Additionally, the hotel domain suffers from a higher data sparsity than traditional recommendation datasets and therefore, traditional collaborative-filtering approaches cannot be applied to such data. In this paper, we propose HotelRec, a very large-scale hotel recommendation dataset, based on TripAdvisor, containing 50 million reviews. To the best of our knowledge, HotelRec is the largest publicly available dataset in the hotel domain (50M versus 0.9M) and additionally, the largest recommendation dataset in a single domain and with textual reviews (50M versus 22M). We release HotelRec for further research: https://github.com/Diego999/HotelRec. △ Less

Submitted 17 February, 2020; originally announced February 2020.

Comments: 7 pages, 3 figure, 5 tables. Accepted at LREC 2020

arXiv:2002.06851 [pdf, other]

GameWikiSum: a Novel Large Multi-Document Summarization Dataset

Authors: Diego Antognini, Boi Faltings

Abstract: Today's research progress in the field of multi-document summarization is obstructed by the small number of available datasets. Since the acquisition of reference summaries is costly, existing datasets contain only hundreds of samples at most, resulting in heavy reliance on hand-crafted features or necessitating additional, manually annotated data. The lack of large corpora therefore hinders the d… ▽ More Today's research progress in the field of multi-document summarization is obstructed by the small number of available datasets. Since the acquisition of reference summaries is costly, existing datasets contain only hundreds of samples at most, resulting in heavy reliance on hand-crafted features or necessitating additional, manually annotated data. The lack of large corpora therefore hinders the development of sophisticated models. Additionally, most publicly available multi-document summarization corpora are in the news domain, and no analogous dataset exists in the video game domain. In this paper, we propose GameWikiSum, a new domain-specific dataset for multi-document summarization, which is one hundred times larger than commonly used datasets, and in another domain than news. Input documents consist of long professional video game reviews as well as references of their gameplay sections in Wikipedia pages. We analyze the proposed dataset and show that both abstractive and extractive models can be trained on it. We release GameWikiSum for further research: https://github.com/Diego999/GameWikiSum. △ Less

Submitted 17 February, 2020; originally announced February 2020.

Comments: 6 pages, 1 figure, 4 tables. Accepted at LREC 2020

arXiv:2001.00846 [pdf, other]

Multi-Gradient Descent for Multi-Objective Recommender Systems

Authors: Nikola Milojkovic, Diego Antognini, Giancarlo Bergamin, Boi Faltings, Claudiu Musat

Abstract: Recommender systems need to mirror the complexity of the environment they are applied in. The more we know about what might benefit the user, the more objectives the recommender system has. In addition there may be multiple stakeholders - sellers, buyers, shareholders - in addition to legal and ethical constraints. Simultaneously optimizing for a multitude of objectives, correlated and not correla… ▽ More Recommender systems need to mirror the complexity of the environment they are applied in. The more we know about what might benefit the user, the more objectives the recommender system has. In addition there may be multiple stakeholders - sellers, buyers, shareholders - in addition to legal and ethical constraints. Simultaneously optimizing for a multitude of objectives, correlated and not correlated, having the same scale or not, has proven difficult so far. We introduce a stochastic multi-gradient descent approach to recommender systems (MGDRec) to solve this problem. We show that this exceeds state-of-the-art methods in traditional objective mixtures, like revenue and recall. Not only that, but through gradient normalization we can combine fundamentally different objectives, having diverse scales, into a single coherent framework. We show that uncorrelated objectives, like the proportion of quality products, can be improved alongside accuracy. Through the use of stochasticity, we avoid the pitfalls of calculating full gradients and provide a clear setting for its applicability. △ Less

Submitted 17 April, 2020; v1 submitted 9 December, 2019; originally announced January 2020.

Comments: 9 pages, 4 figures, Accepted at AAAI 2020 - Workshop on Interactive and Conversational Recommendation Systems (WICRS)

arXiv:1912.08066 [pdf, other]

doi 10.1007/s10462-022-10145-0

Putting Ridesharing to the Test: Efficient and Scalable Solutions and the Power of Dynamic Vehicle Relocation

Authors: Panayiotis Danassis, Marija Sakota, Aris Filos-Ratsikas, Boi Faltings

Abstract: We study the optimization of large-scale, real-time ridesharing systems and propose a modular design methodology, Component Algorithms for Ridesharing (CAR). We evaluate a diverse set of CARs (14 in total), focusing on the key algorithmic components of ridesharing. We take a multi-objective approach, evaluating 12 metrics related to global efficiency, complexity, passenger, driver, and platform in… ▽ More We study the optimization of large-scale, real-time ridesharing systems and propose a modular design methodology, Component Algorithms for Ridesharing (CAR). We evaluate a diverse set of CARs (14 in total), focusing on the key algorithmic components of ridesharing. We take a multi-objective approach, evaluating 12 metrics related to global efficiency, complexity, passenger, driver, and platform incentives, in settings designed to closely resemble reality in every aspect, focusing on vehicles of capacity two. To the best of our knowledge, this is the largest and most comprehensive evaluation to date. We (i) identify CARs that perform well on global, passenger, driver or platform metrics, (ii) demonstrate that lightweight relocation schemes can significantly improve the Quality of Service by up to $50\%$, and (iii) highlight a practical, scalable, on-device CAR that works well across all metrics. △ Less

Submitted 20 June, 2022; v1 submitted 17 December, 2019; originally announced December 2019.

Comments: A version of this paper has been published in the Artificial Intelligence Review (https://doi.org/10.1007/s10462-022-10145-0)

Journal ref: Artificial Intelligence Review (2022)

arXiv:1911.10071 [pdf, other]

doi 10.1109/BigData47090.2019.9005465

Federated Learning with Bayesian Differential Privacy

Authors: Aleksei Triastcyn, Boi Faltings

Abstract: We consider the problem of reinforcing federated learning with formal privacy guarantees. We propose to employ Bayesian differential privacy, a relaxation of differential privacy for similarly distributed data, to provide sharper privacy loss bounds. We adapt the Bayesian privacy accounting method to the federated setting and suggest multiple improvements for more efficient privacy budgeting at di… ▽ More We consider the problem of reinforcing federated learning with formal privacy guarantees. We propose to employ Bayesian differential privacy, a relaxation of differential privacy for similarly distributed data, to provide sharper privacy loss bounds. We adapt the Bayesian privacy accounting method to the federated setting and suggest multiple improvements for more efficient privacy budgeting at different levels. Our experiments show significant advantage over the state-of-the-art differential privacy bounds for federated learning on image classification tasks, including a medical application, bringing the privacy budget below 1 at the client level, and below 0.1 at the instance level. Lower amounts of noise also benefit the model accuracy and reduce the number of communication rounds. △ Less

Submitted 22 November, 2019; originally announced November 2019.

Comments: Accepted at 2019 IEEE International Conference on Big Data (IEEE Big Data 2019). 10 pages, 2 figures, 4 tables. arXiv admin note: text overlap with arXiv:1901.09697

arXiv:1910.08385 [pdf, other]

Federated Generative Privacy

Authors: Aleksei Triastcyn, Boi Faltings

Abstract: In this paper, we propose FedGP, a framework for privacy-preserving data release in the federated learning setting. We use generative adversarial networks, generator components of which are trained by FedAvg algorithm, to draw privacy-preserving artificial data samples and empirically assess the risk of information disclosure. Our experiments show that FedGP is able to generate labelled data of hi… ▽ More In this paper, we propose FedGP, a framework for privacy-preserving data release in the federated learning setting. We use generative adversarial networks, generator components of which are trained by FedAvg algorithm, to draw privacy-preserving artificial data samples and empirically assess the risk of information disclosure. Our experiments show that FedGP is able to generate labelled data of high quality to successfully train and validate supervised models. Finally, we demonstrate that our approach significantly reduces vulnerability of such models to model inversion attacks. △ Less

Submitted 18 October, 2019; originally announced October 2019.

Comments: IJCAI Workshop on Federated Machine Learning for User Privacy and Data Confidentiality (FL-IJCAI 2019). 7 pages, 2 figures, 3 tables

arXiv:1909.12231 [pdf, other]

Learning to Create Sentence Semantic Relation Graphs for Multi-Document Summarization

Authors: Diego Antognini, Boi Faltings

Abstract: Linking facts across documents is a challenging task, as the language used to express the same information in a sentence can vary significantly, which complicates the task of multi-document summarization. Consequently, existing approaches heavily rely on hand-crafted features, which are domain-dependent and hard to craft, or additional annotated data, which is costly to gather. To overcome these l… ▽ More Linking facts across documents is a challenging task, as the language used to express the same information in a sentence can vary significantly, which complicates the task of multi-document summarization. Consequently, existing approaches heavily rely on hand-crafted features, which are domain-dependent and hard to craft, or additional annotated data, which is costly to gather. To overcome these limitations, we present a novel method, which makes use of two types of sentence embeddings: universal embeddings, which are trained on a large unrelated corpus, and domain-specific embeddings, which are learned during training. To this end, we develop SemSentSum, a fully data-driven model able to leverage both types of sentence embeddings by building a sentence semantic relation graph. SemSentSum achieves competitive results on two types of summary, consisting of 665 bytes and 100 words. Unlike other state-of-the-art models, neither hand-crafted features nor additional annotated data are necessary, and the method is easily adaptable for other tasks. To our knowledge, we are the first to use multiple sentence embeddings for the task of multi-document summarization. △ Less

Submitted 20 September, 2019; originally announced September 2019.

Comments: 10 pages, 4 tables, 1 figure, Accepted at 2019 Empirical Methods in Natural Language Processing - Workshop on New Frontiers in Summarization

arXiv:1909.11386 [pdf, other]

Multi-Dimensional Explanation of Target Variables from Documents

Authors: Diego Antognini, Claudiu Musat, Boi Faltings

Abstract: Automated predictions require explanations to be interpretable by humans. Past work used attention and rationale mechanisms to find words that predict the target variable of a document. Often though, they result in a tradeoff between noisy explanations or a drop in accuracy. Furthermore, rationale methods cannot capture the multi-faceted nature of justifications for multiple targets, because of th… ▽ More Automated predictions require explanations to be interpretable by humans. Past work used attention and rationale mechanisms to find words that predict the target variable of a document. Often though, they result in a tradeoff between noisy explanations or a drop in accuracy. Furthermore, rationale methods cannot capture the multi-faceted nature of justifications for multiple targets, because of the non-probabilistic nature of the mask. In this paper, we propose the Multi-Target Masker (MTM) to address these shortcomings. The novelty lies in the soft multi-dimensional mask that models a relevance probability distribution over the set of target variables to handle ambiguities. Additionally, two regularizers guide MTM to induce long, meaningful explanations. We evaluate MTM on two datasets and show, using standard metrics and human annotations, that the resulting masks are more accurate and coherent than those generated by the state-of-the-art methods. Moreover, MTM is the first to also achieve the highest F1 scores for all the target variables simultaneously. △ Less

Submitted 21 December, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

Comments: Accepted in AAAI 2021. 18 pages, 14 figures, 9 tables

arXiv:1909.02789 [pdf, ps, other]

An Effective Upperbound on Treewidth Using Partial Fill-in of Separators

Authors: Boi Faltings, Martin Charles Golumbic

Abstract: Partitioning a graph using graph separators, and particularly clique separators, are well-known techniques to decompose a graph into smaller units which can be treated independently. It was previously known that the treewidth was bounded above by the sum of the size of the separator plus the treewidth of disjoint components, and this was obtained by the heuristic of filling in all edges of the sep… ▽ More Partitioning a graph using graph separators, and particularly clique separators, are well-known techniques to decompose a graph into smaller units which can be treated independently. It was previously known that the treewidth was bounded above by the sum of the size of the separator plus the treewidth of disjoint components, and this was obtained by the heuristic of filling in all edges of the separator making it into a clique. In this paper, we present a new, tighter upper bound on the treewidth of a graph obtained by only partially filling in the edges of a separator. In particular, the method completes just those pairs of separator vertices that are adjacent to a common component, and indicates a more effective heuristic than filling in the entire separator. We discuss the relevance of this result for combinatorial algorithms and give an example of how the tighter bound can be exploited in the domain of constraint satisfaction problems. △ Less

Submitted 6 September, 2019; originally announced September 2019.

Comments: Unpublished Research Report: January 22, 2010

arXiv:1908.11598 [pdf, other]

Rewarding High-Quality Data via Influence Functions

Authors: Adam Richardson, Aris Filos-Ratsikas, Boi Faltings

Abstract: We consider a crowdsourcing data acquisition scenario, such as federated learning, where a Center collects data points from a set of rational Agents, with the aim of training a model. For linear regression models, we show how a payment structure can be designed to incentivize the agents to provide high-quality data as early as possible, based on a characterization of the influence that data points… ▽ More We consider a crowdsourcing data acquisition scenario, such as federated learning, where a Center collects data points from a set of rational Agents, with the aim of training a model. For linear regression models, we show how a payment structure can be designed to incentivize the agents to provide high-quality data as early as possible, based on a characterization of the influence that data points have on the loss function of the model. Our contributions can be summarized as follows: (a) we prove theoretically that this scheme ensures truthful data reporting as a game-theoretic equilibrium and further demonstrate its robustness against mixtures of truthful and heuristic data reports, (b) we design a procedure according to which the influence computation can be efficiently approximated and processed sequentially in batches over time, (c) we develop a theory that allows correcting the difference between the influence and the overall change in loss and (d) we evaluate our approach on real datasets, confirming our theoretical findings. △ Less

Submitted 30 August, 2019; originally announced August 2019.

arXiv:1908.10258 [pdf, other]

Infochain: A Decentralized, Trustless and Transparent Oracle on Blockchain

Authors: Naman Goel, Cyril van Schreven, Aris Filos-Ratsikas, Boi Faltings

Abstract: Blockchain based systems allow various kinds of financial transactions to be executed in a decentralized manner. However, these systems often rely on a trusted third party (oracle) to get correct information about the real-world events, which trigger the financial transactions. In this paper, we identify two biggest challenges in building decentralized, trustless and transparent oracles. The first… ▽ More Blockchain based systems allow various kinds of financial transactions to be executed in a decentralized manner. However, these systems often rely on a trusted third party (oracle) to get correct information about the real-world events, which trigger the financial transactions. In this paper, we identify two biggest challenges in building decentralized, trustless and transparent oracles. The first challenge is acquiring correct information about the real-world events without relying on a trusted information provider. We show how a peer-consistency incentive mechanism can be used to acquire truthful information from an untrusted and self-interested crowd, even when the crowd has outside incentives to provide wrong information. The second is a system design and implementation challenge. For the first time, we show how to implement a trustless and transparent oracle in Ethereum. We discuss various non-trivial issues that arise in implementing peer-consistency mechanisms in Ethereum, suggest several optimizations to reduce gas cost and provide empirical analysis. △ Less

Submitted 26 July, 2020; v1 submitted 27 August, 2019; originally announced August 2019.

arXiv:1906.06567 [pdf, other]

A Practical Solution to Yao's Millionaires' Problem and Its Application in Designing Secure Combinatorial Auction

Authors: Sankarshan Damle, Boi Faltings, Sujit Gujar

Abstract: The emergence of e-commerce and e-voting platforms has resulted in the rise in the volume of sensitive information over the Internet. This has resulted in an increased demand for secure and private means of information computation. Towards this, the Yao's Millionaires' problem, i.e., to determine the richer among two millionaires' securely, finds an application. In this work, we present a new solu… ▽ More The emergence of e-commerce and e-voting platforms has resulted in the rise in the volume of sensitive information over the Internet. This has resulted in an increased demand for secure and private means of information computation. Towards this, the Yao's Millionaires' problem, i.e., to determine the richer among two millionaires' securely, finds an application. In this work, we present a new solution to the Yao's Millionaires' problem namely, Privacy Preserving Comparison (PPC). We show that PPC achieves this comparison in constant time as well as in one execution. PPC uses semi-honest third parties for the comparison who do not learn any information about the values. Further, we show that PPC is collusion-resistance. To demonstrate the significance of PPC, we present a secure, approximate single-minded combinatorial auction, which we call TPACAS, i.e., Truthful, Privacy-preserving Approximate Combinatorial Auction for Single-minded bidders. We show that TPACAS, unlike previous works, preserves the following privacies relevant to an auction: agent privacy, the identities of the losing bidders must not be revealed to any other agent except the auctioneer (AU), bid privacy, the bid values must be hidden from the other agents as well as the AU and bid-topology privacy, the items for which the agents are bidding must be hidden from the other agents as well as the AU. We demonstrate the practicality of TPACAS through simulations. Lastly, we also look at TPACAS' implementation over a publicly distributed ledger, such as the Ethereum blockchain. △ Less

Submitted 15 June, 2019; originally announced June 2019.

arXiv:1905.05644 [pdf, other]

Meta-Learning for Low-resource Natural Language Generation in Task-oriented Dialogue Systems

Authors: Fei Mi, Minlie Huang, Jiyong Zhang, Boi Faltings

Abstract: Natural language generation (NLG) is an essential component of task-oriented dialogue systems. Despite the recent success of neural approaches for NLG, they are typically developed for particular domains with rich annotated training examples. In this paper, we study NLG in a low-resource setting to generate sentences in new scenarios with handful training examples. We formulate the problem from a… ▽ More Natural language generation (NLG) is an essential component of task-oriented dialogue systems. Despite the recent success of neural approaches for NLG, they are typically developed for particular domains with rich annotated training examples. In this paper, we study NLG in a low-resource setting to generate sentences in new scenarios with handful training examples. We formulate the problem from a meta-learning perspective, and propose a generalized optimization-based approach (Meta-NLG) based on the well-recognized model-agnostic meta-learning (MAML) algorithm. Meta-NLG defines a set of meta tasks, and directly incorporates the objective of adapting to new low-resource NLG tasks into the meta-learning optimization process. Extensive experiments are conducted on a large multi-domain dataset (MultiWoz) with diverse linguistic variations. We show that Meta-NLG significantly outperforms other training procedures in various low-resource configurations. We analyze the results, and demonstrate that Meta-NLG adapts extremely fast and well to low-resource situations. △ Less

Submitted 14 May, 2019; originally announced May 2019.

Comments: Accepted as a full paper at IJCAI 2019

arXiv:1902.09359 [pdf, other]

doi 10.24963/ijcai.2019/31

Anytime Heuristic for Weighted Matching Through Altruism-Inspired Behavior

Authors: Panayiotis Danassis, Aris Filos-Ratsikas, Boi Faltings

Abstract: We present a novel anytime heuristic (ALMA), inspired by the human principle of altruism, for solving the assignment problem. ALMA is decentralized, completely uncoupled, and requires no communication between the participants. We prove an upper bound on the convergence speed that is polynomial in the desired number of resources and competing agents per resource; crucially, in the realistic case wh… ▽ More We present a novel anytime heuristic (ALMA), inspired by the human principle of altruism, for solving the assignment problem. ALMA is decentralized, completely uncoupled, and requires no communication between the participants. We prove an upper bound on the convergence speed that is polynomial in the desired number of resources and competing agents per resource; crucially, in the realistic case where the aforementioned quantities are bounded independently of the total number of agents/resources, the convergence time remains constant as the total problem size increases. We have evaluated ALMA under three test cases: (i) an anti-coordination scenario where agents with similar preferences compete over the same set of actions, (ii) a resource allocation scenario in an urban environment, under a constant-time constraint, and finally, (iii) an on-line matching scenario using real passenger-taxi data. In all of the cases, ALMA was able to reach high social welfare, while being orders of magnitude faster than the centralized, optimal algorithm. The latter allows our algorithm to scale to realistic scenarios with hundreds of thousands of agents, e.g., vehicle coordination in urban environments. △ Less

Submitted 25 February, 2019; originally announced February 2019.

arXiv:1901.09697 [pdf, other]

Bayesian Differential Privacy for Machine Learning

Authors: Aleksei Triastcyn, Boi Faltings

Abstract: Traditional differential privacy is independent of the data distribution. However, this is not well-matched with the modern machine learning context, where models are trained on specific data. As a result, achieving meaningful privacy guarantees in ML often excessively reduces accuracy. We propose Bayesian differential privacy (BDP), which takes into account the data distribution to provide more p… ▽ More Traditional differential privacy is independent of the data distribution. However, this is not well-matched with the modern machine learning context, where models are trained on specific data. As a result, achieving meaningful privacy guarantees in ML often excessively reduces accuracy. We propose Bayesian differential privacy (BDP), which takes into account the data distribution to provide more practical privacy guarantees. We also derive a general privacy accounting method under BDP, building upon the well-known moments accountant. Our experiments demonstrate that in-distribution samples in classic machine learning datasets, such as MNIST and CIFAR-10, enjoy significantly stronger privacy guarantees than postulated by DP, while models maintain high classification accuracy. △ Less

Submitted 19 August, 2020; v1 submitted 28 January, 2019; originally announced January 2019.

Comments: International Conference on Machine Learning (ICML 2020). 15 pages, 5 figures, 1 table

arXiv:1810.13314 [pdf, other]

Crowdsourcing with Fairness, Diversity and Budget Constraints

Authors: Naman Goel, Boi Faltings

Abstract: Recent studies have shown that the labels collected from crowdworkers can be discriminatory with respect to sensitive attributes such as gender and race. This raises questions about the suitability of using crowdsourced data for further use, such as for training machine learning algorithms. In this work, we address the problem of fair and diverse data collection from a crowd under budget constrain… ▽ More Recent studies have shown that the labels collected from crowdworkers can be discriminatory with respect to sensitive attributes such as gender and race. This raises questions about the suitability of using crowdsourced data for further use, such as for training machine learning algorithms. In this work, we address the problem of fair and diverse data collection from a crowd under budget constraints. We propose a novel algorithm which maximizes the expected accuracy of the collected data, while ensuring that the errors satisfy desired notions of fairness. We provide guarantees on the performance of our algorithm and show that the algorithm performs well in practice through experiments on a real dataset. △ Less

Submitted 1 March, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

arXiv:1808.04056 [pdf, other]

Privacy Preserving and Cost Optimal Mobile Crowdsensing using Smart Contracts on Blockchain

Authors: Dimitris Chatzopoulos, Sujit Gujar, Boi Faltings, Pan Hui

Abstract: The popularity and applicability of mobile crowdsensing applications are continuously increasing due to the widespread of mobile devices and their sensing and processing capabilities. However, we need to offer appropriate incentives to the mobile users who contribute their resources and preserve their privacy. Blockchain technologies enable semi-anonymous multi-party interactions and can be utiliz… ▽ More The popularity and applicability of mobile crowdsensing applications are continuously increasing due to the widespread of mobile devices and their sensing and processing capabilities. However, we need to offer appropriate incentives to the mobile users who contribute their resources and preserve their privacy. Blockchain technologies enable semi-anonymous multi-party interactions and can be utilized in crowdsensing applications to maintain the privacy of the mobile users while ensuring first-rate crowdsensed data. In this work, we propose to use blockchain technologies and smart contracts to orchestrate the interactions between mobile crowdsensing providers and mobile users for the case of spatial crowdsensing, where mobile users need to be at specific locations to perform the tasks. Smart contracts, by operating as processes that are executed on the blockchain, are used to preserve users' privacy and make payments. Furthermore, for the assignment of the crowdsensing tasks to the mobile users, we design a truthful, cost-optimal auction that minimizes the payments from the crowdsensing providers to the mobile users. Extensive experimental results show that the proposed privacy preserving auction outperforms state-of-the-art proposals regarding cost by ten times for high numbers of mobile users and tasks. △ Less

Submitted 20 August, 2018; v1 submitted 12 August, 2018; originally announced August 2018.

Comments: Extended version of the paper accepted in IEEE MASS 2018

arXiv:1806.03733 [pdf, other]

Context Tree for Adaptive Session-based Recommendation

Authors: Fei Mi, Boi Faltings

Abstract: There has been growing interests in recent years from both practical and research perspectives for session-based recommendation tasks as long-term user profiles do not often exist in many real-life recommendation applications. In this case, recommendations for user's immediate next actions need to be generated based on patterns in anonymous short sessions. An often overlooked aspect is that new it… ▽ More There has been growing interests in recent years from both practical and research perspectives for session-based recommendation tasks as long-term user profiles do not often exist in many real-life recommendation applications. In this case, recommendations for user's immediate next actions need to be generated based on patterns in anonymous short sessions. An often overlooked aspect is that new items with limited observations arrive continuously in many domains (e.g. news and discussion forums). Therefore, recommendations need to be adaptive to such frequent changes. In this paper, we benchmark a new nonparametric method called context tree (CT) against various state-of-the-art methods on extensive datasets for session-based recommendation task. Apart from the standard static evaluation protocol adopted by previous literatures, we include an adaptive configuration to mimic the situation when new items with limited observations arrives continuously. Our results show that CT outperforms two best-performing approaches (recurrent neural network; heuristic-based nearest neighbor) in majority of the tested configurations and datasets. We analyze reasons for this and demonstrate that it is because of the better adaptation to changes in the domain, as well as the remarkable capability to learn static sequential patterns. Moreover, our running time analysis illustrates the efficiency of using CT as other nonparametric methods. △ Less

Submitted 10 June, 2018; originally announced June 2018.

arXiv:1804.10316 [pdf, other]

CompNet: Neural networks growing via the compact network morphism

Authors: Jun Lu, Wei Ma, Boi Faltings

Abstract: It is often the case that the performance of a neural network can be improved by adding layers. In real-world practices, we always train dozens of neural network architectures in parallel which is a wasteful process. We explored $CompNet$, in which case we morph a well-trained neural network to a deeper one where network function can be preserved and the added layer is compact. The work of the pap… ▽ More It is often the case that the performance of a neural network can be improved by adding layers. In real-world practices, we always train dozens of neural network architectures in parallel which is a wasteful process. We explored $CompNet$, in which case we morph a well-trained neural network to a deeper one where network function can be preserved and the added layer is compact. The work of the paper makes two contributions: a). The modified network can converge fast and keep the same functionality so that we do not need to train from scratch again; b). The layer size of the added layer in the neural network is controlled by removing the redundant parameters with sparse optimization. This differs from previous network morphism approaches which tend to add more neurons or channels beyond the actual requirements and result in redundance of the model. The method is illustrated using several neural network structures on different data sets including MNIST and CIFAR10. △ Less

Submitted 26 April, 2018; originally announced April 2018.

arXiv:1804.05560 [pdf, other]

Deep Bayesian Trust : A Dominant and Fair Incentive Mechanism for Crowd

Authors: Naman Goel, Boi Faltings

Abstract: An important class of game-theoretic incentive mechanisms for eliciting effort from a crowd are the peer based mechanisms, in which workers are paid by matching their answers with one another. The other classic mechanism is to have the workers solve some gold standard tasks and pay them according to their accuracy on gold tasks. This mechanism ensures stronger incentive compatibility than the peer… ▽ More An important class of game-theoretic incentive mechanisms for eliciting effort from a crowd are the peer based mechanisms, in which workers are paid by matching their answers with one another. The other classic mechanism is to have the workers solve some gold standard tasks and pay them according to their accuracy on gold tasks. This mechanism ensures stronger incentive compatibility than the peer based mechanisms but assigning gold tasks to all workers becomes inefficient at large scale. We propose a novel mechanism that assigns gold tasks to only a few workers and exploits transitivity to derive accuracy of the rest of the workers from their peers' accuracy. We show that the resulting mechanism ensures a dominant notion of incentive compatibility and fairness. △ Less

Submitted 14 November, 2018; v1 submitted 16 April, 2018; originally announced April 2018.

arXiv:1803.03148 [pdf, other]

Generating Artificial Data for Private Deep Learning

Authors: Aleksei Triastcyn, Boi Faltings

Abstract: In this paper, we propose generating artificial data that retain statistical properties of real data as the means of providing privacy with respect to the original dataset. We use generative adversarial network to draw privacy-preserving artificial data samples and derive an empirical method to assess the risk of information disclosure in a differential-privacy-like way. Our experiments show that… ▽ More In this paper, we propose generating artificial data that retain statistical properties of real data as the means of providing privacy with respect to the original dataset. We use generative adversarial network to draw privacy-preserving artificial data samples and derive an empirical method to assess the risk of information disclosure in a differential-privacy-like way. Our experiments show that we are able to generate artificial data of high quality and successfully train and validate machine learning models on this data while limiting potential privacy loss. △ Less

Submitted 28 April, 2019; v1 submitted 8 March, 2018; originally announced March 2018.

Comments: Privacy-Enhancing Artificial Intelligence and Language Technologies, AAAI Spring Symposium Series, 2019

Showing 1–50 of 63 results for author: Faltings, B