-
Flexible Stellarator Physics Facility
Authors:
F. I. Parra,
S. -G. Baek,
M. Churchill,
D. R. Demers,
B. Dudson,
N. M. Ferraro,
B. Geiger,
S. Gerhardt,
K. C. Hammond,
S. Hudson,
R. Jorge,
E. Kolemen,
D. M. Kriete,
S. T. A. Kumar,
M. Landreman,
C. Lowe,
D. A. Maurer,
F. Nespoli,
N. Pablant,
M. J. Pueschel,
A. Punjabi,
J. A. Schwartz,
C. P. S. Swanson,
A. M. Wright
Abstract:
We propose to build a Flexible Stellarator Physics Facility to explore promising regions of the vast parameter space of disruption-free stellarator solutions for Fusion Pilot Plants (FPPs).
We propose to build a Flexible Stellarator Physics Facility to explore promising regions of the vast parameter space of disruption-free stellarator solutions for Fusion Pilot Plants (FPPs).
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Generalization of Hamiltonian algorithms
Authors:
Andreas Maurer
Abstract:
The paper proves generalization results for a class of stochastic learning algorithms. The method applies whenever the algorithm generates an absolutely continuous distribution relative to some a-priori measure and the Radon Nikodym derivative has subgaussian concentration. Applications are bounds for the Gibbs algorithm and randomizations of stable deterministic algorithms as well as PAC-Bayesian…
▽ More
The paper proves generalization results for a class of stochastic learning algorithms. The method applies whenever the algorithm generates an absolutely continuous distribution relative to some a-priori measure and the Radon Nikodym derivative has subgaussian concentration. Applications are bounds for the Gibbs algorithm and randomizations of stable deterministic algorithms as well as PAC-Bayesian bounds with data-dependent priors.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Auditing health-related recommendations in social media: A Case Study of Abortion on YouTube
Authors:
Mohammed Lahsaini,
Mohamed Lechiakh,
Alexandre Maurer
Abstract:
Recommendation algorithms (RS) used by social media, like YouTube, significantly shape our information consumption across various domains, especially in healthcare. Hence, algorithmic auditing becomes crucial to uncover their potential bias and misinformation, particularly in the context of controversial topics like abortion. We introduce a simple yet effective sock puppet auditing approach to inv…
▽ More
Recommendation algorithms (RS) used by social media, like YouTube, significantly shape our information consumption across various domains, especially in healthcare. Hence, algorithmic auditing becomes crucial to uncover their potential bias and misinformation, particularly in the context of controversial topics like abortion. We introduce a simple yet effective sock puppet auditing approach to investigate how YouTube recommends abortion-related videos to individuals with different backgrounds. Our framework allows for efficient auditing of RS, regardless of the complexity of the underlying algorithms
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Weighted Monte Carlo augmented spherical Fourier-Bessel convolutional layers for 3D abdominal organ segmentation
Authors:
Wenzhao Zhao,
Steffen Albert,
Barbara D. Wichtmann,
Angelika Maurer,
Ulrike Attenberger,
Frank G. Zöllner,
Jürgen Hesser
Abstract:
Filter-decomposition-based group equivariant convolutional neural networks show promising stability and data efficiency for 3D image feature extraction. However, the existing filter-decomposition-based 3D group equivariant neural networks rely on parameter-sharing designs and are mostly limited to rotation transformation groups, where the chosen spherical harmonic filter bases consider only angula…
▽ More
Filter-decomposition-based group equivariant convolutional neural networks show promising stability and data efficiency for 3D image feature extraction. However, the existing filter-decomposition-based 3D group equivariant neural networks rely on parameter-sharing designs and are mostly limited to rotation transformation groups, where the chosen spherical harmonic filter bases consider only angular orthogonality. These limitations hamper its application to deep neural network architectures for medical image segmentation. To address these issues, this paper describes a non-parameter-sharing affine group equivariant neural network for 3D medical image segmentation based on an adaptive aggregation of Monte Carlo augmented spherical Fourier Bessel filter bases. The efficiency and flexibility of the adopted non-parameter-sharing strategy enable for the first time an efficient implementation of 3D affine group equivariant convolutional neural networks for volumetric data. The introduced spherical Bessel Fourier filter basis combines both angular and radial orthogonality for better feature extraction. The 3D image segmentation experiments on two abdominal medical image sets, BTCV and the NIH Pancreas datasets, show that the proposed methods excel the state-of-the-art 3D neural networks with high training stability and data efficiency. The code will be available at https://github.com/ZhaoWenzhao/WMCSFB.
△ Less
Submitted 9 March, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
Adaptive aggregation of Monte Carlo augmented decomposed filters for efficient group-equivariant convolutional neural network
Authors:
Wenzhao Zhao,
Barbara D. Wichtmann,
Steffen Albert,
Angelika Maurer,
Frank G. Zöllner,
Ulrike Attenberger,
Jürgen Hesser
Abstract:
Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance. However, the parameter-sharing strategy greatly increases the computational burden for each added parameter, which hampers its application to deep neural network models. In this paper, we address these problems by proposing a non-parameter-sharing approach fo…
▽ More
Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance. However, the parameter-sharing strategy greatly increases the computational burden for each added parameter, which hampers its application to deep neural network models. In this paper, we address these problems by proposing a non-parameter-sharing approach for group equivariant neural networks. The proposed methods adaptively aggregate a diverse range of filters by a weighted sum of stochastically augmented decomposed filters. We give theoretical proof about how the continuous group convolution can be approximated by our methods. Our method applies to both continuous and discrete groups, where the augmentation is implemented using Monte Carlo sampling and bootstrap resampling, respectively. We demonstrate that our methods serve as an efficient extension of standard CNN. Experiments on group equivariance tests show how our methods can achieve superior performance to parameter-sharing group equivariant networks. Experiments on image classification and image denoising tasks show that in certain scenarios, with a suitable set of filter bases, our method helps improve the performance of standard CNNs and build efficient lightweight image denoising networks. The code will be available at https://github.com/ZhaoWenzhao/MCG_CNN.
△ Less
Submitted 1 May, 2024; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Generalization for slowly mixing processes
Authors:
Andreas Maurer
Abstract:
A bound uniform over various loss-classes is given for data generated by stationary and phi-mixing processes, where the mixing time (the time needed to obtain approximate independence) enters the sample complexity only in an additive way. For slowly mixing processes this can be a considerable advantage over results with multiplicative dependence on the mixing time. The admissible loss-classes incl…
▽ More
A bound uniform over various loss-classes is given for data generated by stationary and phi-mixing processes, where the mixing time (the time needed to obtain approximate independence) enters the sample complexity only in an additive way. For slowly mixing processes this can be a considerable advantage over results with multiplicative dependence on the mixing time. The admissible loss-classes include functions with prescribed Lipschitz norms or smoothness parameters. The bound can also be applied to be uniform over unconstrained loss-classes, where it depends on local Lipschitz properties of the function on the sample path.
△ Less
Submitted 1 June, 2023; v1 submitted 28 April, 2023;
originally announced May 2023.
-
Exponential finite sample bounds for incomplete U-statistics
Authors:
Andreas Maurer
Abstract:
Incomplete U-statistics have been proposed to accelerate computation. They use only a subset of the subsamples required for kernel evaluations by complete U-statistics. This paper gives a finite sample bound in the style of Bernstein's inequality. Applied to complete U-statistics the resulting inequality improves over the bounds of both Hoeffding and Arcones. For randomly determined subsamples it…
▽ More
Incomplete U-statistics have been proposed to accelerate computation. They use only a subset of the subsamples required for kernel evaluations by complete U-statistics. This paper gives a finite sample bound in the style of Bernstein's inequality. Applied to complete U-statistics the resulting inequality improves over the bounds of both Hoeffding and Arcones. For randomly determined subsamples it is shown, that, as soon as the their number reaches the square of the sample-size, the same order bound is obtained as for the complete statistic.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Concentration of the missing mass in metric spaces
Authors:
Andreas Maurer
Abstract:
We study the estimation and concentration on its expectation of the probability to observe data further than a specified distance from a given iid sample in a metric space. The problem extends the classical problem of estimation of the missing mass in discrete spaces. We give some estimators for the conditional missing mass and show that estimation of the expected missing mass is difficult in gene…
▽ More
We study the estimation and concentration on its expectation of the probability to observe data further than a specified distance from a given iid sample in a metric space. The problem extends the classical problem of estimation of the missing mass in discrete spaces. We give some estimators for the conditional missing mass and show that estimation of the expected missing mass is difficult in general. Conditions on the distribution, under which the Good-Turing estimator and the conditional missing mass concentrate on their expectations are identified. Applications to anomaly detection, coding, the Wasserstein distance between true and empirical measure and simple learning bounds are sketched.
△ Less
Submitted 22 November, 2022; v1 submitted 4 June, 2022;
originally announced June 2022.
-
Learning Dynamical Systems via Koopman Operator Regression in Reproducing Kernel Hilbert Spaces
Authors:
Vladimir Kostic,
Pietro Novelli,
Andreas Maurer,
Carlo Ciliberto,
Lorenzo Rosasco,
Massimiliano Pontil
Abstract:
We study a class of dynamical systems modelled as Markov chains that admit an invariant distribution via the corresponding transfer, or Koopman, operator. While data-driven algorithms to reconstruct such operators are well known, their relationship with statistical learning is largely unexplored. We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical…
▽ More
We study a class of dynamical systems modelled as Markov chains that admit an invariant distribution via the corresponding transfer, or Koopman, operator. While data-driven algorithms to reconstruct such operators are well known, their relationship with statistical learning is largely unexplored. We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical system. We consider the restriction of this operator to a reproducing kernel Hilbert space and introduce a notion of risk, from which different estimators naturally arise. We link the risk with the estimation of the spectral decomposition of the Koopman operator. These observations motivate a reduced-rank operator regression (RRR) estimator. We derive learning bounds for the proposed estimator, holding both in i.i.d. and non i.i.d. settings, the latter in terms of mixing coefficients. Our results suggest RRR might be beneficial over other widely used estimators as confirmed in numerical experiments both for forecasting and mode decomposition.
△ Less
Submitted 13 December, 2022; v1 submitted 27 May, 2022;
originally announced May 2022.
-
Actionable Conversational Quality Indicators for Improving Task-Oriented Dialog Systems
Authors:
Michael Higgins,
Dominic Widdows,
Chris Brew,
Gwen Christian,
Andrew Maurer,
Matthew Dunn,
Sujit Mathi,
Akshay Hazare,
George Bonev,
Beth Ann Hockey,
Kristen Howell,
Joe Bradley
Abstract:
Automatic dialog systems have become a mainstream part of online customer service. Many such systems are built, maintained, and improved by customer service specialists, rather than dialog systems engineers and computer programmers. As conversations between people and machines become commonplace, it is critical to understand what is working, what is not, and what actions can be taken to reduce the…
▽ More
Automatic dialog systems have become a mainstream part of online customer service. Many such systems are built, maintained, and improved by customer service specialists, rather than dialog systems engineers and computer programmers. As conversations between people and machines become commonplace, it is critical to understand what is working, what is not, and what actions can be taken to reduce the frequency of inappropriate system responses. These analyses and recommendations need to be presented in terms that directly reflect the user experience rather than the internal dialog processing.
This paper introduces and explains the use of Actionable Conversational Quality Indicators (ACQIs), which are used both to recognize parts of dialogs that can be improved, and to recommend how to improve them. This combines benefits of previous approaches, some of which have focused on producing dialog quality scoring while others have sought to categorize the types of errors the dialog system is making.
We demonstrate the effectiveness of using ACQIs on LivePerson internal dialog systems used in commercial customer service applications, and on the publicly available CMU LEGOv2 conversational dataset (Raux et al. 2005). We report on the annotation and analysis of conversational datasets showing which ACQIs are important to fix in various situations.
The annotated datasets are then used to build a predictive model which uses a turn-based vector embedding of the message texts and achieves an 79% weighted average f1-measure at the task of finding the correct ACQI for a given conversation. We predict that if such a model worked perfectly, the range of potential improvement actions a bot-builder must consider at each turn could be reduced by an average of 81%.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
FEBR: Expert-Based Recommendation Framework for beneficial and personalized content
Authors:
Mohamed Lechiakh,
Alexandre Maurer
Abstract:
So far, most research on recommender systems focused on maintaining long-term user engagement and satisfaction, by promoting relevant and personalized content. However, it is still very challenging to evaluate the quality and the reliability of this content. In this paper, we propose FEBR (Expert-Based Recommendation Framework), an apprenticeship learning framework to assess the quality of the rec…
▽ More
So far, most research on recommender systems focused on maintaining long-term user engagement and satisfaction, by promoting relevant and personalized content. However, it is still very challenging to evaluate the quality and the reliability of this content. In this paper, we propose FEBR (Expert-Based Recommendation Framework), an apprenticeship learning framework to assess the quality of the recommended content on online platforms. The framework exploits the demonstrated trajectories of an expert (assumed to be reliable) in a recommendation evaluation environment, to recover an unknown utility function. This function is used to learn an optimal policy describing the expert's behavior, which is then used in the framework to provide high-quality and personalized recommendations. We evaluate the performance of our solution through a user interest simulation environment (using RecSim). We simulate interactions under the aforementioned expert policy for videos recommendation, and compare its efficiency with standard recommendation methods. The results show that our approach provides a significant gain in terms of content quality, evaluated by experts and watched by users, while maintaining almost the same watch time as the baseline approaches.
△ Less
Submitted 3 November, 2021; v1 submitted 17 July, 2021;
originally announced August 2021.
-
Tournesol: A quest for a large, secure and trustworthy database of reliable human judgments
Authors:
Lê-Nguyên Hoang,
Louis Faucon,
Aidan Jungo,
Sergei Volodin,
Dalia Papuc,
Orfeas Liossatos,
Ben Crulis,
Mariame Tighanimine,
Isabela Constantin,
Anastasiia Kucherenko,
Alexandre Maurer,
Felix Grimberg,
Vlad Nitu,
Chris Vossen,
Sébastien Rouault,
El-Mahdi El-Mhamdi
Abstract:
Today's large-scale algorithms have become immensely influential, as they recommend and moderate the content that billions of humans are exposed to on a daily basis. They are the de-facto regulators of our societies' information diet, from shaping opinions on public health to organizing groups for social movements. This creates serious concerns, but also great opportunities to promote quality info…
▽ More
Today's large-scale algorithms have become immensely influential, as they recommend and moderate the content that billions of humans are exposed to on a daily basis. They are the de-facto regulators of our societies' information diet, from shaping opinions on public health to organizing groups for social movements. This creates serious concerns, but also great opportunities to promote quality information. Addressing the concerns and seizing the opportunities is a challenging, enormous and fabulous endeavor, as intuitively appealing ideas often come with unwanted {\it side effects}, and as it requires us to think about what we deeply prefer.
Understanding how today's large-scale algorithms are built is critical to determine what interventions will be most effective. Given that these algorithms rely heavily on {\it machine learning}, we make the following key observation: \emph{any algorithm trained on uncontrolled data must not be trusted}. Indeed, a malicious entity could take control over the data, poison it with dangerously manipulative fabricated inputs, and thereby make the trained algorithm extremely unsafe. We thus argue that the first step towards safe and ethical large-scale algorithms must be the collection of a large, secure and trustworthy dataset of reliable human judgments.
To achieve this, we introduce \emph{Tournesol}, an open source platform available at \url{https://tournesol.app}. Tournesol aims to collect a large database of human judgments on what algorithms ought to widely recommend (and what they ought to stop widely recommending). We outline the structure of the Tournesol database, the key features of the Tournesol platform and the main hurdles that must be overcome to make it a successful project. Most importantly, we argue that, if successful, Tournesol may then serve as the essential foundation for any safe and ethical large-scale algorithm.
△ Less
Submitted 29 May, 2021;
originally announced July 2021.
-
Some Hoeffding- and Bernstein-type Concentration Inequalities
Authors:
Andreas Maurer,
Massimiliano Pontil
Abstract:
We prove concentration inequalities for functions of independent random variables {under} sub-gaussian and sub-exponential conditions. The utility of the inequalities is demonstrated by an extension of the now classical method of Rademacher complexities to Lipschitz function classes and unbounded sub-exponential distribution.
We prove concentration inequalities for functions of independent random variables {under} sub-gaussian and sub-exponential conditions. The utility of the inequalities is demonstrated by an extension of the now classical method of Rademacher complexities to Lipschitz function classes and unbounded sub-exponential distribution.
△ Less
Submitted 23 June, 2021; v1 submitted 11 February, 2021;
originally announced February 2021.
-
Robust Unsupervised Learning via L-Statistic Minimization
Authors:
Andreas Maurer,
Daniela A. Parletta,
Andrea Paudice,
Massimiliano Pontil
Abstract:
Designing learning algorithms that are resistant to perturbations of the underlying data distribution is a problem of wide practical and theoretical importance. We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. This is exploited by…
▽ More
Designing learning algorithms that are resistant to perturbations of the underlying data distribution is a problem of wide practical and theoretical importance. We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. This is exploited by a general descent algorithm which minimizes an $L$-statistic criterion over the model class, weighting small losses more. Our analysis characterizes the robustness of the method in terms of bounds on the reconstruction error relative to the underlying unperturbed distribution. As a byproduct, we prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning, a result which may be of independent interest.Numerical experiments with kmeans clustering and principal subspace analysis demonstrate the effectiveness of our approach.
△ Less
Submitted 18 February, 2021; v1 submitted 14 December, 2020;
originally announced December 2020.
-
Advances in neutral tungsten ultraviolet spectroscopy for the potential benefit to gross erosion diagnosis
Authors:
C. A. Johnson,
D. A. Ennis,
S. D. Loch,
G. J. Hartwell,
D. A. Maurer,
S. L. Allen,
B. S. Victor,
C. M. Samuell,
T. Abrams,
E. A. Unterberg,
R. T. Smyth
Abstract:
A spectral survey of tungsten emission in the ultraviolet region has been completed in the DIII-D tokamak and the CTH torsatron to assess the potential benefit of UV emission for the diagnosis of gross W erosion. A total of 29 W I spectral lines are observed from the two experiments using survey spectrometers between 200-400 nm with level identifications provided based on a structure calculation f…
▽ More
A spectral survey of tungsten emission in the ultraviolet region has been completed in the DIII-D tokamak and the CTH torsatron to assess the potential benefit of UV emission for the diagnosis of gross W erosion. A total of 29 W I spectral lines are observed from the two experiments using survey spectrometers between 200-400 nm with level identifications provided based on a structure calculation for many of the excited states that produce strong emission lines. Of the 29 observed lines, 20 have not previously been reported in fusion relevant plasmas, including an intense line at 265.65 nm which could be important for benchmarking the frequently exploited line at 400.88 nm. Nearly all of the observed spectral lines decay down to one of the six lowest energy levels for neutral W, which are likely to be long-lived metastable states. The impact of metastable level populations on the W I emission spectrum and any erosion measurement utilizing a spectroscopic technique is potentially significant. Nevertheless, the high density of W I emission in the UV region allows for the possibly of determining the relative metastable fractions and plasma parameters local to the erosion region. Additionally, the lines observed in this work could be used to perform multiple independent gross erosion measurements, leading to more accurate diagnosis of gross tungsten erosion.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.
-
Learning Fair and Transferable Representations
Authors:
Luca Oneto,
Michele Donini,
Andreas Maurer,
Massimiliano Pontil
Abstract:
Developing learning methods which do not discriminate subgroups in the population is a central goal of algorithmic fairness. One way to reach this goal is by modifying the data representation in order to meet certain fairness constraints. In this work we measure fairness according to demographic parity. This requires the probability of the possible model decisions to be independent of the sensitiv…
▽ More
Developing learning methods which do not discriminate subgroups in the population is a central goal of algorithmic fairness. One way to reach this goal is by modifying the data representation in order to meet certain fairness constraints. In this work we measure fairness according to demographic parity. This requires the probability of the possible model decisions to be independent of the sensitive information. We argue that the goal of imposing demographic parity can be substantially facilitated within a multitask learning setting. We leverage task similarities by encouraging a shared fair representation across the tasks via low rank matrix factorization. We derive learning bounds establishing that the learned representation transfers well to novel tasks both in terms of prediction performance and fairness metrics. We present experiments on three real world datasets, showing that the proposed method outperforms state-of-the-art approaches by a significant margin.
△ Less
Submitted 31 January, 2020; v1 submitted 25 June, 2019;
originally announced June 2019.
-
Gathering with extremely restricted visibility
Authors:
Rachid Guerraoui,
Alexandre Maurer
Abstract:
We consider the classical problem of making mobile processes gather or converge at a same position (as performed by swarms of animals in Nature). Existing works assume that each process can see all other processes, or all processes within a certain radius. In this paper, we introduce a new model with an extremely restricted visibility: each process can only see one other process (its closest neigh…
▽ More
We consider the classical problem of making mobile processes gather or converge at a same position (as performed by swarms of animals in Nature). Existing works assume that each process can see all other processes, or all processes within a certain radius. In this paper, we introduce a new model with an extremely restricted visibility: each process can only see one other process (its closest neighbor). Our goal is to see if (and to what extent) the gathering and convergence problems can be solved in this setting. We first show that, surprisingly, the problem can be solved for a small number of processes (at most 5), but not beyond. This is due to indeterminacy in the case where there are several closest neighbors for a same process. By removing this indeterminacy with an additional hypothesis (choosing the closest neighbor according to an order on the positions of processes), we then show that the problem can be solved for any number of processes. We also show that up to one crash failure can be tolerated for the convergence problem.
△ Less
Submitted 14 June, 2019;
originally announced June 2019.
-
Uniform concentration and symmetrization for weak interactions
Authors:
Andreas Maurer,
Massimiliano Pontil
Abstract:
The method to derive uniform bounds with Gaussian and Rademacher complexities is extended to the case where the sample average is replaced by a nonlinear statistic. Tight bounds are obtained for U-statistics, smoothened L-statistics and error functionals of l2-regularized algorithms.
The method to derive uniform bounds with Gaussian and Rademacher complexities is extended to the case where the sample average is replaced by a nonlinear statistic. Tight bounds are obtained for U-statistics, smoothened L-statistics and error functionals of l2-regularized algorithms.
△ Less
Submitted 10 May, 2019; v1 submitted 5 February, 2019;
originally announced February 2019.
-
Limiting the Spread of Fake News on Social Media Platforms by Evaluating Users' Trustworthiness
Authors:
Oana Balmau,
Rachid Guerraoui,
Anne-Marie Kermarrec,
Alexandre Maurer,
Matej Pavlovic,
Willy Zwaenepoel
Abstract:
Today's social media platforms enable to spread both authentic and fake news very quickly. Some approaches have been proposed to automatically detect such "fake" news based on their content, but it is difficult to agree on universal criteria of authenticity (which can be bypassed by adversaries once known). Besides, it is obviously impossible to have each news item checked by a human.
In this pa…
▽ More
Today's social media platforms enable to spread both authentic and fake news very quickly. Some approaches have been proposed to automatically detect such "fake" news based on their content, but it is difficult to agree on universal criteria of authenticity (which can be bypassed by adversaries once known). Besides, it is obviously impossible to have each news item checked by a human.
In this paper, we a mechanism to limit the spread of fake news which is not based on content. It can be implemented as a plugin on a social media platform. The principle is as follows: a team of fact-checkers reviews a small number of news items (the most popular ones), which enables to have an estimation of each user's inclination to share fake news items. Then, using a Bayesian approach, we estimate the trustworthiness of future news items, and treat accordingly those of them that pass a certain "untrustworthiness" threshold.
We then evaluate the effectiveness and overhead of this technique on a large Twitter graph. We show that having a few thousands users exposed to one given news item enables to reach a very precise estimation of its reliability. We thus identify more than 99% of fake news items with no false positives. The performance impact is very small: the induced overhead on the 90th percentile latency is less than 3%, and less than 8% on the throughput of user operations.
△ Less
Submitted 29 August, 2018;
originally announced August 2018.
-
Removing Algorithmic Discrimination (With Minimal Individual Error)
Authors:
El Mahdi El Mhamdi,
Rachid Guerraoui,
Lê Nguyên Hoang,
Alexandre Maurer
Abstract:
We address the problem of correcting group discriminations within a score function, while minimizing the individual error. Each group is described by a probability density function on the set of profiles. We first solve the problem analytically in the case of two populations, with a uniform bonus-malus on the zones where each population is a majority. We then address the general case of n populati…
▽ More
We address the problem of correcting group discriminations within a score function, while minimizing the individual error. Each group is described by a probability density function on the set of profiles. We first solve the problem analytically in the case of two populations, with a uniform bonus-malus on the zones where each population is a majority. We then address the general case of n populations, where the entanglement of populations does not allow a similar analytical solution. We show that an approximate solution with an arbitrarily high level of precision can be computed with linear programming. Finally, we address the inverse problem where the error should not go beyond a certain value and we seek to minimize the discrimination.
△ Less
Submitted 7 June, 2018;
originally announced June 2018.
-
Virtuously Safe Reinforcement Learning
Authors:
Henrik Aslund,
El Mahdi El Mhamdi,
Rachid Guerraoui,
Alexandre Maurer
Abstract:
We show that when a third party, the adversary, steps into the two-party setting (agent and operator) of safely interruptible reinforcement learning, a trade-off has to be made between the probability of following the optimal policy in the limit, and the probability of escaping a dangerous situation created by the adversary. So far, the work on safely interruptible agents has assumed a perfect per…
▽ More
We show that when a third party, the adversary, steps into the two-party setting (agent and operator) of safely interruptible reinforcement learning, a trade-off has to be made between the probability of following the optimal policy in the limit, and the probability of escaping a dangerous situation created by the adversary. So far, the work on safely interruptible agents has assumed a perfect perception of the agent about its environment (no adversary), and therefore implicitly set the second probability to zero, by explicitly seeking a value of one for the first probability. We show that (1) agents can be made both interruptible and adversary-resilient, and (2) the interruptibility can be made safe in the sense that the agent itself will not seek to avoid it. We also solve the problem that arises when the agent does not go completely greedy, i.e. issues with safe exploration in the limit. Resilience to perturbed perception, safe exploration in the limit, and safe interruptibility are the three pillars of what we call \emph{virtuously safe reinforcement learning}.
△ Less
Submitted 29 May, 2018;
originally announced May 2018.
-
Empirical bounds for functions with weak interactions
Authors:
Andreas Maurer,
Massimiliano Pontil
Abstract:
We provide sharp empirical estimates of expectation, variance and normal approximation for a class of statistics whose variation in any argument does not change too much when another argument is modified. Examples of such weak interactions are furnished by U- and V-statistics, Lipschitz L-statistics and various error functionals of L2-regularized algorithms and Gibbs algorithms.
We provide sharp empirical estimates of expectation, variance and normal approximation for a class of statistics whose variation in any argument does not change too much when another argument is modified. Examples of such weak interactions are furnished by U- and V-statistics, Lipschitz L-statistics and various error functionals of L2-regularized algorithms and Gibbs algorithms.
△ Less
Submitted 11 March, 2018;
originally announced March 2018.
-
Learning to Gather without Communication
Authors:
El Mahdi El Mhamdi,
Rachid Guerraoui,
Alexandre Maurer,
Vladislav Tempez
Abstract:
A standard belief on emerging collective behavior is that it emerges from simple individual rules. Most of the mathematical research on such collective behavior starts from imperative individual rules, like always go to the center. But how could an (optimal) individual rule emerge during a short period within the group lifetime, especially if communication is not available. We argue that such rule…
▽ More
A standard belief on emerging collective behavior is that it emerges from simple individual rules. Most of the mathematical research on such collective behavior starts from imperative individual rules, like always go to the center. But how could an (optimal) individual rule emerge during a short period within the group lifetime, especially if communication is not available. We argue that such rules can actually emerge in a group in a short span of time via collective (multi-agent) reinforcement learning, i.e learning via rewards and punishments. We consider the gathering problem: several agents (social animals, swarming robots...) must gather around a same position, which is not determined in advance. They must do so without communication on their planned decision, just by looking at the position of other agents. We present the first experimental evidence that a gathering behavior can be learned without communication in a partially observable environment. The learned behavior has the same properties as a self-stabilizing distributed algorithm, as processes can gather from any initial state (and thus tolerate any transient failure). Besides, we show that it is possible to tolerate the brutal loss of up to 90\% of agents without significant impact on the behavior.
△ Less
Submitted 21 February, 2018;
originally announced February 2018.
-
On The Finite Generation Of Relative Cohomology For Lie Superalgebras
Authors:
Andrew Maurer
Abstract:
The author establishes finite-generation of the cohomology ring of a classical Lie superalgebra relative to an even subsuperalgebra. A spectral sequence is constructed to provide conditions for when this relative cohomology ring is Cohen-Macaulay. With finite generation established, support varieties for modules are defined via the relative cohomology, which generalize those of [BKN-1]
The author establishes finite-generation of the cohomology ring of a classical Lie superalgebra relative to an even subsuperalgebra. A spectral sequence is constructed to provide conditions for when this relative cohomology ring is Cohen-Macaulay. With finite generation established, support varieties for modules are defined via the relative cohomology, which generalize those of [BKN-1]
△ Less
Submitted 27 July, 2018; v1 submitted 6 November, 2017;
originally announced November 2017.
-
Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning
Authors:
El Mahdi El Mhamdi,
Rachid Guerraoui,
Hadrien Hendrikx,
Alexandre Maurer
Abstract:
In reinforcement learning, agents learn by performing actions and observing their outcomes. Sometimes, it is desirable for a human operator to \textit{interrupt} an agent in order to prevent dangerous situations from happening. Yet, as part of their learning process, agents may link these interruptions, that impact their reward, to specific states and deliberately avoid them. The situation is part…
▽ More
In reinforcement learning, agents learn by performing actions and observing their outcomes. Sometimes, it is desirable for a human operator to \textit{interrupt} an agent in order to prevent dangerous situations from happening. Yet, as part of their learning process, agents may link these interruptions, that impact their reward, to specific states and deliberately avoid them. The situation is particularly challenging in a multi-agent context because agents might not only learn from their own past interruptions, but also from those of other agents. Orseau and Armstrong defined \emph{safe interruptibility} for one learner, but their work does not naturally extend to multi-agent systems. This paper introduces \textit{dynamic safe interruptibility}, an alternative definition more suited to decentralized learning problems, and studies this notion in two learning frameworks: \textit{joint action learners} and \textit{independent learners}. We give realistic sufficient conditions on the learning algorithm to enable dynamic safe interruptibility in the case of joint action learners, yet show that these conditions are not sufficient for independent learners. We show however that if agents can detect interruptions, it is possible to prune the observations to ensure dynamic safe interruptibility even for independent learners.
△ Less
Submitted 22 May, 2017; v1 submitted 10 April, 2017;
originally announced April 2017.
-
A Bernstein-type inequality for functions of bounded interaction
Authors:
Andreas Maurer
Abstract:
We give a distribution-dependent concentration inequality for functions of independent variables. The result extends Bernstein's inequality from sums to more general functions, whose variation in any argument does not depend too much on the other arguments. Applications sharpen existing bounds for U-statistics and the generalization error of regularized least squares.
We give a distribution-dependent concentration inequality for functions of independent variables. The result extends Bernstein's inequality from sums to more general functions, whose variation in any argument does not depend too much on the other arguments. Applications sharpen existing bounds for U-statistics and the generalization error of regularized least squares.
△ Less
Submitted 11 May, 2017; v1 submitted 22 January, 2017;
originally announced January 2017.
-
Bounds for Vector-Valued Function Estimation
Authors:
Andreas Maurer,
Massimiliano Pontil
Abstract:
We present a framework to derive risk bounds for vector-valued learning with a broad class of feature maps and loss functions. Multi-task learning and one-vs-all multi-category learning are treated as examples. We discuss in detail vector-valued functions with one hidden layer, and demonstrate that the conditions under which shared representations are beneficial for multi- task learning are equall…
▽ More
We present a framework to derive risk bounds for vector-valued learning with a broad class of feature maps and loss functions. Multi-task learning and one-vs-all multi-category learning are treated as examples. We discuss in detail vector-valued functions with one hidden layer, and demonstrate that the conditions under which shared representations are beneficial for multi- task learning are equally applicable to multi-category learning.
△ Less
Submitted 5 June, 2016;
originally announced June 2016.
-
A vector-contraction inequality for Rademacher complexities
Authors:
Andreas Maurer
Abstract:
The contraction inequality for Rademacher averages is extended to Lipschitz functions with vector-valued domains, and it is also shown that in the bounding expression the Rademacher variables can be replaced by arbitrary iid symmetric and sub-gaussian variables. Example applications are given for multi-category learning, K-means clustering and learning-to-learn.
The contraction inequality for Rademacher averages is extended to Lipschitz functions with vector-valued domains, and it is also shown that in the bounding expression the Rademacher variables can be replaced by arbitrary iid symmetric and sub-gaussian variables. Example applications are given for multi-category learning, K-means clustering and learning-to-learn.
△ Less
Submitted 1 May, 2016;
originally announced May 2016.
-
The Benefit of Multitask Representation Learning
Authors:
Andreas Maurer,
Massimiliano Pontil,
Bernardino Romera-Paredes
Abstract:
We discuss a general method to learn data representations from multiple tasks. We provide a justification for this method in both settings of multitask learning and learning-to-learn. The method is illustrated in detail in the special case of linear feature learning. Conditions on the theoretical advantage offered by multitask representation learning over independent task learning are established.…
▽ More
We discuss a general method to learn data representations from multiple tasks. We provide a justification for this method in both settings of multitask learning and learning-to-learn. The method is illustrated in detail in the special case of linear feature learning. Conditions on the theoretical advantage offered by multitask representation learning over independent task learning are established. In particular, focusing on the important example of half-space learning, we derive the regime in which multitask representation learning is beneficial over independent task learning, as a function of the sample size, the number of tasks and the intrinsic data dimensionality. Other potential applications of our results include multitask feature learning in reproducing kernel Hilbert spaces and multilayer, deep networks.
△ Less
Submitted 25 March, 2016; v1 submitted 23 May, 2015;
originally announced May 2015.
-
Uniform Estimation Beyond the Mean
Authors:
Andreas Maurer
Abstract:
Finite sample bounds on the estimation error of the mean by the empirical mean, uniform over a class of functions, can often be conveniently obtained in terms of Rademacher or Gaussian averages of the class. If a function of n variables has suitably bounded partial derivatives, it can be substituted for the empirical mean, with uniform estimation again controlled by Gaussian averages. Up to a cons…
▽ More
Finite sample bounds on the estimation error of the mean by the empirical mean, uniform over a class of functions, can often be conveniently obtained in terms of Rademacher or Gaussian averages of the class. If a function of n variables has suitably bounded partial derivatives, it can be substituted for the empirical mean, with uniform estimation again controlled by Gaussian averages. Up to a constant the result recovers standard results for the empirical mean and more recent ones about U-statistics, and extends to a general class of estimation problems.
△ Less
Submitted 7 March, 2015;
originally announced March 2015.
-
A chain rule for the expected suprema of Gaussian processes
Authors:
Andreas Maurer
Abstract:
The expected supremum of a Gaussian process indexed by the image of an index set under a function class is bounded in terms of separate properties of the index set and the function class. The bound is relevant to the estimation of nonlinear transformations or the analysis of learning algorithms whenever hypotheses are chosen from composite classes, as is the case for multi-layer models.
The expected supremum of a Gaussian process indexed by the image of an index set under a function class is bounded in terms of separate properties of the index set and the function class. The bound is relevant to the estimation of nonlinear transformations or the analysis of learning algorithms whenever hypotheses are chosen from composite classes, as is the case for multi-layer models.
△ Less
Submitted 10 November, 2014;
originally announced November 2014.
-
An Inequality with Applications to Structured Sparsity and Multitask Dictionary Learning
Authors:
Andreas Maurer,
Massimiliano Pontil,
Bernardino Romera-Paredes
Abstract:
From concentration inequalities for the suprema of Gaussian or Rademacher processes an inequality is derived. It is applied to sharpen existing and to derive novel bounds on the empirical Rademacher complexities of unit balls in various norms appearing in the context of structured sparsity and multitask dictionary learning or matrix factorization. A key role is played by the largest eigenvalue of…
▽ More
From concentration inequalities for the suprema of Gaussian or Rademacher processes an inequality is derived. It is applied to sharpen existing and to derive novel bounds on the empirical Rademacher complexities of unit balls in various norms appearing in the context of structured sparsity and multitask dictionary learning or matrix factorization. A key role is played by the largest eigenvalue of the data covariance matrix.
△ Less
Submitted 7 June, 2014; v1 submitted 8 February, 2014;
originally announced February 2014.
-
Reliable Communication in a Dynamic Network in the Presence of Byzantine Faults
Authors:
Alexandre Maurer,
Sébastien Tixeuil,
Xavier Défago
Abstract:
We consider the following problem: two nodes want to reliably communicate in a dynamic multihop network where some nodes have been compromised, and may have a totally arbitrary and unpredictable behavior. These nodes are called Byzantine. We consider the two cases where cryptography is available and not available.
We prove the necessary and sufficient condition (that is, the weakest possible condi…
▽ More
We consider the following problem: two nodes want to reliably communicate in a dynamic multihop network where some nodes have been compromised, and may have a totally arbitrary and unpredictable behavior. These nodes are called Byzantine. We consider the two cases where cryptography is available and not available.
We prove the necessary and sufficient condition (that is, the weakest possible condition) to ensure reliable communication in this context. Our proof is constructive, as we provide Byzantine-resilient algorithms for reliable communication that are optimal with respect to our impossibility results.
In a second part, we investigate the impact of our conditions in three case studies: participants interacting in a conference, robots moving on a grid and agents in the subway. Our simulations indicate a clear benefit of using our algorithms for reliable communication in those contexts.
△ Less
Submitted 16 February, 2015; v1 submitted 1 February, 2014;
originally announced February 2014.
-
On the minimum order of k-cop-win graphs
Authors:
William Baird,
Andrew Beveridge,
Anthony Bonato,
Paolo Codenotti,
Aaron Maurer,
John McCauley,
Silviya Valeva
Abstract:
We consider the minimum order graphs with a given cop number. We prove that the minimum order of a connected graph with cop number 3 is 10, and show that the Petersen graph is the unique isomorphism type of graph with this property. We provide the results of a computational search on the cop number of all graphs up to and including order 10. A relationship is presented between the minimum order of…
▽ More
We consider the minimum order graphs with a given cop number. We prove that the minimum order of a connected graph with cop number 3 is 10, and show that the Petersen graph is the unique isomorphism type of graph with this property. We provide the results of a computational search on the cop number of all graphs up to and including order 10. A relationship is presented between the minimum order of graph with cop number $k$ and Meyniel's conjecture on the asymptotic maximum value of the cop number of a connected graph.
△ Less
Submitted 13 August, 2013;
originally announced August 2013.
-
CTA contributions to the 33rd International Cosmic Ray Conference (ICRC2013)
Authors:
The CTA Consortium,
:,
O. Abril,
B. S. Acharya,
M. Actis,
G. Agnetta,
J. A. Aguilar,
F. Aharonian,
M. Ajello,
A. Akhperjanian,
M. Alcubierre,
J. Aleksic,
R. Alfaro,
E. Aliu,
A. J. Allafort,
D. Allan,
I. Allekotte,
R. Aloisio,
E. Amato,
G. Ambrosi,
M. Ambrosio,
J. Anderson,
E. O. Angüner,
L. A. Antonelli,
V. Antonuccio
, et al. (1082 additional authors not shown)
Abstract:
Compilation of CTA contributions to the proceedings of the 33rd International Cosmic Ray Conference (ICRC2013), which took place in 2-9 July, 2013, in Rio de Janeiro, Brazil
Compilation of CTA contributions to the proceedings of the 33rd International Cosmic Ray Conference (ICRC2013), which took place in 2-9 July, 2013, in Rio de Janeiro, Brazil
△ Less
Submitted 29 July, 2013; v1 submitted 8 July, 2013;
originally announced July 2013.
-
Parameterizable Byzantine Broadcast in Loosely Connected Networks
Authors:
Alexandre Maurer,
Sébastien Tixeuil
Abstract:
We consider the problem of reliably broadcasting information in a multihop asynchronous network, despite the presence of Byzantine failures: some nodes are malicious and behave arbitrarly. We focus on non-cryptographic solutions. Most existing approaches give conditions for perfect reliable broadcast (all correct nodes deliver the good information), but require a highly connected network. A probab…
▽ More
We consider the problem of reliably broadcasting information in a multihop asynchronous network, despite the presence of Byzantine failures: some nodes are malicious and behave arbitrarly. We focus on non-cryptographic solutions. Most existing approaches give conditions for perfect reliable broadcast (all correct nodes deliver the good information), but require a highly connected network. A probabilistic approach was recently proposed for loosely connected networks: the Byzantine failures are randomly distributed, and the correct nodes deliver the good information with high probability. A first solution require the nodes to initially know their position on the network, which may be difficult or impossible in self-organizing or dynamic networks. A second solution relaxed this hypothesis but has much weaker Byzantine tolerance guarantees. In this paper, we propose a parameterizable broadcast protocol that does not require nodes to have any knowledge about the network. We give a deterministic technique to compute a set of nodes that always deliver authentic information, for a given set of Byzantine failures. Then, we use this technique to experimentally evaluate our protocol, and show that it significantely outperforms previous solutions with the same hypotheses. Important disclaimer: these results have NOT yet been published in an international conference or journal. This is just a technical report presenting intermediary and incomplete results. A generalized version of these results may be under submission.
△ Less
Submitted 8 December, 2013; v1 submitted 17 January, 2013;
originally announced January 2013.
-
On Byzantine Broadcast in Planar Graphs
Authors:
Alexandre Maurer,
Sébastien Tixeuil
Abstract:
We consider the problem of reliably broadcasting information in a multihop asynchronous network in the presence of Byzantine failures: some nodes may exhibit unpredictable malicious behavior. We focus on completely decentralized solutions. Few Byzantine-robust algorithms exist for loosely connected networks. A recent solution guarantees reliable broadcast on a torus when D > 4, D being the minimal…
▽ More
We consider the problem of reliably broadcasting information in a multihop asynchronous network in the presence of Byzantine failures: some nodes may exhibit unpredictable malicious behavior. We focus on completely decentralized solutions. Few Byzantine-robust algorithms exist for loosely connected networks. A recent solution guarantees reliable broadcast on a torus when D > 4, D being the minimal distance between two Byzantine nodes. In this paper, we generalize this result to 4-connected planar graphs. We show that reliable broadcast can be guaranteed when D > Z, Z being the maximal number of edges per polygon. We also show that this bound on D is a lower bound for this class of graphs. Our solution has the same time complexity as a simple broadcast. This is also the first solution where the memory required increases linearly (instead of exponentially) with the size of transmitted information. Important disclaimer: these results have NOT yet been published in an international conference or journal. This is just a technical report presenting intermediary and incomplete results. A generalized version of these results may be under submission.
△ Less
Submitted 7 December, 2013; v1 submitted 14 January, 2013;
originally announced January 2013.
-
Excess risk bounds for multitask learning with trace norm regularization
Authors:
Andreas Maurer,
Massimiliano Pontil
Abstract:
Trace norm regularization is a popular method of multitask learning. We give excess risk bounds with explicit dependence on the number of tasks, the number of examples per task and properties of the data distribution. The bounds are independent of the dimension of the input space, which may be infinite as in the case of reproducing kernel Hilbert spaces. A byproduct of the proof are bounds on the…
▽ More
Trace norm regularization is a popular method of multitask learning. We give excess risk bounds with explicit dependence on the number of tasks, the number of examples per task and properties of the data distribution. The bounds are independent of the dimension of the input space, which may be infinite as in the case of reproducing kernel Hilbert spaces. A byproduct of the proof are bounds on the expected norm of sums of random positive semidefinite matrices with subexponential moments.
△ Less
Submitted 14 January, 2013; v1 submitted 6 December, 2012;
originally announced December 2012.
-
A Scalable Byzantine Grid
Authors:
Alexandre Maurer,
Sébastien Tixeuil
Abstract:
Modern networks assemble an ever growing number of nodes. However, it remains difficult to increase the number of channels per node, thus the maximal degree of the network may be bounded. This is typically the case in grid topology networks, where each node has at most four neighbors. In this paper, we address the following issue: if each node is likely to fail in an unpredictable manner, how can…
▽ More
Modern networks assemble an ever growing number of nodes. However, it remains difficult to increase the number of channels per node, thus the maximal degree of the network may be bounded. This is typically the case in grid topology networks, where each node has at most four neighbors. In this paper, we address the following issue: if each node is likely to fail in an unpredictable manner, how can we preserve some global reliability guarantees when the number of nodes keeps increasing unboundedly ? To be more specific, we consider the problem or reliably broadcasting information on an asynchronous grid in the presence of Byzantine failures -- that is, some nodes may have an arbitrary and potentially malicious behavior. Our requirement is that a constant fraction of correct nodes remain able to achieve reliable communication. Existing solutions can only tolerate a fixed number of Byzantine failures if they adopt a worst-case placement scheme. Besides, if we assume a constant Byzantine ratio (each node has the same probability to be Byzantine), the probability to have a fatal placement approaches 1 when the number of nodes increases, and reliability guarantees collapse. In this paper, we propose the first broadcast protocol that overcomes these difficulties. First, the number of Byzantine failures that can be tolerated (if they adopt the worst-case placement) now increases with the number of nodes. Second, we are able to tolerate a constant Byzantine ratio, however large the grid may be. In other words, the grid becomes scalable. This result has important security applications in ultra-large networks, where each node has a given probability to misbehave.
△ Less
Submitted 17 October, 2012;
originally announced October 2012.
-
On Byzantine Broadcast in Loosely Connected Networks
Authors:
Alexandre Maurer,
Sébastien Tixeuil
Abstract:
We consider the problem of reliably broadcasting information in a multihop asynchronous network that is subject to Byzantine failures. Most existing approaches give conditions for perfect reliable broadcast (all correct nodes deliver the authentic message and nothing else), but they require a highly connected network. An approach giving only probabilistic guarantees (correct nodes deliver the auth…
▽ More
We consider the problem of reliably broadcasting information in a multihop asynchronous network that is subject to Byzantine failures. Most existing approaches give conditions for perfect reliable broadcast (all correct nodes deliver the authentic message and nothing else), but they require a highly connected network. An approach giving only probabilistic guarantees (correct nodes deliver the authentic message with high probability) was recently proposed for loosely connected networks, such as grids and tori. Yet, the proposed solution requires a specific initialization (that includes global knowledge) of each node, which may be difficult or impossible to guarantee in self-organizing networks - for instance, a wireless sensor network, especially if they are prone to Byzantine failures. In this paper, we propose a new protocol offering guarantees for loosely connected networks that does not require such global knowledge dependent initialization. In more details, we give a methodology to determine whether a set of nodes will always deliver the authentic message, in any execution. Then, we give conditions for perfect reliable broadcast in a torus network. Finally, we provide experimental evaluation for our solution, and determine the number of randomly distributed Byzantine failures than can be tolerated, for a given correct broadcast probability.
△ Less
Submitted 5 September, 2012;
originally announced September 2012.
-
Sparse coding for multitask and transfer learning
Authors:
Andreas Maurer,
Massimiliano Pontil,
Bernardino Romera-Paredes
Abstract:
We investigate the use of sparse coding and dictionary learning in the context of multitask and transfer learning. The central assumption of our learning method is that the tasks parameters are well approximated by sparse linear combinations of the atoms of a dictionary on a high or infinite dimensional space. This assumption, together with the large quantity of available data in the multitask and…
▽ More
We investigate the use of sparse coding and dictionary learning in the context of multitask and transfer learning. The central assumption of our learning method is that the tasks parameters are well approximated by sparse linear combinations of the atoms of a dictionary on a high or infinite dimensional space. This assumption, together with the large quantity of available data in the multitask and transfer learning settings, allows a principled choice of the dictionary. We provide bounds on the generalization error of this approach, for both settings. Numerical experiments on one synthetic and two real datasets show the advantage of our method over single task learning, a previous method based on orthogonal and dense representation of the tasks and a related method learning task grouping.
△ Less
Submitted 16 June, 2014; v1 submitted 4 September, 2012;
originally announced September 2012.
-
Thermodynamics and concentration
Authors:
Andreas Maurer
Abstract:
We show that the thermal subadditivity of entropy provides a common basis to derive a strong form of the bounded difference inequality and related results as well as more recent inequalities applicable to convex Lipschitz functions, random symmetric matrices, shortest travelling salesmen paths and weakly self-bounding functions. We also give two new concentration inequalities.
We show that the thermal subadditivity of entropy provides a common basis to derive a strong form of the bounded difference inequality and related results as well as more recent inequalities applicable to convex Lipschitz functions, random symmetric matrices, shortest travelling salesmen paths and weakly self-bounding functions. We also give two new concentration inequalities.
△ Less
Submitted 8 May, 2012;
originally announced May 2012.
-
Limiting Byzantien Influence in Multihop Asynchronous Networks
Authors:
Alexandre Maurer,
Sébastien Tixeuil
Abstract:
We consider the problem of reliably broadcasting information in a multihop asyn- chronous network that is subject to Byzantine failures. That is, some nodes of the network can exhibit arbitrary (and potentially malicious) behavior. Existing solutions provide de- terministic guarantees for broadcasting between all correct nodes, but require that the communication network is highly-connected (typica…
▽ More
We consider the problem of reliably broadcasting information in a multihop asyn- chronous network that is subject to Byzantine failures. That is, some nodes of the network can exhibit arbitrary (and potentially malicious) behavior. Existing solutions provide de- terministic guarantees for broadcasting between all correct nodes, but require that the communication network is highly-connected (typically, 2k + 1 connectivity is required, where k is the total number of Byzantine nodes in the network). In this paper, we investigate the possibility of Byzantine tolerant reliable broadcast be- tween most correct nodes in low-connectivity networks (typically, networks with constant connectivity). In more details, we propose a new broadcast protocol that is specifically designed for low-connectivity networks. We provide sufficient conditions for correct nodes using our protocol to reliably communicate despite Byzantine participants. We present experimental results that show that our approach is especially effective in low-connectivity networks when Byzantine nodes are randomly distributed.
△ Less
Submitted 27 January, 2012;
originally announced January 2012.
-
Dark matter powered stars: Constraints from the extragalactic background light
Authors:
A. Maurer,
M. Raue,
T. Kneiske,
D. Elsässer,
P. H. Hauschildt,
D. Horns
Abstract:
The existence of predominantly cold non-baryonic dark matter is unambiguously demonstrated by several observations (e.g., structure formation, big bang nucleosynthesis, gravitational lensing, and rotational curves of spiral galaxies). A candidate well motivated by particle physics is a weakly interacting massive particle (WIMP). Self-annihilating WIMPs would affect the stellar evolution especially…
▽ More
The existence of predominantly cold non-baryonic dark matter is unambiguously demonstrated by several observations (e.g., structure formation, big bang nucleosynthesis, gravitational lensing, and rotational curves of spiral galaxies). A candidate well motivated by particle physics is a weakly interacting massive particle (WIMP). Self-annihilating WIMPs would affect the stellar evolution especially in the early universe. Stars powered by self-annihilating WIMP dark matter should possess different properties compared with standard stars. While a direct detection of such dark matter powered stars seems very challenging, their cumulative emission might leave an imprint in the diffuse metagalactic radiation fields, in particular in the mid-infrared part of the electromagnetic spectrum. In this work the possible contributions of dark matter powered stars (dark stars; DSs) to the extragalactic background light (EBL) are calculated. It is shown that existing data and limits of the EBL intensity can already be used to rule out some DS parameter sets.
△ Less
Submitted 5 January, 2012;
originally announced January 2012.
-
The Petersen graph is the smallest 3-cop-win graph
Authors:
Andrew Beveridge,
Paolo Codenotti,
Aaron Maurer,
John McCauley,
Silviya Valeva
Abstract:
In the game of \emph{cops and robbers} on a graph $G = (V,E)$, $k$ cops try to catch a robber. On the cop turn, each cop may move to a neighboring vertex or remain in place. On the robber's turn, he moves similarly. The cops win if there is some time at which a cop is at the same vertex as the robber. Otherwise, the robber wins. The minimum number of cops required to catch the robber is called the…
▽ More
In the game of \emph{cops and robbers} on a graph $G = (V,E)$, $k$ cops try to catch a robber. On the cop turn, each cop may move to a neighboring vertex or remain in place. On the robber's turn, he moves similarly. The cops win if there is some time at which a cop is at the same vertex as the robber. Otherwise, the robber wins. The minimum number of cops required to catch the robber is called the \emph{cop number} of $G$, and is denoted $c(G)$. Let $m_k$ be the minimum order of a connected graph satisfying $c(G) \geq k$. Recently, Baird and Bonato determined via computer search that $m_3=10$ and that this value is attained uniquely by the Petersen graph. Herein, we give a self-contained mathematical proof of this result. Along the way, we give some characterizations of graphs with $c(G) >2$ and very high maximum degree.
△ Less
Submitted 2 April, 2012; v1 submitted 4 October, 2011;
originally announced October 2011.
-
Structured Sparsity and Generalization
Authors:
Andreas Maurer,
Massimiliano Pontil
Abstract:
We present a data dependent generalization bound for a large class of regularized algorithms which implement structured sparsity constraints. The bound can be applied to standard squared-norm regularization, the Lasso, the group Lasso, some versions of the group Lasso with overlapping groups, multiple kernel learning and other regularization schemes. In all these cases competitive results are obta…
▽ More
We present a data dependent generalization bound for a large class of regularized algorithms which implement structured sparsity constraints. The bound can be applied to standard squared-norm regularization, the Lasso, the group Lasso, some versions of the group Lasso with overlapping groups, multiple kernel learning and other regularization schemes. In all these cases competitive results are obtained. A novel feature of our bound is that it can be applied in an infinite dimensional setting such as the Lasso in a separable Hilbert space or multiple kernel learning with a countable number of kernels.
△ Less
Submitted 2 September, 2011; v1 submitted 17 August, 2011;
originally announced August 2011.
-
Empirical Bernstein Bounds and Sample Variance Penalization
Authors:
Andreas Maurer,
Massimiliano Pontil
Abstract:
We give improved constants for data dependent and variance sensitive confidence bounds, called empirical Bernstein bounds, and extend these inequalities to hold uniformly over classes of functionswhose growth function is polynomial in the sample size n. The bounds lead us to consider sample variance penalization, a novel learning method which takes into account the empirical variance of the loss…
▽ More
We give improved constants for data dependent and variance sensitive confidence bounds, called empirical Bernstein bounds, and extend these inequalities to hold uniformly over classes of functionswhose growth function is polynomial in the sample size n. The bounds lead us to consider sample variance penalization, a novel learning method which takes into account the empirical variance of the loss function. We give conditions under which sample variance penalization is effective. In particular, we present a bound on the excess risk incurred by the method. Using this, we argue that there are situations in which the excess risk of our method is of order 1/n, while the excess risk of empirical risk minimization is of order 1/sqrt/{n}. We show some experimental results, which confirm the theory. Finally, we discuss the potential application of our results to sample compression schemes.
△ Less
Submitted 21 July, 2009;
originally announced July 2009.
-
An optimization problem on the sphere
Authors:
Andreas Maurer
Abstract:
We prove existence and uniqueness of the minimizer for the average geodesic distance to the points of a geodesically convex set on the sphere. This implies a corresponding existence and uniqueness result for an optimal algorithm for halfspace learning, when data and target functions are drawn from the uniform distribution.
We prove existence and uniqueness of the minimizer for the average geodesic distance to the points of a geodesically convex set on the sphere. This implies a corresponding existence and uniqueness result for an optimal algorithm for halfspace learning, when data and target functions are drawn from the uniform distribution.
△ Less
Submitted 15 May, 2008;
originally announced May 2008.
-
A Note on the PAC Bayesian Theorem
Authors:
Andreas Maurer
Abstract:
We prove general exponential moment inequalities for averages of [0,1]-valued iid random variables and use them to tighten the PAC Bayesian Theorem. The logarithmic dependence on the sample count in the enumerator of the PAC Bayesian bound is halved.
We prove general exponential moment inequalities for averages of [0,1]-valued iid random variables and use them to tighten the PAC Bayesian Theorem. The logarithmic dependence on the sample count in the enumerator of the PAC Bayesian bound is halved.
△ Less
Submitted 30 November, 2004;
originally announced November 2004.