Search | arXiv e-print repository

doi 10.2172/2372903

Flexible Stellarator Physics Facility

Authors: F. I. Parra, S. -G. Baek, M. Churchill, D. R. Demers, B. Dudson, N. M. Ferraro, B. Geiger, S. Gerhardt, K. C. Hammond, S. Hudson, R. Jorge, E. Kolemen, D. M. Kriete, S. T. A. Kumar, M. Landreman, C. Lowe, D. A. Maurer, F. Nespoli, N. Pablant, M. J. Pueschel, A. Punjabi, J. A. Schwartz, C. P. S. Swanson, A. M. Wright

Abstract: We propose to build a Flexible Stellarator Physics Facility to explore promising regions of the vast parameter space of disruption-free stellarator solutions for Fusion Pilot Plants (FPPs). We propose to build a Flexible Stellarator Physics Facility to explore promising regions of the vast parameter space of disruption-free stellarator solutions for Fusion Pilot Plants (FPPs). △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: White paper submitted to FESAC subcommittee on Facilities, 8 pages

arXiv:2405.14469 [pdf, ps, other]

Generalization of Hamiltonian algorithms

Authors: Andreas Maurer

Abstract: The paper proves generalization results for a class of stochastic learning algorithms. The method applies whenever the algorithm generates an absolutely continuous distribution relative to some a-priori measure and the Radon Nikodym derivative has subgaussian concentration. Applications are bounds for the Gibbs algorithm and randomizations of stable deterministic algorithms as well as PAC-Bayesian… ▽ More The paper proves generalization results for a class of stochastic learning algorithms. The method applies whenever the algorithm generates an absolutely continuous distribution relative to some a-priori measure and the Radon Nikodym derivative has subgaussian concentration. Applications are bounds for the Gibbs algorithm and randomizations of stable deterministic algorithms as well as PAC-Bayesian bounds with data-dependent priors. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2404.07896 [pdf, other]

Auditing health-related recommendations in social media: A Case Study of Abortion on YouTube

Authors: Mohammed Lahsaini, Mohamed Lechiakh, Alexandre Maurer

Abstract: Recommendation algorithms (RS) used by social media, like YouTube, significantly shape our information consumption across various domains, especially in healthcare. Hence, algorithmic auditing becomes crucial to uncover their potential bias and misinformation, particularly in the context of controversial topics like abortion. We introduce a simple yet effective sock puppet auditing approach to inv… ▽ More Recommendation algorithms (RS) used by social media, like YouTube, significantly shape our information consumption across various domains, especially in healthcare. Hence, algorithmic auditing becomes crucial to uncover their potential bias and misinformation, particularly in the context of controversial topics like abortion. We introduce a simple yet effective sock puppet auditing approach to investigate how YouTube recommends abortion-related videos to individuals with different backgrounds. Our framework allows for efficient auditing of RS, regardless of the complexity of the underlying algorithms △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2402.16825 [pdf, ps, other]

Weighted Monte Carlo augmented spherical Fourier-Bessel convolutional layers for 3D abdominal organ segmentation

Authors: Wenzhao Zhao, Steffen Albert, Barbara D. Wichtmann, Angelika Maurer, Ulrike Attenberger, Frank G. Zöllner, Jürgen Hesser

Abstract: Filter-decomposition-based group equivariant convolutional neural networks show promising stability and data efficiency for 3D image feature extraction. However, the existing filter-decomposition-based 3D group equivariant neural networks rely on parameter-sharing designs and are mostly limited to rotation transformation groups, where the chosen spherical harmonic filter bases consider only angula… ▽ More Filter-decomposition-based group equivariant convolutional neural networks show promising stability and data efficiency for 3D image feature extraction. However, the existing filter-decomposition-based 3D group equivariant neural networks rely on parameter-sharing designs and are mostly limited to rotation transformation groups, where the chosen spherical harmonic filter bases consider only angular orthogonality. These limitations hamper its application to deep neural network architectures for medical image segmentation. To address these issues, this paper describes a non-parameter-sharing affine group equivariant neural network for 3D medical image segmentation based on an adaptive aggregation of Monte Carlo augmented spherical Fourier Bessel filter bases. The efficiency and flexibility of the adopted non-parameter-sharing strategy enable for the first time an efficient implementation of 3D affine group equivariant convolutional neural networks for volumetric data. The introduced spherical Bessel Fourier filter basis combines both angular and radial orthogonality for better feature extraction. The 3D image segmentation experiments on two abdominal medical image sets, BTCV and the NIH Pancreas datasets, show that the proposed methods excel the state-of-the-art 3D neural networks with high training stability and data efficiency. The code will be available at https://github.com/ZhaoWenzhao/WMCSFB. △ Less

Submitted 9 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

arXiv:2305.10110 [pdf, ps, other]

Adaptive aggregation of Monte Carlo augmented decomposed filters for efficient group-equivariant convolutional neural network

Authors: Wenzhao Zhao, Barbara D. Wichtmann, Steffen Albert, Angelika Maurer, Frank G. Zöllner, Ulrike Attenberger, Jürgen Hesser

Abstract: Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance. However, the parameter-sharing strategy greatly increases the computational burden for each added parameter, which hampers its application to deep neural network models. In this paper, we address these problems by proposing a non-parameter-sharing approach fo… ▽ More Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance. However, the parameter-sharing strategy greatly increases the computational burden for each added parameter, which hampers its application to deep neural network models. In this paper, we address these problems by proposing a non-parameter-sharing approach for group equivariant neural networks. The proposed methods adaptively aggregate a diverse range of filters by a weighted sum of stochastically augmented decomposed filters. We give theoretical proof about how the continuous group convolution can be approximated by our methods. Our method applies to both continuous and discrete groups, where the augmentation is implemented using Monte Carlo sampling and bootstrap resampling, respectively. We demonstrate that our methods serve as an efficient extension of standard CNN. Experiments on group equivariance tests show how our methods can achieve superior performance to parameter-sharing group equivariant networks. Experiments on image classification and image denoising tasks show that in certain scenarios, with a suitable set of filter bases, our method helps improve the performance of standard CNNs and build efficient lightweight image denoising networks. The code will be available at https://github.com/ZhaoWenzhao/MCG_CNN. △ Less

Submitted 1 May, 2024; v1 submitted 17 May, 2023; originally announced May 2023.

arXiv:2305.00977 [pdf, other]

Generalization for slowly mixing processes

Authors: Andreas Maurer

Abstract: A bound uniform over various loss-classes is given for data generated by stationary and phi-mixing processes, where the mixing time (the time needed to obtain approximate independence) enters the sample complexity only in an additive way. For slowly mixing processes this can be a considerable advantage over results with multiplicative dependence on the mixing time. The admissible loss-classes incl… ▽ More A bound uniform over various loss-classes is given for data generated by stationary and phi-mixing processes, where the mixing time (the time needed to obtain approximate independence) enters the sample complexity only in an additive way. For slowly mixing processes this can be a considerable advantage over results with multiplicative dependence on the mixing time. The admissible loss-classes include functions with prescribed Lipschitz norms or smoothness parameters. The bound can also be applied to be uniform over unconstrained loss-classes, where it depends on local Lipschitz properties of the function on the sample path. △ Less

Submitted 1 June, 2023; v1 submitted 28 April, 2023; originally announced May 2023.

Comments: Improved version

MSC Class: 60G10 ACM Class: G.3

arXiv:2207.03136 [pdf, ps, other]

Exponential finite sample bounds for incomplete U-statistics

Authors: Andreas Maurer

Abstract: Incomplete U-statistics have been proposed to accelerate computation. They use only a subset of the subsamples required for kernel evaluations by complete U-statistics. This paper gives a finite sample bound in the style of Bernstein's inequality. Applied to complete U-statistics the resulting inequality improves over the bounds of both Hoeffding and Arcones. For randomly determined subsamples it… ▽ More Incomplete U-statistics have been proposed to accelerate computation. They use only a subset of the subsamples required for kernel evaluations by complete U-statistics. This paper gives a finite sample bound in the style of Bernstein's inequality. Applied to complete U-statistics the resulting inequality improves over the bounds of both Hoeffding and Arcones. For randomly determined subsamples it is shown, that, as soon as the their number reaches the square of the sample-size, the same order bound is obtained as for the complete statistic. △ Less

Submitted 7 July, 2022; originally announced July 2022.

arXiv:2206.02012 [pdf, ps, other]

Concentration of the missing mass in metric spaces

Authors: Andreas Maurer

Abstract: We study the estimation and concentration on its expectation of the probability to observe data further than a specified distance from a given iid sample in a metric space. The problem extends the classical problem of estimation of the missing mass in discrete spaces. We give some estimators for the conditional missing mass and show that estimation of the expected missing mass is difficult in gene… ▽ More We study the estimation and concentration on its expectation of the probability to observe data further than a specified distance from a given iid sample in a metric space. The problem extends the classical problem of estimation of the missing mass in discrete spaces. We give some estimators for the conditional missing mass and show that estimation of the expected missing mass is difficult in general. Conditions on the distribution, under which the Good-Turing estimator and the conditional missing mass concentrate on their expectations are identified. Applications to anomaly detection, coding, the Wasserstein distance between true and empirical measure and simple learning bounds are sketched. △ Less

Submitted 22 November, 2022; v1 submitted 4 June, 2022; originally announced June 2022.

Comments: 25 pages. The application to the Wasserstein metric has been added

MSC Class: 60E15

arXiv:2205.14027 [pdf, other]

Learning Dynamical Systems via Koopman Operator Regression in Reproducing Kernel Hilbert Spaces

Authors: Vladimir Kostic, Pietro Novelli, Andreas Maurer, Carlo Ciliberto, Lorenzo Rosasco, Massimiliano Pontil

Abstract: We study a class of dynamical systems modelled as Markov chains that admit an invariant distribution via the corresponding transfer, or Koopman, operator. While data-driven algorithms to reconstruct such operators are well known, their relationship with statistical learning is largely unexplored. We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical… ▽ More We study a class of dynamical systems modelled as Markov chains that admit an invariant distribution via the corresponding transfer, or Koopman, operator. While data-driven algorithms to reconstruct such operators are well known, their relationship with statistical learning is largely unexplored. We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical system. We consider the restriction of this operator to a reproducing kernel Hilbert space and introduce a notion of risk, from which different estimators naturally arise. We link the risk with the estimation of the spectral decomposition of the Koopman operator. These observations motivate a reduced-rank operator regression (RRR) estimator. We derive learning bounds for the proposed estimator, holding both in i.i.d. and non i.i.d. settings, the latter in terms of mixing coefficients. Our results suggest RRR might be beneficial over other widely used estimators as confirmed in numerical experiments both for forecasting and mode decomposition. △ Less

Submitted 13 December, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

Comments: Main text: 10 pages, 2 figures, 1 table. Supplementary informations: 18 pages, 5 figures, 2 tables

arXiv:2109.11064 [pdf, other]

Actionable Conversational Quality Indicators for Improving Task-Oriented Dialog Systems

Authors: Michael Higgins, Dominic Widdows, Chris Brew, Gwen Christian, Andrew Maurer, Matthew Dunn, Sujit Mathi, Akshay Hazare, George Bonev, Beth Ann Hockey, Kristen Howell, Joe Bradley

Abstract: Automatic dialog systems have become a mainstream part of online customer service. Many such systems are built, maintained, and improved by customer service specialists, rather than dialog systems engineers and computer programmers. As conversations between people and machines become commonplace, it is critical to understand what is working, what is not, and what actions can be taken to reduce the… ▽ More Automatic dialog systems have become a mainstream part of online customer service. Many such systems are built, maintained, and improved by customer service specialists, rather than dialog systems engineers and computer programmers. As conversations between people and machines become commonplace, it is critical to understand what is working, what is not, and what actions can be taken to reduce the frequency of inappropriate system responses. These analyses and recommendations need to be presented in terms that directly reflect the user experience rather than the internal dialog processing. This paper introduces and explains the use of Actionable Conversational Quality Indicators (ACQIs), which are used both to recognize parts of dialogs that can be improved, and to recommend how to improve them. This combines benefits of previous approaches, some of which have focused on producing dialog quality scoring while others have sought to categorize the types of errors the dialog system is making. We demonstrate the effectiveness of using ACQIs on LivePerson internal dialog systems used in commercial customer service applications, and on the publicly available CMU LEGOv2 conversational dataset (Raux et al. 2005). We report on the annotation and analysis of conversational datasets showing which ACQIs are important to fix in various situations. The annotated datasets are then used to build a predictive model which uses a turn-based vector embedding of the message texts and achieves an 79% weighted average f1-measure at the task of finding the correct ACQI for a given conversation. We predict that if such a model worked perfectly, the range of potential improvement actions a bot-builder must consider at each turn could be reduced by an average of 81%. △ Less

Submitted 22 September, 2021; originally announced September 2021.

arXiv:2108.01455 [pdf, other]

FEBR: Expert-Based Recommendation Framework for beneficial and personalized content

Authors: Mohamed Lechiakh, Alexandre Maurer

Abstract: So far, most research on recommender systems focused on maintaining long-term user engagement and satisfaction, by promoting relevant and personalized content. However, it is still very challenging to evaluate the quality and the reliability of this content. In this paper, we propose FEBR (Expert-Based Recommendation Framework), an apprenticeship learning framework to assess the quality of the rec… ▽ More So far, most research on recommender systems focused on maintaining long-term user engagement and satisfaction, by promoting relevant and personalized content. However, it is still very challenging to evaluate the quality and the reliability of this content. In this paper, we propose FEBR (Expert-Based Recommendation Framework), an apprenticeship learning framework to assess the quality of the recommended content on online platforms. The framework exploits the demonstrated trajectories of an expert (assumed to be reliable) in a recommendation evaluation environment, to recover an unknown utility function. This function is used to learn an optimal policy describing the expert's behavior, which is then used in the framework to provide high-quality and personalized recommendations. We evaluate the performance of our solution through a user interest simulation environment (using RecSim). We simulate interactions under the aforementioned expert policy for videos recommendation, and compare its efficiency with standard recommendation methods. The results show that our approach provides a significant gain in terms of content quality, evaluated by experts and watched by users, while maintaining almost the same watch time as the baseline approaches. △ Less

Submitted 3 November, 2021; v1 submitted 17 July, 2021; originally announced August 2021.

Comments: Preprint. Under review

arXiv:2107.07334 [pdf, other]

Tournesol: A quest for a large, secure and trustworthy database of reliable human judgments

Authors: Lê-Nguyên Hoang, Louis Faucon, Aidan Jungo, Sergei Volodin, Dalia Papuc, Orfeas Liossatos, Ben Crulis, Mariame Tighanimine, Isabela Constantin, Anastasiia Kucherenko, Alexandre Maurer, Felix Grimberg, Vlad Nitu, Chris Vossen, Sébastien Rouault, El-Mahdi El-Mhamdi

Abstract: Today's large-scale algorithms have become immensely influential, as they recommend and moderate the content that billions of humans are exposed to on a daily basis. They are the de-facto regulators of our societies' information diet, from shaping opinions on public health to organizing groups for social movements. This creates serious concerns, but also great opportunities to promote quality info… ▽ More Today's large-scale algorithms have become immensely influential, as they recommend and moderate the content that billions of humans are exposed to on a daily basis. They are the de-facto regulators of our societies' information diet, from shaping opinions on public health to organizing groups for social movements. This creates serious concerns, but also great opportunities to promote quality information. Addressing the concerns and seizing the opportunities is a challenging, enormous and fabulous endeavor, as intuitively appealing ideas often come with unwanted {\it side effects}, and as it requires us to think about what we deeply prefer. Understanding how today's large-scale algorithms are built is critical to determine what interventions will be most effective. Given that these algorithms rely heavily on {\it machine learning}, we make the following key observation: \emph{any algorithm trained on uncontrolled data must not be trusted}. Indeed, a malicious entity could take control over the data, poison it with dangerously manipulative fabricated inputs, and thereby make the trained algorithm extremely unsafe. We thus argue that the first step towards safe and ethical large-scale algorithms must be the collection of a large, secure and trustworthy dataset of reliable human judgments. To achieve this, we introduce \emph{Tournesol}, an open source platform available at \url{https://tournesol.app}. Tournesol aims to collect a large database of human judgments on what algorithms ought to widely recommend (and what they ought to stop widely recommending). We outline the structure of the Tournesol database, the key features of the Tournesol platform and the main hurdles that must be overcome to make it a successful project. Most importantly, we argue that, if successful, Tournesol may then serve as the essential foundation for any safe and ethical large-scale algorithm. △ Less

Submitted 29 May, 2021; originally announced July 2021.

Comments: 27 pages, 13 figures

arXiv:2102.06304 [pdf, ps, other]

Some Hoeffding- and Bernstein-type Concentration Inequalities

Authors: Andreas Maurer, Massimiliano Pontil

Abstract: We prove concentration inequalities for functions of independent random variables {under} sub-gaussian and sub-exponential conditions. The utility of the inequalities is demonstrated by an extension of the now classical method of Rademacher complexities to Lipschitz function classes and unbounded sub-exponential distribution. We prove concentration inequalities for functions of independent random variables {under} sub-gaussian and sub-exponential conditions. The utility of the inequalities is demonstrated by an extension of the now classical method of Rademacher complexities to Lipschitz function classes and unbounded sub-exponential distribution. △ Less

Submitted 23 June, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

arXiv:2012.07399 [pdf, other]

Robust Unsupervised Learning via L-Statistic Minimization

Authors: Andreas Maurer, Daniela A. Parletta, Andrea Paudice, Massimiliano Pontil

Abstract: Designing learning algorithms that are resistant to perturbations of the underlying data distribution is a problem of wide practical and theoretical importance. We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. This is exploited by… ▽ More Designing learning algorithms that are resistant to perturbations of the underlying data distribution is a problem of wide practical and theoretical importance. We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. This is exploited by a general descent algorithm which minimizes an $L$-statistic criterion over the model class, weighting small losses more. Our analysis characterizes the robustness of the method in terms of bounds on the reconstruction error relative to the underlying unperturbed distribution. As a byproduct, we prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning, a result which may be of independent interest.Numerical experiments with kmeans clustering and principal subspace analysis demonstrate the effectiveness of our approach. △ Less

Submitted 18 February, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

Comments: We have just uploaded a new version of the paper with a more relavant title " Robust Unsupervised Learning via L-statistic Minimization"

arXiv:1912.07723 [pdf]

doi 10.1088/1361-6587/ab2b25

Advances in neutral tungsten ultraviolet spectroscopy for the potential benefit to gross erosion diagnosis

Authors: C. A. Johnson, D. A. Ennis, S. D. Loch, G. J. Hartwell, D. A. Maurer, S. L. Allen, B. S. Victor, C. M. Samuell, T. Abrams, E. A. Unterberg, R. T. Smyth

Abstract: A spectral survey of tungsten emission in the ultraviolet region has been completed in the DIII-D tokamak and the CTH torsatron to assess the potential benefit of UV emission for the diagnosis of gross W erosion. A total of 29 W I spectral lines are observed from the two experiments using survey spectrometers between 200-400 nm with level identifications provided based on a structure calculation f… ▽ More A spectral survey of tungsten emission in the ultraviolet region has been completed in the DIII-D tokamak and the CTH torsatron to assess the potential benefit of UV emission for the diagnosis of gross W erosion. A total of 29 W I spectral lines are observed from the two experiments using survey spectrometers between 200-400 nm with level identifications provided based on a structure calculation for many of the excited states that produce strong emission lines. Of the 29 observed lines, 20 have not previously been reported in fusion relevant plasmas, including an intense line at 265.65 nm which could be important for benchmarking the frequently exploited line at 400.88 nm. Nearly all of the observed spectral lines decay down to one of the six lowest energy levels for neutral W, which are likely to be long-lived metastable states. The impact of metastable level populations on the W I emission spectrum and any erosion measurement utilizing a spectroscopic technique is potentially significant. Nevertheless, the high density of W I emission in the UV region allows for the possibly of determining the relative metastable fractions and plasma parameters local to the erosion region. Additionally, the lines observed in this work could be used to perform multiple independent gross erosion measurements, leading to more accurate diagnosis of gross tungsten erosion. △ Less

Submitted 16 December, 2019; originally announced December 2019.

Comments: 16 pages, 6 figures

Journal ref: C. Johnson et al., Plasma Phys. Control. Fusion 61 (2019) 095006

arXiv:1906.10673 [pdf, ps, other]

Learning Fair and Transferable Representations

Authors: Luca Oneto, Michele Donini, Andreas Maurer, Massimiliano Pontil

Abstract: Developing learning methods which do not discriminate subgroups in the population is a central goal of algorithmic fairness. One way to reach this goal is by modifying the data representation in order to meet certain fairness constraints. In this work we measure fairness according to demographic parity. This requires the probability of the possible model decisions to be independent of the sensitiv… ▽ More Developing learning methods which do not discriminate subgroups in the population is a central goal of algorithmic fairness. One way to reach this goal is by modifying the data representation in order to meet certain fairness constraints. In this work we measure fairness according to demographic parity. This requires the probability of the possible model decisions to be independent of the sensitive information. We argue that the goal of imposing demographic parity can be substantially facilitated within a multitask learning setting. We leverage task similarities by encouraging a shared fair representation across the tasks via low rank matrix factorization. We derive learning bounds establishing that the learned representation transfers well to novel tasks both in terms of prediction performance and fairness metrics. We present experiments on three real world datasets, showing that the proposed method outperforms state-of-the-art approaches by a significant margin. △ Less

Submitted 31 January, 2020; v1 submitted 25 June, 2019; originally announced June 2019.

arXiv:1906.06239 [pdf, ps, other]

Gathering with extremely restricted visibility

Authors: Rachid Guerraoui, Alexandre Maurer

Abstract: We consider the classical problem of making mobile processes gather or converge at a same position (as performed by swarms of animals in Nature). Existing works assume that each process can see all other processes, or all processes within a certain radius. In this paper, we introduce a new model with an extremely restricted visibility: each process can only see one other process (its closest neigh… ▽ More We consider the classical problem of making mobile processes gather or converge at a same position (as performed by swarms of animals in Nature). Existing works assume that each process can see all other processes, or all processes within a certain radius. In this paper, we introduce a new model with an extremely restricted visibility: each process can only see one other process (its closest neighbor). Our goal is to see if (and to what extent) the gathering and convergence problems can be solved in this setting. We first show that, surprisingly, the problem can be solved for a small number of processes (at most 5), but not beyond. This is due to indeterminacy in the case where there are several closest neighbors for a same process. By removing this indeterminacy with an additional hypothesis (choosing the closest neighbor according to an order on the positions of processes), we then show that the problem can be solved for any number of processes. We also show that up to one crash failure can be tolerated for the convergence problem. △ Less

Submitted 14 June, 2019; originally announced June 2019.

arXiv:1902.01911 [pdf, ps, other]

Uniform concentration and symmetrization for weak interactions

Authors: Andreas Maurer, Massimiliano Pontil

Abstract: The method to derive uniform bounds with Gaussian and Rademacher complexities is extended to the case where the sample average is replaced by a nonlinear statistic. Tight bounds are obtained for U-statistics, smoothened L-statistics and error functionals of l2-regularized algorithms. The method to derive uniform bounds with Gaussian and Rademacher complexities is extended to the case where the sample average is replaced by a nonlinear statistic. Tight bounds are obtained for U-statistics, smoothened L-statistics and error functionals of l2-regularized algorithms. △ Less

Submitted 10 May, 2019; v1 submitted 5 February, 2019; originally announced February 2019.

arXiv:1808.09922 [pdf, other]

Limiting the Spread of Fake News on Social Media Platforms by Evaluating Users' Trustworthiness

Authors: Oana Balmau, Rachid Guerraoui, Anne-Marie Kermarrec, Alexandre Maurer, Matej Pavlovic, Willy Zwaenepoel

Abstract: Today's social media platforms enable to spread both authentic and fake news very quickly. Some approaches have been proposed to automatically detect such "fake" news based on their content, but it is difficult to agree on universal criteria of authenticity (which can be bypassed by adversaries once known). Besides, it is obviously impossible to have each news item checked by a human. In this pa… ▽ More Today's social media platforms enable to spread both authentic and fake news very quickly. Some approaches have been proposed to automatically detect such "fake" news based on their content, but it is difficult to agree on universal criteria of authenticity (which can be bypassed by adversaries once known). Besides, it is obviously impossible to have each news item checked by a human. In this paper, we a mechanism to limit the spread of fake news which is not based on content. It can be implemented as a plugin on a social media platform. The principle is as follows: a team of fact-checkers reviews a small number of news items (the most popular ones), which enables to have an estimation of each user's inclination to share fake news items. Then, using a Bayesian approach, we estimate the trustworthiness of future news items, and treat accordingly those of them that pass a certain "untrustworthiness" threshold. We then evaluate the effectiveness and overhead of this technique on a large Twitter graph. We show that having a few thousands users exposed to one given news item enables to reach a very precise estimation of its reliability. We thus identify more than 99% of fake news items with no false positives. The performance impact is very small: the induced overhead on the 90th percentile latency is less than 3%, and less than 8% on the throughput of user operations. △ Less

Submitted 29 August, 2018; originally announced August 2018.

Comments: 10 pages, 9 figures

arXiv:1806.02510 [pdf, other]

Removing Algorithmic Discrimination (With Minimal Individual Error)

Authors: El Mahdi El Mhamdi, Rachid Guerraoui, Lê Nguyên Hoang, Alexandre Maurer

Abstract: We address the problem of correcting group discriminations within a score function, while minimizing the individual error. Each group is described by a probability density function on the set of profiles. We first solve the problem analytically in the case of two populations, with a uniform bonus-malus on the zones where each population is a majority. We then address the general case of n populati… ▽ More We address the problem of correcting group discriminations within a score function, while minimizing the individual error. Each group is described by a probability density function on the set of profiles. We first solve the problem analytically in the case of two populations, with a uniform bonus-malus on the zones where each population is a majority. We then address the general case of n populations, where the entanglement of populations does not allow a similar analytical solution. We show that an approximate solution with an arbitrarily high level of precision can be computed with linear programming. Finally, we address the inverse problem where the error should not go beyond a certain value and we seek to minimize the discrimination. △ Less

Submitted 7 June, 2018; originally announced June 2018.

arXiv:1805.11447 [pdf, other]

Virtuously Safe Reinforcement Learning

Authors: Henrik Aslund, El Mahdi El Mhamdi, Rachid Guerraoui, Alexandre Maurer

Abstract: We show that when a third party, the adversary, steps into the two-party setting (agent and operator) of safely interruptible reinforcement learning, a trade-off has to be made between the probability of following the optimal policy in the limit, and the probability of escaping a dangerous situation created by the adversary. So far, the work on safely interruptible agents has assumed a perfect per… ▽ More We show that when a third party, the adversary, steps into the two-party setting (agent and operator) of safely interruptible reinforcement learning, a trade-off has to be made between the probability of following the optimal policy in the limit, and the probability of escaping a dangerous situation created by the adversary. So far, the work on safely interruptible agents has assumed a perfect perception of the agent about its environment (no adversary), and therefore implicitly set the second probability to zero, by explicitly seeking a value of one for the first probability. We show that (1) agents can be made both interruptible and adversary-resilient, and (2) the interruptibility can be made safe in the sense that the agent itself will not seek to avoid it. We also solve the problem that arises when the agent does not go completely greedy, i.e. issues with safe exploration in the limit. Resilience to perturbed perception, safe exploration in the limit, and safe interruptibility are the three pillars of what we call \emph{virtuously safe reinforcement learning}. △ Less

Submitted 29 May, 2018; originally announced May 2018.

arXiv:1803.03934 [pdf, ps, other]

Empirical bounds for functions with weak interactions

Authors: Andreas Maurer, Massimiliano Pontil

Abstract: We provide sharp empirical estimates of expectation, variance and normal approximation for a class of statistics whose variation in any argument does not change too much when another argument is modified. Examples of such weak interactions are furnished by U- and V-statistics, Lipschitz L-statistics and various error functionals of L2-regularized algorithms and Gibbs algorithms. We provide sharp empirical estimates of expectation, variance and normal approximation for a class of statistics whose variation in any argument does not change too much when another argument is modified. Examples of such weak interactions are furnished by U- and V-statistics, Lipschitz L-statistics and various error functionals of L2-regularized algorithms and Gibbs algorithms. △ Less

Submitted 11 March, 2018; originally announced March 2018.

arXiv:1802.07834 [pdf, other]

Learning to Gather without Communication

Authors: El Mahdi El Mhamdi, Rachid Guerraoui, Alexandre Maurer, Vladislav Tempez

Abstract: A standard belief on emerging collective behavior is that it emerges from simple individual rules. Most of the mathematical research on such collective behavior starts from imperative individual rules, like always go to the center. But how could an (optimal) individual rule emerge during a short period within the group lifetime, especially if communication is not available. We argue that such rule… ▽ More A standard belief on emerging collective behavior is that it emerges from simple individual rules. Most of the mathematical research on such collective behavior starts from imperative individual rules, like always go to the center. But how could an (optimal) individual rule emerge during a short period within the group lifetime, especially if communication is not available. We argue that such rules can actually emerge in a group in a short span of time via collective (multi-agent) reinforcement learning, i.e learning via rewards and punishments. We consider the gathering problem: several agents (social animals, swarming robots...) must gather around a same position, which is not determined in advance. They must do so without communication on their planned decision, just by looking at the position of other agents. We present the first experimental evidence that a gathering behavior can be learned without communication in a partially observable environment. The learned behavior has the same properties as a self-stabilizing distributed algorithm, as processes can gather from any initial state (and thus tolerate any transient failure). Besides, we show that it is possible to tolerate the brutal loss of up to 90\% of agents without significant impact on the behavior. △ Less

Submitted 21 February, 2018; originally announced February 2018.

Comments: Preliminary version, presented at the 5th Biological Distributed Algorithms Workshop. Washington D.C, July 28th, 2017

arXiv:1711.02112 [pdf, ps, other]

On The Finite Generation Of Relative Cohomology For Lie Superalgebras

Authors: Andrew Maurer

Abstract: The author establishes finite-generation of the cohomology ring of a classical Lie superalgebra relative to an even subsuperalgebra. A spectral sequence is constructed to provide conditions for when this relative cohomology ring is Cohen-Macaulay. With finite generation established, support varieties for modules are defined via the relative cohomology, which generalize those of [BKN-1] The author establishes finite-generation of the cohomology ring of a classical Lie superalgebra relative to an even subsuperalgebra. A spectral sequence is constructed to provide conditions for when this relative cohomology ring is Cohen-Macaulay. With finite generation established, support varieties for modules are defined via the relative cohomology, which generalize those of [BKN-1] △ Less

Submitted 27 July, 2018; v1 submitted 6 November, 2017; originally announced November 2017.

arXiv:1704.02882 [pdf, ps, other]

Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning

Authors: El Mahdi El Mhamdi, Rachid Guerraoui, Hadrien Hendrikx, Alexandre Maurer

Abstract: In reinforcement learning, agents learn by performing actions and observing their outcomes. Sometimes, it is desirable for a human operator to \textit{interrupt} an agent in order to prevent dangerous situations from happening. Yet, as part of their learning process, agents may link these interruptions, that impact their reward, to specific states and deliberately avoid them. The situation is part… ▽ More In reinforcement learning, agents learn by performing actions and observing their outcomes. Sometimes, it is desirable for a human operator to \textit{interrupt} an agent in order to prevent dangerous situations from happening. Yet, as part of their learning process, agents may link these interruptions, that impact their reward, to specific states and deliberately avoid them. The situation is particularly challenging in a multi-agent context because agents might not only learn from their own past interruptions, but also from those of other agents. Orseau and Armstrong defined \emph{safe interruptibility} for one learner, but their work does not naturally extend to multi-agent systems. This paper introduces \textit{dynamic safe interruptibility}, an alternative definition more suited to decentralized learning problems, and studies this notion in two learning frameworks: \textit{joint action learners} and \textit{independent learners}. We give realistic sufficient conditions on the learning algorithm to enable dynamic safe interruptibility in the case of joint action learners, yet show that these conditions are not sufficient for independent learners. We show however that if agents can detect interruptions, it is possible to prune the observations to ensure dynamic safe interruptibility even for independent learners. △ Less

Submitted 22 May, 2017; v1 submitted 10 April, 2017; originally announced April 2017.

arXiv:1701.06191 [pdf, ps, other]

A Bernstein-type inequality for functions of bounded interaction

Authors: Andreas Maurer

Abstract: We give a distribution-dependent concentration inequality for functions of independent variables. The result extends Bernstein's inequality from sums to more general functions, whose variation in any argument does not depend too much on the other arguments. Applications sharpen existing bounds for U-statistics and the generalization error of regularized least squares. We give a distribution-dependent concentration inequality for functions of independent variables. The result extends Bernstein's inequality from sums to more general functions, whose variation in any argument does not depend too much on the other arguments. Applications sharpen existing bounds for U-statistics and the generalization error of regularized least squares. △ Less

Submitted 11 May, 2017; v1 submitted 22 January, 2017; originally announced January 2017.

Comments: This version contains an additional corollary of the main result. Proposition 4, Corollary 5 and Section 2.4 added. Rest unchanged

MSC Class: 60F10

arXiv:1606.01487 [pdf, ps, other]

Bounds for Vector-Valued Function Estimation

Authors: Andreas Maurer, Massimiliano Pontil

Abstract: We present a framework to derive risk bounds for vector-valued learning with a broad class of feature maps and loss functions. Multi-task learning and one-vs-all multi-category learning are treated as examples. We discuss in detail vector-valued functions with one hidden layer, and demonstrate that the conditions under which shared representations are beneficial for multi- task learning are equall… ▽ More We present a framework to derive risk bounds for vector-valued learning with a broad class of feature maps and loss functions. Multi-task learning and one-vs-all multi-category learning are treated as examples. We discuss in detail vector-valued functions with one hidden layer, and demonstrate that the conditions under which shared representations are beneficial for multi- task learning are equally applicable to multi-category learning. △ Less

Submitted 5 June, 2016; originally announced June 2016.

arXiv:1605.00251 [pdf, ps, other]

A vector-contraction inequality for Rademacher complexities

Authors: Andreas Maurer

Abstract: The contraction inequality for Rademacher averages is extended to Lipschitz functions with vector-valued domains, and it is also shown that in the bounding expression the Rademacher variables can be replaced by arbitrary iid symmetric and sub-gaussian variables. Example applications are given for multi-category learning, K-means clustering and learning-to-learn. The contraction inequality for Rademacher averages is extended to Lipschitz functions with vector-valued domains, and it is also shown that in the bounding expression the Rademacher variables can be replaced by arbitrary iid symmetric and sub-gaussian variables. Example applications are given for multi-category learning, K-means clustering and learning-to-learn. △ Less

Submitted 1 May, 2016; originally announced May 2016.

arXiv:1505.06279 [pdf, ps, other]

The Benefit of Multitask Representation Learning

Authors: Andreas Maurer, Massimiliano Pontil, Bernardino Romera-Paredes

Abstract: We discuss a general method to learn data representations from multiple tasks. We provide a justification for this method in both settings of multitask learning and learning-to-learn. The method is illustrated in detail in the special case of linear feature learning. Conditions on the theoretical advantage offered by multitask representation learning over independent task learning are established.… ▽ More We discuss a general method to learn data representations from multiple tasks. We provide a justification for this method in both settings of multitask learning and learning-to-learn. The method is illustrated in detail in the special case of linear feature learning. Conditions on the theoretical advantage offered by multitask representation learning over independent task learning are established. In particular, focusing on the important example of half-space learning, we derive the regime in which multitask representation learning is beneficial over independent task learning, as a function of the sample size, the number of tasks and the intrinsic data dimensionality. Other potential applications of our results include multitask feature learning in reproducing kernel Hilbert spaces and multilayer, deep networks. △ Less

Submitted 25 March, 2016; v1 submitted 23 May, 2015; originally announced May 2015.

Comments: To appear in Journal of Machine Learning Research (JMLR). 31 pages

arXiv:1503.02163 [pdf, ps, other]

Uniform Estimation Beyond the Mean

Authors: Andreas Maurer

Abstract: Finite sample bounds on the estimation error of the mean by the empirical mean, uniform over a class of functions, can often be conveniently obtained in terms of Rademacher or Gaussian averages of the class. If a function of n variables has suitably bounded partial derivatives, it can be substituted for the empirical mean, with uniform estimation again controlled by Gaussian averages. Up to a cons… ▽ More Finite sample bounds on the estimation error of the mean by the empirical mean, uniform over a class of functions, can often be conveniently obtained in terms of Rademacher or Gaussian averages of the class. If a function of n variables has suitably bounded partial derivatives, it can be substituted for the empirical mean, with uniform estimation again controlled by Gaussian averages. Up to a constant the result recovers standard results for the empirical mean and more recent ones about U-statistics, and extends to a general class of estimation problems. △ Less

Submitted 7 March, 2015; originally announced March 2015.

MSC Class: 60G05

arXiv:1411.2635 [pdf, ps, other]

doi 10.1007/978-3-319-11662-4_18

A chain rule for the expected suprema of Gaussian processes

Authors: Andreas Maurer

Abstract: The expected supremum of a Gaussian process indexed by the image of an index set under a function class is bounded in terms of separate properties of the index set and the function class. The bound is relevant to the estimation of nonlinear transformations or the analysis of learning algorithms whenever hypotheses are chosen from composite classes, as is the case for multi-layer models. The expected supremum of a Gaussian process indexed by the image of an index set under a function class is bounded in terms of separate properties of the index set and the function class. The bound is relevant to the estimation of nonlinear transformations or the analysis of learning algorithms whenever hypotheses are chosen from composite classes, as is the case for multi-layer models. △ Less

Submitted 10 November, 2014; originally announced November 2014.

Journal ref: Lecture Notes in Computer Science Volume 8776, 2014, pp 245-259

arXiv:1402.1864 [pdf, ps, other]

An Inequality with Applications to Structured Sparsity and Multitask Dictionary Learning

Authors: Andreas Maurer, Massimiliano Pontil, Bernardino Romera-Paredes

Abstract: From concentration inequalities for the suprema of Gaussian or Rademacher processes an inequality is derived. It is applied to sharpen existing and to derive novel bounds on the empirical Rademacher complexities of unit balls in various norms appearing in the context of structured sparsity and multitask dictionary learning or matrix factorization. A key role is played by the largest eigenvalue of… ▽ More From concentration inequalities for the suprema of Gaussian or Rademacher processes an inequality is derived. It is applied to sharpen existing and to derive novel bounds on the empirical Rademacher complexities of unit balls in various norms appearing in the context of structured sparsity and multitask dictionary learning or matrix factorization. A key role is played by the largest eigenvalue of the data covariance matrix. △ Less

Submitted 7 June, 2014; v1 submitted 8 February, 2014; originally announced February 2014.

arXiv:1402.0121 [pdf, other]

Reliable Communication in a Dynamic Network in the Presence of Byzantine Faults

Authors: Alexandre Maurer, Sébastien Tixeuil, Xavier Défago

Abstract: We consider the following problem: two nodes want to reliably communicate in a dynamic multihop network where some nodes have been compromised, and may have a totally arbitrary and unpredictable behavior. These nodes are called Byzantine. We consider the two cases where cryptography is available and not available. We prove the necessary and sufficient condition (that is, the weakest possible condi… ▽ More We consider the following problem: two nodes want to reliably communicate in a dynamic multihop network where some nodes have been compromised, and may have a totally arbitrary and unpredictable behavior. These nodes are called Byzantine. We consider the two cases where cryptography is available and not available. We prove the necessary and sufficient condition (that is, the weakest possible condition) to ensure reliable communication in this context. Our proof is constructive, as we provide Byzantine-resilient algorithms for reliable communication that are optimal with respect to our impossibility results. In a second part, we investigate the impact of our conditions in three case studies: participants interacting in a conference, robots moving on a grid and agents in the subway. Our simulations indicate a clear benefit of using our algorithms for reliable communication in those contexts. △ Less

Submitted 16 February, 2015; v1 submitted 1 February, 2014; originally announced February 2014.

arXiv:1308.2841 [pdf, ps, other]

On the minimum order of k-cop-win graphs

Authors: William Baird, Andrew Beveridge, Anthony Bonato, Paolo Codenotti, Aaron Maurer, John McCauley, Silviya Valeva

Abstract: We consider the minimum order graphs with a given cop number. We prove that the minimum order of a connected graph with cop number 3 is 10, and show that the Petersen graph is the unique isomorphism type of graph with this property. We provide the results of a computational search on the cop number of all graphs up to and including order 10. A relationship is presented between the minimum order of… ▽ More We consider the minimum order graphs with a given cop number. We prove that the minimum order of a connected graph with cop number 3 is 10, and show that the Petersen graph is the unique isomorphism type of graph with this property. We provide the results of a computational search on the cop number of all graphs up to and including order 10. A relationship is presented between the minimum order of graph with cop number $k$ and Meyniel's conjecture on the asymptotic maximum value of the cop number of a connected graph. △ Less

Submitted 13 August, 2013; originally announced August 2013.

Comments: arXiv admin note: substantial text overlap with arXiv:1110.0768

arXiv:1307.2232

CTA contributions to the 33rd International Cosmic Ray Conference (ICRC2013)

Authors: The CTA Consortium, :, O. Abril, B. S. Acharya, M. Actis, G. Agnetta, J. A. Aguilar, F. Aharonian, M. Ajello, A. Akhperjanian, M. Alcubierre, J. Aleksic, R. Alfaro, E. Aliu, A. J. Allafort, D. Allan, I. Allekotte, R. Aloisio, E. Amato, G. Ambrosi, M. Ambrosio, J. Anderson, E. O. Angüner, L. A. Antonelli, V. Antonuccio , et al. (1082 additional authors not shown)

Abstract: Compilation of CTA contributions to the proceedings of the 33rd International Cosmic Ray Conference (ICRC2013), which took place in 2-9 July, 2013, in Rio de Janeiro, Brazil Compilation of CTA contributions to the proceedings of the 33rd International Cosmic Ray Conference (ICRC2013), which took place in 2-9 July, 2013, in Rio de Janeiro, Brazil △ Less

Submitted 29 July, 2013; v1 submitted 8 July, 2013; originally announced July 2013.

Comments: Index of CTA conference proceedings at the ICRC2013, Rio de Janeiro (Brazil). v1: placeholder with no arXiv links yet, to be replaced once individual contributions have been all submitted. v2: final with arXiv links to all CTA contributions and full author list

arXiv:1301.3996 [pdf, other]

Parameterizable Byzantine Broadcast in Loosely Connected Networks

Authors: Alexandre Maurer, Sébastien Tixeuil

Abstract: We consider the problem of reliably broadcasting information in a multihop asynchronous network, despite the presence of Byzantine failures: some nodes are malicious and behave arbitrarly. We focus on non-cryptographic solutions. Most existing approaches give conditions for perfect reliable broadcast (all correct nodes deliver the good information), but require a highly connected network. A probab… ▽ More We consider the problem of reliably broadcasting information in a multihop asynchronous network, despite the presence of Byzantine failures: some nodes are malicious and behave arbitrarly. We focus on non-cryptographic solutions. Most existing approaches give conditions for perfect reliable broadcast (all correct nodes deliver the good information), but require a highly connected network. A probabilistic approach was recently proposed for loosely connected networks: the Byzantine failures are randomly distributed, and the correct nodes deliver the good information with high probability. A first solution require the nodes to initially know their position on the network, which may be difficult or impossible in self-organizing or dynamic networks. A second solution relaxed this hypothesis but has much weaker Byzantine tolerance guarantees. In this paper, we propose a parameterizable broadcast protocol that does not require nodes to have any knowledge about the network. We give a deterministic technique to compute a set of nodes that always deliver authentic information, for a given set of Byzantine failures. Then, we use this technique to experimentally evaluate our protocol, and show that it significantely outperforms previous solutions with the same hypotheses. Important disclaimer: these results have NOT yet been published in an international conference or journal. This is just a technical report presenting intermediary and incomplete results. A generalized version of these results may be under submission. △ Less

Submitted 8 December, 2013; v1 submitted 17 January, 2013; originally announced January 2013.

arXiv:1301.2875 [pdf, other]

On Byzantine Broadcast in Planar Graphs

Authors: Alexandre Maurer, Sébastien Tixeuil

Abstract: We consider the problem of reliably broadcasting information in a multihop asynchronous network in the presence of Byzantine failures: some nodes may exhibit unpredictable malicious behavior. We focus on completely decentralized solutions. Few Byzantine-robust algorithms exist for loosely connected networks. A recent solution guarantees reliable broadcast on a torus when D > 4, D being the minimal… ▽ More We consider the problem of reliably broadcasting information in a multihop asynchronous network in the presence of Byzantine failures: some nodes may exhibit unpredictable malicious behavior. We focus on completely decentralized solutions. Few Byzantine-robust algorithms exist for loosely connected networks. A recent solution guarantees reliable broadcast on a torus when D > 4, D being the minimal distance between two Byzantine nodes. In this paper, we generalize this result to 4-connected planar graphs. We show that reliable broadcast can be guaranteed when D > Z, Z being the maximal number of edges per polygon. We also show that this bound on D is a lower bound for this class of graphs. Our solution has the same time complexity as a simple broadcast. This is also the first solution where the memory required increases linearly (instead of exponentially) with the size of transmitted information. Important disclaimer: these results have NOT yet been published in an international conference or journal. This is just a technical report presenting intermediary and incomplete results. A generalized version of these results may be under submission. △ Less

Submitted 7 December, 2013; v1 submitted 14 January, 2013; originally announced January 2013.

arXiv:1212.1496 [pdf, ps, other]

Excess risk bounds for multitask learning with trace norm regularization

Authors: Andreas Maurer, Massimiliano Pontil

Abstract: Trace norm regularization is a popular method of multitask learning. We give excess risk bounds with explicit dependence on the number of tasks, the number of examples per task and properties of the data distribution. The bounds are independent of the dimension of the input space, which may be infinite as in the case of reproducing kernel Hilbert spaces. A byproduct of the proof are bounds on the… ▽ More Trace norm regularization is a popular method of multitask learning. We give excess risk bounds with explicit dependence on the number of tasks, the number of examples per task and properties of the data distribution. The bounds are independent of the dimension of the input space, which may be infinite as in the case of reproducing kernel Hilbert spaces. A byproduct of the proof are bounds on the expected norm of sums of random positive semidefinite matrices with subexponential moments. △ Less

Submitted 14 January, 2013; v1 submitted 6 December, 2012; originally announced December 2012.

arXiv:1210.4640 [pdf, other]

A Scalable Byzantine Grid

Authors: Alexandre Maurer, Sébastien Tixeuil

Abstract: Modern networks assemble an ever growing number of nodes. However, it remains difficult to increase the number of channels per node, thus the maximal degree of the network may be bounded. This is typically the case in grid topology networks, where each node has at most four neighbors. In this paper, we address the following issue: if each node is likely to fail in an unpredictable manner, how can… ▽ More Modern networks assemble an ever growing number of nodes. However, it remains difficult to increase the number of channels per node, thus the maximal degree of the network may be bounded. This is typically the case in grid topology networks, where each node has at most four neighbors. In this paper, we address the following issue: if each node is likely to fail in an unpredictable manner, how can we preserve some global reliability guarantees when the number of nodes keeps increasing unboundedly ? To be more specific, we consider the problem or reliably broadcasting information on an asynchronous grid in the presence of Byzantine failures -- that is, some nodes may have an arbitrary and potentially malicious behavior. Our requirement is that a constant fraction of correct nodes remain able to achieve reliable communication. Existing solutions can only tolerate a fixed number of Byzantine failures if they adopt a worst-case placement scheme. Besides, if we assume a constant Byzantine ratio (each node has the same probability to be Byzantine), the probability to have a fatal placement approaches 1 when the number of nodes increases, and reliability guarantees collapse. In this paper, we propose the first broadcast protocol that overcomes these difficulties. First, the number of Byzantine failures that can be tolerated (if they adopt the worst-case placement) now increases with the number of nodes. Second, we are able to tolerate a constant Byzantine ratio, however large the grid may be. In other words, the grid becomes scalable. This result has important security applications in ultra-large networks, where each node has a given probability to misbehave. △ Less

Submitted 17 October, 2012; originally announced October 2012.

Comments: 17 pages

arXiv:1209.1358 [pdf, other]

doi 10.1007/978-3-642-33651-5_18

On Byzantine Broadcast in Loosely Connected Networks

Authors: Alexandre Maurer, Sébastien Tixeuil

Abstract: We consider the problem of reliably broadcasting information in a multihop asynchronous network that is subject to Byzantine failures. Most existing approaches give conditions for perfect reliable broadcast (all correct nodes deliver the authentic message and nothing else), but they require a highly connected network. An approach giving only probabilistic guarantees (correct nodes deliver the auth… ▽ More We consider the problem of reliably broadcasting information in a multihop asynchronous network that is subject to Byzantine failures. Most existing approaches give conditions for perfect reliable broadcast (all correct nodes deliver the authentic message and nothing else), but they require a highly connected network. An approach giving only probabilistic guarantees (correct nodes deliver the authentic message with high probability) was recently proposed for loosely connected networks, such as grids and tori. Yet, the proposed solution requires a specific initialization (that includes global knowledge) of each node, which may be difficult or impossible to guarantee in self-organizing networks - for instance, a wireless sensor network, especially if they are prone to Byzantine failures. In this paper, we propose a new protocol offering guarantees for loosely connected networks that does not require such global knowledge dependent initialization. In more details, we give a methodology to determine whether a set of nodes will always deliver the authentic message, in any execution. Then, we give conditions for perfect reliable broadcast in a torus network. Finally, we provide experimental evaluation for our solution, and determine the number of randomly distributed Byzantine failures than can be tolerated, for a given correct broadcast probability. △ Less

Submitted 5 September, 2012; originally announced September 2012.

Comments: 14

arXiv:1209.0738 [pdf, ps, other]

Sparse coding for multitask and transfer learning

Authors: Andreas Maurer, Massimiliano Pontil, Bernardino Romera-Paredes

Abstract: We investigate the use of sparse coding and dictionary learning in the context of multitask and transfer learning. The central assumption of our learning method is that the tasks parameters are well approximated by sparse linear combinations of the atoms of a dictionary on a high or infinite dimensional space. This assumption, together with the large quantity of available data in the multitask and… ▽ More We investigate the use of sparse coding and dictionary learning in the context of multitask and transfer learning. The central assumption of our learning method is that the tasks parameters are well approximated by sparse linear combinations of the atoms of a dictionary on a high or infinite dimensional space. This assumption, together with the large quantity of available data in the multitask and transfer learning settings, allows a principled choice of the dictionary. We provide bounds on the generalization error of this approach, for both settings. Numerical experiments on one synthetic and two real datasets show the advantage of our method over single task learning, a previous method based on orthogonal and dense representation of the tasks and a related method learning task grouping. △ Less

Submitted 16 June, 2014; v1 submitted 4 September, 2012; originally announced September 2012.

Comments: International Conference on Machine Learning 2013

MSC Class: 68Q32; 68T05; 97C30; 46N30

arXiv:1205.1595 [pdf, ps, other]

doi 10.3150/10-BEJ341

Thermodynamics and concentration

Authors: Andreas Maurer

Abstract: We show that the thermal subadditivity of entropy provides a common basis to derive a strong form of the bounded difference inequality and related results as well as more recent inequalities applicable to convex Lipschitz functions, random symmetric matrices, shortest travelling salesmen paths and weakly self-bounding functions. We also give two new concentration inequalities. We show that the thermal subadditivity of entropy provides a common basis to derive a strong form of the bounded difference inequality and related results as well as more recent inequalities applicable to convex Lipschitz functions, random symmetric matrices, shortest travelling salesmen paths and weakly self-bounding functions. We also give two new concentration inequalities. △ Less

Submitted 8 May, 2012; originally announced May 2012.

Comments: Published in at http://dx.doi.org/10.3150/10-BEJ341 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ341

Journal ref: Bernoulli 2012, Vol. 18, No. 2, 434-454

arXiv:1201.5824 [pdf, other]

Limiting Byzantien Influence in Multihop Asynchronous Networks

Authors: Alexandre Maurer, Sébastien Tixeuil

Abstract: We consider the problem of reliably broadcasting information in a multihop asyn- chronous network that is subject to Byzantine failures. That is, some nodes of the network can exhibit arbitrary (and potentially malicious) behavior. Existing solutions provide de- terministic guarantees for broadcasting between all correct nodes, but require that the communication network is highly-connected (typica… ▽ More We consider the problem of reliably broadcasting information in a multihop asyn- chronous network that is subject to Byzantine failures. That is, some nodes of the network can exhibit arbitrary (and potentially malicious) behavior. Existing solutions provide de- terministic guarantees for broadcasting between all correct nodes, but require that the communication network is highly-connected (typically, 2k + 1 connectivity is required, where k is the total number of Byzantine nodes in the network). In this paper, we investigate the possibility of Byzantine tolerant reliable broadcast be- tween most correct nodes in low-connectivity networks (typically, networks with constant connectivity). In more details, we propose a new broadcast protocol that is specifically designed for low-connectivity networks. We provide sufficient conditions for correct nodes using our protocol to reliably communicate despite Byzantine participants. We present experimental results that show that our approach is especially effective in low-connectivity networks when Byzantine nodes are randomly distributed. △ Less

Submitted 27 January, 2012; originally announced January 2012.

Comments: 18 pages

arXiv:1201.1305 [pdf, other]

doi 10.1088/0004-637X/745/2/166

Dark matter powered stars: Constraints from the extragalactic background light

Authors: A. Maurer, M. Raue, T. Kneiske, D. Elsässer, P. H. Hauschildt, D. Horns

Abstract: The existence of predominantly cold non-baryonic dark matter is unambiguously demonstrated by several observations (e.g., structure formation, big bang nucleosynthesis, gravitational lensing, and rotational curves of spiral galaxies). A candidate well motivated by particle physics is a weakly interacting massive particle (WIMP). Self-annihilating WIMPs would affect the stellar evolution especially… ▽ More The existence of predominantly cold non-baryonic dark matter is unambiguously demonstrated by several observations (e.g., structure formation, big bang nucleosynthesis, gravitational lensing, and rotational curves of spiral galaxies). A candidate well motivated by particle physics is a weakly interacting massive particle (WIMP). Self-annihilating WIMPs would affect the stellar evolution especially in the early universe. Stars powered by self-annihilating WIMP dark matter should possess different properties compared with standard stars. While a direct detection of such dark matter powered stars seems very challenging, their cumulative emission might leave an imprint in the diffuse metagalactic radiation fields, in particular in the mid-infrared part of the electromagnetic spectrum. In this work the possible contributions of dark matter powered stars (dark stars; DSs) to the extragalactic background light (EBL) are calculated. It is shown that existing data and limits of the EBL intensity can already be used to rule out some DS parameter sets. △ Less

Submitted 5 January, 2012; originally announced January 2012.

Comments: Accepted for publication in ApJ; 7 pages, 5 figures

arXiv:1110.0768 [pdf, ps, other]

The Petersen graph is the smallest 3-cop-win graph

Authors: Andrew Beveridge, Paolo Codenotti, Aaron Maurer, John McCauley, Silviya Valeva

Abstract: In the game of \emph{cops and robbers} on a graph $G = (V,E)$, $k$ cops try to catch a robber. On the cop turn, each cop may move to a neighboring vertex or remain in place. On the robber's turn, he moves similarly. The cops win if there is some time at which a cop is at the same vertex as the robber. Otherwise, the robber wins. The minimum number of cops required to catch the robber is called the… ▽ More In the game of \emph{cops and robbers} on a graph $G = (V,E)$, $k$ cops try to catch a robber. On the cop turn, each cop may move to a neighboring vertex or remain in place. On the robber's turn, he moves similarly. The cops win if there is some time at which a cop is at the same vertex as the robber. Otherwise, the robber wins. The minimum number of cops required to catch the robber is called the \emph{cop number} of $G$, and is denoted $c(G)$. Let $m_k$ be the minimum order of a connected graph satisfying $c(G) \geq k$. Recently, Baird and Bonato determined via computer search that $m_3=10$ and that this value is attained uniquely by the Petersen graph. Herein, we give a self-contained mathematical proof of this result. Along the way, we give some characterizations of graphs with $c(G) >2$ and very high maximum degree. △ Less

Submitted 2 April, 2012; v1 submitted 4 October, 2011; originally announced October 2011.

Comments: 14 pages, 3 figures

arXiv:1108.3476 [pdf, ps, other]

Structured Sparsity and Generalization

Authors: Andreas Maurer, Massimiliano Pontil

Abstract: We present a data dependent generalization bound for a large class of regularized algorithms which implement structured sparsity constraints. The bound can be applied to standard squared-norm regularization, the Lasso, the group Lasso, some versions of the group Lasso with overlapping groups, multiple kernel learning and other regularization schemes. In all these cases competitive results are obta… ▽ More We present a data dependent generalization bound for a large class of regularized algorithms which implement structured sparsity constraints. The bound can be applied to standard squared-norm regularization, the Lasso, the group Lasso, some versions of the group Lasso with overlapping groups, multiple kernel learning and other regularization schemes. In all these cases competitive results are obtained. A novel feature of our bound is that it can be applied in an infinite dimensional setting such as the Lasso in a separable Hilbert space or multiple kernel learning with a countable number of kernels. △ Less

Submitted 2 September, 2011; v1 submitted 17 August, 2011; originally announced August 2011.

Journal ref: Journal of Machine Learning Research, 13:671-690, 2012

arXiv:0907.3740 [pdf, ps, other]

Empirical Bernstein Bounds and Sample Variance Penalization

Authors: Andreas Maurer, Massimiliano Pontil

Abstract: We give improved constants for data dependent and variance sensitive confidence bounds, called empirical Bernstein bounds, and extend these inequalities to hold uniformly over classes of functionswhose growth function is polynomial in the sample size n. The bounds lead us to consider sample variance penalization, a novel learning method which takes into account the empirical variance of the loss… ▽ More We give improved constants for data dependent and variance sensitive confidence bounds, called empirical Bernstein bounds, and extend these inequalities to hold uniformly over classes of functionswhose growth function is polynomial in the sample size n. The bounds lead us to consider sample variance penalization, a novel learning method which takes into account the empirical variance of the loss function. We give conditions under which sample variance penalization is effective. In particular, we present a bound on the excess risk incurred by the method. Using this, we argue that there are situations in which the excess risk of our method is of order 1/n, while the excess risk of empirical risk minimization is of order 1/sqrt/{n}. We show some experimental results, which confirm the theory. Finally, we discuss the potential application of our results to sample compression schemes. △ Less

Submitted 21 July, 2009; originally announced July 2009.

Comments: 10 pages, 1 figure, Proc. Computational Learning Theory Conference (COLT 2009)

arXiv:0805.2362 [pdf, ps, other]

An optimization problem on the sphere

Authors: Andreas Maurer

Abstract: We prove existence and uniqueness of the minimizer for the average geodesic distance to the points of a geodesically convex set on the sphere. This implies a corresponding existence and uniqueness result for an optimal algorithm for halfspace learning, when data and target functions are drawn from the uniform distribution. We prove existence and uniqueness of the minimizer for the average geodesic distance to the points of a geodesically convex set on the sphere. This implies a corresponding existence and uniqueness result for an optimal algorithm for halfspace learning, when data and target functions are drawn from the uniform distribution. △ Less

Submitted 15 May, 2008; originally announced May 2008.

arXiv:cs/0411099 [pdf, ps, other]

A Note on the PAC Bayesian Theorem

Authors: Andreas Maurer

Abstract: We prove general exponential moment inequalities for averages of [0,1]-valued iid random variables and use them to tighten the PAC Bayesian Theorem. The logarithmic dependence on the sample count in the enumerator of the PAC Bayesian bound is halved. We prove general exponential moment inequalities for averages of [0,1]-valued iid random variables and use them to tighten the PAC Bayesian Theorem. The logarithmic dependence on the sample count in the enumerator of the PAC Bayesian bound is halved. △ Less

Submitted 30 November, 2004; originally announced November 2004.

Comments: 9 pages

ACM Class: I.5.1

Showing 1–49 of 49 results for author: Maurer, A