-
Robust Semi-supervised Learning via $f$-Divergence and $α$-Rényi Divergence
Authors:
Gholamali Aminian,
Amirhossien Bagheri,
Mahyar JafariNodeh,
Radmehr Karimian,
Mohammad-Hossein Yassaee
Abstract:
This paper investigates a range of empirical risk functions and regularization methods suitable for self-training methods in semi-supervised learning. These approaches draw inspiration from various divergence measures, such as $f$-divergences and $α$-Rényi divergences. Inspired by the theoretical foundations rooted in divergences, i.e., $f$-divergences and $α$-Rényi divergence, we also provide val…
▽ More
This paper investigates a range of empirical risk functions and regularization methods suitable for self-training methods in semi-supervised learning. These approaches draw inspiration from various divergence measures, such as $f$-divergences and $α$-Rényi divergences. Inspired by the theoretical foundations rooted in divergences, i.e., $f$-divergences and $α$-Rényi divergence, we also provide valuable insights to enhance the understanding of our empirical risk functions and regularization techniques. In the pseudo-labeling and entropy minimization techniques as self-training methods for effective semi-supervised learning, the self-training process has some inherent mismatch between the true label and pseudo-label (noisy pseudo-labels) and some of our empirical risk functions are robust, concerning noisy pseudo-labels. Under some conditions, our empirical risk functions demonstrate better performance when compared to traditional self-training methods.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Statistics of Random Binning Based on Tsallis Divergence
Authors:
Masoud Kavian,
Mohammad Mahdi Mojahedian,
Mohammad Hossein Yassaee,
Mahtab Mirmohseni,
Mohammad Reza Aref
Abstract:
Random binning is a widely utilized tool in information theory, finding applications in various domains. In this paper, we focus on the output statistics of random binning (OSRB) using the Tsallis divergence $T_α$. Our investigation encompasses all values of $α$ within the range of $(0,\infty)$. The proofs provided in this paper cover both the achievability and converse aspects. To accommodate the…
▽ More
Random binning is a widely utilized tool in information theory, finding applications in various domains. In this paper, we focus on the output statistics of random binning (OSRB) using the Tsallis divergence $T_α$. Our investigation encompasses all values of $α$ within the range of $(0,\infty)$. The proofs provided in this paper cover both the achievability and converse aspects. To accommodate the unbounded nature of $T_\infty$, we analyze the OSRB framework using the Rényi's divergence criterion with the order of infinity, denoted as $D_\infty$. During our exploration of OSRB, we encounter a specific form of Rényi's conditional entropy and delve into its properties. Additionally, we demonstrate the effectiveness of this framework in establishing achievability results for wiretap channel, where Tsallis divergence serves as a security measure.
△ Less
Submitted 13 February, 2024; v1 submitted 25 April, 2023;
originally announced April 2023.
-
f-divergences and their applications in lossy compression and bounding generalization error
Authors:
Saeed Masiha,
Amin Gohari,
Mohammad Hossein Yassaee
Abstract:
In this paper, we provide three applications for $f$-divergences: (i) we introduce Sanov's upper bound on the tail probability of the sum of independent random variables based on super-modular $f$-divergence and show that our generalized Sanov's bound strictly improves over ordinary one, (ii) we consider the lossy compression problem which studies the set of achievable rates for a given distortion…
▽ More
In this paper, we provide three applications for $f$-divergences: (i) we introduce Sanov's upper bound on the tail probability of the sum of independent random variables based on super-modular $f$-divergence and show that our generalized Sanov's bound strictly improves over ordinary one, (ii) we consider the lossy compression problem which studies the set of achievable rates for a given distortion and code length. We extend the rate-distortion function using mutual $f$-information and provide new and strictly better bounds on achievable rates in the finite blocklength regime using super-modular $f$-divergences, and (iii) we provide a connection between the generalization error of algorithms with bounded input/output mutual $f$-information and a generalized rate-distortion problem. This connection allows us to bound the generalization error of learning algorithms using lower bounds on the $f$-rate-distortion function. Our bound is based on a new lower bound on the rate-distortion function that (for some examples) strictly improves over previously best-known bounds.
△ Less
Submitted 26 January, 2023; v1 submitted 21 June, 2022;
originally announced June 2022.
-
Sequential Estimation under Multiple Resources: a Bandit Point of View
Authors:
Alireza Masoumian,
Shayan Kiyani,
Mohammad Hossein Yassaee
Abstract:
The problem of Sequential Estimation under Multiple Resources (SEMR) is defined in a federated setting. SEMR could be considered as the intersection of statistical estimation and bandit theory. In this problem, an agent is confronting with k resources to estimate a parameter $θ$. The agent should continuously learn the quality of the resources by wisely choosing them and at the end, proposes an es…
▽ More
The problem of Sequential Estimation under Multiple Resources (SEMR) is defined in a federated setting. SEMR could be considered as the intersection of statistical estimation and bandit theory. In this problem, an agent is confronting with k resources to estimate a parameter $θ$. The agent should continuously learn the quality of the resources by wisely choosing them and at the end, proposes an estimator based on the collected data. In this paper, we assume that the resources' distributions are Gaussian. The quality of the final estimator is evaluated by its mean squared error. Also, we restrict our class of estimators to unbiased estimators in order to define a meaningful notion of regret. The regret measures the performance of the agent by the variance of the final estimator in comparison to the optimal variance. We propose a lower bound to determine the fundamental limit of the setting even in the case that the distributions are not Gaussian. Also, we offer an order-optimal algorithm to achieve this lower bound.
△ Less
Submitted 29 September, 2021;
originally announced September 2021.
-
Learning under Distribution Mismatch and Model Misspecification
Authors:
Saeed Masiha,
Amin Gohari,
Mohammad Hossein Yassaee,
Mohammad Reza Aref
Abstract:
We study learning algorithms when there is a mismatch between the distributions of the training and test datasets of a learning algorithm. The effect of this mismatch on the generalization error and model misspecification are quantified. Moreover, we provide a connection between the generalization error and the rate-distortion theory, which allows one to utilize bounds from the rate-distortion the…
▽ More
We study learning algorithms when there is a mismatch between the distributions of the training and test datasets of a learning algorithm. The effect of this mismatch on the generalization error and model misspecification are quantified. Moreover, we provide a connection between the generalization error and the rate-distortion theory, which allows one to utilize bounds from the rate-distortion theory to derive new bounds on the generalization error and vice versa. In particular, the rate-distortion based bound strictly improves over the earlier bound by Xu and Raginsky even when there is no mismatch. We also discuss how "auxiliary loss functions" can be utilized to obtain upper bounds on the generalization error.
△ Less
Submitted 10 August, 2022; v1 submitted 10 February, 2021;
originally announced February 2021.
-
State Masking Over a Two-State Compound Channel
Authors:
Sadaf Salehkalaibar,
Mohammad Hossein Yassaee,
Vincent Y. F. Tan,
Mehrasa Ahmadipour
Abstract:
We consider fundamental limits for communicating over a compound channel when the state of the channel needs to be masked. Our model is closely related to an area of study known as covert communication that is a setting in which the transmitter wishes to communicate to legitimate receiver(s) while ensuring that the communication is not detected by an adversary. The main contribution in our two-sta…
▽ More
We consider fundamental limits for communicating over a compound channel when the state of the channel needs to be masked. Our model is closely related to an area of study known as covert communication that is a setting in which the transmitter wishes to communicate to legitimate receiver(s) while ensuring that the communication is not detected by an adversary. The main contribution in our two-state masking setup is the establishment of bounds on the throughput-key region when the constraint that quantifies how much the states are masked is defined to be the total variation distance between the two channel-induced distributions. For the scenario in which the key length is infinite, we provide sufficient conditions for when the bounds to coincide for the throughput, which follows the square-root law. Numerical examples, including that of a Gaussian channel, are provided to illustrate our results.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
Covert Communication Over a Compound Channel
Authors:
Sadaf Salehkalaibar,
Mohammad Hossein Yassaee,
Vincent Y. F. Tan
Abstract:
In this paper, we consider fundamental communication limits over a compound channel. Covert communication in the information-theoretic context has been primarily concerned with fundamental limits when the transmitter wishes to communicate to legitimate receiver(s) while ensuring that the communication is not detected by an adversary. This paper, however, considers an alternative, and no less impor…
▽ More
In this paper, we consider fundamental communication limits over a compound channel. Covert communication in the information-theoretic context has been primarily concerned with fundamental limits when the transmitter wishes to communicate to legitimate receiver(s) while ensuring that the communication is not detected by an adversary. This paper, however, considers an alternative, and no less important, setting in which the object to be masked is the state of the compound channel. Such a communication model has applications in the prevention of malicious parties seeking to jam the communication signal when, for example, the signal-to-noise ratio of a wireless channel is found to be low. Our main contribution is the establishment of bounds on the throughput-key region when the covertness constraint is defined in terms of the total variation distance. In addition, for the scenario in which the key length is infinite, we provide a sufficient condition for when the bounds coincide for the scaling of the throughput, which follows the square-root law. Numerical examples, including that of a Gaussian channel, are provided to illustrate our results.
△ Less
Submitted 16 June, 2019;
originally announced June 2019.
-
Almost exact analysis of soft covering lemma via large deviation
Authors:
Mohammad Hossein Yassaee
Abstract:
This paper investigates the soft covering lemma under both the relative entropy and the total variation distance as the measures of deviation. The exact order of the expected deviation of the random i.i.d. code for the soft covering problem problem, is determined. The proof technique used in this paper significantly differs from the previous techniques for deriving exact exponent of the soft cover…
▽ More
This paper investigates the soft covering lemma under both the relative entropy and the total variation distance as the measures of deviation. The exact order of the expected deviation of the random i.i.d. code for the soft covering problem problem, is determined. The proof technique used in this paper significantly differs from the previous techniques for deriving exact exponent of the soft covering lemma. The achievability of the exact order follows from applying the change of measure trick (which has been broadly used in the large deviation) to the known one-shot bounds in the literature. For the ensemble converse, some new inequalities of independent interest derived and then the change of measure trick is applied again. The exact order of the total variation distance is similar to the exact order of the error probability, thus it adds another duality between the channel coding and soft covering. Finally, The results of this paper are valid for any memoryless channels, not only channels with finite alphabets.
△ Less
Submitted 21 February, 2019;
originally announced February 2019.
-
Sharp Bounds for Mutual Covering
Authors:
Jingbo Liu,
Mohammad H. Yassaee,
Sergio Verdú
Abstract:
A fundamental tool in network information theory is the covering lemma, which lower bounds the probability that there exists a pair of random variables, among a give number of independently generated candidates, falling within a given set. We use a weighted sum trick and Talagrand's concentration inequality to prove new mutual covering bounds. We identify two interesting applications: 1) When the…
▽ More
A fundamental tool in network information theory is the covering lemma, which lower bounds the probability that there exists a pair of random variables, among a give number of independently generated candidates, falling within a given set. We use a weighted sum trick and Talagrand's concentration inequality to prove new mutual covering bounds. We identify two interesting applications: 1) When the probability of the set under the given joint distribution is bounded away from 0 and 1, the covering probability converges to 1 \emph{doubly} exponentially fast in the blocklength, which implies that the covering lemma does not induce penalties on the error exponents in the applications to coding theorems. 2) Using Hall's marriage lemma, we show that the maximum difference between the probability of the set under the joint distribution and the covering probability equals half the minimum total variation distance between the joint distribution and any distribution that can be simulated by selecting a pair from the candidates. Thus we use the mutual covering bound to derive the exact error exponent in the joint distribution simulation problem. In both applications, the determination of the exact exponential (or doubly exponential) behavior relies crucially on the sharp concentration inequality used in the proof of the mutual covering lemma.
△ Less
Submitted 16 April, 2019; v1 submitted 1 January, 2019;
originally announced January 2019.
-
A Correlation Measure Based on Vector-Valued $L_p$-Norms
Authors:
Mohammad Mahdi Mojahedian,
Salman Beigi,
Amin Gohari,
Mohammad Hossein Yassaee,
Mohammad Reza Aref
Abstract:
In this paper, we introduce a new measure of correlation for bipartite quantum states. This measure depends on a parameter $α$, and is defined in terms of vector-valued $L_p$-norms. The measure is within a constant of the exponential of $α$-Rényi mutual information, and reduces to the trace norm (total variation distance) for $α=1$. We will prove some decoupling type theorems in terms of this meas…
▽ More
In this paper, we introduce a new measure of correlation for bipartite quantum states. This measure depends on a parameter $α$, and is defined in terms of vector-valued $L_p$-norms. The measure is within a constant of the exponential of $α$-Rényi mutual information, and reduces to the trace norm (total variation distance) for $α=1$. We will prove some decoupling type theorems in terms of this measure of correlation, and present some applications in privacy amplification as well as in bounding the random coding exponents. In particular, we establish a bound on the secrecy exponent of the wiretap channel (under the total variation metric) in terms of the $α$-Rényi mutual information according to \emph{Csiszár's proposal}.
△ Less
Submitted 21 May, 2018;
originally announced May 2018.
-
Simulation of a Channel with Another Channel
Authors:
Farzin Haddadpour,
Mohammad Hossein Yassaee,
Salman Beigi,
Amin Gohari,
Mohammad Reza Aref
Abstract:
In this paper, we study the problem of simulating a DMC channel from another DMC channel under an average-case and an exact model. We present several achievability and infeasibility results, with tight characterizations in special cases. In particular for the exact model, we fully characterize when a BSC channel can be simulated from a BEC channel when there is no shared randomness. We also provid…
▽ More
In this paper, we study the problem of simulating a DMC channel from another DMC channel under an average-case and an exact model. We present several achievability and infeasibility results, with tight characterizations in special cases. In particular for the exact model, we fully characterize when a BSC channel can be simulated from a BEC channel when there is no shared randomness. We also provide infeasibility and achievability results for simulation of a binary channel from another binary channel in the case of no shared randomness. To do this, we use properties of Rényi capacity of a given order. We also introduce a notion of "channel diameter" which is shown to be additive and satisfy a data processing inequality.
△ Less
Submitted 1 December, 2016; v1 submitted 25 May, 2013;
originally announced May 2013.
-
A Technique for Deriving One-Shot Achievability Results in Network Information Theory
Authors:
Mohammad Hossein Yassaee,
Mohammad Reza Aref,
Amin Gohari
Abstract:
This paper proposes a novel technique to prove a one-shot version of achievability results in network information theory. The technique is not based on covering and packing lemmas. In this technique, we use an stochastic encoder and decoder with a particular structure for coding that resembles both the ML and the joint-typicality coders. Although stochastic encoders and decoders do not usually enh…
▽ More
This paper proposes a novel technique to prove a one-shot version of achievability results in network information theory. The technique is not based on covering and packing lemmas. In this technique, we use an stochastic encoder and decoder with a particular structure for coding that resembles both the ML and the joint-typicality coders. Although stochastic encoders and decoders do not usually enhance the capacity region, their use simplifies the analysis. The Jensen inequality lies at the heart of error analysis, which enables us to deal with the expectation of many terms coming from stochastic encoders and decoders at once. The technique is illustrated via several examples: point-to-point channel coding, Gelfand-Pinsker, Broadcast channel (Marton), Berger-Tung, Heegard-Berger/Kaspi, Multiple description coding and Joint source-channel coding over a MAC. Most of our one-shot results are new. The asymptotic forms of these expressions is the same as that of classical results. Our one-shot bounds in conjunction with multi-dimensional Berry-Essen CLT imply new results in the finite blocklength regime. In particular applying the one-shot result for the memoryless broadcast channel in the asymptotic case, we get the entire region of Marton's inner bound without any need for time-sharing.
△ Less
Submitted 4 March, 2013;
originally announced March 2013.
-
Non-Asymptotic Output Statistics of Random Binning and Its Applications
Authors:
Mohammad Hossein Yassaee,
Mohammad Reza Aref,
Amin Gohari
Abstract:
In this paper we develop a finite blocklength version of the Output Statistics of Random Binning (OSRB) framework. The framework is shown to be optimal in the point-to-point case. New second order regions for broadcast channel and wiretap channel with strong secrecy criterion are derived.
In this paper we develop a finite blocklength version of the Output Statistics of Random Binning (OSRB) framework. The framework is shown to be optimal in the point-to-point case. New second order regions for broadcast channel and wiretap channel with strong secrecy criterion are derived.
△ Less
Submitted 4 March, 2013;
originally announced March 2013.
-
Secure Channel Simulation
Authors:
Amin Gohari,
Mohammad Hossein Yassaee,
Mohammad Reza Aref
Abstract:
In this paper the Output Statistics of Random Binning (OSRB) framework is used to prove a new inner bound for the problem of secure channel simulation. Our results subsume some recent results on the secure function computation. We also provide an achievability result for the problem of simultaneously simulating a channel and creating a shared secret key. A special case of this result generalizes t…
▽ More
In this paper the Output Statistics of Random Binning (OSRB) framework is used to prove a new inner bound for the problem of secure channel simulation. Our results subsume some recent results on the secure function computation. We also provide an achievability result for the problem of simultaneously simulating a channel and creating a shared secret key. A special case of this result generalizes the lower bound of Gohari and Anantharam on the source model to include constraints on the rates of the public discussion.
△ Less
Submitted 15 July, 2012;
originally announced July 2012.
-
Channel simulation via interactive communications
Authors:
Mohammad Hossein Yassaee,
Amin Gohari,
Mohammad Reza Aref
Abstract:
In this paper, we study the problem of channel simulation via interactive communication, known as the coordination capacity, in a two-terminal network. We assume that two terminals observe i.i.d.\ copies of two random variables and would like to generate i.i.d.\ copies of two other random variables jointly distributed with the observed random variables. The terminals are provided with two-way comm…
▽ More
In this paper, we study the problem of channel simulation via interactive communication, known as the coordination capacity, in a two-terminal network. We assume that two terminals observe i.i.d.\ copies of two random variables and would like to generate i.i.d.\ copies of two other random variables jointly distributed with the observed random variables. The terminals are provided with two-way communication links, and shared common randomness, all at limited rates. Two special cases of this problem are the interactive function computation studied by Ma and Ishwar, and the tradeoff curve between one-way communication and shared randomness studied by Cuff. The latter work had inspired Gohari and Anantharam to study the general problem of channel simulation via interactive communication stated above. However only inner and outer bounds for the special case of no shared randomness were obtained in their work. In this paper we settle this problem by providing an exact computable characterization of the multi-round problem. To show this we employ the technique of "output statistics of random binning" that has been recently developed by the authors.
△ Less
Submitted 18 April, 2014; v1 submitted 14 March, 2012;
originally announced March 2012.
-
Coordination via a relay
Authors:
Farzin Haddadpour,
Mohammad Hossein Yassaee,
Amin Gohari,
Mohammad Reza Aref
Abstract:
In this paper, we study the problem of coordinating two nodes which can only exchange information via a relay at limited rates. The nodes are allowed to do a two-round interactive two-way communication with the relay, after which they should be able to generate i.i.d. copies of two random variables with a given joint distribution within a vanishing total variation distance. We prove inner and oute…
▽ More
In this paper, we study the problem of coordinating two nodes which can only exchange information via a relay at limited rates. The nodes are allowed to do a two-round interactive two-way communication with the relay, after which they should be able to generate i.i.d. copies of two random variables with a given joint distribution within a vanishing total variation distance. We prove inner and outer bounds on the coordination capacity region for this problem. Our inner bound is proved using the technique of "output statistics of random binning" that has recently been developed by Yassaee, et al.
△ Less
Submitted 4 March, 2012;
originally announced March 2012.
-
Achievability proof via output statistics of random binning
Authors:
Mohammad Hossein Yassaee,
Mohammad Reza Aref,
Amin Gohari
Abstract:
This paper introduces a new and ubiquitous framework for establishing achievability results in \emph{network information theory} (NIT) problems. The framework uses random binning arguments and is based on a duality between channel and source coding problems. {Further,} the framework uses pmf approximation arguments instead of counting and typicality. This allows for proving coordination and \emph{…
▽ More
This paper introduces a new and ubiquitous framework for establishing achievability results in \emph{network information theory} (NIT) problems. The framework uses random binning arguments and is based on a duality between channel and source coding problems. {Further,} the framework uses pmf approximation arguments instead of counting and typicality. This allows for proving coordination and \emph{strong} secrecy problems where certain statistical conditions on the distribution of random variables need to be satisfied. These statistical conditions include independence between messages and eavesdropper's observations in secrecy problems and closeness to a certain distribution (usually, i.i.d. distribution) in coordination problems. One important feature of the framework is to enable one {to} add an eavesdropper and obtain a result on the secrecy rates "for free."
We make a case for generality of the framework by studying examples in the variety of settings containing channel coding, lossy source coding, joint source-channel coding, coordination, strong secrecy, feedback and relaying. In particular, by investigating the framework for the lossy source coding problem over broadcast channel, it is shown that the new framework provides a simple alternative scheme to \emph{hybrid} coding scheme. Also, new results on secrecy rate region (under strong secrecy criterion) of wiretap broadcast channel and wiretap relay channel are derived. In a set of accompanied papers, we have shown the usefulness of the framework to establish achievability results for coordination problems including interactive channel simulation, coordination via relay and channel simulation via another channel.
△ Less
Submitted 21 August, 2014; v1 submitted 4 March, 2012;
originally announced March 2012.
-
Slepian-Wolf Coding Over Cooperative Relay Networks
Authors:
Mohammad Hossein Yassaee,
Mohammad Reza Aref
Abstract:
This paper deals with the problem of multicasting a set of discrete memoryless correlated sources (DMCS) over a cooperative relay network. Necessary conditions with cut-set interpretation are presented. A \emph{Joint source-Wyner-Ziv encoding/sliding window decoding} scheme is proposed, in which decoding at each receiver is done with respect to an ordered partition of other nodes. For each ordered…
▽ More
This paper deals with the problem of multicasting a set of discrete memoryless correlated sources (DMCS) over a cooperative relay network. Necessary conditions with cut-set interpretation are presented. A \emph{Joint source-Wyner-Ziv encoding/sliding window decoding} scheme is proposed, in which decoding at each receiver is done with respect to an ordered partition of other nodes. For each ordered partition a set of feasibility constraints is derived. Then, utilizing the sub-modular property of the entropy function and a novel geometrical approach, the results of different ordered partitions are consolidated, which lead to sufficient conditions for our problem. The proposed scheme achieves operational separation between source coding and channel coding. It is shown that sufficient conditions are indeed necessary conditions in two special cooperative networks, namely, Aref network and finite-field deterministic network. Also, in Gaussian cooperative networks, it is shown that reliable transmission of all DMCS whose Slepian-Wolf region intersects the cut-set bound region within a constant number of bits, is feasible. In particular, all results of the paper are specialized to obtain an achievable rate region for cooperative relay networks which includes relay networks and two-way relay networks.
△ Less
Submitted 6 December, 2010; v1 submitted 19 October, 2009;
originally announced October 2009.
-
Slepian-Wolf Coding over Cooperative Networks
Authors:
Mohammad Hossein Yassaee,
Mohammad Reza Aref
Abstract:
We present sufficient conditions for multicasting a set of correlated sources over cooperative networks. We propose joint source-Wyner-Ziv encoding/sliding-window decoding scheme, in which each receiver considers an ordered partition of other nodes. Subject to this scheme, we obtain a set of feasibility constraints for each ordered partition. We consolidate the results of different ordered parti…
▽ More
We present sufficient conditions for multicasting a set of correlated sources over cooperative networks. We propose joint source-Wyner-Ziv encoding/sliding-window decoding scheme, in which each receiver considers an ordered partition of other nodes. Subject to this scheme, we obtain a set of feasibility constraints for each ordered partition. We consolidate the results of different ordered partitions by utilizing a result of geometrical approach to obtain the sufficient conditions. We observe that these sufficient conditions are indeed necessary conditions for Aref networks. As a consequence of the main result, we obtain an achievable rate region for networks with multicast demands. Also, we deduce an achievability result for two-way relay networks, in which two nodes want to communicate over a relay network.
△ Less
Submitted 15 January, 2009;
originally announced January 2009.